# Set Combination Algorithm

Roshan Lal

Ranch Hand

Posts: 64

posted 10 years ago

Hi everybody,

I have a problem on which some pointers can be very useful

and would appreciate some help.

The problem can be formulated like this:

There are members like, A,B,C...L, M, N,O (15 members).

These members are grouped together by some criteria and we get

get groups(sets) like (A,B), (C,D), (J,K,L), (B,D,I), (B,C,D), (A,K)

,(J,K,M,N,O) and so forth. There will be dozens and in worse cases

hundreds of such groups.

These groups have some properties which makes a group better than the other. First is the size. The larger size group is better.

Second, color.can be be green or yellow. If

there are two groups the same size say G(A,B) and Y(A,B) then G(A,B) is considered better than Y(A,B).

The goal is to combine these groups into arrangements(set of groups)

and find the best arrangement so that in an arrangement of groups,

i)none of the member in the combined group arrangement is repeated

ie [Y(A,B), G(B,C)] is invalid arrangement becuase B appears twice here.

ii) an arrangement with larger number of member is considered better

ie Y(A,B,C) is superior to G(A,B).

iii) if two arrangement have same number of members, then the one with

fewer group is considered better. eg [Y(A,B,C,D)] is better than [Y(A,B), G(C,D)].

Finding the best arrangement is proving to be an expensive operation.

My initial logic would be:

a) remove all the yellow groups who have a corresponding

sized green group as they are guarnteed to be inferior in the presence of green

group. That is if both G(A,B) and Y(A,B) exist keep only G(A,B)

b)Then partition groups into two categories the ones which don't have any of its member in any other groups and the others which may have at least one of its member present in any other group.

c) Then for each group which have at least of one of its member in common with others, we generate an arrangement by adding that group and all the disjoint groups. Compare this arrangement with the last one and keep the better one.

I hope I'm making the problem understandble. Any help, suggestion is

highly appreciated.

Thanks

Roshan

I have a problem on which some pointers can be very useful

and would appreciate some help.

The problem can be formulated like this:

There are members like, A,B,C...L, M, N,O (15 members).

These members are grouped together by some criteria and we get

get groups(sets) like (A,B), (C,D), (J,K,L), (B,D,I), (B,C,D), (A,K)

,(J,K,M,N,O) and so forth. There will be dozens and in worse cases

hundreds of such groups.

These groups have some properties which makes a group better than the other. First is the size. The larger size group is better.

Second, color.can be be green or yellow. If

there are two groups the same size say G(A,B) and Y(A,B) then G(A,B) is considered better than Y(A,B).

The goal is to combine these groups into arrangements(set of groups)

and find the best arrangement so that in an arrangement of groups,

i)none of the member in the combined group arrangement is repeated

ie [Y(A,B), G(B,C)] is invalid arrangement becuase B appears twice here.

ii) an arrangement with larger number of member is considered better

ie Y(A,B,C) is superior to G(A,B).

iii) if two arrangement have same number of members, then the one with

fewer group is considered better. eg [Y(A,B,C,D)] is better than [Y(A,B), G(C,D)].

Finding the best arrangement is proving to be an expensive operation.

My initial logic would be:

a) remove all the yellow groups who have a corresponding

sized green group as they are guarnteed to be inferior in the presence of green

group. That is if both G(A,B) and Y(A,B) exist keep only G(A,B)

b)Then partition groups into two categories the ones which don't have any of its member in any other groups and the others which may have at least one of its member present in any other group.

c) Then for each group which have at least of one of its member in common with others, we generate an arrangement by adding that group and all the disjoint groups. Compare this arrangement with the last one and keep the better one.

I hope I'm making the problem understandble. Any help, suggestion is

highly appreciated.

Thanks

Roshan

Frank Carver

Sheriff

Posts: 6920

posted 10 years ago

This is really an algorithm performance question rather than one about servlets, so I have moved it to our Performance and Algorithms forum where you may get a better answer.

Lear nable

Greenhorn

Posts: 3

posted 10 years ago

How do you color groups? Does it serve any practical purpose?

Originally posted by Roshan Lal:

Hi everybody,

These groups have some properties which makes a group better than the other. First is the size. The larger size group is better.

Second, color.can be be green or yellow. If

there are two groups the same size say G(A,B) and Y(A,B) then G(A,B) is considered better than Y(A,B).

How do you color groups? Does it serve any practical purpose?

William Brogden

Author and all-around good cowpoke

Rancher

Rancher

Posts: 13074

6

posted 10 years ago

I would start with smallest groups, because they tend to have the smallest intersections.

When I understood correctly, a supergroup of 8 Groups with a total of 16 elements is better than a supergroup of one group with 15 elements.

Then I would try to reach the maximal number of elements.

Then I would try to replace groups with larger groups of the same elements.

X (A, B) + Y (C, D) -> Z (A, B, C, D)

This would still be a hard job, and I guess it is not guaranteed to find the best combination.

It reminds me of the backpacker problem.

Are groups of 1 element possible groups? X (A)

When I understood correctly, a supergroup of 8 Groups with a total of 16 elements is better than a supergroup of one group with 15 elements.

Then I would try to reach the maximal number of elements.

Then I would try to replace groups with larger groups of the same elements.

X (A, B) + Y (C, D) -> Z (A, B, C, D)

This would still be a hard job, and I guess it is not guaranteed to find the best combination.

It reminds me of the backpacker problem.

Are groups of 1 element possible groups? X (A)

Lear nable

Greenhorn

Posts: 3

posted 10 years ago

If I understand correctly, an optimal and valid arrangement is one in which all 15 members are included, there is NO duplication of members, and better groups prefered over worse groups. Is that correct? By "no duplication", do you mean "minimize duplication"?

Can you end up with a set of groups in partition 2 (the one with groups that have at least one member in common with another group) such that you can never form an arrangement with no duplication AND include all 15 members?

For e.g.

Partition 1: G{A,B,C,D,E,F,G}, G{H, I, J, K, L}

Partition 2: G{M, O}, G{N, O}

What would you do in this case?

Also, take a loop at Set Cover Problem

b)Then partition groups into two categories the ones which don't have any of its member in any other groups and the others which may have at least one of its member present in any other group.

c) Then for each group which have at least of one of its member in common with others, we generate an arrangement by adding that group and all the disjoint groups. Compare this arrangement with the last one and keep the better one.

If I understand correctly, an optimal and valid arrangement is one in which all 15 members are included, there is NO duplication of members, and better groups prefered over worse groups. Is that correct? By "no duplication", do you mean "minimize duplication"?

Can you end up with a set of groups in partition 2 (the one with groups that have at least one member in common with another group) such that you can never form an arrangement with no duplication AND include all 15 members?

For e.g.

Partition 1: G{A,B,C,D,E,F,G}, G{H, I, J, K, L}

Partition 2: G{M, O}, G{N, O}

What would you do in this case?

Also, take a loop at Set Cover Problem

Roshan Lal

Ranch Hand

Posts: 64

posted 10 years ago

If I understand correctly, an optimal and valid arrangement is one in which all 15 members are included, there is NO duplication of members, and better groups prefered over worse groups. Is that correct? By "no duplication", do you mean "minimize duplication"?

You are right, Lear.

But to be considered "best" it doesn't have to have all

the members in it. In many cases, it is not even possible to have an arrangement which will cover all the members.

Best here simply will mean there is no other better arrangement. That

is, if there are 15 members and we could make only 3 arrangements with

10, 12, 13 total members then the arrangement with 13 is best.

Absolutely no duplication. Even if a single member is common amongs

groups inside the arrangement, the whole arrangement is invalid.

I'll do more search on Set Cover problem. That wiki link just gives definition.

If I understand correctly, an optimal and valid arrangement is one in which all 15 members are included, there is NO duplication of members, and better groups prefered over worse groups. Is that correct? By "no duplication", do you mean "minimize duplication"?

You are right, Lear.

But to be considered "best" it doesn't have to have all

the members in it. In many cases, it is not even possible to have an arrangement which will cover all the members.

Best here simply will mean there is no other better arrangement. That

is, if there are 15 members and we could make only 3 arrangements with

10, 12, 13 total members then the arrangement with 13 is best.

Absolutely no duplication. Even if a single member is common amongs

groups inside the arrangement, the whole arrangement is invalid.

I'll do more search on Set Cover problem. That wiki link just gives definition.

Roshan Lal

Ranch Hand

Posts: 64

posted 10 years ago

Thanks Stefan.

1. The input groups are unsorted, if we start with smallest then

we have to sort them first.

2. "a supergroup of 8 Groups with a total of 16 elements is better than a supergroup of one group with 15 elements". Yes, any arrangement with more total member is better.

I guess, the harder part is finding the intersecting sets efficiently as there could be hundreds of sets.

A naive solution could be like this.

Set findBestArrangement(Group[] groups) {

Set best = = new HashSet();;

for (int i=0; i < groups.length; i++) {

Set arrangement = new HashSet();

Group g = groups[i];

arrangement.add(g);

for (int j=0; j < groups.length; j++) {

if (i != j) {

Group g1 = groups[j];

if (!g.intersects(g1)) {

arrangement.add(g1);

}

}

}

if (arrangement.compareTo(best) > 0) {

best = arrangement;

}

}

return best;

}

We may improve upon this, say by storing the intersecting groups found so

far and so may have to sacrifice some memory for speed.

Thanks everybody for your input

Roshan

Originally posted by Stefan Wagner:

I would start with smallest groups, because they tend to have the smallest intersections.

When I understood correctly, a supergroup of 8 Groups with a total of 16 elements is better than a supergroup of one group with 15 elements.

Then I would try to reach the maximal number of elements.

Then I would try to replace groups with larger groups of the same elements.

X (A, B) + Y (C, D) -> Z (A, B, C, D)

This would still be a hard job, and I guess it is not guaranteed to find the best combination.

It reminds me of the backpacker problem.

Are groups of 1 element possible groups? X (A)

Thanks Stefan.

1. The input groups are unsorted, if we start with smallest then

we have to sort them first.

2. "a supergroup of 8 Groups with a total of 16 elements is better than a supergroup of one group with 15 elements". Yes, any arrangement with more total member is better.

I guess, the harder part is finding the intersecting sets efficiently as there could be hundreds of sets.

A naive solution could be like this.

Set findBestArrangement(Group[] groups) {

Set best = = new HashSet();;

for (int i=0; i < groups.length; i++) {

Set arrangement = new HashSet();

Group g = groups[i];

arrangement.add(g);

for (int j=0; j < groups.length; j++) {

if (i != j) {

Group g1 = groups[j];

if (!g.intersects(g1)) {

arrangement.add(g1);

}

}

}

if (arrangement.compareTo(best) > 0) {

best = arrangement;

}

}

return best;

}

We may improve upon this, say by storing the intersecting groups found so

far and so may have to sacrifice some memory for speed.

Thanks everybody for your input

Roshan

Ilja Preuss

author

Sheriff

Sheriff

Posts: 14112

posted 10 years ago

Sorting is relatively inexpensive - it can typically be done in O(n log n).

Originally posted by Roshan Lal:

Thanks Stefan.

1. The input groups are unsorted, if we start with smallest then

we have to sort them first.

Sorting is relatively inexpensive - it can typically be done in O(n log n).

The soul is dyed the color of its thoughts. Think only on those things that are in line with your principles and can bear the light of day. The content of your character is your choice. Day by day, what you do is who you become. Your integrity is your destiny - it is the light that guides your way. - Heraclitus

Roshan Lal

Ranch Hand

Posts: 64

posted 10 years ago

Lear,

There are business rules whereby groups can be in different "color"/catagories. The "color" word is used for

illustrated purpose only.

There is practical purpose for assigning these catagories.

This is one of the parameter in maximization function for the

arrangements.

Thanks

Roshan

Originally posted by Lear nable:

How do you color groups? Does it serve any practical purpose?

Lear,

There are business rules whereby groups can be in different "color"/catagories. The "color" word is used for

illustrated purpose only.

There is practical purpose for assigning these catagories.

This is one of the parameter in maximization function for the

arrangements.

Thanks

Roshan

Roshan Lal

Ranch Hand

Posts: 64

Roshan Lal

Ranch Hand

Posts: 64

Randall Kippen

Greenhorn

Posts: 15