posted 3 years ago

Oh no! Disaster has struck some of ACME's redundant data centers. The administrators have managed to restore backups, but some data sets are still missing from some data centers. Fortunately, every data set can be found at least once in one or more of the data centers. However, before ACME can resume normal operations, it needs to ensure that each data center has a copy of every data set.

Your goal is to help ACME resume normal operations by writing a program to synchronize data sets between data centers using as few copies as possible.

The first line of input will contain an integer between 0 and 999999 inclusive, representing the number of data centers.

Following that will be one line of input for each data center. Each line will contain an integer from 0 to 299 representing the number of data sets at that data center, followed by a space and space-separated list of data set ids currently present at that data center. Data set ids are each an integer between 1 and 999999, inclusive. Each line will be at most 999 characters long.

Data set ids are not necessarily consecutive. The list of data sets will not be in any particular order.

Output:

The program must output an optimal data set copy strategy to ensure that every data center has a copy of every data set. Output one line for every copy instruction.

A copy instruction is of the form <data-set-id> <from> <to>, where <data-set-id> is the data set id, <from> is the index of the data center the data set will be copied from (1 = the first data center), and <to> is the index of the data center to copy the data set to.

When there are no more copy instructions, the program must output the word "done" on a line by itself.

There is often more than one correct output with minimal number of operations for a given input, and any output that satisfies the requirements is valid.

Example 1:

----------

Input:

4

3 1 3 4

3 1 2 3

2 1 3

3 1 4 2

One Possible Correct Output:

2 2 1

4 1 2

2 2 3

4 4 3

3 1 4

done

Example 2:

----------

Input:

2

2 1 2

2 2 1

Output:

done

Your goal is to help ACME resume normal operations by writing a program to synchronize data sets between data centers using as few copies as possible.

The first line of input will contain an integer between 0 and 999999 inclusive, representing the number of data centers.

Following that will be one line of input for each data center. Each line will contain an integer from 0 to 299 representing the number of data sets at that data center, followed by a space and space-separated list of data set ids currently present at that data center. Data set ids are each an integer between 1 and 999999, inclusive. Each line will be at most 999 characters long.

Data set ids are not necessarily consecutive. The list of data sets will not be in any particular order.

Output:

The program must output an optimal data set copy strategy to ensure that every data center has a copy of every data set. Output one line for every copy instruction.

A copy instruction is of the form <data-set-id> <from> <to>, where <data-set-id> is the data set id, <from> is the index of the data center the data set will be copied from (1 = the first data center), and <to> is the index of the data center to copy the data set to.

When there are no more copy instructions, the program must output the word "done" on a line by itself.

There is often more than one correct output with minimal number of operations for a given input, and any output that satisfies the requirements is valid.

Example 1:

----------

Input:

4

3 1 3 4

3 1 2 3

2 1 3

3 1 4 2

One Possible Correct Output:

2 2 1

4 1 2

2 2 3

4 4 3

3 1 4

done

Example 2:

----------

Input:

2

2 1 2

2 2 1

Output:

done

posted 3 years ago

You forgot to ask a question.

You might want to read HowToAskQuestionsOnJavaRanch, especially the NotACodeMill link.

You might want to read HowToAskQuestionsOnJavaRanch, especially the NotACodeMill link.

Joanne

Neeta Bharadwaj

Greenhorn

Posts: 10

Joanne Neal

Rancher

Posts: 3742

16

posted 3 years ago

I must be missing something but I don't understand how there can be an optimal number of copies - each problem will always require a fixed number of copies as far as I can see.

In your example, data centres 1, 2 and 4 are missing one data set and data centre 3 is missing 2 data sets, so five copies will be needed and those copies can be from any data center that has the required data set.

Maybe you can show us an example where there are an optimal and a non-optimal solution to the problem, but as far as I can tell at the moment the algorithm will be to find a data centre that has the required data set and copy it to the data centre that doesn't have it.

In your example, data centres 1, 2 and 4 are missing one data set and data centre 3 is missing 2 data sets, so five copies will be needed and those copies can be from any data center that has the required data set.

Maybe you can show us an example where there are an optimal and a non-optimal solution to the problem, but as far as I can tell at the moment the algorithm will be to find a data centre that has the required data set and copy it to the data centre that doesn't have it.

Joanne

posted 3 years ago

As with Joanne, I don't know about a specific algorithm, but there are datasets (Sets, Maps, etc) that can help you out. But the

Once you have

This is also just ONE way to do it - there are many others - but pretty much all of them will work much better if you have proper classes to hold your information, so don't be dealing with this stuff simply as Strings and numbers.

Also: write out what you need to do

HIH

Winston

Neeta Bharadwaj wrote:I dont know how to start with this problem. Can anyone suggest an algorithm?

As with Joanne, I don't know about a specific algorithm, but there are datasets (Sets, Maps, etc) that can help you out. But the

*first*thing you need to do is to create some classes - at the very least, a

`Dataset`class and a

`DataCentre`one - and

`Dataset`s should only be

`equal()`if their IDs are equal.

Once you have

`objects`, you can read them into a

`Set`(in the case of

`Dataset`s) or a

`List`(for your

`DataCentre`s) and, if each DataCentre also contains a

`List`of the

`Dataset`s it contains at the start, you can use the Set.removeAll() method to work out which ones it's missing (or alternatively, write a method to do it yourself). Just be warned:

`removeAll()`

*updates*

`Set`contents, so you'll need to copy the

`Set`before you run it.

This is also just ONE way to do it - there are many others - but pretty much all of them will work much better if you have proper classes to hold your information, so don't be dealing with this stuff simply as Strings and numbers.

Also: write out what you need to do

__in English__(or your native language) before you write your

*first line of code*.

HIH

Winston

"Leadership is nature's way of removing morons from the productive flow" - Dogbert

Articles by Winston can be found here

It is sorta covered in the JavaRanch Style Guide. |