The second time it may export the same file as following data:
I am not bothered about change in row order since I have a program which can compare the two csv files correctly even if they have rows out of order but I don't know how to overcome the change in column order. Is there a way to process this csv file and get another file of fixed column order?
How many columns are there? Can you create factory methods which take the different values in different orders? Remember the number of methods required is n! where n is the number of columns which might be reordered.
Are the column names always the same? Can you create some sort of map from column name to column value?
If so, then it's simply a matter of mapping against row and column headers instead of just row headers (since you mentioned that the row ordering doesn't bother you). Show us some code so we have a better idea of how you're doing the comparison and where it is messing up.
ABC are column headers. But 1,2,3 need not be row headers since the order in which the data is exported is uncertain. Assuming I get consistent column order (say A,B,C everytime) I use following code. But I am not sure how to deal with it when it comes out of order (B,A,C).
Padmanabh Sahasrabudhe wrote:I only wish to see if the rows which file 1 has are all present in file 2 or not. I need not retain them. The contents of the two files should same row wise.
It's not yet clear from you explanation, but I suspect you have two separate problems here: row order and column order. The first is (probably) a simple sorting exercise, the other is a mapping one, which supports Campbell's post.
For the latter, you will need some way of specifying the new order for your columns.
The simplest way to do that in Java is to supply column indexes in the order that you want them output, so if you intend to supply column identifiers instead (which, I assume, is what 'A', 'B', 'C'...etc. are), then you will need some way of translating your "new column order" input into a set (or array) of indexes, and then using that to rearrange the output for each line.
It should be added that reading CSV files can be quite involved: It's not simply a case of splitting data based on commas (unless you're absolutely sure that's the case), so you might want to look at third party libraries for reading your files.
Alternatively, if this CSV is generated from an Excel spreadsheet, you might want to look at Apache POI, because you may well be able to process it directly, rather than via CSVs.
I think I need to reframe my problem. I was behind a wrong issue. My issue is I have these two files
Technically, both of these files contains same data which I want to verify the same through my code. Winston, I am not generating data from Excel but thanks for your pointers.
Padmanabh Sahasrabudhe wrote:Technically, both of these files contains same data which I want to verify the same through my code. Winston, I am not generating data from Excel but thanks for your pointers.
Right, well first you need to arrange the data from both files in the same column sequence; and to do that you must have:
In addition, you may also need to know:
After that, it's simply an issue of mapping columns in a consistent order and (I suspect) sorting rows based on their "mapped" content - unless you want some form of diff algorithm, which is rather more advanced.