Hello All!
I have been trying to come up with the best possible way to compare the contents of two comma delimited text files and create possibly a third file with the outputs appending each line from both the files based on the first column value match. With the help of Jeff Albrechtsen (Thanks Jeff!)
I have used HashMap, ArrayList, Set, Iterator, Regex utility and other suggested APIs for JDK 1.4.2. So far, what I have works. But to me the code looks ugly. Wondering if you gurus can advise me with a much better solution for what I want to accomplish. Eventually I will be using doing manipulation (e.g. total sum of each of the column values) of certain column data from the final array list constructed from the two files to create the third file.
Here's code with the usage, input files and the output -
import java.io.*;
import java.util.*;
import java.util.regex.*;
import java.lang.String;
public class CompareFiles {
public static void main(
String[] argv){
try {
String basefile = argv[0];
String inputfile = argv[1];
List val = new ArrayList();
List final_list = new ArrayList();
// Calling the method to read files and create a HashMap object
HashMap base_hash = readFile(basefile);
HashMap input_hash = readFile(inputfile);
System.out.println("Base Hash: " + base_hash);
System.out.println("Input Hash: " + input_hash);
// Iterating through the first input (base) file
Set entries = base_hash.entrySet();
for (Iterator it = entries.iterator(); it.hasNext(); )
{
StringBuffer sb = new StringBuffer();
Map.Entry entry = (Map.Entry) it.next();
val = (List) entry.getValue();
String k = (String) entry.getKey();
// Appending the key value to a StringBuffer object
sb.append(k);
Iterator iter = val.iterator();
while (iter.hasNext()) {
String str = (String) iter.next();
//Prepending the value to the same StringBuffer object prefixed with a comma
sb.append(',' + str);
}
// Iterating through the second input file
List l = (List)input_hash.get(entry.getKey());
Iterator i = l.iterator();
while (i.hasNext()) {
String s = (String) i.next();
sb.append(',' + s) ;
}
// Add StringBuffer object into a List object
final_list.add(sb);
}
// Iterate through the the list
Iterator a = final_list.iterator();
while (a.hasNext()) {
System.out.println ("Array List " + a.next());
}
} catch (ArrayIndexOutOfBoundsException e) {
System.out.println("\n" + "You must specify the base file and a file name as argument" + "\n");
System.out.println("USAGE:
java CompareFiles <BaseFileName> <InputFileName>" + "\n");
}
}
public static HashMap readFile(String file_name) {
HashMap hm = new HashMap();
try {
// Reading file
BufferedReader reader = new BufferedReader(new FileReader(file_name));
List list = new ArrayList();
String line = reader.readLine();
while(line != null){
list.add(line);
line = reader.readLine();
}
reader.close();
// Setting delimiter
Pattern p = Pattern.compile(",");
// records are in the array one line per element
// also, each was printed to stout as it was read
Iterator iterator = list.iterator();
while(iterator.hasNext()){
String str = (String) iterator.next();
// Parsing each line by delimiter
String[] result = p.split(str);
// Storing the first value from the String array as the key
String key = result[0];
//Set value = new LinkedHashSet();
List value = new ArrayList();
// Rest of the String array will be the value
for (int i=1; i<result.length; i++)
{
value.add(result[i]);
}
hm.put(key, value);
}
} catch(Exception ex){
System.out.println(ex);
}
return hm;
}
}
Input data files -
input1 :
1,abc,cde,efg
2,ghi,jkl, lmn
3,nop,pqr,stw
input2 :
1,111,cde,efg
2,222,jkl, lmn
3,333,stv,lmn
% java CompareFiles input1 input2
Base Hash: {3=[nop, pqr, stw], 2=[ghi, jkl, lmn], 1=[abc, cde, efg]}
Input Hash: {3=[333, stv, lmn], 2=[222, jkl, lmn], 1=[111, cde, efg]}
Array List 3,nop,pqr,stw,333,stv,lmn
Array List 2,ghi,jkl, lmn,222,jkl, lmn
Array List 1,abc,cde,efg,111,cde,efg