I have to join data in java(but i cant use database).I have data coming from 3 sources (just like it comes from 3 tables but not database tables).Had it been from SQL tables I wud have applied sql join and joined them.But this data is not coming from SQL but from text files.Now I am confused how to join them.Please suggest me the way.One way I am thinking of is putting these in list and then iterating one outside other and matching but that would be complex.Please help.
It's not clear exactly what you mean or just what format these text files will have. I'm assuming it's something like this:
And you need to join them into
If so, I suggest:
1) Define a class corresponding to the joined row.
2) Define classes corresponding to each of the "sub-rows" in the individual files.
3) Create one SortedMap per file. The keys will be the IDs or whatever the common column is that you're joining on. The values will be the objects corresponding to that file's "sub-rows".
4) Read each file, populating the appropriate Map as you go.
5) Pick a Map as your starting point. Iterate over its EntrySet(). Create a new "joined row" object for each entry, filling in whatever data is present in the value of that map. Call get(key) on each of the other Maps to get their corresponding data and fill in the rest of that joined row.
part of how to solve it also depends on the scale of the file size. If each file only has a few hundred rows, just about anything will work.
if you are talking about millions of rows per file...how long do you want to wait for it to finish?
Software design is always about balancing many competing factors. You have to figure out which are the most important, and then let the others suffer slightly. Whether it is memory, complexity, speed, technologies (i.e. database vs. arrays) or something else...you can never have it all.
So CAN you put them into lists and iterate? Yes. but you will sacrifice speed and efficiency.
There are only two hard things in computer science: cache invalidation, naming things, and off-by-one errors
no worries for speed.because I am using all this logic in Hadoop ..only thing I am not clear about is once i get all this data in 3 lists..how to join comparing and joining..logic will become quite complex.