• Post Reply Bookmark Topic Watch Topic
  • New Topic

write huge data into a File in java  RSS feed

 
Nikki Tha
Greenhorn
Posts: 13
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I have to write huge data into a File in java.

1. I retrieve data from Oracle DB with 4 Queries.
Query 1: Employee data: EmpID, EmpName, EmpDept and many more
Query 2: Leave Data:    EmpID, Number of Days of leave, type of leave and many more
Query 3: Depatment Data:EmpID, DeptID, Dept Name and many more
Query 4: EmploymentData:EmpID, Current Position, previous Position and many more
-> Each Query has a common field: EmpID
-> Each of the Query is too big and too complex, because of which i wrote seperate queries for each else i would have combined them through the "With"

Clause in Oracle.

Right now, my requirement is I have to write Each Employee Data together in the File i.e find the sample Data below:

EMP001 EMPName123 DEPT111
EMP001 2Days   Casual Leave
EMP001 3Days   Sick Leave
EMP001 DEPT111 HR Dept
EMP001 SR.SoftwareEngineer SoftwareEngineer
EMP002 EMPName987 DEPT999
EMP002 10Days   Casual Leave
EMP002 12Days   Sick Leave
EMP002 DEPT999 Fire Dept
EMP002 Manager TeamLead
EMP002 TeamLead Sr.Engineer

As you can see, each Employee will have only one line for the "Employee Data" and for other data he can have any number of lines of data.

So the solution, i thought for this is

1. Retrieve all the 4 Queries of Data and store them in 4 different Arraylists in java
2. Just before writing the data to file, check for the EMPID first in the Employee Data and in the inner while loop(one inner while loop for each List) and write all the lines for each EmpId in the file.

Could you please your suggestions for the betterment of this solution and any new solutions with utmost performance ?

---Thanks in Advance
 
Campbell Ritchie
Sheriff
Posts: 55351
157
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What sort of file are you writing to? Random access? Text? Why do you need to extract data from the database in the first place? What do your queries look like when they are printed? How many lines do you expect to write?
 
Campbell Ritchie
Sheriff
Posts: 55351
157
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
...and can you create an object to encapsulate those data?
 
Nikki Tha
Greenhorn
Posts: 13
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
1. This is Text File.
2. The no: of Employees will be around 100,000.
3. For each Employee line, there might be around 3-10 lines of each other query.
4. I need to extract data from the database because there are many conditions and flags based on which i have to retrieve data.
5. Yes, i have created one Value Object for each query(because each of the 4 queries, have different fields)

Thanks
 
Campbell Ritchie
Sheriff
Posts: 55351
157
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Nikki Tha wrote:. . . For each Employee line, there might be around 3-10 lines of each other query. . . .
And how long is a line?
Thanks
That's a pleasure

I presume you can give these value objects a toString method. Separate the fields in the text by reproducible characters, e.g. pipe | so you can regard the output as a CSV. Then you can create a Formatter which prints the lines:-You may find it easier to read all the Employee objects into a List.
Or even to write them into a blocking queue on Thread 1 and use Thread 2 to read them from the other end of the queue and write them to the file.

A 3000 character line for each of 100000 instances is 3000×100000 = 6×10⁸ bytes (2 bytes per char), so it is not huge at all. You shou‍ld have no difficulty with 600MB files. Of course, finding something in that file requires a linear search, which won't be fast.
 
Tony Docherty
Bartender
Posts: 3264
81
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I would suggest you start by ensuring all the data returned by each of your DB queries is sorted into ascending EMPID order, that way you don't have to search through each of the lists for a given EMPID value.
But as Campbell has said it's not an enormous amount of data so I'd suggest you write it out and see how long it takes and then if it's taking too long look at how to improve the performance.
 
Campbell Ritchie
Sheriff
Posts: 55351
157
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I can't see that writing the file will take long at all. I am pretty sure you can select those 100000 lines from the database and write them to disc in a few seconds. But I still can't see what advantage the text file will have over the database when it comes to searching, etc.
 
Nikki Tha
Greenhorn
Posts: 13
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Campbell Ritchie wrote:I can't see that writing the file will take long at all. I am pretty sure you can select those 100000 lines from the database and write them to disc in a few seconds. But I still can't see what advantage the text file will have over the database when it comes to searching, etc.


Actually i have to generate this file and send to another application which accepts this as the input.
 
Campbell Ritchie
Sheriff
Posts: 55351
157
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
That still sounds an iffy bit of design. I would get suspicious of applications which “require” a particular form of input like a text file.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!