• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Performance Problem?

 
Ranch Hand
Posts: 335
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I have web application.

I have a file which has 2 million records, each record is of one line
however size is variable but max 32 characters.

I display records on page (1,000 records a page).

Here user can add record as well as modify/edit record as all 1000 records
shown on page can be modified. Also user may delete record by checking
checkbox which appear with every record.

Assume user is on page 5 hence viewing 4001 to 5000 record after add and modify he clicks save following even occur on save

1. I create a temp file.
2. Read records sequentially using readLine() from original file and copy to temp file (1 to 4000).
3. Next I write records from form 1000 records which I displayed on screen (so that modification done by user on viewed records is saved).
4. I go to 5001 record in original file and write to temp file remaining all records. (Now temp file contains updated data).
5. Take backup of original and rename temp to original.


Also I show page links so when user clicks on say 100 page I have to sequentially read file (skip 100000 line using readLine()). Than read 1000 records which will be shown to user.



I find this solution not v good but also can not think of better solution.

I have to do so many readLines because record size is not fixed.
Also since record can be added in between and also deleted hence
have to create temp file to copy all records.


I can not use database, this is restriction.

Is there any better solution?
 
author and iconoclast
Posts: 24207
46
Mac OS X Eclipse IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If you can make all the records the same size, then you can use RandomAccessFile, which will let you skip directly to a record and modify it without rewriting the whole file.

If you can't, then you could do something a little more complex: make an "index file" which shows the starting offset of each record. Then when you modify a record, append the new one to the end of the file, and modify the index file to point to the new offset (you'd use RandomAccessFile on the index.) Occasionally (overnight?) you could rewrite the main file, omitting all the "dead" records.
 
Santana Iyer
Ranch Hand
Posts: 335
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks but problems

1. I have to write record in between two records not at the end.
2. Also with RandomAccessFile I can modify record of same length
ABCD; now I can modify this record with 5 chars only not more or less.
Or my next record gets corrupted.
3. During deletion I have to rewrite file.

For above problems it seems to me that everytime I have to rewrite file i.e. create temp file and rename to original.
 
Ernest Friedman-Hill
author and iconoclast
Posts: 24207
46
Mac OS X Eclipse IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If you have to keep the records in a file, in order, and they're variable length, then yes, you're pretty much hosed. In this kind of problem, you can improve performance only by changing the data structures you use. Since this is apparently not possible, you have two options:

1) Go to whoever is insisting on this storage format, explain why it's a bad choice, and offer alternatives.

2) Wait until that same person complains to you about the performance of the deployed system, tell them it's because of the storage format, and suffer the consequences at that time.

But there's no magic way to make rewriting the file go ten times faster.
 
Santana Iyer
Ranch Hand
Posts: 335
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks a lot.
I am more worried about corruption of data rather than performance.
But it seems this is the only way, I just wanted to know some senior's opinion. Thankyou Sir.
 
Ernest Friedman-Hill
author and iconoclast
Posts: 24207
46
Mac OS X Eclipse IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Santana Iyer:

I am more worried about corruption of data rather than performance.



If you write a new file, while saving the old one, and only rename the new one once you know that the file writing went OK, as you've described, then this is generally a safe thing to do. Of course, you do have to worry about concurrent updates, something you haven't mentioned here. If there is more than one user of the system at a time, then rewriting a single file obviously becomes a difficult and dangerous thing to manage.
 
Ranch Hand
Posts: 38
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Even though you cant have a Database the entities of your file should have a valid max size.
So practically you should be able to use fixed width records,but that also might mean wastage of space and increased file size.
 
reply
    Bookmark Topic Watch Topic
  • New Topic