• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • paul wheaton
  • Jeanne Boyarsky
  • Ron McLeod
Sheriffs:
  • Paul Clapham
  • Liutauras Vilda
  • Devaka Cooray
Saloon Keepers:
  • Tim Holloway
  • Roland Mueller
Bartenders:

java.util.Scanner class - splitting a large text file

 
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi all,

I have a cituation here in my project, where i am wring a parser for the text file of size ranging between 100 - 700MB. All the records inside are seperated with Hex characters ; start of the record - "\u0002" ; end of the record - "\u0003". In one of the postings i found Scanner is very useful for not to load the whole data into memory, meaning i can read the file record by record. The file will have 1000's of records each record seperated by the Hex characters mentioned above.

Could any one please help me with a piece of code to read the input file record by record ( i already wrote a parser for parsing individual records, which are considerably small in size - can afford to store in memory) without loading whole data into memory?

Quick reply is much appreciated.

regards,
Jay.
 
Bartender
Posts: 9626
16
Mac OS X Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Welcome to the JavaRanch.
We like to help people, but we ask that you show some effort.
Have a look at the Scanner Java Doc and the Java Tutorial and give it a try. If you have any problems, feel free to come back with some code and we'll see what we can do.
 
Ranch Hand
Posts: 262
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I would recommend using the findWithinHorizon() method, with a horizon of zero and the regex If you don't want the delimiters returned as part of the record, you can use lookbehind and lookahead to match them:
 
Do not threaten THIS beaver! Not even with this tiny ad:
Smokeless wood heat with a rocket mass heater
https://woodheat.net
reply
    Bookmark Topic Watch Topic
  • New Topic