• Post Reply Bookmark Topic Watch Topic
  • New Topic

reading large xml files in java  RSS feed

 
Nidheesh Krishna
Ranch Hand
Posts: 62
Java MySQL Database Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
what is the fastest way to read a xml file in java containing more than 3 million (30lakh) lines and file size of more than 100mb?
i used normal method,

but it takes too long time to read the file.
 
Tim Moores
Saloon Keeper
Posts: 4032
94
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Using the java.io classes is unlikely to be the right approach for handling XML. Check out the SAX and StAX APIs instead. Those should work even for large files where using DOM might fail.
 
Nidheesh Krishna
Ranch Hand
Posts: 62
Java MySQL Database Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
i want to just read and display the contents.
 
Dave Tolls
Ranch Foreman
Posts: 3061
37
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Not sure how you would go any faster than you are.
Possibly change the buffer size in the BufferedReader?

In any case, what counts as a "long time"?
It's 3 million lines.
Why output it to the screen in the first place?
Who's going to be able to read that?
 
Campbell Ritchie
Marshal
Posts: 56546
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Dave Tolls wrote: . . . Why output it to the screen in the first place? . . .
And screen output will probably be anything from 10× to 100× slower than reading from the file. That may be the explanation for the slow execution.
 
Liutauras Vilda
Sheriff
Posts: 4918
334
BSD
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well, I'm slightly related with similar tasks you mentioned.

Tim is right, you need right tools for that and SAX is the one as the xml file is big enough. DOM is slightly more natural to use but singular xml file needs to be relatively small.

Don't worry about reading an xml yet. What are you going to do with its content afterwards? What is the task about? Start from thinking in more abstract way what needs to be achieved so you could find the right toolbox for whole problem solution and then start thinking about an implementation.
 
Liutauras Vilda
Sheriff
Posts: 4918
334
BSD
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Nidheesh Krishna wrote:i want to just read and display the contents.
Wait. Is that file is truly xml file with tags/elements? Don't you need to extract specific content out of it? Or just somebody decided to give it extension xml so you decided to read it line by line as you'd read an ordinary text notes file?
 
Nidheesh Krishna
Ranch Hand
Posts: 62
Java MySQL Database Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
i want to read the content of the xml file and i'm going to format the xml content, because currently xml contents are not displaying in correct order(spacing and indent are not in correct manner)
 
Tim Moores
Saloon Keeper
Posts: 4032
94
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Indentation had no semantic meaning in XML. Fiddling around with it implies that humans will look at this file, but to echo what Dave said, no human being can realistically look at such a large file. So why do you want to do this?
 
Campbell Ritchie
Marshal
Posts: 56546
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
TM is right, that indenting the file is a waste of time.
If you simply want practice writing indentation programs, you can read and write one line at a time and count the start and end tags as you go. you should end up with the same number when you finish the file, if the XML is well‑formed.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!