File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes I/O and Streams and the fly likes Unable to read a large text file using BufferedReader Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » I/O and Streams
Bookmark "Unable to read a large text file using BufferedReader" Watch "Unable to read a large text file using BufferedReader" New topic
Author

Unable to read a large text file using BufferedReader

praveen Tummalapally
Greenhorn

Joined: Mar 05, 2013
Posts: 2
I am trying to read a text file of size 70MB.
The format of the text file (which contains 2 integers per line separated by a space) is as follows:
1 2
1 9
2 89
34 97

etc...

Am reading them into an ArrayList<Integer> as below:



BufferedReader br = null;
try {
br = new BufferedReader(new FileReader(new File("xyz.txt")));

} catch (FileNotFoundException e) {
System.out.println("Unable to open the input file... ");
e.printStackTrace();
}
String[] vertices = null;
for(int i = 0; i < numOfLinesInTheFile; i++){
vertices = br.readLine().split(" ");
//The below line makes use of the 2 integers in the line read.
gOrig[Integer.parseInt(vertices[0])-1].adjVertexList.add(Integer.parseInt(vertices[1])-1);
//prints how many lakhs of lines are read so far
if( ((i/100000) > 0) && (i % 100000) == 0){
System.out.println(i);
}
}//end of for loop


There are around 51 lakh lines in the text file.
But I get the below error while reading from 42 lakhs onwards:
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1100000
1200000
1300000
1400000
1500000
1600000
1700000
1800000
1900000
2000000
2100000
2200000
2300000
2400000
2500000
2600000
2700000
2800000
2900000
3000000
3100000
3200000
3300000
3400000
3500000
3600000
3700000
3800000
3900000
4000000
4100000
4200000
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.lang.String.substring(Unknown Source)
at java.lang.String.subSequence(Unknown Source)
at java.util.regex.Pattern.split(Unknown Source)
at java.lang.String.split(Unknown Source)
at java.lang.String.split(Unknown Source)
at SCC.readGraph(SCC.java:154)
at SCC.main(SCC.java:34)

I am running this program on 32-bit windows using Eclipse IDE.
I changed the Xmx argument to 1024m.
When I increase it further eclipse is failing to launch.
Even if i used the below line still the same error is coming:
br = new BufferedReader(new FileReader(new File("xyz.txt")), 1024*1024);



Any help is appreciated in solving the above problem.
Or is there a better way to read very large files.

Thanks in advance,
Praveen.
Tony Docherty
Bartender

Joined: Aug 07, 2007
Posts: 2250
    
  47
The memory problem is not down to reading the file, it's because you are storing everything you read in. If you remove the line of code which stores the value in the ArrayList you will find it happily reads the whole file.
praveen Tummalapally
Greenhorn

Joined: Mar 05, 2013
Posts: 2
Tony Docherty wrote:The memory problem is not down to reading the file, it's because you are storing everything you read in. If you remove the line of code which stores the value in the ArrayList you will find it happily reads the whole file.


But I need to store the values that I am reading.
I need all of the data that is read to be processed.
I don't have a choice for piece wise processing the data that is read.
Tony Docherty
Bartender

Joined: Aug 07, 2007
Posts: 2250
    
  47
If you need to store it all then you need to either:

1. Assign more memory using -Xmx.
I'm not sure why you are having a problem with setting values greater than 1024Mb in Eclipse or why they would stop Eclipse from starting as this value should be set for your application config and not for Eclipse. Are you sure you are setting the correct thing. You need to edit the run configuration for the application you are running and go to the Arguments tag and enter the values in the -vmargs field.

2. Store the data in a more memory efficient manner.
Try storing the values as int's rather than as Integers.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Unable to read a large text file using BufferedReader