aspose file tools*
The moose likes Java in General and the fly likes Out of memory error when opening an XML of size 18 MB Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "Out of memory error when opening an XML of size 18 MB" Watch "Out of memory error when opening an XML of size 18 MB" New topic
Author

Out of memory error when opening an XML of size 18 MB

aadhar sharma
Ranch Hand

Joined: Oct 09, 2006
Posts: 38
Hi,

Code :

DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory
.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
Document doc = docBuilder.parse(new File("c:\\xmlfiles\\first.xml"));

I am trying to open an XML file using the above mentioned code and I receive the exception below, I changed my eclipse settings to
C:\eclipse_new\eclipse\eclipse.exe -vmargs -Xms256m -Xmx512M
but even this doesnt help.


Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at org.apache.xerces.dom.DeferredDocumentImpl.createChunk(Unknown Source)
at org.apache.xerces.dom.DeferredDocumentImpl.ensureCapacity(Unknown Source)
at org.apache.xerces.dom.DeferredDocumentImpl.createNode(Unknown Source)
at org.apache.xerces.dom.DeferredDocumentImpl.createDeferredTextNode(Unknown Source)
at org.apache.xerces.parsers.AbstractDOMParser.characters(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanContent(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at javax.xml.parsers.DocumentBuilder.parse(Unknown Source)
at MySampleCode.main(MySampleCode.java:36)


Thanks and Regards
Sebastian Janisch
Ranch Hand

Joined: Feb 23, 2009
Posts: 1183
What happens in line 36 and how big is your xml file?


JDBCSupport - An easy to use, light-weight JDBC framework -
aadhar sharma
Ranch Hand

Joined: Oct 09, 2006
Posts: 38
There should be some way of reading huge xml files in java

I guess this might not be the way but there should be a way to do it
Ulas Ergin
Ranch Hand

Joined: Oct 10, 2002
Posts: 77
try using a SAX parser instead of using DOM parser.
Vlado Zajac
Ranch Hand

Joined: Aug 03, 2004
Posts: 245
Your program runs within eclipse process?

If it runs is separate JVM (which probably is) then you need to increase its memory not Eclipse memory. I think there is some way to set JVM options for your project in Eclipse (I don't use Eclipse so I don't know exactly where it is).

SAX is probably better then DOM for large files.
aadhar sharma
Ranch Hand

Joined: Oct 09, 2006
Posts: 38
Is it possible to update an xml using a SAX parser.

As per my knowledge the sax parser loads the xml in chunks is it still possible to update the xml using sax if yes where can i find a sample example for the same.

My sample code for DOM which works on smaller files is below , I am unable to figure out how the following code needs to be changed so that it can work on files of size 18 MB



static List voList = new ArrayList();

public static void main(String argv[]) {
try {

DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory
.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
Document doc = docBuilder.parse(new File("c:\\xmlfiles\\first.xml"));

// normalize text representation
doc.getDocumentElement().normalize();
System.out.println("Root element of the doc is "
+ doc.getDocumentElement().getNodeName());

NodeList listOfBodies = doc
.getElementsByTagName("ShippedPtvDevice");
int totalBodies = listOfBodies.getLength();
System.out.println("Total no of Body Tags are : " + totalBodies);

for (int s = 0; s < listOfBodies.getLength(); s++) {

Node firstPersonNode = listOfBodies.item(s);
// MyIdsVo myIdVo = (MyIdsVo) voList.get(s);
if (firstPersonNode.getNodeType() == Node.ELEMENT_NODE) {

Element firstPersonElement = (Element) firstPersonNode;
NodeList firstNameList = firstPersonElement
.getElementsByTagName("PtvID");

Element firstNameElement = (Element) firstNameList.item(0);

// firstNameElement.setTextContent(myIdVo.getAuthCode());

NodeList lastNameList = firstPersonElement
.getElementsByTagName("IntegratedReceiverDeviceSN");
Element lastNameElement = (Element) lastNameList.item(0);

// lastNameElement.setTextContent(myIdVo.getDeviceID());

NodeList textLNList = lastNameElement.getChildNodes();
System.out
.println("Last Name : "
+ ((Node) textLNList.item(0))
.getNodeValue().trim());

}

// set up a transformer
TransformerFactory transfac = TransformerFactory.newInstance();
Transformer trans = transfac.newTransformer();

// create string from xml tree
StringWriter sw = new StringWriter();
StreamResult result = new StreamResult(sw);
DOMSource source = new DOMSource(doc);
trans.transform(source, result);
String xmlString = sw.toString();

OutputStream f0;
byte buf[] = xmlString.getBytes();
f0 = new FileOutputStream("c:\\xmlfiles\\first.xml");
for (int i = 0; i < buf.length; i++) {
f0.write(buf[i]);
}
f0.close();
buf = null;
}// end of if clause
} catch (Exception e) {
System.out.println(e);
}

}

private void getParameters() {
String str, tmp, tmp1, tmp2, csvFileName = "c:\\Dummy_Devices_3_20000.prn";
StringTokenizer tok;
int lineCount = 0;

try {
FileInputStream fis = new FileInputStream(csvFileName);
BufferedReader br = new BufferedReader(new InputStreamReader(fis));
do {
str = br.readLine();
if (str == null) {
break;
}
lineCount++;
// taking tokens based on "," token separator and adding them to
// two vectors
tok = new StringTokenizer(str, ",");
try {
MyIdsVo vo = new MyIdsVo();
tmp = tok.nextToken();
vo.setAuthCode(tmp);
tmp1 = tok.nextToken();
vo.setDeviceID(tmp1);
tmp2 = tok.nextToken();
System.out.println(" " + tmp2);
voList.add(vo);

} catch (NoSuchElementException ex) {
System.out.println(ex);
}
} while (true);
fis.close();
} catch (FileNotFoundException fnofe) {
System.out.println(fnofe);
} catch (Exception e) {
System.out.println(e);
}

}

Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18669
    
    8

aadhar sharma wrote:Is it possible to update an xml using a SAX parser.

As per my knowledge the sax parser loads the xml in chunks is it still possible to update the xml using sax if yes where can i find a sample example for the same.

My sample code for DOM which works on smaller files is below , I am unable to figure out how the following code needs to be changed so that it can work on files of size 18 MB...


No, you can't update an XML document using a SAX parser. But then, you can't update one using a DOM parser either. What your DOM code does is to parse the entire document into an internal form in memory, then change that internal form, then serialize that internal form back into a new document.

(At least I assume it does that... you didn't post it inside the Code tags so it was unreadable unindented code so I didn't look at it much.)

So, the equivalent would be to parse the document with a SAX parser, which produces a stream of SAX events. You would convert that stream of SAX events into a new stream of SAX events, changing whatever your requirements need to change, and send that stream of events to a SAX writer which would produce the new document. Since I didn't look at your code much I have no idea how easy it would be to do that.

See chapter 8 of ERH's online book for examples of that sort of thing.
Jamie Zhang
Greenhorn

Joined: Jul 31, 2009
Posts: 9

You could try vtd-xml to save memory, the rule of thumb is that DOM consumes 5~10x the memory
of XML documen... vtd-xml consumes only 1.3~1.5, you are less likely to run into out of memory issue..
in addition vtd-xml enables random access, natively supports XPath 1.0...
Tanzy Akhtar
Ranch Hand

Joined: Jul 19, 2009
Posts: 110
Hi Aadhar,

I faced the same kind of issue.

For that i got two solutions-

1. right click on java class--> run as --> run configurations --> choose arguments tab --> in VM arguments

provide -Xmx1024m

2. Write a batch file "%JAVA_HOME%"\bin\java -cp %CLASSPATH% -Xmx1024m <class name>.

Hope it will work for you.

Thanks,
Tanzy.

Roll with punchers, there is always tomorrow.
Techie Blog -- http://jtanzy.blogspot.com/
Waldemar Hummer
Greenhorn

Joined: Sep 13, 2013
Posts: 1
You may want to check out ScaleDOM, which allows to parse very large XML files:
https://github.com/whummer/scaleDOM

ScaleDOM has a small memory footprint due to lazy loading of XML nodes. It only keeps a portion of the XML document in memory and re-loads nodes from the source file when necessary.

(note: this response comes a bit late, but it may be of interest for users who find this page via Google...)
Jesper de Jong
Java Cowboy
Saloon Keeper

Joined: Aug 16, 2005
Posts: 14270
    
  21

aadhar sharma wrote:I changed my eclipse settings to
C:\eclipse_new\eclipse\eclipse.exe -vmargs -Xms256m -Xmx512M
but even this doesnt help.

No, that will not help. The only thing that that does, it give Eclipse itself more memory, but not the program that you run from within Eclipse.

You have to add these parameters to the run configuration of your program inside Eclipse, instead of setting parameters on the Eclipse command line.


Java Beginners FAQ - JavaRanch SCJP FAQ - The Java Tutorial - Java SE 8 API documentation
 
Don't get me started about those stupid light bulbs.
 
subject: Out of memory error when opening an XML of size 18 MB