This week's book giveaway is in the OCAJP forum.
We're giving away four copies of Programmer's Guide to Java SE 8 Oracle Certified Associate (OCA) and have Khalid A Mughal & Rolf W Rasmussen on-line!
See this thread for details.
Win a copy of Programmer's Guide to Java SE 8 Oracle Certified Associate (OCA) this week in the OCAJP forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Out of memory error when opening an XML of size 18 MB

 
aadhar sharma
Ranch Hand
Posts: 38
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

Code :

DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory
.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
Document doc = docBuilder.parse(new File("c:\\xmlfiles\\first.xml"));

I am trying to open an XML file using the above mentioned code and I receive the exception below, I changed my eclipse settings to
C:\eclipse_new\eclipse\eclipse.exe -vmargs -Xms256m -Xmx512M
but even this doesnt help.


Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at org.apache.xerces.dom.DeferredDocumentImpl.createChunk(Unknown Source)
at org.apache.xerces.dom.DeferredDocumentImpl.ensureCapacity(Unknown Source)
at org.apache.xerces.dom.DeferredDocumentImpl.createNode(Unknown Source)
at org.apache.xerces.dom.DeferredDocumentImpl.createDeferredTextNode(Unknown Source)
at org.apache.xerces.parsers.AbstractDOMParser.characters(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanContent(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at javax.xml.parsers.DocumentBuilder.parse(Unknown Source)
at MySampleCode.main(MySampleCode.java:36)
 
Sebastian Janisch
Ranch Hand
Posts: 1183
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What happens in line 36 and how big is your xml file?
 
aadhar sharma
Ranch Hand
Posts: 38
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
There should be some way of reading huge xml files in java

I guess this might not be the way but there should be a way to do it
 
Ulas Ergin
Ranch Hand
Posts: 77
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
try using a SAX parser instead of using DOM parser.
 
Vlado Zajac
Ranch Hand
Posts: 245
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Your program runs within eclipse process?

If it runs is separate JVM (which probably is) then you need to increase its memory not Eclipse memory. I think there is some way to set JVM options for your project in Eclipse (I don't use Eclipse so I don't know exactly where it is).

SAX is probably better then DOM for large files.
 
aadhar sharma
Ranch Hand
Posts: 38
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Is it possible to update an xml using a SAX parser.

As per my knowledge the sax parser loads the xml in chunks is it still possible to update the xml using sax if yes where can i find a sample example for the same.

My sample code for DOM which works on smaller files is below , I am unable to figure out how the following code needs to be changed so that it can work on files of size 18 MB



static List voList = new ArrayList();

public static void main(String argv[]) {
try {

DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory
.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
Document doc = docBuilder.parse(new File("c:\\xmlfiles\\first.xml"));

// normalize text representation
doc.getDocumentElement().normalize();
System.out.println("Root element of the doc is "
+ doc.getDocumentElement().getNodeName());

NodeList listOfBodies = doc
.getElementsByTagName("ShippedPtvDevice");
int totalBodies = listOfBodies.getLength();
System.out.println("Total no of Body Tags are : " + totalBodies);

for (int s = 0; s < listOfBodies.getLength(); s++) {

Node firstPersonNode = listOfBodies.item(s);
// MyIdsVo myIdVo = (MyIdsVo) voList.get(s);
if (firstPersonNode.getNodeType() == Node.ELEMENT_NODE) {

Element firstPersonElement = (Element) firstPersonNode;
NodeList firstNameList = firstPersonElement
.getElementsByTagName("PtvID");

Element firstNameElement = (Element) firstNameList.item(0);

// firstNameElement.setTextContent(myIdVo.getAuthCode());

NodeList lastNameList = firstPersonElement
.getElementsByTagName("IntegratedReceiverDeviceSN");
Element lastNameElement = (Element) lastNameList.item(0);

// lastNameElement.setTextContent(myIdVo.getDeviceID());

NodeList textLNList = lastNameElement.getChildNodes();
System.out
.println("Last Name : "
+ ((Node) textLNList.item(0))
.getNodeValue().trim());

}

// set up a transformer
TransformerFactory transfac = TransformerFactory.newInstance();
Transformer trans = transfac.newTransformer();

// create string from xml tree
StringWriter sw = new StringWriter();
StreamResult result = new StreamResult(sw);
DOMSource source = new DOMSource(doc);
trans.transform(source, result);
String xmlString = sw.toString();

OutputStream f0;
byte buf[] = xmlString.getBytes();
f0 = new FileOutputStream("c:\\xmlfiles\\first.xml");
for (int i = 0; i < buf.length; i++) {
f0.write(buf[i]);
}
f0.close();
buf = null;
}// end of if clause
} catch (Exception e) {
System.out.println(e);
}

}

private void getParameters() {
String str, tmp, tmp1, tmp2, csvFileName = "c:\\Dummy_Devices_3_20000.prn";
StringTokenizer tok;
int lineCount = 0;

try {
FileInputStream fis = new FileInputStream(csvFileName);
BufferedReader br = new BufferedReader(new InputStreamReader(fis));
do {
str = br.readLine();
if (str == null) {
break;
}
lineCount++;
// taking tokens based on "," token separator and adding them to
// two vectors
tok = new StringTokenizer(str, ",");
try {
MyIdsVo vo = new MyIdsVo();
tmp = tok.nextToken();
vo.setAuthCode(tmp);
tmp1 = tok.nextToken();
vo.setDeviceID(tmp1);
tmp2 = tok.nextToken();
System.out.println(" " + tmp2);
voList.add(vo);

} catch (NoSuchElementException ex) {
System.out.println(ex);
}
} while (true);
fis.close();
} catch (FileNotFoundException fnofe) {
System.out.println(fnofe);
} catch (Exception e) {
System.out.println(e);
}

}

 
Paul Clapham
Sheriff
Posts: 21316
32
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
aadhar sharma wrote:Is it possible to update an xml using a SAX parser.

As per my knowledge the sax parser loads the xml in chunks is it still possible to update the xml using sax if yes where can i find a sample example for the same.

My sample code for DOM which works on smaller files is below , I am unable to figure out how the following code needs to be changed so that it can work on files of size 18 MB...


No, you can't update an XML document using a SAX parser. But then, you can't update one using a DOM parser either. What your DOM code does is to parse the entire document into an internal form in memory, then change that internal form, then serialize that internal form back into a new document.

(At least I assume it does that... you didn't post it inside the Code tags so it was unreadable unindented code so I didn't look at it much.)

So, the equivalent would be to parse the document with a SAX parser, which produces a stream of SAX events. You would convert that stream of SAX events into a new stream of SAX events, changing whatever your requirements need to change, and send that stream of events to a SAX writer which would produce the new document. Since I didn't look at your code much I have no idea how easy it would be to do that.

See chapter 8 of ERH's online book for examples of that sort of thing.
 
Jamie Zhang
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

You could try vtd-xml to save memory, the rule of thumb is that DOM consumes 5~10x the memory
of XML documen... vtd-xml consumes only 1.3~1.5, you are less likely to run into out of memory issue..
in addition vtd-xml enables random access, natively supports XPath 1.0...
 
Tanzy Akhtar
Ranch Hand
Posts: 110
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Aadhar,

I faced the same kind of issue.

For that i got two solutions-

1. right click on java class--> run as --> run configurations --> choose arguments tab --> in VM arguments

provide -Xmx1024m

2. Write a batch file "%JAVA_HOME%"\bin\java -cp %CLASSPATH% -Xmx1024m <class name>.

Hope it will work for you.

Thanks,
Tanzy.
 
Waldemar Hummer
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You may want to check out ScaleDOM, which allows to parse very large XML files:
https://github.com/whummer/scaleDOM

ScaleDOM has a small memory footprint due to lazy loading of XML nodes. It only keeps a portion of the XML document in memory and re-loads nodes from the source file when necessary.

(note: this response comes a bit late, but it may be of interest for users who find this page via Google...)
 
Jesper de Jong
Java Cowboy
Saloon Keeper
Pie
Posts: 15436
41
Android IntelliJ IDE Java Scala Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
aadhar sharma wrote:I changed my eclipse settings to
C:\eclipse_new\eclipse\eclipse.exe -vmargs -Xms256m -Xmx512M
but even this doesnt help.

No, that will not help. The only thing that that does, it give Eclipse itself more memory, but not the program that you run from within Eclipse.

You have to add these parameters to the run configuration of your program inside Eclipse, instead of setting parameters on the Eclipse command line.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic