• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Memory leak while using String tokenizer

 
Greenhorn
Posts: 10
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi All,

I have a program which reads a text file with comma separated value(size 3MB). Then the file will be read into StringBuffer which is then tokenized using StringTokenizer.
I see memory leak in JVM when I store the string returned from the nextToken operation when it is stored in static hashmap.

When I run a Jprobe I find the a char array being created in JVM which will not be released untill the key or value which is referenced to the string returned from the nexttoken is released from hashmap.
The memory occupied in Heap memory in this case is around 7MB

Why is that JVM not releasing the char array memory during Garbage collection. Do you have any idea?

I have attached the sample program below


[ October 21, 2008: Message edited by: vijay kumar ]

[ October 21, 2008: Message edited by: vijay kumar ]

[Nitesh: Added code tags. Kindly use code tags while posting code.]
[ October 21, 2008: Message edited by: Nitesh Kant ]
 
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
StringTokenizer uses substring() to return the tokens, so the token objects don't store their own char data, just a reference to the original string and an offset and length. As long as the token strings are referenced (here, held in the map) the original string will be referenced, too.

This is a common gotcha when parsing out small chunks of large strings. One way to get around it is to create a new String from the token ( token = new String(token); ) , this will copy the string contents and release the reference to the original string.

More importantly, there's a much easier way to do what you want. You are already using BufferedReader, so why not use the .readLine() method? Similarly, .split() is an easier way to get the fields.


[ October 21, 2008: Message edited by: Dmitri Bichko ]
 
vijay kumar
Greenhorn
Posts: 10
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks for the explanation.I never knew that substring returns the reference instead of new string.
 
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
 
vijay kumar
Greenhorn
Posts: 10
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The above code says the substring will return new String and not the reference unless the count is begin or end. So if I retreive the string from the center then it is the new string, the reference should not hold good here. So the original string has to be garbage collected which is not the case?
 
Dmitri Bichko
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by vijay kumar:
The above code says the substring will return new String and not the reference unless the count is begin or end. So if I retreive the string from the center then it is the new string, the reference should not hold good here. So the original string has to be garbage collected which is not the case?



I can't say I see where it says that. It always returns a new String object, it's the backing data we are talking about.

Nothing beats experimentation, throw something like this into a debugger and look at the 'value' field of both objects:


They will reference the same char[].
 
vijay kumar
Greenhorn
Posts: 10
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Yes the Char[] is refering to the string even though it substring returns new String. The memory hold by the char[] is not released if we store the returned string in hashmap.

Thanks.
[ October 23, 2008: Message edited by: vijay kumar ]
 
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You just made my day!
I was actually stuck on this nagging issue for almost a week: my tenured heapspace was overbroiling, no matter how big I made it...
Thanks a bunch!
 
reply
    Bookmark Topic Watch Topic
  • New Topic