• Post Reply Bookmark Topic Watch Topic
  • New Topic

use of listFiles() shoots up memory usage occasinally

 
Biji Nair
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I have an application deployed in Unix env where a functionality is to search the File system for a specific file pattern. The drive is a SAN environment. Occasionally, on the listFiles() api call the thread hangs and it is observed that the heap size goes increasing till it reaches its max and the application throws OutOfMemory exception. This happens in a short span.
Load is not a problem as this scenario happens when there is not much load in the system and there are only few transactions doing the same processing and only one thread hangs.

Please share your inputs for this behaviour and any experience on similar issue.

Thanks,
Biji
 
Ernest Friedman-Hill
author and iconoclast
Sheriff
Posts: 24213
35
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

Welcome to JavaRanch!

If you were to call listFiles() on a directory with many, many files in it, then the array of File objects could use up a lot of memory. "A lot" is relative to the amount of memory available to your application -- remember that the JVM has a maximum heap size set at application startup.

Just to give you an idea: a File contains the path as a String object, plus an int. If we assume objects have a 16-byte overhead (typical), then for the File, the String, and the String's array of characters, that's 48 bytes; for the various other primitive members that's another 12 (let's say), so 60 bytes. Now there are two bytes for each character in the path/filename. If you have a longish path to the directory, and a longish filename let's say 150 characters altogether, then that's about 350 bytes per File object. A directory with 100,000 files would therefore return 35 megabytes of data from listFiles().
 
Biji Nair
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
True. As you rightly mentioned, each file object is taking about 302 bytes and when we have a scenario of 25k files in the folder it goes to 8MB. The heap that we have is 2GB and the average utilization of the application is 450MB. Given this state, I would assume that this operation takes 8MB and gets GC. On a regular period it is never an issue.

If the application consistently uses all the memeory off then we can infer and use alternatives to listFiles().

But this behaviour of eating up the entire heap (from 500MB to 2GB) is not understandable. Also for the fact that the thread hangs. In addition I would like to point out is that this folder is dynamic as files gets added and deleted. Will initialization of array or being a SAN have an impact?

Thanks,
Biji
 
Nitesh Kant
Bartender
Posts: 1638
IntelliJ IDE Java MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Biji:
Also for the fact that the thread hangs.

"Hangs" can be a vague term because as you have told that the directory has more than 25K files, the listFiles() itself may take a long time. Complicating the scenario, it is a SAN setup so maybe that adds to the latency.

If you really think that the memory usage is not justified then I would suggest that you use a profiler to profile your application. It will be able to tell you what is it that takes a long time.
(It may be that the returned array is not GCed i.e. someone sits over the reference and is not eligible for GC.)

If you are in jdk 6, HeapDumpOnOutOfMemoryError JVM option will help you on getting the heap dump on OOM. This heap dump can be analyzed using jhat
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!