• Post Reply Bookmark Topic Watch Topic
  • New Topic

Garbage collection & memory leaks in Java  RSS feed

 
Mahdad Zarafshan
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi all,

Is memory leaks possible in Java? I think I have the idea that the answer is yes. However, there are different implementations of GC's in Java, which:
- collect circular references and even weak references
- collect the object when the object's counter becomes 0
etc.

My question is:

What's the scenario in which GC fails to collect an object? Any idea about this?

Thanks,

Mahdad
 
Ernest Friedman-Hill
author and iconoclast
Sheriff
Posts: 24217
38
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Memory leaks in manual-memory-management languages happen because an object becomes unreferenced before anyone frees it. Leaks in Java happen because an object you're no longer using remains referenced. This happens often in GUI programs, for example, when forgotten event handlers contain references to screens you no longer need, but remain attached to some existing component.

I don't know of any modern JVMs that use reference-counting; it's slow and uses too much memory.
 
Mahdad Zarafshan
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for your reply Ernest.

My Q's were about automatic GC's such as Java, not manual ones such as C++.

More Q's:

- You mentioned **an object you're no longer using remains referenced**. How is this possible? Who will reference this unused object? Is it the developer's job to set the reference to null to mark this object for collection?

- You mentioned **JVMs that use reference-counting; it's slow and uses too much memory**. Which implementation of GC is efficient? Why is ref.counting slow? Which GC implementation is more vulnerable to leaks? Any idea?

Thanks,

Mahdad
 
Ernest Friedman-Hill
author and iconoclast
Sheriff
Posts: 24217
38
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Mahdad Zarafshan:
Thanks for your reply Ernest.

My Q's were about automatic GC's such as Java, not manual ones such as C++.


Yes, I know that. I was explaining that "memory leak" means something fundamentally different in a GC'd language. In C++, if you forget to delete something, the object is unreachable, and therefore can never be freed. In Java, the problem happens when something is reachable.


- You mentioned **an object you're no longer using remains referenced**. How is this possible? Who will reference this unused object? Is it the developer's job to set the reference to null to mark this object for collection?


It's possible because an object can have multiple references, and the programmer may not realize that she's put some object in a situation where it will always be reachable. As I said, event listeners are easy to forget about, but there are all sorts of other possibilities. It's easy to think that when your local reference to an object goes out of scope or is set to null, the object will be collectible, but that's not always the case -- the object may be referenced somewhere else.

Again, it's the opposite of the situation in (for example) C++. In that language, if you have multiple pointers to an object, then freeing the object via one pointer gets rid of the object -- but now you have all these bad "dangling" pointers.


- You mentioned **JVMs that use reference-counting; it's slow and uses too much memory**. Which implementation of GC is efficient? Why is ref.counting slow? Which GC implementation is more vulnerable to leaks? Any idea?


Modern JVMs typically a generational mark-and-sweep collector. "Generational" means the heap is divided into sections called "generation", where objects are segregated according to their age; the longer an object exists, the longer it's likely to continue to exist, so it's efficient to concentrate garbage collection efforts only on the section containing the newest arguments. The "mark and sweep" algorithm basically involves looking through the heap to find unreferenced objects, then freeing them. It can break cycles, unlike reference counting, so it can pretty much collect all unreachable objects -- the only leaks are due to "accidentally referenced" objects, as I discussed above.

Reference counting is slow because you have to add/subtract each time an object is referenced, it's expensive in terms of storage because you have to reserve extra space for the count for each object, and it's fragile because no matter how much space you reserve for the reference count, an object might be referenced more times than that so the count can overflow (this last argument strikes me as rather weak; nonetheless, you'll hear it made.) Although mark-and-sweep's "searching" sounds slow, it's really not -- it takes much less time overall than reference counting.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!