• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • paul wheaton
  • Liutauras Vilda
  • Ron McLeod
Sheriffs:
  • Jeanne Boyarsky
  • Devaka Cooray
  • Paul Clapham
Saloon Keepers:
  • Scott Selikoff
  • Tim Holloway
  • Piet Souris
  • Mikalai Zaikin
  • Frits Walraven
Bartenders:
  • Stephan van Hulst
  • Carey Brown

Wondering About Class Memory Overhead

 
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

I've been tasked with processing a large dataset as part of a class assignment. One of the fields is a 24-digit unsigned hex number. I realized that, rather than storing the field verbatim in a char array of length 24, I could store the actual value of the hex number in an array only 6 chars long (I chose char over int because chars are unsigned). To do this, I wrote the following simple class to accept the hex string and convert it so that it can be stored in that manner:


Assuming all this works (haven't tested it, but I think it should), what I'm wondering is how much memory this will save me (if any) compared to just throwing everything into a char[24] array. The underlying char[6] is obviously quite a bit smaller, but objects must take up more space than just their fields since Java needs to know what kind of object it is so it can know what methods it has, etc.. I have no idea how to accurately compare the size of a Hex24 object to the size of a 24 character array. How do you figure this out?

edit: I guess another option/thing I might want to compare size efficiency for is putting the char[6] along with the conversion/comparison logic directly into the classes (as fields/member methods) where I'm currently using Hex24 fields. I'd guess this is the most memory-efficient option I've come up with, but it would lead to a lot of code duplication.

edit2: Another thing I'd like to compare is the size of a String vs. the size of the equivalent char array for shortish text fields. If this difference is big enough it might be worth storing those fields as arrays rather than Strings.
 
Sheriff
Posts: 17734
302
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Programmers are notoriously bad at optimizing based on gut feeling and intuition. Use a profiler if memory is a big concern. However, remember what Turing Award winner Donald Knuth said: "Premature optimization is the root of all evil."

Strive for clean and clear code and designs before you start worrying about performance and efficiency. Efforts to optimize based on opinion and guesswork is more often than not, not worth the minimal gains you get, if any.
 
Junilu Lacar
Sheriff
Posts: 17734
302
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
And if you think you should test this gnarly looking code, yes, you definitely should.
 
Sam Sylva
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Okay. I don't know how to use a profiler and I don't really have time to learn right now since it's the end of the semester. The reason I'm so worried about optimizing for memory is that, if I did the calculations correctly, reducing memory requirements of a field by a single byte will reduce the total amount of memory my program needs to run by about 100 mb. I need to get it down to the point where it can run on a machine with 8 gb RAM. I guess for now I'll just use the 24 char array and hope it's good enough.

edit: So basically what you recommend is that I initially write everything in as straightforward, object-oriented a manner as possible and then go back later and fix what's killing my memory? For example, it would be better to start off encapsulating each record in the data set in an object and maintain one array that references these objects rather than creating a set of parallel arrays, one for each field, where each index across the arrays corresponds to a record (which would definitely use less memory since you don't have the overhead from millions of record objects)?
 
Marshal
Posts: 8988
652
Mac OS X Spring VI Editor BSD Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Sam,

How about the other important things as code indentation, magic numbers, misleading variable names, are they part of marking grid?

Also, since you're worrying about tinny memory/speed efficiency improvements and also using pre-increment within "for" loop (if I'm not mistaken I know why), which is not usual expression of "for" loop , it's might worth to think, which case of "if" statement likely going to be satisfied more often (you mentioned you working with big amount of data). So, based on your assumption, make amendments from != to == and swap return statements if it is a case (it's not something usual as well when talking about efficiency improvement, but..).

I have a hunch that someone will criticize on that.
 
Junilu Lacar
Sheriff
Posts: 17734
302
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Sam Sylva wrote:edit: So basically what you recommend is that I initially write everything in as straightforward, object-oriented a manner as possible and then go back later and fix what's killing my memory? For example, it would be better to start off encapsulating each record in the data set in an object and maintain one array that references these objects rather than creating a set of parallel arrays, one for each field, where each index across the arrays corresponds to a record (which would definitely use less memory since you don't have the overhead from millions of record objects)?



I don't know the specifics of your requirements, so beyond the advice to write clean code first, anything else you read into that is coming from you alone. I do question why you need to keep millions of objects in memory. As with most things in programming, the way to manage complexity, limited resources, etc. is by Divide and Conquer. Even your operating system will page data in and out of memory to make more efficient use of it. If you want to tighten your belt but still eat a ton of food, at some point you're still going to burst. The only way to eat a whole elephant is one bite at a time.

BTW, your equals method is flawed. It should return false when passed null but instead your implementation will throw a NullPointerException. Also, if you are passed a reference to the same object itself, you do unnecessary calculations instead of immediately returning true. An object is always equal to itself and it is never equal to null.

 
Junilu Lacar
Sheriff
Posts: 17734
302
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Liutauras Vilda wrote:
Also, since you're worrying about tinny memory/speed efficiency improvements and also using pre-increment within "for" loop (if I'm not mistaken I know why), which is not usual expression of "for" loop , it's might worth to think, which case of "if" statement likely going to be satisfied more often (you mentioned you working with big amount of data). So, based on your assumption, make amendments from != to == and swap return statements if it is a case (it's not something usual as well when talking about efficiency improvement, but..).

I have a hunch that someone will criticize on that.


Ok, I'll bite.

Using pre-increment vs post-increment in the for-loop here makes no difference since the increment part is always evaluated after the loop body is executed. As a matter of style, you should use post-increment so you don't confuse other folks the way you did with Liutauras

Regarding the if-statement and using != vs. ==, in this case != is required because you know at that point that equals is false and the rest of the array doesn't matter. To return true, all elements must be the same as the corresponding elements in the other object.

The magic number, 6, can be replaced by hexAsInt.length
 
Liutauras Vilda
Marshal
Posts: 8988
652
Mac OS X Spring VI Editor BSD Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Junilu Lacar wrote:Using pre-increment vs post-increment in the for-loop here makes no difference since the increment part is always evaluated after the loop body is executed.


Well, agree, Junilu.
I was wondering just, is OP actually had in mind, that dependent of implementation possibly pre-increment is faster compared to post-increment, because of post-increment "index++" must create a temporary storage for keeping the original value of index, increment/store index, and return the original temporary value, while pre-increment "++index" only has to increment/store the value and return it, by omitting of creating temporary value.
 
Marshal
Posts: 80097
413
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
LV, I would suggest you write methods with i++ and ++i in and print their bytecode; I think you use
javap -c MyClass
Then you can see what the bytecode does. I have seen websites (I cannot remember where) saying you can save a few milliseconds if you use ++i several million times instead of i++, which seems to mean any difference is very small.

SS, you should not worry about memory use, unless you need millions of objects “live” simultaneously. Saving a few bytes here and a few bytes there is trivial compared to the amount of effort you are putting into it.

How are you calculating your chars from a 24‑digit hex number? If you pass 0x1000_0000_0000_0000_0000_0000, what will your array contain? Will five of the six chars equal (char) 0?
 
Bartender
Posts: 10780
71
Hibernate Eclipse IDE Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Sam Sylva wrote:I've been tasked with processing a large dataset as part of a class assignment. One of the fields is a 24-digit unsigned hex number. I realized that, rather than storing the field verbatim in a char array of length 24, I could store the actual value of the hex number in an array only 6 chars long (I chose char over int because chars are unsigned). To do this, I wrote the following simple class...


And you wrote this to convert just one of the fields?

You might want to have a look at BigInteger (← click), which has methods to convert to and from hex.

That said, I applaud your thinking. I've often wondered if BigInteger could possibly be improved by using chars internally, rather than ints; but I suspect that, even there, it's premature optimization, because the natural size for arithmetic in Java is an int.

Other than that: listen to the others. They speak not with forked tongue.

Winston
 
Then YOU must do the pig's work! Read this tiny ad. READ IT!
New web page for Paul's Rocket Mass Heaters movies
https://coderanch.com/t/785239/web-page-Paul-Rocket-Mass
reply
    Bookmark Topic Watch Topic
  • New Topic