Thanks and Regards
SCJP ,SCWCD
William Brogden wrote:A lot would depend on exactly how your program uses these "huge data structures" - random access? sequential access? - does your program have to modify and save these structures?
Explain in more detail please if you want better advice.
Are we talking about Java objects here? If so can you create a more compact representation based on the real characteristics of the data items?
Are you on a network where some work can be farmed out to another machine?
Bill
Thanks and Regards
William Brogden wrote:Obviously the very best performance would be an all in memory HashMap.
Assuming that is impossible, lets consider some more questions.
Can lookups be batched? If you are going to have to go to some sort of file or network lookup, the bigger the batch you can create the better.
William Brogden wrote:
More questions:
Do these strings have to be UNICODE or are the characters in the ASCII set?
William Brogden wrote:
Do you have to check every single word for a possible substitution?
William Brogden wrote:
What is the likely frequency of having to replace a string? It may make a big difference if there are only a few per document versus wholesale replacement.
William Brogden wrote:
I can't see how a linked hash map would help, but a cache is a good idea since words tend to repeat in documents. Note that I am assuming typical text documents because thats the kind of thing I used to work with.
Thanks and Regards
David O'Meara wrote:Are you using substring on file contents? That method can cause memory leaks since you think you're using part of the file but the immutability of Strings means you retain the whole file in memory. Maybe not, but then again maybe you're not using as much memory as you think.
Thanks and Regards
William Brogden wrote:With all ASCII replacement text you could save considerable memory in a HashMap by storing replacement text as byte[] not String, String uses 16bit unicode.
IF the input numbers and replacement numbers fit in Java Integer or Long you could save even more and maybe keep the whole HashMap in memory. So - what is the range of these numbers?
Bill
Thanks and Regards
Consider Paul's rocket mass heater. |