• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • paul wheaton
  • Liutauras Vilda
  • Ron McLeod
Sheriffs:
  • Jeanne Boyarsky
  • Devaka Cooray
  • Paul Clapham
Saloon Keepers:
  • Scott Selikoff
  • Tim Holloway
  • Piet Souris
  • Mikalai Zaikin
  • Frits Walraven
Bartenders:
  • Stephan van Hulst
  • Carey Brown

maximum length of String = 32k ?

 
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
maximum length of String = 32k ?
 
Ranch Hand
Posts: 57
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Simplistically, yes.
Here's a discussion of the topic:
String length
Linda
 
Greenhorn
Posts: 28
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Java has no limit - the JLS does not specify a limit. There is an implicit limitation if one presumes that a String must be implemented as a single array.
Because of the way the JVM spec is laid out there are limits on the constant length and how it is specified.
There is a limit for each object based on the java heap. Strings live on the heap, so a single one can never be bigger than the heap. Keep in mind that if copying a large string that the space effectively doubles. And characters take two bytes. So one meg of characters take up 2 meg of space.
There is also a limit based on serialization due to a 'bug' which might or might not have been fixed (since the Sun folks were arguing as to whether it is a bug or not.) This limits it to 64k, or maybe 32k. But only when serialization is used.
For most basic purposes the limit is imposed by the heap size and the serialization limit.

----
Robbies
-----------------------------
1.java IDE tool : JawaBeginer
2.Java Jar tool : JavaJar
http://www.pivotonic.com
-----------------------------
 
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If you read the other posts after the one Robbie quoted (uncredited), you see there are corrections to the statement "Java has no limit". The JLS does specify that the String class must follow the String API. The String API has methods int length() and char[] toCharacterArray() which can only work if the number of chars in the String is less than or equal to Integer.MAX_VALUE, which is 2^^31 - 1 = 2147483647. This would take about 4 GB of memory as a character array (2 bytes per char). This is the only hard limit imposed by the language. In practice you we are usually limited by memory. Try running the following program using java -mx8G (setting max heap size to 8 gigabytes, which is a hideously large amount for most of us):

Unless you actually have several gigabytes of RAM, this probram will slow down considerably as the strings get successively larger - the excess memory is stored on disk, which takes much longer to access. Personally I witnessed it create a string of size 134217728 before I got tired of waiting for it. But if you've got the memory available, plus time to wait, then there's no reason you can't see a String much larger strings created - up to the stated limit.
 
Ranch Hand
Posts: 1365
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I modified the program slightly (to use a StringBuffer) and it went pretty smoothly until 16777216 before disk crunching for a while. The next number was 33554432 (at which point I gave up--except Control-C wouldn't stop the VM!). I'm still hearing disk crunch. Hmmmm. We'll see what happens.
 
Ranch Hand
Posts: 1873
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
hi Jim,
i have a confusion here as per your post that says,

"The String API has methods int length() and char[] toCharacterArray() which can only work if the number of chars in the String is less than or equal to Integer.MAX_VALUE, which is 2^^31 - 1 = 2147483647."

i agree that String API must be followed for String objects but that can't be reason, i think, to have a limit of 4GB, as you described, on String object because if say i have String object > 4GB then it will truncate (cast) the length, which actually becomes long data type now, to int type and will return a -ve value (as the higher bit becomes 1 and all fundamentals you know...)...
this won't be true length of the string obviously but API can't do much about it if the object size violates API spec return type OR i don't know if JVM throws Exception if we increase the limit that violates length() return value...
the logic of having heap size limitation fits to my mind more.
i will have to test what happens when we have string > 4GB and call length() object on it but i don't have that much RAM so i am not sure if thats going to work for me...

regards
maulin
 
Arron Zhang
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I test the length of String from 32k to 15M(the size of memory on my computer is 384M)
the length can be any digit less than 15M,
when the length is 15M,exception:
Exception in thread "main" java.lang.OutOfMemoryError
 
Jim Yingst
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Maulin- well, the API for length() doesn't say anything about taking the size of the String and casting it to long before returning the number. It says it returns the length, period. If it can't do that, it's unable to comply with its own API.
I know that there are a number of places in Java where longs get converted to ints, which may create negative numbers and assorted annoying effects. But each of these cases is actually documented somewhere - it would be exceedingly bad manners for them to throw in this type of conversion here without warning anyone.
Also, consider the toCharArray() method. Any array is limited to having an int number of elements - how else would you access the higher indices? So a String longer than Integer.MAX_VALUE couldn't return a char[] array that actually contains the complete contents of the String - another violation of the API. And what about many other String methods like lastIndexOf(char)? Again, it returns an int - which may not be able to hold the index of the correct answer.
So, what can a String do to avoid these violations? Well, it's always allowed to throw a RuntimeException or Error like OutOfMemoryError, the moment an attempt is made to create a string that's beyond the theoretical limit. As we've seen, this error is usually thrown well before reaching the theoretical limit. It's true that it might be nice if the String API actually documented what does occur in this case, but it's not required to do so for unchecked exception.
Arron - did you reset the max heap size? I mis-stated the option to use here - it's -Xmx rather than -mx, and G is not a recognized suffix. (I guess they figured no one would try setting the heap this big.) So to set a heap size of 4 Gb, for example, you'd use
java -Xmx4192m MyClass
If you don't do this, you're getting a default heap size of 64M. This seems to explain your results - a String of 15M chars probably takes 30M bytes. And if you've got this in a StringBuffer and you add another 1M chars onto it, you're creating a new internal char[] array of length 16M, taking up 32M bytes. Both old and new char arrays need to be kept in memory at the same time, at least long enough to copy values from one to the other. So you're using 62M right there - it's not difficult to imagine that you've done something else slightly different which causes you to use a little more instead, crossing the 64M line.
Hint - run java using the -verbose:gc option as well, to get messages telling you how big the heap really is (and how much gargage collection is going on). Enjoy...
[ January 06, 2003: Message edited by: Jim Yingst ]
 
Arron Zhang
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I see.
Thanks.
 
Maulin Vasavada
Ranch Hand
Posts: 1873
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
hi Jim,
now its clear. thanks for the details explanation. it was good
so now i have got one more question to be asked to the person who claims to know Java "If I have 10GB of character data and want to process it using array how would you go about it?"
regards
maulin
 
Jim Yingst
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Ummm... don't? As we've seen it's going to be very hard to get that much data in an array at once, unless your computer has a lot more RAM than mine does. So typically you'd analyse the type of "processing" required and try to figure out how much data needs to be in memory at one time. A common situation is if you're reading data from a file, you often only need to save & process one line at a time. When you're done processing a given line, reuse the same String variable to store the next line. The previous line should then be discarded. See this linkfor discussion of this sort of thing. If you find that the relationships in your data are more complex and you can't do the processing you need one line at a time, then it may be preferable to design a database schema to capture all the data relationships you are interested in, and then let the database do the work of searching for things. That's a much more complex topic - but it really depends what sort of data you have, and what sort of processing you need to do.
 
Ranch Hand
Posts: 898
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator


Java has no limit - the JLS does not specify a limit.


But JVM specs do:

".From the Bill Venners' "The lean, mean, virtual machine. An introduction to the basic structure and functionality of the Java Virtual Machine

The size of an address in the JVM is 32 bits.The JVM can, therefore, address up to 4 gigabytes (2 to the power of 32) of memory, with each memory location containing one byte. Each register in the JVM stores one 32-bit address. The stack, the garbage-collected heap, and the method area reside somewhere within the 4 gigabytes of addressable memory. The exact location of these memory areas is a decision of the implementor of each particular JVM.


From JVM Specs, 1.1
While the Java Virtual Machines would appear to be limited by the bytecode definition to running on a 32-bit
address space machine, it is possible to build a version of the Java Virtual Machine that automatically
translates the bytecodes into a 64-bit form. A description of this transformation is beyond the scope of this
specification.


I would like to understand whether heap can be more than RAM+"Total paging size"

For ex., in my Windows XP
Total paging size for all drives:
(In Windows XP it is in Control Panel – System – Advanced – Performance – Settings – Advanced )
is 386 MB
RAM is 256 MB
So it is +-640MB
Can you all, who "crunched" with such patience your hard disks and first of all Arron Zhang, kindly write those in your PCs?
I propose to rename "Performance" forum to "JVM
[ January 14, 2003: Message edited by: yidanneuG ninaV ]
 
Jim Yingst
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
True - the JVM specs limit the JVM internal memory to 4 GB. This is more fundamental than the String API limits described above, since we can imagine a Java implementation which violates parts of the String API but otherwise works as expected - but even if a JVM contains more than 4 GB of memory, compiled bytecode will never access it unless that compiled code uses a new format with addresses longer than 32 bits. This is such a fundamental change that it would completely break compatibility with other Java implementations.
I would like to understand whether heap can be more than RAM+"Total paging size"
Probably it can't. However I tired of my test long before it came close to my total available paging size, so I can't confirm this at the moment.
 
Politics n. Poly "many" + ticks "blood sucking insects". Tiny ad:
Gift giving made easy with the permaculture playing cards
https://coderanch.com/t/777758/Gift-giving-easy-permaculture-playing
reply
    Bookmark Topic Watch Topic
  • New Topic