Win a copy of Head First Agile this week in the Agile forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

Convert InputStream to byte array  RSS feed

 
Monty Guppy
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi, Can someone suggest the most efficient way to convert an InputStream to byte[]. I want to avoid using StringBuffer (since for larger files it gives me an outOfMemory). ANy code help would be greatly appreciated. Thanks.
 
Peter Chase
Ranch Hand
Posts: 1970
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You have ByteArrayOutputStream. Open a ByteArrayOutputStream, then use a loop to repeatedly read some of the InputStream and write it to the ByteArrayOutputStream. When finished, use toByteArray() to get the byte[]. A disadvantage is that you can only get a copy of the internal byte array, not the actual internal byte array, which means that memory must be able to hold double the amount of bytes.
The internal byte array of ByteArrayOutputStream is a protected member. I have made a subclass that makes this publically accessible via a getter method. This eliminates the doubling-up of memory consumption. However, it is necessary to be sure that the ByteArrayOutputStream is not used again, once its array has been accessed by other code.
 
Monty Guppy
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for your response Peter. Following your suggestion I came up with:
InputStream in=xmlClob().getAsciiStream();
int c;
while ((c = in.read()) != -1) {
byteArrayOutputStream.write((char) c);
}
ByteArrayInputStream byteArrayInputStream = new ByteArrayInputStream(byteArrayOutputStream.toByteArray());
But as you had mentioned, the memory constraint remains this way. If the ultimate goal is to convert the InputStream to ByteArrayInputStream, is there a possibility of using buffer in the above code to make it memory efficient.
Thanks again.
 
jason adam
Chicken Farmer ()
Ranch Hand
Posts: 1932
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Prateek, since you're getting a conversation going here, I'm going to close the other thread. Please refrain from multiple postings about the same issue.
And as I mentioned in the other post, there are Buffered input and output streams, you might want to look into using those.
[ October 24, 2003: Message edited by: jason adam ]
 
Monty Guppy
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Jason,
I tried to use the BufferedInput/ Output stream. For 8MB of data returned back by my query, this failed. Following is the code:
BufferedInputStream bufIn=new BufferedInputStream(clob.getAsciiStream(),512);
ByteArrayOutputStream bAOut = new ByteArrayOutputStream();
BufferedOutputStream bufOut=new BufferedOutputStream(bAOut,512);
while ((c = bufIn.read()) != -1) {
bufOut.write((char) c);
}
Am I doing something wrong here?Any other ideas?
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
this failed.
In what way? Did you get an error message? Was the data invalid? Something else?
Looks like the data is coming from a java.sql.Clob. Are you certain that an ascii stream will be appropriate? If the characters you're reading come from any non-English languages, you may need to use getCharacterStream() to read into a char[] array instead. (This will usually take take more memory, but it should work correctly, which is probably more important.) If you're using a byte[] you need to know for sure what encoding is being used. If you don't, you may well get data that looks like it's corrupt gibbersh. So find out what the encoding realy is, or use getCharacterStream() instead.
If your probles are still with memory usage - first, are you sure you really need the whole object in memory at once? Formany applications, you can just read some data, process it, write it to a file or socket or DB or whatever, then read some more data. Keep looping until you're done. If you can organize the processing so this is possible, do it. If not, if you really need all the data in memory at once - well, for one thing you can save some space by knowing how big the byte[] or char[] array needs to be in the first pplace, and allocting it in advance. Then just read directly into the array, with no need for any other ByteArrayOutputStream or even BufferedInput/Output streams. (If you're using the preallocated array correctly, you're doing your own "buffering" at the final destination; the other buffers become redudant.) Here's what this can look like, using an ascii stream (where 1 byte = 1 character):

On completion, array should be filled with the data you read, and no extra objects were allocated. Or if you need to use character streams and chars:

This will take twice as much memory, but hey, at least it will be correct.
[ October 24, 2003: Message edited by: Jim Yingst ]
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Come to think of it, if you do end up needing chars rather than bytes, you can just use

Should be approximately equivalent the the char[] method.
[ October 27, 2003: Message edited by: Jim Yingst ]
 
Monty Guppy
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks a bunch Jim for your suggestions.
I had to rush my release last week, and it did solve most of my problems, though with some really huge data (>5MB). The original problem with Buffered Stream was (which I forgot to mention in the last post) were java.lang.outOfMemory.
To answer your other question about whether I need all the data in memory at a time- I am reasonably (not 100%) sure that I do. I pass the XML that is returned by my clob in the form of a ByteaArrayInputStream (BAIS) to be parsed by my SAX parser for generating a CSV document. THe CSV document is formed as the BAIS is being parsed:
SAXParser saxParser = SAXParserFactory.newInstance().newSAXParser();
saxParser.parse(byteArrayInputStream, this);
public void startDocument (){------}
public void endDocument () {------}
public void startElement (Str uri, Str name, Str qName, Attributes atts) {------}
public void endElement (String uri, String name, String qName){---}
public void characters (char ch[], int start, int length) {----}
Thanks again.
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!