Win a copy of The Java Performance Companion this week in the Performance forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Large File Upload

 
Harry Anan
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Dear Experts,

I have a requirement where a large file (100 -200MB) is uploaded from the client to a content management system. I am using a servlet with Apache Commons File Upload API. Apache FileUpload has 2 ways of handling files,

1) Non-Streaming
2) Streaming

Currently I use the Non-Streaming approach where the servlet stores the file in a temp location and upload the same into the content management system - This is taking lot of time so I am trying to implement Streaming API.

Content Management API supports streaming in 2 methods,
a) SetContent - Takes the file's ByteArrayOutputStream as input -> This gives OutOfMemoryException because the file being large
b) AppendContent - Takes the file's ByteArrayOutputStream as input -> This method can be called multiple times to upload the large file but I dont know how to do this. The Apache File Upload gives InputStream of the file and I need to split that into chuncks and append into the content management system.

Can someone guide me how to convert InputStream to 4KB ByteArrayOutputStream so that I can use the AppendContent method in content management API?

Thanks in advance
 
Paul Clapham
Sheriff
Posts: 21137
32
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You mean, like, writing a simple loop which reads bytes from the InputStream and writes them to the ByteArrayOutputStream? Or is there something else besides just copying the data which you are asking about?
 
Harry Anan
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yes. Reading 4096 bytes from the InputStream using a loop and creating a ByteArrayOutputStream out of the 4096 bytes and appending that into the content management system. I am using read(byte[] b, int off, int len) but it looks like infinite loop no errors in servlet even after 20 mins. This is the snippet from my servlet -please let me know if I miss something.





Also is this going to take lot of CPU time because of reading 100+MB in 4KB chunks?

Thanks a lot
 
Paul Clapham
Sheriff
Posts: 21137
32
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Harry Anan wrote:... but it looks like infinite loop no errors in servlet even after 20 mins. This is the snippet from my servlet -please let me know if I miss something.

Also is this going to take lot of CPU time because of reading 100+MB in 4KB chunks?


It "looks like" infinite loop? You haven't put debugging statements in there to see what's really going on? Better still you should be debugging this code in a standalone Java class, rather than using a servlet container as your test base.

And yes, it's going to take a lot of CPU time to read 100 MB of data. And more to the point, it's going to take a lot of elapsed time in your case because the data is coming slowly over the network. You should expect your code which processes the upload to run faster than the data arrives, so don't waste your time worrying about CPU time.

As for the code, I don't understand why you create a new ByteArrayOutputStream for each chunk of data you read. You could be creating a new ByteArrayOutputStream for each byte in the worst case. That code should be outside the loop which fills the buffer. And you're always writing 4096 bytes from the buffer even if you didn't read 4096 bytes from the buffer. This will give you extra junk at the end of the file in most cases. (In your code you break out of the loop, so you throw away the last buffer, but that should be fixed if you put the writing outside the loop.)
 
karthikeyan Chockalingam
Ranch Hand
Posts: 259
 
Harry Anan
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I don't understand why you create a new ByteArrayOutputStream for each chunk of data you read. You could be creating a new ByteArrayOutputStream for each byte in the worst case. That code should be outside the loop which fills the buffer.


BullsEye!!

Thanks a lot for the correction, Paul. After I reused the same stream instance, I am able to upload a 150MB file in 3 mins which is simply great.





 
Nguyen Ninh
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Harry Anan, I met the same problem but I don't know how to resolve (Using FileUpload API, Content Management API...).
So, could you give me some code snippet.
Thanks.
 
venkata Silla
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Nguyen Ninh wrote:Harry Anan, I met the same problem but I don't know how to resolve (Using FileUpload API, Content Management API...).
So, could you give me some code snippet.
Thanks.



Hi Ninh,

I got the same requirement to use the Apache fileupload api to upload large files in MB using HTTP.Could you please suggest any references or source code around.Thanks a lot..
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic