Win a copy of Murach's Python Programming this week in the Jython/Python forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

DataInputStream reading less than buffer size  RSS feed

 
Anand M Kulkarni
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have a binary file on hadoop distributed file system that i want to read . I am using FSDataInputStream ( which extends DataInputStream ) . I have buffer of length "len" . I use readBytes = stream.read(buffer) method to read "len" number of bytes from file into buffer.
BUT Actual number of bytes read ( readBytes ) are less than buffer size ( len ), even though I know that there are "len" number of bytes present in file.
So why does FSDataInputStream read less number of bytes than i ask it to read? Any IDEA?
 
Paul Clapham
Sheriff
Posts: 22185
38
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
From the API documentation for DataInputStream:

Reads some number of bytes from the contained input stream and stores them into the buffer array b. The number of bytes actually read is returned as an integer.


You'll notice that it doesn't promise that the buffer array will be filled up. In fact it clearly implies that it might not be filled up. So your program should not ever assume that the buffer will be full after this method is called.

As for why the buffer might not be filled up -- no doubt there's a reason, but what good will it do you to know that reason? Once you find it out, you will still have to write the same code you would have to write if you didn't know the reason.
 
Mike Simmons
Ranch Hand
Posts: 3090
14
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If only there were a readFully() method that was part of the DataInputStream API. That would solve this problem.

Oh, wait - there is.

Never mind.

There are various reasons why a normal read() method might return with less bytes than you are expecting. Maybe there's a hardware buffer somewhere that's smaller than the buffer you're using. Maybe the file is fragmented and it's going to take several milliseconds for the disc reader to get to the next fragment. Maybe you're reading a networked drive and the network is slow. But as Paul says, it doesn't really matter. You just have to assume that it may give you an incomplete read, like it says.
 
Seetharaman Venkatasamy
Ranch Hand
Posts: 5575
Eclipse IDE Java Windows XP
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Welcome to JavaRanch Anand M Kulkarni
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!