Win a copy of Murach's Python Programming this week in the Jython/Python forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

Legacy Data File  RSS feed

 
William Jefferson
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I've been steadily learning Java with the goal of replacing a legacy Visual Basic app with java. Now I'm to the good part - File I-O. But my historical VB files have integers stored as integer field. So far, everything I've read about file input in Java makes me think all data is stored as character strings, usually separated by a delimiter.

First, is that true? Care to elaborate for newbie?

Second, is there any way to read an old style interger field from a file in Java?

I assume I can use VB to convert this old data into a java readable format. These are what I have always called "flat" files. I read the file once and build an array index based on customer name, display these names and allow the user to jump from customer to customer. There are multiple data records per customer, but only one customer name record per customer. In VB I had a fixed length record and could seek on record number. Am I still going to be able to do this in java? Again, care to elaborate for a newbie?

thanks much


 
Steve Luke
Bartender
Posts: 4181
22
IntelliJ IDE Java Python
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
William Jefferson wrote:I've been steadily learning Java with the goal of replacing a legacy Visual Basic app with java. Now I'm to the good part - File I-O. But my historical VB files have integers stored as integer field. So far, everything I've read about file input in Java makes me think all data is stored as character strings, usually separated by a delimiter.

First, is that true? Care to elaborate for newbie?


No, that isn't true. Actually, all data is read in as a sequence of bytes. The most common tools take those bytes and order them as text because so many file-reads are for text. The tools which are designed around text / strings / character arrays are usually called 'Reader' like BufferedReader or FileReader. But if you use on of the base-input tools, they are called 'InputStreams' and get you bytes. One input stream implementation is the DataInputStream which has methods for reading ints, floats, chars, etc... from a Stream. Note, though, that what an Integer is in VB and what an Integer is in Java might be different - the byte order or word order, as well as byte count could be different and give you unexpected results.

That is one of the reasons why using text to pass data between applications is so popular - text encoding has standard definitions, so if write and read with the same encoding the data you read will be the same.

Second, is there any way to read an old style interger field from a file in Java?
See above. Yes, either by:
1) Using the DataInputStream if your data is compatible
2) Using any type of 'InputStream' to read the bytes in and putting the bytes together in the correct order


I assume I can use VB to convert this old data into a java readable format. These are what I have always called "flat" files. I read the file once and build an array index based on customer name, display these names and allow the user to jump from customer to customer. There are multiple data records per customer, but only one customer name record per customer. In VB I had a fixed length record and could seek on record number. Am I still going to be able to do this in java? Again, care to elaborate for a newbie?


Sure, look at the different classes available in the java.io package. There are tools that let you skip or seek about a Stream. Choose one that has the functionality you need.

You might also look into the java.nio package. It is a little more complicated, but there are some great tools and good tutorials out there as well.
 
William Jefferson
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Thanks for the reply Steve. At this point my ignorance is overwhelming. Guess my next step is to write some integers to a file and then look at them. When I read about DataInputStream, it still seems as though the actual data storage on the hard drive is going to be numbers stored 1 number per byte as if they were characters in a text field. Because it says you can have commas in the number. You can't store commas in numbers the way the old timey file systems were written. The old systems were hot on space savings and could store a 5 digit integer into two bytes on the hard drive.

I can't convert my old files until I'm sure of how data is stored in Java. So I guess I'll play with file i-o for a while.

thanks again

other comments still welcome ;-)



 
Steve Luke
Bartender
Posts: 4181
22
IntelliJ IDE Java Python
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
William Jefferson wrote:
Thanks for the reply Steve. At this point my ignorance is overwhelming. Guess my next step is to write some integers to a file and then look at them. When I read about DataInputStream, it still seems as though the actual data storage on the hard drive is going to be numbers stored 1 number per byte as if they were characters in a text field.


Yes, all data on a disk (Java and any other language) is written in bytes. You read the bytes in and convert those bytes into your data structure. For Java, an int is 4 consecutive bytes in a specific byte order. A long is 8 bytes, etc... But as data in a computer is always stored in binary, each binary character is called a bit and 8 bits make a byte. There has been a convention to store data in byte format with the specific data structure used by a language and computer specification defining the count and order of the bytes to use as a representation of a number. If you wrote anything to a file in any language you wrote bytes to the file.


Because it says you can have commas in the number.


Really? I looked through the API for DataInputStream (text search) and found no reference to commas. Can you point to where you saw that?

You can't store commas in numbers the way the old timey file systems were written. The old systems were hot on space savings and could store a 5 digit integer into two bytes on the hard drive.


That's correct. 2 bytes gets you 65536 different values, and since Java uses signed data types, that is a range of values from -32768 to 32767. A two byte integer type in Java is called a short and you can get it out of a DataInputStream using readShort(), or if your data is not in the appropriate order for a Java short you would read two bytes in from an InputStream and put them together like (short)((upperByte << 8) | (lowerByte & 0xff)).

I can't convert my old files until I'm sure of how data is stored in Java. So I guess I'll play with file i-o for a while.

thanks again

other comments still welcome ;-)



 
William Jefferson
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
William Jefferson wrote:

Because it says you can have commas in the number.


Steve Luke wrote:
Really? I looked through the API for DataInputStream (text search) and found no reference to commas. Can you point to where you saw that?


Under scanner in the Oracle java trail. Yes, I now see that is for character streams and not data streams. I've been reading too much and not coding enough. Time to change that, but your replies have certainly saved me many hours of trial and error, especially since I didn't even see the dataStream topic until I went back today looking for the "commas" quote.

thanks for your patience and helpful explanations
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!