All files are comprised of an ordered sequence of bytes. There is no such thing as a "text file", unless you provide a definition of one beforehand, but that can be said for anything. Some other APIs implicitly provide that definition; for example, java.io.Reader, which defines how bytes are converted to character data - maybe this fits your specific definition of "text file"? java.io.InputStream knows only of ordered bytes.
The soul is dyed the color of its thoughts. Think only on those things that are in line with your principles and can bear the light of day. The content of your character is your choice. Day by day, what you do is who you become. Your integrity is your destiny - it is the light that guides your way. - Heraclitus
posted 14 years ago
Thanks for the response.
I would like to know exactly how the ordered sequence of bytes are read from a file. does anyone know of any resources that provide this information ?
Most OSs these days use the eight-bit byte as the basic unit of file structure. Ultimately, files contain bytes, and it's up to software to interpret what those bytes mean.
A significant portion of the world's computing is done using ASCII, a 7-bit code in which one byte corresponds to one character. Java is more sophisticated in that it uses a 16-bit character, which can represent most of the world's alphabets. But the OS still delivers bytes. It's up to Java to decide how to convert those bytes into characters. It does so using a character encoding. There are many possible encodings, and different ones are used in different parts of the world. The "UTF-8" encoding is a common one in the US -- it's basically the same as ASCII. Each byte is promoted to a character -- except for some special ones, which serve as signals to "shift" into a 3-byte encoding for special characters.
Anyway: the JDK Javadocs themselves contain a lot of information about character encodings. Read the pages for java.lang.Character, java.io.Reader, java.io.Writer, and the other pages those refer to.