• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

How to read all file at once in string representation instead of byte one

 
Greenhorn
Posts: 25
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi everyone,

I have a project in which i have to develop some algorithms and then use this algorithms for files . Since my project is too big (more than 2000 rows of code) i can not write all the code here . The algorithm is working fine and i got what i want but the reason why i am writing here is because i have taken the file as an array of byte . Let say i have one file with text "Hello World" . What i got is the ascii representation of those words . Actually what i want to get is these words instead of representation of these words in ascii . Below i am writing one of my several classes where i think i need to do a change.



in my main method i call this file in that way :



and then every method i call to main is performing the array byte of the file . I want that my methods to perform over that file but not in byte representation but in charset representation . Another thing i want to read all the file at once not word per word .
 
Marshal
Posts: 79177
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Go through the Java™ Tutorials, where you find that you should not use classes called XXXStream for reading text files. You should use classes called XXXReader, or (simpler) a Scanner. You can read the individual lines as Strings and put them into a List.
 
Rancher
Posts: 4801
50
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Campbell Ritchie wrote:You can read the individual lines as Strings and put them into a List.



Funnily enough Files has a method to do just that called readAllLines, though it doesn't recommend using it for anything big.
 
Saloon Keeper
Posts: 15510
363
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
For bigger files, you should use BufferedReader.lines().
 
johnsoan smith
Greenhorn
Posts: 25
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I appreciate a lot your replies . I know how to read a file in a character set but the problem is that firstly i have to read it in bytes because all my operations are done in byte level . For example i have used one method which generate a random number array of bytes and use it with bytes of the file . So it means i cant change my file class to a string one but what i need is if there is one way to perform the operations of the file in byte and the result to get in the char . Hope i am clear
 
Stephan van Hulst
Saloon Keeper
Posts: 15510
363
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
There are two solutions. The first is to use a DataInputStream. This will allow you to read various Java primitives, byte arrays and UTF-8 strings. The problem with this approach is that it's not easy to deal with strings in a different encoding.

The alternative is to open a stream, and open an unbuffered reader on that stream as well. You can read byte arrays from the stream, and read text from the reader. It's important that you don't close the reader before you're done with the stream, and that the reader is NOT buffered:

The problem with both of these approaches is that it's difficult to deal with line-endings.

It's best to treat a file as if it was only text or only binary. If it's binary, use only (Data)InputStream. If it's text, use only Reader.
 
Campbell Ritchie
Marshal
Posts: 79177
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You can easily turn a String into a byte[] because String has a method which does that.
 
Campbell Ritchie
Marshal
Posts: 79177
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
That isn't as easy as it sounds; the lengths vary, getting longer if I try currency symbols, German towns with ü and ß etc. Obviously the String is translated into its encoding (default=F8) and you get different lengths for different characters.
 
Saloon Keeper
Posts: 10705
86
Eclipse IDE Firefox Browser MySQL Database VI Editor Java Windows ChatGPT
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

johnsoan smith wrote:I appreciate a lot your replies . I know how to read a file in a character set but the problem is that firstly i have to read it in bytes because all my operations are done in byte level . For example i have used one method which generate a random number array of bytes and use it with bytes of the file . So it means i cant change my file class to a string one but what i need is if there is one way to perform the operations of the file in byte and the result to get in the char . Hope i am clear


Sorry, I can't fathom why you'd mix binary data in with UTF-8 strings. How are you writing this file in the first place? How do you know where the bytes containing chars ends and the random bytes begin?
 
Carey Brown
Saloon Keeper
Posts: 10705
86
Eclipse IDE Firefox Browser MySQL Database VI Editor Java Windows ChatGPT
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Carey Brown wrote:

johnsoan smith wrote:I appreciate a lot your replies . I know how to read a file in a character set but the problem is that firstly i have to read it in bytes because all my operations are done in byte level . For example i have used one method which generate a random number array of bytes and use it with bytes of the file . So it means i cant change my file class to a string one but what i need is if there is one way to perform the operations of the file in byte and the result to get in the char . Hope i am clear


Sorry, I can't fathom why you'd mix binary data in with UTF-8 strings. How are you writing this file in the first place? How do you know where the bytes containing chars ends and the random bytes begin?


Seems that you'd really need a homogenous file type, either all bytes or all characters. If you go with the all bytes approach you could convert your strings to bytes before writing them out and precede them with a 2 or 4 byte header that tells you how many bytes make up the converted string. If you go with the all chars approach you could convert your random bytes to chars using base64 (or hex) before writing them out.
 
Bartender
Posts: 10780
71
Hibernate Eclipse IDE Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

johnsoan smith wrote:I appreciate a lot your replies . I know how to read a file in a character set but the problem is that firstly i have to read it in bytes because all my operations are done in byte level .


Byte level or character level? They are NOT the same, although many people assume they are because ascii text is stored as bytes.

So the real question here is: Are you reading text, or is this something like a JPEG image ... or is it some hybrid format that only you and the people who wrote it understand?

If it's text, then the best way to read it is with a BufferedReader as described by others; and if the text is in the form of lines, you should read it a line at a time.
That doesn't mean that you can't process the content character-by-character if you need to.

Winston
reply
    Bookmark Topic Watch Topic
  • New Topic