Win a copy of Machine Learning Systems: Designs that scale this week in the Scala forum
or Xamarin in Action: Creating native cross-platform mobile apps in the Android forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

Files.lines vs BufferedReader vs Files.newBufferedReader java.nio.charset.MalformedInputException  RSS feed

 
Ranch Hand
Posts: 153
3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have a program which parses files in a directory. The files are supposed to be UTF-8 encoded. However if by mistake a file has another encoding, it is not supposed to crash, but to ignore the file.
I have started out using Files.lines to read the lines. But this does not work as required.
A much simplified version:


The file is ANSI encoded. This code crahes, "End" is never printed.



Why is the exception not caught?

When I change the code to "pre-Java 8"



the program reads and prints all the lines without throwing an exception.

When I obtain the BufferedReader from Files, the program cannot read the file.
The exception is caught. "End" is printed.



Is there an explanation for the different behavior of the BufferedReaders?
Is there any way to make the first version work?

I am using java version "1.8.0_131".

Thanks,

Hans
 
Rancher
Posts: 3492
39
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Markus Schmider wrote:


Why is the exception not caught?



If you look at the first line of that stack trace it is actually throwing an UncheckedIOException, which is wrapping the root-cause MalformedInputException.
You'd have to catch the former to handle this.
 
Sheriff
Posts: 21327
87
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I think the issue lies in the encoding used. The methods in Files use, if none is explicitly defined, UTF-8 encoding. The "old" I/O code uses the system default, which depends on your operating system but is hardly ever UTF-8.

Can you try the following to two bits of code and post the results?


The first uses your existing Files.lines code but uses the system charset instead of UTF-8.
The second uses your existing BufferedReader code but uses UTF-8 instead of the system charset. Note that I had to use FileInputStream + InputStreamReader instead of FileReader for that.
 
Trust God, but always tether your camel... to this tiny ad.
Rocket Oven Kickstarter - from the trailboss
https://coderanch.com/t/695773/Rocket-Oven-Kickstarter-trailboss
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!