• Post Reply Bookmark Topic Watch Topic
  • New Topic

How to read such Symbols in io stream/ reader along with string?  RSS feed

 
Maki Jav
Ranch Hand
Posts: 470
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

The file contains such names like Buk Leather®   ,  Ryker LumaTech™

How to read and store in String objects and show correctly instead of getting a "?"


Thanks
 
Tim Moores
Saloon Keeper
Posts: 3967
94
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
That's a matter of specifying the correct encoding when reading the file, and whenever you want to display it afterwards. What encoding is the file in?
 
Stephan van Hulst
Saloon Keeper
Posts: 7933
143
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
How do you read and display the string?
 
Maki Jav
Ranch Hand
Posts: 470
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It is just a plain text file with txt extension
 
Campbell Ritchie
Marshal
Posts: 56227
171
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Maki Jav wrote:It is just a plain text file . . .
Read this and you will learn that there is no such thing as a plain text file.
Also, what were you displaying the text on? A Windows® terminal has its own encoding and will display ? when a Swing component (e.g. J Option Pane) will display the correct character.
 
Stephan van Hulst
Saloon Keeper
Posts: 7933
143
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Campbell Ritchie wrote:a Swing component (e.g. J Option Pane) will display the correct character.

Only when using a font that supports the character. For ® and ™ that shouldn't be a problem though.
 
Maki Jav
Ranch Hand
Posts: 470
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
They were getting correct  symbols on web applcation running on windows/tomcat and problem on linux/tomcat setup...

The name is input by their files...

That setup in not under my control
 
Stephan van Hulst
Saloon Keeper
Posts: 7933
143
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You're still not telling us how you're reading the file, and how you're displaying the string.
 
Maki Jav
Ranch Hand
Posts: 470
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Uploading file from text files to server. The server running swing/apache Fileupload gets that as stream and saves it to database. The items are read, then, from database.
 
Maki Jav
Ranch Hand
Posts: 470
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
uploading data*
 
Maki Jav
Ranch Hand
Posts: 470
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The developers here are trying to change BufferedReader used in a CSV Reader and trying to read data on server-side using inputstream in its place. assuming that it will be able to read the text as it is in files.
 
Dave Tolls
Ranch Foreman
Posts: 3011
37
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
And what is displaying the output from the database?
What charset does the database use?
 
Maki Jav
Ranch Hand
Posts: 470
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Database is cassandra

Displayed text on page is:

Buk Leather?
Ryker LumaTech?


 
Dave Tolls
Ranch Foreman
Posts: 3011
37
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
OK, I'll try and step through this.

The file has a character encoding.
Assuming your code has read in the text from the file correctly (ie with the correct encoding) you then have the bit where it gets stored in the DB.
The Database has its own encoding.
On the assumption these match (since you say it displays on the web), there is then the step of when it is read out of the database and displayed.
The tool used to display the text itself uses an encoding.
At that point the display method uses a font, and that font has to have the glyph matching the codepoint.

So there are lots of places this could go wrong and you will need to identify which one of them is causing the issue.

This is why people are asking how (as in using what technology) you are displaying the text that is looking incorrect.
 
Tim Moores
Saloon Keeper
Posts: 3967
94
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You seem to have missed this:
Tim Moores wrote:That's a matter of specifying the correct encoding when reading the file, and whenever you want to display it afterwards. What encoding is the file in?
 
Maki Jav
Ranch Hand
Posts: 470
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I hope that this answers your questions to some extent. All setup is same but their OS is linux-based.
unicode.PNG
[Thumbnail for unicode.PNG]
 
Maki Jav
Ranch Hand
Posts: 470
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The have reported this but that they are getting something like the name in picture below. Their file encoding is not same as ours I think.

Thanks


thiers.PNG
[Thumbnail for thiers.PNG]
 
Stephan van Hulst
Saloon Keeper
Posts: 7933
143
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You really should take your time to understand what an encoding/charset is, so you can verify each step of the way that Dave described for yourself.

Have you read the article that Campbell linked to?
 
Maki Jav
Ranch Hand
Posts: 470
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am in office working, developing, so I read in bird-eye view. I am working on another issue right now but want to help my colleagues.

The problem is that we have cannot tell our clients to use a utf-8 or so an so encode files while uploading.

Thanks.

 
Maki Jav
Ranch Hand
Posts: 470
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
THe two different images I posted can tell you the different effects.

Thanks
 
Tim Moores
Saloon Keeper
Posts: 3967
94
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well, you will need to take the time to understand the problem, and also to understand possible solutions. It all seems to hinge on the encoding, so as long as you don't know that, you're not in a position to change your code to deal with it properly.

There are libraries out there that try to guess the encoding from a given file, but that's not 100% accurate - much better to agree on a file format with your client, and then to use that appropriately.
 
Dave Tolls
Ranch Foreman
Posts: 3011
37
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Get a copy of one of the files they upload and run it through the same system...that includes using Linux, preferably the same one as they use.
 
Maki Jav
Ranch Hand
Posts: 470
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Tim,

On your advice I logged in into their server and uploaded very same my file, the name of which I got (R) displayed correctly.

Well,

The same problem, as is reported by them, occured.


Thanks
dev-server.png
[Thumbnail for dev-server.png]
 
Maki Jav
Ranch Hand
Posts: 470
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Now I sitting in Karachi, send data to the same code server over to the United States, and phoosh....
:'(
 
Maki Jav
Ranch Hand
Posts: 470
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Why is editing of one's messages not available here any longer?
 
Tim Moores
Saloon Keeper
Posts: 3967
94
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You seem to have missed this:
Tim Moores wrote:It all seems to hinge on the encoding, so as long as you don't know that, you're not in a position to change your code to deal with it properly.

... much better to agree on a file encoding with your client, and then to use that appropriately.

Please take the time to read and consider the advice you are getting.
 
Campbell Ritchie
Marshal
Posts: 56227
171
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Maki Jav wrote:I am in office working, developing, so I read in bird-eye view.
In which case, reading the article thoroughly should count as part of your work. As people have said, if you don't understand the encoding problem there is no chance of you solving the rest of the problem.
I am working on another issue right now but want to help my colleagues.
Don't try doing two things at the same time. Consider whether you shou‍ld continue with this problem, or whether to get one of your colleagues to sign in here and continue the discussion.
. . . we . . . cannot tell our clients to use a utf-8 or so an so encode files while uploading. . . .
No, you can't. But you can ask them to specify an encoding for you to use, and insist they notify you whenever they change encodings. It is very easy to change the encoding you are using, as long as you know.
 
Maki Jav
Ranch Hand
Posts: 470
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator


The code base is same as we develop and give over to QA team.

Cassandra encoding must be same, ut I am not aware of it.

The file that I upload to my local machine and to the QA server in the US, is the same. Hence same encoding.

The result is different.

The only difference is...

Qa in the US has Linux and we have Windows.

I think that either it is casandra or the OS default encoding.

Thanks




 
Stephan van Hulst
Saloon Keeper
Posts: 7933
143
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The code that reads the files probably uses default encoding. Can you show us the related snippet of code?
 
Maki Jav
Ranch Hand
Posts: 470
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

It is uploaded in Spring application using Apache Commons FileUpload utility by making it a bean using @Bean annotation in

ProductSpringMvcConfig extends WebMvcConfigurerAdapter implements AsyncConfigurer{} class


The csv used has following imports....

import au.com.bytecode.opencsv.CSVReader;
import au.com.bytecode.opencsv.bean.HeaderColumnNameTranslateMappingStrategy;

CSVReadeer lib is used for reading csv files uploaded as a stream. After I have developed it, the other developers have used encoding like this:




This was not effective so they remove the encoding "ISO-8859-1".

They did not consult me otherwise I would hae advised to make changes in the SpringMVCConfig file Multipart bean



So does UTF-8 not mapping characters on Linux (don't know which flavour) correctly?

Thanks,




 
Tim Moores
Saloon Keeper
Posts: 3967
94
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You need to start by finding out what encoding the file is actually in. Then you can make the necessary changes in the code to always use that encoding.

If the code doesn't specify any encoding then the platform default encoding is used, which will vary between operating systems, possibly vary between different machines running the same OS, and possibly even between different JVMs running on the same machine.
 
Maki Jav
Ranch Hand
Posts: 470
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator


Please, read my posts so that it will be clear that encoding is given. The file had same encoding, ie ISO-8859-1, only when same file is uploded to server, it runs on in trouble on linux machine.

So they need to start with jvm ie tomcat, with some encoding?
 
Maki Jav
Ranch Hand
Posts: 470
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
an someone just tell me wich encoding will allow ® © etc?

Thanks
 
Tim Moores
Saloon Keeper
Posts: 3967
94
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I don't think I understand. If the encoding is ISO-8859, why does the code specify UTF-8? But in any case, don't ever rely on the platform default encoding being the correct one.
 
Stephan van Hulst
Saloon Keeper
Posts: 7933
143
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm sorry, I don't mean to be rude, but if you or your colleagues understood what encodings exactly are and how they are used, you would be able to debug this problem without too much problems. I really recommend that all of you study them a little bit. This is going to help you a lot in the long run.

Show us exactly the Spring controller action that receives the file. You can fiddle around with Spring charset configuration all day, but if you treat your input as a stream of bytes that needs to be read manually, it's going to be useless.
 
Don't get me started about those stupid light bulbs.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!