Win a copy of Transfer Learning for Natural Language Processing (MEAP) this week in the Artificial Intelligence and Machine Learning forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • Paul Clapham
  • Devaka Cooray
  • Bear Bibeault
Sheriffs:
  • Junilu Lacar
  • Knute Snortum
  • Liutauras Vilda
Saloon Keepers:
  • Ron McLeod
  • Stephan van Hulst
  • Tim Moores
  • Tim Holloway
  • Piet Souris
Bartenders:
  • salvin francis
  • Carey Brown
  • Frits Walraven

File Upload

 
Ranch Hand
Posts: 52
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator


on linux , when I use springboot upload a file "这是一个文本.txt",
the output is :  
???.txt
???.txt

but on windows is ok, why?
HELP!
 
Saloon Keeper
Posts: 6377
158
Android Mac OS X Firefox Browser VI Editor Tomcat Server Safari
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Where is that output shown? Somewhere that's actually capable of displaying Unicode?
 
Marshal
Posts: 25436
65
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator


This ugly code is more or less guaranteed to fail. Here's what it does:

1. Take the String named "originalFilename" and convert it to bytes, using ISO-8859-1. Note that ISO-8859-1 doesn't have representations for non-European scripts so you have already damaged the data.

2. Take those bytes and convert them to a String, assuming that they were encoded in UTF-8. Note that they weren't actually encoded in UTF-8, they were encoded in ISO-8859-1. So any code points which differ between those encodings are going to be damaged by this step.



The same applies to this. It will work nicely if request.getCharacterEncoding() is UTF-8, in which case it does nothing and you don't need that code. If it isn't UTF-8 then it's going to damage the data as outlined above.

So don't do any of lines 4 to 13. It can't do any good and it can only do harm.

I suppose you are using line 14 to debug the process? That line itself is part of the problem, most likely. It's going to use the system's default encoding to write originalFilename to some log file somewhere; that encoding may or may not support Chinese characters. Likewise the text editor you use to look at the log file may use the wrong encoding. So there are plenty of things which can go wrong with that.

If you're finding that the code works differently in different environments, that would be because the default character encoding for requests is different in those environments. I strongly recommend you configure all your servlet-handling environments to use UTF-8 for its default request and response encodings and get rid of the ugly hacks which don't work.
 
xin yi
Ranch Hand
Posts: 52
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Paul Clapham wrote:
It will work nicely if request.getCharacterEncoding() is UTF-8, in which case it does nothing and you don't need that code. If it isn't UTF-8 then it's going to damage the data as outlined above.



Yes, request.getCharacterEncoding() is UTF-8. In theory it will be well worked, but when file name contains Chinese character, It still turn to ???. I try to many ways , but failed...
 
xin yi
Ranch Hand
Posts: 52
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Tim Moores wrote:Where is that output shown? Somewhere that's actually capable of displaying Unicode?


request.getCharacterEncoding() is UFT-8.
 
Paul Clapham
Marshal
Posts: 25436
65
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

xin yi wrote:

Tim Moores wrote:Where is that output shown? Somewhere that's actually capable of displaying Unicode?


request.getCharacterEncoding() is UFT-8.



No. request.getCharacterEncoding() tells the system what encoding to use for the request. The request is the input to that code. The output is what you wrote with System.out near the end. But that's only debugging output, right? I expect you're using the file name in some real situation. In which case you should consider whether that is working correctly.

But if it's really the case that your System.out data is actually important, I would suggest you stop doing it that way. This code is in a web app, right? And System.out is not a standard way to do debugging output. So there's no point in trying to manipulate System.out (and whatever you're using to read wherever System.out writes to) so that you can read the file name.
 
xin yi
Ranch Hand
Posts: 52
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
In reality , I use log4j's logger.info as output.  Can I concat to you with wechat?  
 
Rancher
Posts: 4546
47
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What are you using to read the log file?
 
Paul Clapham
Marshal
Posts: 25436
65
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

xin yi wrote:In reality , I use log4j's logger.info as output.



That's a better approach. And have you specified a file encoding for log4j to use? Again, I would suggest UTF-8. And like Dave suggests, to look at the logs you should use a text editor which knows that the files are encoded in UTF-8.
 
Rancher
Posts: 458
7
Android Tomcat Server Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

xin yi wrote:Can I concat to you with wechat?  


It should be can I contact you via wechat ?
 
Marshal
Posts: 68904
275
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Randy Tong wrote:. . . It should be can I contact you via wechat ?

The answer to both questions is no. You will get much better advice if you keep the discussion where more experienced eyes can see it.
 
xin yi
Ranch Hand
Posts: 52
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Randy Tong wrote:
It should be can I contact you via wechat ?



Uh haha
 
Attractive, successful people love this tiny ad:
Two software engineers solve most of the world's problems in one K&R sized book
https://coderanch.com/wiki/718759/books/Building-World-Backyard-Paul-Wheaton
    Bookmark Topic Watch Topic
  • New Topic