• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • Devaka Cooray
  • Ron McLeod
  • Jeanne Boyarsky
Sheriffs:
  • Liutauras Vilda
  • paul wheaton
  • Junilu Lacar
Saloon Keepers:
  • Tim Moores
  • Stephan van Hulst
  • Piet Souris
  • Carey Brown
  • Tim Holloway
Bartenders:
  • Martijn Verburg
  • Frits Walraven
  • Himai Minh

File.path() returns unicode replacement character ('\ufffd') in filename

 
Ranch Hand
Posts: 30
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

I originally thought that this was a problem converting a string in XML, but since I'm getting the filenames that I'm trying to escape by doing a File.list(), I decided to look at that. As it turns, it's actually File.list() that is sticking the unicode replacement character ('\ufffd') in the filename. Is there a way to get File.list() to recognize characters that are in the range of 128-255, i.e. yen-sign, english-pound-sign, etc?

Thanks,

Rob

P.S. Is there a way to delete a topic? I don't need the other one...
 
Rob Marshall
Ranch Hand
Posts: 30
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I'm apparently completely confused...I deleted all the offending file names, wrote a little Java program to create new ones, and they're working fine. Although the same program on OSX does odd things, but since the application runs on Linux it really doesn't matter. It's probably just a little/big endian thing.

Rob
 
Marshal
Posts: 76870
366
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Don't know. For what it is worth, this is what it says about fffd

2
FFFD  REPLACEMENT CHARACTER
• used to replace an incoming character whose
value is unknown or unrepresentable in
Unicode
• compare the use of 001A  as a control
character to indicate the substitute function

and you can find the details here.
 
Rob Marshall
Ranch Hand
Posts: 30
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Yeah, I saw that. It just makes me a little nervous since I'm not sure how those characters got in the filenames in the first place. I will probably have to check for bad characters in the filenames, log an error, and then just move on...not sure what else to do.

Thanks,

Rob
 
Campbell Ritchie
Marshal
Posts: 76870
366
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I don't think you can tell how those characters got there. It might be that a character which the OS didn't recognise was included in the original file name and it was replaced with fffd. But who knows?
 
Onion rings are vegetable donuts. Taste this tiny ad:
the value of filler advertising in 2021
https://coderanch.com/t/730886/filler-advertising
reply
    Bookmark Topic Watch Topic
  • New Topic