• Post Reply Bookmark Topic Watch Topic
  • New Topic

Determine a file's encoding  RSS feed

 
Mark Vedder
Ranch Hand
Posts: 624
IntelliJ IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Is there a method somewhere in the J2SE API that you pass in a file and have it return what charset that file is encoded in? I've searched through the J2SE and haven't found anything. (Sorry if this question seems too basic, but I'm just now in the process of learning more detail about charsets and file encoding since I currently only have an ancillary understanding of them; all my previous projects haven't required any I/O beyond logging and properties files.)

Thanks,
Mark
 
Dmitry Melnik
Ranch Hand
Posts: 328
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am not aware of existance of any API like this, but if it's a UNICODE file, then charset ID could be found in the first 4 bytes of the file (Windows). Bytes 0-1 = 0xFFFE , bytes 2-3 = charset ID. If it's not UNICODE (1 byte per symbol), then recognizing the encoding becomes a cryptoanalytical task
 
Mark Vedder
Ranch Hand
Posts: 624
IntelliJ IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for the info...
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!