• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

file bin or ASCII

 
anil bisht
Ranch Hand
Posts: 81
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,
I need to check if a file is a binnary or ASCII. Only if te file is ASCII i need to process it. Is there any API by which i can check the file type??
I also need to list the file by modification dates. Is there any way i can get the list ordered by the modification date. by default ordering is on the file name
TIA
Anil
 
Tim Holloway
Saloon Keeper
Posts: 18359
56
Android Eclipse IDE Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm afraid that Java can't help you there. Those are both OS-dependent questions and Java is intended to provide OS-independent programs. Not all OS's (example MS-DOS) store distinct create, reference and modification dates (or times) like Unix does, and in many OS's the distinction between binary and text files can only be made by reading the file and looking to see if it contains "non-text" bytes. Which is further complicated by the fact that what consistutes a "text" byte is codeset-dependent.
There are 2 ways around this. First, if there's a utility program in your OS that will display that information, you can exec() it and dissect the output. The second way is to use JNI to make a call to OS-specific native code.
 
Steve Deadsea
Ranch Hand
Posts: 125
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It is fairly easy to tell if the file is an ASCII file. To do this you just read through the file and if you encounter any bytes that are greater than 127, it is not ASCII. Also if you encounter any bytes that are less than 32 that are not "\r\n\t\f", it is not likely to be a text file.
 
anil bisht
Ranch Hand
Posts: 81
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
it wont be possible as my file contains some french characters also...
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well then, it's neither binary nor ASCII - it's a text file using some encoding which supports French characters. (True) ASCII does not. However most common European-language encodings are based on ASCII - they just add additional characters after 127, which are not part of ASCII.
To differentiate text from binary files (assuming there's no obvious difference in file extensions, you can do something much like Steve's suggestion - you just have to find out what encoding is actually used in the file. One common possibility is Cp-1252 (Windows Latin-1). Look at the table of characters and identify a group of codes which are not possible in a valid text document on your system. (Usually 0x00-0x08, 0x0B, 0x0C, and 0x0E-0x1F is a good starting poin.) If any of these characters is present in a file, it's probably binary, not text. You'll want to test carefully on many of your files to be more sure though. Good luck...
[ October 19, 2002: Message edited by: Jim Yingst ]
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic