• Post Reply Bookmark Topic Watch Topic
  • New Topic

.csv

 
eswar kumar
Ranch Hand
Posts: 98
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
in my application im trying to read data from .CSV file if there is no value in the cell ,what value it holds? whether ""," " or null?
StringTokenizer st = new StringTokenizer("Read,Line,Here",",",false);

while (st.hasMoreTokens()) {
while (st.hasMoreTokens()) {
String insertString=st.nextToken();
if(insertString.equals(""))
{insertString=null;
System.out.println("next token is null "+insertString);
}
its giving nothing... thanks in advance
 
Ali Gohar
Ranch Hand
Posts: 572
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
StringTokenizer st = new StringTokenizer("Read,Line,Here",",",false);
while (st.hasMoreTokens()) {
String insertString=st.nextToken();
if(insertString.equals(""))
{insertString=null;
System.out.println("next token is null "+insertString);
}
Ofcourse this will give nothing as insertString will never get equals to "".
Try to print the value of insertString it will print "Read" then "Line" and then "Here" thats it.
 
eswar kumar
Ranch Hand
Posts: 98
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
no no not like that if there is
StringTokenizer st = new StringTokenizer("Read,Line,,Here,,",",",false);
if there is ,, inbetween what it ll show
in .csv file if anybody forgot to enter a value in the cell while reading the cell what it ll be whether its ""," " or its value is "null"? i want it
 
Stefan Wagner
Ranch Hand
Posts: 1923
Linux Postgres Database Scala
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you want to store nothing in a csv-file column, you put nothing between the commas: ,,
If you want to store a blank: ' ' you store a blank: , ,
Might be displayed curiously, but I guess you know what I mean.
Whether a single blank makes any sense in your context is a different question - a name=" " doesn't make much sense.
 
Tom Blough
Ranch Hand
Posts: 263
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Also be advised that if you put nothing in the csv file (i.e adfda,,aada), StringTokenizer will return TWO tokens, not THREE. The empty field will not be returned.
StringTokenizer is not the best for dealing with .csv files.
You can do a little better if you have the tokenizer return the tokens as well then you can test if you have two tokens in a row and set the corresponding field to null. however, you may have to write your own parser.
 
Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
csv allows this:
"one", "two, three", "four"
If you parse on commas you'll get four tokens when there are really only three fields and the middle field has a comma in it. So you might try:

I'm not sure if there are real rules about csv, but it might also allow something like this for quotes in strings:
"He said ""Hello"" to her"
Trickier yet. It might be time to get real good at regular expressions!
 
Stefan Wagner
Ranch Hand
Posts: 1923
Linux Postgres Database Scala
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Tom Blough:
Also be advised that if you put nothing in the csv file (i.e adfda,,aada), StringTokenizer will return TWO tokens, not THREE. The empty field will not be returned.

You may tell the Tokenizer to return the separators (',') as well, if the last argument is 'true':

(look at the javadocs, perhaps it's quite the opposite).
Another hint is, to convert the nulls before parsing:

if you're sure, that ###NULL### cannot occur as regular value.
But both will not solve the (1,"green, yellow, blue","Brasil") - problem.
CSV isn't a official-defined format.
Some use ',' as separator, some use ';' and some use a tab.
Some use two-letter delimiters ("a","17","b") and some use quotation-marks only for Strings.
You allways have to investigate or configure the source of your csv or define your own, when writing the csv-output yourself.
 
clio katz
Ranch Hand
Posts: 101
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
StringTokenizer is deprecated, no? Try split. it may eat - at most - one of your tokens (the last one) ... but i'm not sure (1) whether you're ready for the nullpointerexception you'll get for blank fields, or (2) what 'blank' is to you ... your example seems to have a sole double-quote in 'blank' field ... this is *not* a 'valid' .CSV format - perhaps you wrote 2 single quotes?
Other suggestion would be (1) serialize, or (2) if you don't want to serialize and/or need a .CSV format (file), check what constitutes "valid" .CSV (there's no standard, but you could, for example, support Excel .CSV standards). Finally (3) still want .CSV? Try a canned library like ostermillerutils (http://ostermiller.org/utils/CSV.html)
 
Stefan Wagner
Ranch Hand
Posts: 1923
Linux Postgres Database Scala
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
StringTokenizer is deprecated, no?

No.
 
Tim West
Ranch Hand
Posts: 539
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
From the 1.4.2 API for StringTokenizer:

StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.

That said, it isn't yet formally @deprecated.

-Tim
 
Stefan Wagner
Ranch Hand
Posts: 1923
Linux Postgres Database Scala
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It seems you're right.
Do you know why it's usage is discouraged? And why not deprecated?
If I know a class by intensive usage, I will not reread the docs, until I get a 'deprecated' warning. At least a bold, red warning could be in the docs, to be fast seen when looking for a detail.
 
Tim West
Ranch Hand
Posts: 539
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm only guessing, but I'd say:
  • it's described as deprecated because (1) the regexp-based splitting is much more powerful and is even simpler *, and (2) because the regexp stuff is so apt as a replacement, it may not be present in future versions of the API.
  • it's not flagged @deprecated because it's so widely used that legacy code all over the place would compile with zillions of warnings, and they don't want to force people to turn off deprecation warnings just for this one class.


  • Everything above is pure speculation, but that's my $0.02.
    Cheers,
    --Tim
    * well, I'd contend String.split is easier if you assume basic regexp knowledge.
     
    Richard Rodger
    Greenhorn
    Posts: 17
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Don't forget that as well as handling comma's inside quotes, you'll need to handle newlines too.

    "foo,bar", "baz
    bat"

    should resolve to a single record: ["foo,bar","baz\nbat"]
     
    • Post Reply Bookmark Topic Watch Topic
    • New Topic
    Boost this thread!