This week's book giveaway is in the OCAJP forum.
We're giving away four copies of Programmer's Guide to Java SE 8 Oracle Certified Associate (OCA) and have Khalid A Mughal & Rolf W Rasmussen on-line!
See this thread for details.
Win a copy of Programmer's Guide to Java SE 8 Oracle Certified Associate (OCA) this week in the OCAJP forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

read from csv: how to handle null?

 
Peter Primrose
Ranch Hand
Posts: 755
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi all,

I have a csv file with data in it. example of data (imagine evert word in a diff cell)

Country ID Capital Comment
USA 44 WAS na
JAPAN TOKYO na

the problem here is the null value on the ID of JAPAN (it's empty). When I read the next value after JAPAN (expecring ID) I get the capital Tokyo. mmm....I'm expecting null?!

my code looks like this:


can anyone guide of how to get a null?
thanks
 
Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Look at the alternative constructor with the boolean returnDelims. When that is false it returns tokens
[code[
JAPAN
TOKYO
NA
[/code]
Set to true it returns tokens and delimiters:

Now you can detect two commas in a row.

BUT! You will find that CSV is trickier than it looks. For example, a string token might have a comma inside it. Then the string is probably wrapped in quotes:

RICH GUYS, 1, "GATES, BILL"

And the token might have a quote in it, so the quote gets an escape charcter. And the token might have escape characters in it so they are escaped, too.

StringTokenizer not going to do the whole job. There are enough other quirks that it would be a good idea to Google for a proven CSV parser.
[ July 24, 2006: Message edited by: Stan James ]
 
Peter Primrose
Ranch Hand
Posts: 755
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
ok...this is good.

so I did this:
StringTokenizer st = new StringTokenizer(line,",", true);

and I am getting the commas (,) as you described. Now, what you are saying is this: if you find 2 consecutive commas - the cell is null.
EXCELLENT! THANKS

you also mentioned, and you were right, a problem with "GATES, BILL",

what if I recieve the file as is (I cant change it) and the cell contains the value: GATES, BILL (not "GATES, BILL")

how would you recomend reading the value as GATES BILL (no comma)
thanks Stan
 
Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I don't know if CSV is a real standard or just common convention but I'd think a cell containing a comma that isn't escaped or quoted or something is just plain wrong. For my own work with CSV (not in Java) I used Excel save-as-CSV to generate test cases and to serve as the standard. Again, finding an established library would be the best bet here, if maybe not quite as educational and fun to write.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic