• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Complex tokenizing question

 
Ranch Hand
Posts: 529
C++ Java Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,
Okay it might not be complex for some, but it is for me.

Let's say I have a comma delimited String like this:
1,0,5,"Hello","Hello, my name is Barry."

What is the best way to split this String into an array while preserving the comma in the last String. If I use StringTokenizer and use a comma as my delimiter I will get this:
1
0
5
"Hello"
"Hello
my name is Barry."

But of course that's not what I want. I want only 5 elements in the array with the last one being "Hello, my name is Barry." Also what if I had multiple commas in an element?

Parsing is definitely not my strong point I will admit. So if anyone could give me a little nudge I would be very grateful.

many thanks,

Barry
 
Bartender
Posts: 10336
Hibernate Eclipse IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You have to change you delimiter. There's no other easy way round this I can think of - how could you explain in a program that "blah blah, blah blah" should be understood as a distinct sentance rather than two tokens?

If you can't change your delimiter, your only hope is that Strings are described within quote marks, in which case you would be able to distinguish between ignorable commas and delimiters. You could use the split method of String with a suitable Regular Expression, or step through the sentance character by character keeping a note of when you are inside a quoted String and when you are outside it.
[ March 10, 2005: Message edited by: Paul Sturrock ]
 
Barry Andrews
Ranch Hand
Posts: 529
C++ Java Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
"step through the sentance character by character keeping a note of when you are inside a quoted String and when you are outside it"

Which is exactly what I ended up doing. Just thought there was a magical way, but I guess not. Thanks for the reply!
 
author
Posts: 14112
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Doesn't sound like a performance question. Moving to Java in General (interm.)...
 
reply
    Bookmark Topic Watch Topic
  • New Topic