• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • Ron McLeod
  • paul wheaton
  • Jeanne Boyarsky
Sheriffs:
  • Paul Clapham
  • Devaka Cooray
Saloon Keepers:
  • Tim Holloway
  • Roland Mueller
  • Himai Minh
Bartenders:

Parsing given data

 
Ranch Hand
Posts: 48
1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello,
I have a String with following content,


I need to parse the String and result of it should look like the following,
Output
LANGUAGE
L
system_preferences
HiddenWords2
Unauthorized reproduction or distribution of this program, or any portion of it,
eSOMS 'warning' line 2
null
9
Unauthorized reproduction or distribution of this program, or any portion of it,

The problem in this splitting the String is I am not able to find a proper delimiter in the String.
Each of the String in between single quotes ('') and seprated by comma (,) can inturn have these characters in it.

NOTE: This is an extract if an .sql file.

Please help.
Thanks,
Manju Krishna
 
Ranch Hand
Posts: 5575
Eclipse IDE Windows XP Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
dont know my logic is appropriate to this problem or not.
here is my algorithm :

1.first replace your ' by ""[empty string] in your string value. (use replace method of String class)
2. and then split the string value with comma(,) [use split method of String class]. thats all ....

hth
 
Manju Krishna
Ranch Hand
Posts: 48
1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The first step can solve the single quote apperance. But splitting the terms against comma (,) is not ok as the one term can have (,) in it.
 
Seetharaman Venkatasamy
Ranch Hand
Posts: 5575
Eclipse IDE Windows XP Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Ohh, I didnt notice that . then how about below logic ?
hope, you can get algorithm from below code
 
Manju Krishna
Ranch Hand
Posts: 48
1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
That solves the (,) problem. But how about this string ?
eSOMS 'warning' line 2.

I was also got with this vicious circle..

 
Seetharaman Venkatasamy
Ranch Hand
Posts: 5575
Eclipse IDE Windows XP Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

manju wrote: That solves the (,) problem. But how about this string ?
eSOMS 'warning' line 2


I thought of it is a spelling mistake

Hmm, ya. then ,

1. you need to get values by spliting comma(,)

2. and then remove first and last occurences of single quote for the splitted string values

you need to write own logic for this, that is the only way
 
Ranch Hand
Posts: 214
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
It looks like you have tokens of two forms, either wrapped with single quotes, or without them. Each token ends with either a comma, or with the blank (at the end of the string), and there are no blank characters between tokens. If there are no further special cases to take care of, you can then write two methods:

- hasMoreTokens: returns false if the current character is the blank, else true

- nextToken: checks the two cases.
If the current char is a single quote, search for the sequence ', (quote followed by comma), return the substring from current char up to the index of that sequence. And advance current char to the next one after the ', sequence.
In the other case (current char is not a single quote), simply return the substring starting from current char until comma.
In either case, watch if you are at the end of the string (no comma found, only blank).
 
Bartender
Posts: 1952
7
Eclipse IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Seems to me that what you're looking for is a well-rounded CSV parser.
While there's no formal specification of the CSV format, a decent parser should be able to handle the input.
Though you might still have to first replace the ' occurences with " occurences, otherwise the parser might get pissy if adheres strictly to RFC 4180.
Still, better than trying to reinvent the wheel from scratch.
 
Ranch Hand
Posts: 1179
Mac OS X Eclipse IDE
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Well it is not beautiful - but it does the job
 
Manju Krishna
Ranch Hand
Posts: 48
1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks Rene, this works..

Though it process the lines no of times the desired output is obtained. I can minimise the call to this function to handle data with this special processing.

Thanks a lot for each of your inputs.

 
reply
    Bookmark Topic Watch Topic
  • New Topic