Win a copy of Murach's Python Programming this week in the Jython/Python forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

StringTokenizer  RSS feed

 
Toulouse Laurent
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am working with StringTokenizer when I am reading in a file.
I am using a two-dimensional array to store my tokens. However the tab delimited text file I am reading in does not always have one word per column. In other words some entries are like:
"New York"
I want to treat this as one column and not two. Unfortunately, StringTokenizer treats these as separate tokens (ie 2 columns) in the case for me.
Is there a nextColumn method or a way for me to treat the example above a one token?
Thanks.
 
Michael Ernest
High Plains Drifter
Sheriff
Posts: 7292
Netbeans IDE VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The class generates tokens by using some value as a delimiter between them. Normally that value is whitespace. There's pretty much two and only two ways around this:
1) Change the delimiter. Instead of using spaces to separate columns of information, use an offbeat character like a pipe, colon, comma, or anything else that won;t be confused with a data word.
2) Hard-code your program so that when it sees 'New' it runs a subroutine to find 'York' or whatever else might follow it that consistutes a single word. Naturally, this approach is exhaustive (also know as "dictionary-driven") and only as effective as the dictionary is thorough.
Just about everyone goes route 1) 99% of the time.
------------------
Michael Ernest, co-author of: The Complete Java 2 Certification Study Guide
 
Toulouse Laurent
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for the reply. My .txt file is tab delimited, I set my tring tokenizer delimited field as "\t". It now works great treating every column as a token. So if column 1 is "apple" and column 2 is "New York", when I set next token on each they are both considered one token.
The code looks like this:
StringTokenizer tokenizer = new StringTokenizer(line, "\t");
Thanks Again.
 
Consider Paul's rocket mass heater.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!