• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • paul wheaton
  • Jeanne Boyarsky
  • Ron McLeod
Sheriffs:
  • Paul Clapham
  • Liutauras Vilda
  • Devaka Cooray
Saloon Keepers:
  • Tim Holloway
  • Roland Mueller
Bartenders:

Tokenize a string

 
Ranch Hand
Posts: 65
1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I'm trying to get a driver to accept user input and then return the string with each word and delimiter on it's own line, but not to include spaces like this:

User Input:
"The quick, brown!fox jumps+over"

Output:
1. The
2. quick
3. ,
4. brown
5. !
6. fox
7. jumps
8. +
9.over

The code I have works with a hardcoded string except that it's leaving out all of the delimited characters like , + and so on, but when I try it with user input it won't output anything at all.
What I need help with:
1) Understanding why user input doesn't produce output
2) Understanding why the delimiter characters aren't being output at all

Any other suggestions that might help to make this a bit cleaner are appreciated. I'm also trying to get the top line that prompts for input to be red, and the rest of the text blue, and then after the user inputs a string and the output is displayed I want a top line to say something simple like Output, which I know can be done by putting code for red font where I want it and following that up by code for blue font and then doing that again on the output screen, just didn't know if there's a cleaner way to do that without repeating, but it's not a must, just something I'm curious about to try to find ways to make things cleaner and simpler.


Driver.cs



Tools.cs
 
Bartender
Posts: 15737
368
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I'm not sure if I would write my own tokenizer for this. It's much easier to use a regular expression to solve this problem.

You can use Regex.Split() to do this. The trick is to use a regular expression that matches the (possibly zero-width) white space around delimiters. That means you need to use lookaheads and lookbehinds.
 
Nick Smithson
Ranch Hand
Posts: 65
1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
That was my first thought, but we can't use regex since it's not been covered yet and we were told split wont work for this because:

The Split method in the String class will not work for this purpose because it discards the delimiters it finds. You will need to write your own similar method using methods and properties of the String class such as Substring, IndexOf, IndexOfAny, IsNullOrEmpty, Empty, PadLeft, PadRight, Remove, Trim, and so forth.

 
Bartender
Posts: 669
15
TypeScript Fedora
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If you are supposed to make your own then how about this:

You use stringbuilder to build each token.  You can get a char array from the string, then iterate over it.  If the char is not a delimiter then you add it to the stringbuilder.  If the char is a delimiter then you put the string from the string builder into the list, clear the builder or make a new one, then add the delimiter to the list, and then continue to the next one.
 
Consider Paul's rocket mass heater.
reply
    Bookmark Topic Watch Topic
  • New Topic