• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

identifying white spaces

 
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi there
I'm trying to parse a string which that has carriage returns as delimeters. To do this i'm using a StringTokenizer and use "\n" as a delimeter string.
however, there may be serveral returns (or newlines "\n") in a row and then the next token. Or there could be any combination of newlines, tabs, and spaces before the next string text that we want as our token.
So, my question: is there a way to identify any whitespaces (by whitespace i mean any thing other than text ie. newlines, tables, spaces) using a certain escape sequence. For example, "\n" identifies a newline but is there something like "\w" that would take care off all whitespacing including newlines, tabs and spaces?
Thanks
Regards
Desmond Lee
 
Ranch Hand
Posts: 97
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Desmond,
I'm aware of the static Character methods isSpaceChar() and isWhiteSpace(), but I don't know of an escape sequence that would filter out non-text characters.
I wonder if you might need to use StreamTokenizer. It allows you to identify number, word (including single letter), end of line and end of file. You could for example use a switch statement based on the token, to identify numbers, words, end of line, end of file, and use a default to label everything else as spaces, tabs etc.
This way, you'd get all of your strings, characters, numbers and the rest wouldn't matter. It would however mean that single characters such as "$", "%" etc would end up in the default case.
If this sounds related to what you're after, let me know, as I have an example in code.
cheerio
rowan
 
Desmond Lee
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Rowan
If you could put up the example that would be awesome....
Thanks for you help
Regards
Desmond
 
Rowan Brownlee
Ranch Hand
Posts: 97
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
righto - eg. follows...


 
Onion rings are vegetable donuts. Taste this tiny ad:
a bit of art, as a gift, the permaculture playing cards
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic