• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
  • Campbell Ritchie
  • Devaka Cooray
  • Liutauras Vilda
  • Jeanne Boyarsky
  • Bear Bibeault
  • Paul Clapham
  • Knute Snortum
  • Rob Spoor
Saloon Keepers:
  • Tim Moores
  • Ron McLeod
  • Piet Souris
  • Stephan van Hulst
  • Carey Brown
  • Tim Holloway
  • Frits Walraven
  • Ganesh Patekar


Ranch Hand
Posts: 30
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have a few questions about StreamTokenizer

1. Is it possible to make streamtokenizer not distinguish between numbers and words? Is it possible to treat numbers as words for nextToken purposes?

2. Is it possible to make spaces treated as words so that nextToken reads many words until a stopping ordinary character is reached?

3. In one of my code i used

Yet streamtokenizer doesn't read the character '-' as a word. What did I do wrong?

4. I am using streamtokenizer because I am reading a stream. It seems so much simplier to convert the stream into string and use stringtokenzier. Is streamtokenizer really worth all the extra effort? Is it a faster more efficent approach?

Thank you for taking time to answer my questions.
Ranch Hand
Posts: 1646
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I've never used StreamTokenizer before. The following answers are from reading the JavaDocs and the source. I recommend going there first because 1) you'll get your answers faster and 2) you'll learn more.[list]The constructors for ST set it up to recognize numbers by default (along with words, whitespace, and comments). You can change this by calling resetSyntax() on it and then setting up your requirements yourself. Here's the constructor, but you should look at the source.
  • Do you mean that you want five spaces to return five single-space tokens? My understanding is that this is not possible (check the source to be sure). Parsers are intended to return runs of word characters as a single token, using non-word characters to find their boundaries. Perhaps you could subclass ST to change this behavior?
  • This is because '-' is a number character. Once you reset the syntax, it will be normal again and you can make it a word character.
  • The benefit of StreamTokenizer is that you don't have to read in the entire stream into memory at once. If you're parsing a relatively small String (or don't mind using lots of RAM), StringTokenizer should do fine. This is similar to the difference between SAX and DOM for XML parsing.

  • [ February 05, 2005: Message edited by: David Harkness ]
    Consider Paul's rocket mass heater.
    • Post Reply Bookmark Topic Watch Topic
    • New Topic
    Boost this thread!