• Post Reply Bookmark Topic Watch Topic
  • New Topic

How we can Count the no.of words in a given String  RSS feed

 
Jalli Venkat
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Friends,
I want pseudo code and logic for counting the no. of words in a given String.
Eg: Sting str="welcome to the javaranch big mouse saloon";

This is my string.i want 2 count the no. of words in that String.
 
Paul Sturrock
Bartender
Posts: 10336
Eclipse IDE Hibernate Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Not an advanced question. Moving...
 
Jesper de Jong
Java Cowboy
Sheriff
Posts: 16028
87
Android IntelliJ IDE Java Scala Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You can use class java.util.StringTokenizer to break a sentence into separate words, and a loop to count the number of words.

Lookup StringTokenizer in the Java API documentation and try using it in your program.
 
Karan Rajan
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
A quick and easy way is to use StringTokenizer with one whitespace " " as the delimiter. Calling countTokens() will give you the number of words.

e.g.

 
Keith Lynn
Ranch Hand
Posts: 2409
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Starting with JDK 1.4, the String class has a method called split() that will split the String into an array of tokens using a regex as delimiter.
 
Luciano Mantuaneli
Greenhorn
Posts: 13
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It has to be pseudo code? Because I know a solution using features too specific of Java API. Can't see how to express it in pseudo code... Let me try:

If you want to see the code, let us know!
 
Luciano Mantuaneli
Greenhorn
Posts: 13
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Keith Lynn:
Starting with JDK 1.4, the String class has a method called split() that will split the String into an array of tokens using a regex as delimiter.


This approach can be very tricky: Depending on your regex, you may have distorted values
[ December 04, 2006: Message edited by: Luciano Mantuaneli ]
 
Paul Clapham
Sheriff
Posts: 22505
43
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you're just talking about simple English strings like the example (and this is a beginner's homework assignment) then the StringTokenizer and String.split() methods will work just fine. But in real life, especially in languages like Japanese and Thai, extracting words from strings is a rather difficult task.
 
Kaydell Leavitt
Ranch Hand
Posts: 690
Eclipse IDE Firefox Browser Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What if the String has other white-space in it, such as a non-breaking space, a tab, a carriage-return, a linefeed, or a formfeed?

Are there other kinds of whitespace?

-- Kaydell
 
Jesper de Jong
Java Cowboy
Sheriff
Posts: 16028
87
Android IntelliJ IDE Java Scala Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Luciano Mantuaneli:
If you want to see the code, let us know!

Luciano, please don't post the complete solution - that way Jalli will not learn anything. Have you read the description of the beginner forum? It says:

We're all here to learn, so when responding to others, please focus on helping them discover their own solutions, instead of simply providing answers.
 
Luciano Mantuaneli
Greenhorn
Posts: 13
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Jesper Young:

Luciano, please don't post the complete solution - that way Jalli will not learn anything. Have you read the description of the beginner forum? It says:


Got it!

 
Keith Lynn
Ranch Hand
Posts: 2409
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Kaydell Leavitt:
What if the String has other white-space in it, such as a non-breaking space, a tab, a carriage-return, a linefeed, or a formfeed?

Are there other kinds of whitespace?

-- Kaydell


\s mathches A whitespace character: [ \t\n\x0B\f\r].

You can find a complete list in the API docs for java.util.regex.Pattern
 
Kaydell Leavitt
Ranch Hand
Posts: 690
Eclipse IDE Firefox Browser Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I didn't see non-breaking space in the list.

-- Kaydell
 
Sanjit Kumar
Ranch Hand
Posts: 35
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
word counting can be done in easy way by using JAVA API or you can use your own algorithm like counting the character and if next character is space then increment the counter. you can check for other cases like if there is more than single space or string starts with a space or ends with more than one space.
 
Jesse Crockett
Ranch Hand
Posts: 129
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
A simple way to count tokens of strings:



[ December 09, 2006: Message edited by: Jesse Crockett ]
[ December 09, 2006: Message edited by: Jesse Crockett ]
 
Jeroen T Wenting
Ranch Hand
Posts: 1847
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
From the Javadoc of StringTokenizer:
StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.


It is effectively deprecated (though not officially marked as such) and should really not be used in new code.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!