I want to create a small application that takes a file as a parameter and performs the following tasks :
counts the number of characters.
counts the number of white spaces.
counts the number of lines.
counts the number of words.
search for a specific word.
but the problem is that I am't sure about these algorithms.
so would you mind giving me some tips to create these methods ?
(like how could I know that this line has ended and how to know that the word has ended)
here is some code :
any corrections about the previous code ?
I think there is some thing wrong with counting the spaces and chars, what do you think ?
Once you decide, detecting them will be similar to your current checks. Use the Character helper methods as you have so far (and maybe new ones) to check for words.
This part seems like the crux of the assignment, so I don't want to hint too far. Try a few ways and post again if you don't get it.
What does the program specification say you should count: spaces or whitespace? Read the JavaDocs for both methods and see which matches the spec and use it.
Originally posted by John Todd:
in order to count spaces, should I use :
isWhitespace( ) or is SpaceChar( ) ??
How could a Character be able to tell you anything about words (other than single-letter words like "I" and "a")? Think about how you detect English words (try defining one for starters) in the context of a text file. Then translate that into a sequence of logic steps and finally code.
which method in Character class counts the words ??
Every system may vary. Typically, whitespace is considered to include space " ", horizontal tab "\t" and newline "\n". However, the JavaDoc for Character.isWhitespace(char) says
Originally posted by John Todd:
May I ask you what is the difference between the space char and white space ?
What that tells me is that you should never read JavaDocs before coffee. No wait, what that tells me is that the three I mentioned above are included in that much wider definition. But I would bet you that when your program is tested, only the three I mentioned will be considered (maybe carriage return if tested on a Mac). Regardless, using Character.isWhitespace(char) will count them all correctly.
A character is considered to be a Java whitespace character if and only if it satisfies one of the following criteria:
It is a Unicode space separator (category "Zs"), but is not a no-break space (\u00A0 or \uFEFF). It is a Unicode line separator (category "Zl"). It is a Unicode paragraph separator (category "Zp"). It is \u0009, HORIZONTAL TABULATION. It is \u000A, LINE FEED. It is \u000B, VERTICAL TABULATION. It is \u000C, FORM FEED. It is \u000D, CARRIAGE RETURN. It is \u001C, FILE SEPARATOR. It is \u001D, GROUP SEPARATOR. It is \u001E, RECORD SEPARATOR. It is \u001F, UNIT SEPARATOR.
The real trick is how to define and detect an "English word." How many words do the following sentences have?
I found it, I found how to count the words.
but I have a question :
look at this code please :
I am increasing the number of characters every time I encounter a unicode space or a Java space, which ofcourse will cause the sum to produce a wrong result.
if I want to count the number of charcs in a file, which method should I use :
isWhitespace or isSpaceChar ?
what is the difference between the char and the letter ?
Note where the various counter increment steps take place. This will solve your problem with over-counting total characters. In fact, Stan pointed this out in an earlier reply.
[ November 08, 2004: Message edited by: David Harkness ]