• Post Reply Bookmark Topic Watch Topic
  • New Topic

Word Counting Loops  RSS feed

 
Charlie Green
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm have a program that keeps up with the how many times a words occurs in a string. I also need to be able to keep up with the word that comes directly after the word that is counted and keep up with how many times it comes after that particular word.

Example: Hello, name is Bob. Bob my name is. Will you tell me your name please?

If I search for the word name, I need the output: name - 2, is - 2, please - 1. (Not particularly in that format, but for example.)

I read the a text file in with a buffered reader and put the read text into a string as all lower case letters.

I have code that "regexes" so that there is no punctuation and then splits the string after each space.

I then put this into an array and then in a hashmap that counts the occurrences of each word.



How can I modify this hashmap so that it also keeps up with the words that follow each word?
 
Bear Bibeault
Author and ninkuma
Marshal
Posts: 66306
152
IntelliJ IDE Java jQuery Mac Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
A Map of all words to found words doesn't seem all that useful.

A Map of found words to word count, on the other hand...
 
Charlie Green
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
So I guess I should have mentioned that I store the words and their respective counts in a MongoDB, so its a little more than just searching for a word in a string. I need to first store the words, their counts as a document and then the words that follow and their counts as a list in a subdocument of the word they follow, and then search the db to get the information. I can do that for the general words of the string, my problem comes from keeping up with the word that follow as mentioned above.

I didn't mention before because I really don't want the answer given to me outright, just a general direction to go to get the words that follow.
 
Paul Mrozik
Ranch Hand
Posts: 117
Chrome Mac Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Charlie Green wrote:So I guess I should have mentioned that I store the words and their respective counts in a MongoDB, so its a little more than just searching for a word in a string. I need to first store the words, their counts as a document and then the words that follow and their counts as a list in a subdocument of the word they follow, and then search the db to get the information. I can do that for the general words of the string, my problem comes from keeping up with the word that follow as mentioned above.

I didn't mention before because I really don't want the answer given to me outright, just a general direction to go to get the words that follow.


Hi Charlie,

Okay, general direction: you could modify the Word class by adding fields an replacing the current constructor.
 
Charlie Green
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Maybe not that general. Here's the entire word class:

 
Paul Mrozik
Ranch Hand
Posts: 117
Chrome Mac Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I would personally do something like this:


Make sure you override the hashCode() and equals() methods as well.

 
Charlie Green
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
So let's say I do something like this for the constructor:



What would that make the getFrequentWords() method look like at this point to make it count following as well:

 
Paul Mrozik
Ranch Hand
Posts: 117
Chrome Mac Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Charlie, please take a look at Bear's suggestion:

Bear Bibeault wrote: A Map of all words to found words doesn't seem all that useful.


You want to map words to word count so, for example, a HashMap<Word, Integer> would work well here.

I would also remove the getFrequentWords() method from the Word class, it shouldn't be in there. Create a separate class called WordCounter or something that takes a character stream as an argument.

And finally, as far as the for loop is concerned, I'd change it to a classic for, and while iterating, add the word at [i+1] (but do check size first) to the internal HashMap of the Word instance.

Charlie Green wrote:So let's say I do something like this for the constructor:
What would that make the getFrequentWords() method look like at this point to make it count following as well:

 
Tony Docherty
Bartender
Posts: 3271
82
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Maybe not that general. Here's the entire word class: ...

Why does your Word class have an empty constructor? I can see no reason why you would ever want to create a Word object without passing in at least the word.
Also your getFrequentWords() method doesn't use any instance fields or methods and should be static.

So let's say I do something like this for the constructor:

The problem with your code is it only allows for one following word and doesn't record a count for it. Think about what code you currently have for recording words and their frequency and how you can reuse that class.
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!