• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

java exercise help

 
henry dias
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Find the 10 most common and 10 least common words in the textfile.
Find the 10 most common and 10 least common bigrams in the file. (A bigram is two words following each other in the text. (F.ex. "The red fox" contains 2 bigrams: "The red" and "red fox". - See http://en.wikipedia.org/wiki/Bigram) )
Find the longest phrase that also appears at least twice.
("longest phrase" here means the number of words, NOT letters)



 
Henry Wong
author
Marshal
Pie
Posts: 21490
84
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Please tell us what you have done so far. Please tell us *exactly* what issue you are running into. We can't help you if we don't know where you are stuck.

Henry
 
henry dias
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The issue i am facing is for correct regexp

i am getting words as citizen. citzen, and citizen; i want to remove the , . ; after words and then check for thier repetition in file .


 
Rob Spoor
Sheriff
Pie
Posts: 20661
64
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Please UseCodeTags next time. I've added them for you this time.
 
Campbell Ritchie
Sheriff
Pie
Posts: 50171
79
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Welcome to the Ranch

I do not think a regular expression will help you at all. Go through the Java™ Tutorials Collections section and you will find a counting application example.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic