• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

word frequency program for unicode

 
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I'm new to Java and am compounded with an issue:
Need to write a program that counts the word frequency from a file containing Urdu (in Unicode) words.
Would appreciate any help in starting out. I understand that BufferedWriter, FileInputStream, and FileOutputStream classes would be very helpful.
Also would appreciate tips in handling white spaces, new line, punctuation marks and end of file characters.
 
Sheriff
Posts: 11343
Mac Safari Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
"the shrink,"

Welcome to JavaRanch!

First, please revise your display name to meet the JavaRanch Naming Policy. To maintain the friendly atmosphere here at the ranch, we like folks to use real (or at least real-looking) names. You can edit your name here. Thank you for your prompt attention.

Now, with respect to your question, where exactly are you are stuck? Can you post some of the code that you have so far?

-Marc
 
Ranch Hand
Posts: 135
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Since Java uses Unicode, the fact that the words are Urdu is irrelevant.

A common approach to counting words (or any kind of substrings) is:

1. Create HashMap<String,Integer>
2. For each word in the string, if not in the hash map add it, else just increment its count.
 
Ranch Hand
Posts: 3061
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
What have you done so far? At the very least, I assume you know to start with a main() method. Please post some code to illustrate what you have tried.

Layne
 
Don't get me started about those stupid light bulbs.
reply
    Bookmark Topic Watch Topic
  • New Topic