Forums Register Login

word frequency program for unicode

+Pie Number of slices to send: Send
I'm new to Java and am compounded with an issue:
Need to write a program that counts the word frequency from a file containing Urdu (in Unicode) words.
Would appreciate any help in starting out. I understand that BufferedWriter, FileInputStream, and FileOutputStream classes would be very helpful.
Also would appreciate tips in handling white spaces, new line, punctuation marks and end of file characters.
+Pie Number of slices to send: Send
"the shrink,"

Welcome to JavaRanch!

First, please revise your display name to meet the JavaRanch Naming Policy. To maintain the friendly atmosphere here at the ranch, we like folks to use real (or at least real-looking) names. You can edit your name here. Thank you for your prompt attention.

Now, with respect to your question, where exactly are you are stuck? Can you post some of the code that you have so far?

-Marc
+Pie Number of slices to send: Send
Since Java uses Unicode, the fact that the words are Urdu is irrelevant.

A common approach to counting words (or any kind of substrings) is:

1. Create HashMap<String,Integer>
2. For each word in the string, if not in the hash map add it, else just increment its count.
+Pie Number of slices to send: Send
What have you done so far? At the very least, I assume you know to start with a main() method. Please post some code to illustrate what you have tried.

Layne
Not so fast naughty spawn! I want you to know about
a bit of art, as a gift, that will fit in a stocking
https://gardener-gift.com


reply
reply
This thread has been viewed 751 times.
Similar Threads
Problem with handling huge data structures
Counting in arrays
Anyone know How to solve this basic problem?
Matrix form of files and words
how to get frequency form the air
More...

All times above are in ranch (not your local) time.
The current ranch time is
Apr 16, 2024 04:24:46.