This week's book giveaway is in the Cloud/Virtualization forum.
We're giving away four copies of Building Blockchain Apps and have Michael Yuan on-line!
See this thread for details.
Win a copy of Building Blockchain Apps this week in the Cloud/Virtualization forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Paul Clapham
  • Liutauras Vilda
  • Knute Snortum
  • Bear Bibeault
Sheriffs:
  • Devaka Cooray
  • Jeanne Boyarsky
  • Junilu Lacar
Saloon Keepers:
  • Ron McLeod
  • Stephan van Hulst
  • Tim Moores
  • Carey Brown
  • salvin francis
Bartenders:
  • Tim Holloway
  • Piet Souris
  • Frits Walraven

Text Processor: Inputting the contents of a text file into a hashmap

 
Ranch Hand
Posts: 52
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Before I post my issue I'd just like to say sorry for spamming with questions recently.

I've decided to implement my Text Processing program using Hash Tables, in order to output the number of times each word in a text file appears.

My issue seems to lie inside the while loop, when I'm trying to increment the integer by 1 if it comes across the same word again.

This is the error I'm seeing:

Exception in thread "main" java.lang.NullPointerException
       at TextProcessorUsingHashmaps.main(TextProcessorUsingHashmaps.java:36)



Which value is it referring to as being null, word or f?

Here is my code:

 
Marshal
Posts: 68110
258
  • Likes 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You have nothing to say sorry about.
The correct version of the counting program is in the Java™ Tutorials (look for “Map Interface Basic Operations”). See if you can see the difference between what they have there and what you wrote. Then I shall challenge you to explain why you are suffering an NPE in line 36. (Thank you for providing the right line number, which made it easy for me to guess what is happening.)
Also have a look at the put() method and see whether it worries about k being null.

[Addition] Don't write spaces around the < and > when specifying types.
Find out about a much better way to close resources.
 
Marshal
Posts: 6869
182
Eclipse IDE Postgres Database VI Editor Chrome Java Ubuntu
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Here is a minor point:  Instead of writing
write
1) Using the interface (Map) as a type is better than using the implementation (HashMap).  It gives any future programmers (or you!) the flexibility of using a different implementation if necessary.
2) You can let the compiler infer the type by just typing "<>".
 
Charles Ormond
Ranch Hand
Posts: 52
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Campbell Ritchie wrote:You have nothing to say sorry about.
The correct version of the counting program is in the Java™ Tutorials (look for “Map Interface Basic Operations”). See if you can see the difference between what they have there and what you wrote. Then I shall challenge you to explain why you are suffering an NPE in line 36. (Thank you for providing the right line number, which made it easy for me to guess what is happening.)
Also have a look at the put() method and see whether it worries about k being null.

[Addition] Don't write spaces around the < and > when specifying types.
Find out about a much better way to close resources.




Thanks Campbell Ritchie, I have implemented the code from the Tutorial you linked. The code simply returned "0 distinct words" so I think the issue is I had to "put" some words into the hashmap first before I ran the loop.

"message": "The method put(String, Integer) in the type Map<String,Integer> is not applicable for the arguments (String[], int)", line 23

As I'm using an array where a string should be. I understand the error but don't know how to correct the code:


 
Sheriff
Posts: 15043
252
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Charles Ormond wrote:so I think the issue is I had to "put" some words into the hashmap first before I ran the loop.


This issue was not that, so whatever you're doing to fix the non-issue is now giving you an issue. Simply don't do it.

The actual issue was that get() was returning a null because there was no such element in the map. Your code that checks for null on the next line already addresses that issue.
 
Charles Ormond
Ranch Hand
Posts: 52
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Junilu Lacar wrote:

Charles Ormond wrote:so I think the issue is I had to "put" some words into the hashmap first before I ran the loop.


This issue was not that, so whatever you're doing to fix the non-issue is now giving you an issue. Simply don't do it.

The actual issue was that get() was returning a null because there was no such element in the map. Your code that checks for null on the next line already addresses that issue.



Ok. I've reverted my code. So what is the proper way to put elements into the map?


 
Junilu Lacar
Sheriff
Posts: 15043
252
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
after reverting your code, did you test it to see if it works the way you expect it to?
 
Junilu Lacar
Sheriff
Posts: 15043
252
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Your code tells me you're still guessing at a solution for this problem. The code prompts for a file to be read, then sets up a Scanner with System.in as the source, then your loop references the args array, which contains the command line arguments. You need to decide on where your input is coming from.
 
Charles Ormond
Ranch Hand
Posts: 52
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Junilu Lacar wrote:after reverting your code, did you test it to see if it works the way you expect it to?



Just outputs

0 distinct words:
{}

 
Charles Ormond
Ranch Hand
Posts: 52
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

I should have been clearer. To clarify, I would like the program to process just a text file for the time being. I will work on adding command line text later on. I've ammended the code now to remove args as I didn't know that was just for the command line, and replaced it with an array taken from the text file.

This is now the result of running the program and inputting the filepath:

Please enter the filepath:
Test Text Document.txt
12 distinct words:
{prefix=][negative=1, suffix=][negative=1, string=\Q∞\E]=1, suffix=][NaN=1, valid=false][need=1, separator=\.][positive=1, string=\Q�\E][infinity=1, separator=\,][decimal=1, java.util.Scanner[delimiters=\p{javaWhitespace}+][position=0][match=1, closed=false][skipped=false][group=1, input=false][source=1, prefix=\Q-\E][positive=1}



There are several thousand words in the test text file.

This is the updated code:

 
Saloon Keeper
Posts: 6937
65
Eclipse IDE Firefox Browser MySQL Database VI Editor Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
toString() returns the state of the Scanner object, not the contents of the file.
 
Junilu Lacar
Sheriff
Posts: 15043
252
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Google for how to read lines from a file using java.util.Scanner
 
Charles Ormond
Ranch Hand
Posts: 52
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ok, I've removed the toString and replaced it with nextLine, but then this only records the frequency for 1 line of the text file like so:

9 distinct words:
{and=1, Pirrip,=1, name=1, Christian=1, being=1, My=1, father�s=1, family=1, my=1}



What do I need to do for it to process all of the lines?


 
Junilu Lacar
Sheriff
Posts: 15043
252
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The key to figuring out what to do is to read the documentation, understand it, and experiment. Not sensing a lot of that on your part right now. We encourage people to ShowSomeEffort. If you show what kind of different things you tried and why you thought those things might work that would go a long way to showing effort to learn. Dont just try one thing then give up. Try and try again.
 
Bartender
Posts: 21756
148
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
We LOVE lots of questions. That's what we're here for!

Incidentally, this is an alternative (and probably better way to deal with the situation of "incrementing" for the first time:



A variant of that would use if/else, but I like this version because you you jam in a starting value and always use the increment code instead of doing 2 separate logic paths, which I consider to be more prone to error.
 
Carey Brown
Saloon Keeper
Posts: 6937
65
Eclipse IDE Firefox Browser MySQL Database VI Editor Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Tim Holloway wrote:Incidentally, this is an alternative (and probably better way to deal with the situation of "incrementing" for the first time:
A variant of that would use if/else, but I like this version because you you jam in a starting value and always use the increment code instead of doing 2 separate logic paths, which I consider to be more prone to error.


Hate to disagree with you but this is better:
Yours has two gets (I count containsKey() as a get) and 1 or 2 puts, whereas this has one get and one put.
 
Charles Ormond
Ranch Hand
Posts: 52
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Apologies if it comes across as though I'm not putting in effort and just want to be given answers, as I genuinely have been experimenting with different things but my arsenal of things I know how to try is limited and so I keep getting stuck.

To read all the lines into one string, rather than it just reading one line, I've looked into the readAllLines method which uses a filepath as a parameter; might this be the right way to go?

I've tried various ways of implementing the readAllLines method but none are working...

This is the current error:

Exception in thread "main" java.nio.charset.MalformedInputException: Input length = 1
       at java.nio.charset.CoderResult.throwException(CoderResult.java:281)
       at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:339)
       at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
       at java.io.InputStreamReader.read(InputStreamReader.java:184)
       at java.io.BufferedReader.fill(BufferedReader.java:161)
       at java.io.BufferedReader.readLine(BufferedReader.java:324)
       at java.io.BufferedReader.readLine(BufferedReader.java:389)
       at java.nio.file.Files.readAllLines(Files.java:3205)
       at java.nio.file.Files.readAllLines(Files.java:3242)
       at TextProcessorUsingHashmaps.main(TextProcessorUsingHashmaps.java:22)



And the current code:

 
Knute Snortum
Marshal
Posts: 6869
182
Eclipse IDE Postgres Database VI Editor Chrome Java Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Files.readAllLines() returns a List<String>.  You're going to want to roll through this list and for each line, break it into words in a String[], then as you're doing, roll through the String[] and count the words.
 
Junilu Lacar
Sheriff
Posts: 15043
252
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Charles Ormond wrote:I genuinely have been experimenting with different things but my arsenal of things I know how to try is limited and so I keep getting stuck.


There's no need for you to say sorry. However, I will apologize if I seem a little hard on you here but the way I see it, there are still more things you can do to try to help yourself.

For example,

To read all the lines into one string, rather than it just reading one line, I've looked into the readAllLines method which uses a filepath as a parameter; might this be the right way to go?


This makes me wonder what you didn't understand in the API documentation. The question "might this be the right way?" is something you can answer yourself through some experimentation.

This is what the API documentation for readAllLines() says (emphasis mine):

--------
public static List<String> readAllLines(Path path)
                                throws IOException

...
Returns:
the lines from the file as a List; whether the List is modifiable or not is implementation dependent and therefore not specified
---------

The documentation already says what Knute pointed out: that readAllLines() returns a List<String>. I hoped you'd show some code you tried that used a List<String> but instead, it looks like you tried to assign the value returned by readAllLines to a String (line 22). That won't even compile. See Fix All Compiler Errors Before Running The Application

I've tried various ways of implementing the readAllLines method but none are working...


See "It Doesn't Work" is Useless and Search First. The internet is full of examples. If you search for Scanner.readAllLines() examples you'll find plenty of code examples you can experiment with and learn from.

I understand that you're still trying to find your way around but I think you'd be doing yourself a big favor by learning to use the available documentation and examples already out there. As you've seen, there are plenty of people around here who are ready and willing to help but I think you can still do more to help yourself. Believe me, programming is a lot more fun when you figure things out yourself.
 
Tim Holloway
Bartender
Posts: 21756
148
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I must be missing something. Why not simply use the Scanner to read words directly from the file and update the tallies directly without all the intermediate work? Is there something the Scanner can't do here?
 
Junilu Lacar
Sheriff
Posts: 15043
252
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
No, you're not missing anything, Tim. Scanner.next() can be used to solve the wordcount problem. A search for using Scanner to count words in a file examples turns up examples that show exactly how it can be done.
 
Campbell Ritchie
Marshal
Posts: 68110
258
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Junilu Lacar wrote:Google for how to read lines from a file using java.util.Scanner

I wouldn't if I wuz you, Guv. The chances of a Google search finding a correct solution are slim, In fact the first solution to appear on my search is incorrect.

The problem that solution has is that it uses while (mySanner.hasNextLine()) ...I have seen it before on this forum. Many text files end with a line end sequence; indeed some editors automatically add the enter key at the end of the last line regardless. That means the last line is empty, and you will be trying to deal with an empty line, and that can cause problems, even exceptions. Also that solution uses legacy code (File rather than Path). But to its credit it does the correct exception handling, with try with resources rather than myScanner.close()
I would suggest:-You will sufer problems if you have en empty line in the middle of your file.

[edit]Spelling correction i→if
 
Tim Holloway
Bartender
Posts: 21756
148
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
OK, we long ago agreed that Sun's Scanner class is pretty horrible. But explain to me the issue about the last line. And is there a problem with something like this:
 
Junilu Lacar
Sheriff
Posts: 15043
252
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Campbell Ritchie wrote:

Junilu Lacar wrote:Google for how to read lines from a file using java.util.Scanner

I wouldn't i I wuz you, Guv. The chances of a Google search finding a correct solution are slim


Well, your mileage may vary. True, there are plenty of bad examples out there and it takes a while to get a feel for what's bad and what's good. While you're on that journey, however, I think it actually can help you learn by experimenting with several different solutions and seeing for yourself 1) whether the proposed solutions work (you should be able to explore this yourself) 2) if there are problems with the proposed solutions (asking the opinion of others can help you here), and 3) what can be done to address those problems (again, the opinion of more experienced people helps here).

I'm not suggesting copy-paste without understanding. If you really want to learn, making mistakes or experiencing other people's mistakes is as good a way to learn as any. In fact, the way most programmers learn best is by making a lot of mistakes and writing a lot of bad code before figuring out what good code looks like and how to write it.
 
Campbell Ritchie
Marshal
Posts: 68110
258
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I agree that you learn lots more from your mistakes than from good code which works first time. but a search throws the inexperienced into potentially deep water where they can't see potential problems. I found examples using next() and examples where they get all confused about whether they should change the default delimiter from \n to \r\n. Those statements are plausible and beginners won't know whether they are errors or correct.
Some of the hits are so old that new File(...) was the up‑to‑date way to refer to a file when the posts were written.
 
Junilu Lacar
Sheriff
Posts: 15043
252
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I still wouldn't discourage anyone from searching for examples. If you can't tell whether or not an example is good, that's when asking is perfectly appropriate. Personally, I'd prefer seeing more posts like this:

I did a Google search for "flim flam flux examples" and saw one here (URL) that looked promising. I admit I don't really know if it's a good example but I tried it and it seems to work. Can you guys take s look and see if that approach was valid or if not, suggest better ways to do it? If you have any links to examples you can recommend, that would be great, too. Thanks!

 
Campbell Ritchie
Marshal
Posts: 68110
258
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If only we ever saw posts like that....
 
Saloon Keeper
Posts: 3027
414
Android Eclipse IDE Angular Framework MySQL Database TypeScript Redhat Java Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Carey Brown wrote:

Tim Holloway wrote:Incidentally, this is an alternative (and probably better way to deal with the situation of "incrementing" for the first time:
A variant of that would use if/else, but I like this version because you you jam in a starting value and always use the increment code instead of doing 2 separate logic paths, which I consider to be more prone to error.


Hate to disagree with you but this is better:
Yours has two gets (I count containsKey() as a get) and 1 or 2 puts, whereas this has one get and one put.


What about:
 
Carey Brown
Saloon Keeper
Posts: 6937
65
Eclipse IDE Firefox Browser MySQL Database VI Editor Java Windows
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Ron McLeod wrote:

Carey Brown wrote:

Tim Holloway wrote:Incidentally, this is an alternative (and probably better way to deal with the situation of "incrementing" for the first time:
A variant of that would use if/else, but I like this version because you you jam in a starting value and always use the increment code instead of doing 2 separate logic paths, which I consider to be more prone to error.


Hate to disagree with you but this is better:
Yours has two gets (I count containsKey() as a get) and 1 or 2 puts, whereas this has one get and one put.


What about:


How does that handle put() of freq+1 ?
 
Ron McLeod
Saloon Keeper
Posts: 3027
414
Android Eclipse IDE Angular Framework MySQL Database TypeScript Redhat Java Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Carey Brown wrote:How does that handle put() of freq+1 ?


Whoops .. of course you're right ...
 
Tim Holloway
Bartender
Posts: 21756
148
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
There are some who would argue that the trinary operator is an abomination that should be expunged from existence.

I wouldn't go that far, but it is slightly cryptic to the less skilled and fairly easily to subtly screw up.

Also, from the JavaDocs on Map:

JavaDocs wrote:
If this map permits null values, then a return value of null does not necessarily indicate that the map contains no mapping for the key; it's also possible that the map explicitly maps the key to null. The containsKey operation may be used to distinguish these two cases.



So not a good practice to develop, even though we know that a null value isn't appropriate for this exercise.

I like Carey's Ron's approach, as it embeds the "ifPresent" test in the logic that anticipates non-existing entries. Neither a trinary nor an explicit "if" is needed:


With straight-line code you can't "go" wrong.
 
Ron McLeod
Saloon Keeper
Posts: 3027
414
Android Eclipse IDE Angular Framework MySQL Database TypeScript Redhat Java Linux
  • Likes 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
This would be another option:
 
Tim Holloway
Bartender
Posts: 21756
148
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Ron McLeod wrote:This would be another option:



Better still. Now it's only one non-conditional statement. I didn't notice that one in my quick scan of the docs, but I was thinking of its equivalent for Properties and how nice one would be. Someone else obviously thought so, too!
 
Carey Brown
Saloon Keeper
Posts: 6937
65
Eclipse IDE Firefox Browser MySQL Database VI Editor Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Rather than hi-jack this thread any further I started a new TOPIC under performance.
I put three approaches in a class, show the byte code breakdown, and have some nano-second timing for each of them. From fastest to slowest they were backwards to what I was expecting.
Purely from a performance stand point, this was the best:
 
Campbell Ritchie
Marshal
Posts: 68110
258
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Carey Brown wrote:. . . Hate to disagree with you but this is better . . .

I think all those solutions woiuld work, though Carey is right: some are better than others. OP's original problem, which I presume he sorted after reading my hint about the Java™ Tutorias and giving him the line number, is that a newly“put” word doesn't have a “V” associated with it. So when you use get(), it is going to return null and yoiu can't unbox nulls.
Is a method call like orElse(...), getOrDefault(...), etc. really a non‑selection statement? Would it be more accurate to say that the method call does the selection for you and hides the awkward details of if‑else and ?: from you.
I suspect many beginenrs are scared of ?:. It took me ages to learn how it works.
 
Campbell Ritchie
Marshal
Posts: 68110
258
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Carey Brown wrote:. . . From fastest to slowest they were backwards to what I was expecting. . . .

Hahahahahahahahaha! Isn't that always the way.
 
Junilu Lacar
Sheriff
Posts: 15043
252
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Did anyone mention using Map.merge()?
 
Campbell Ritchie
Marshal
Posts: 68110
258
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
No. Does Map#merge() do what we want? Yes, I think from looking at that link, it can do that. Please show us an example.
 
Bartender
Posts: 3776
154
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
For instance


Another way is using groupingBy. For instance:
 
Not looking good. I think this might be the end. Wait! Is that a tiny ad?
Java file APIs (DOC, XLS, PDF, and many more)
https://products.aspose.com/total/java
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!