• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Java8 Find repeated words from list of list

 
Ranch Hand
Posts: 138
1
jQuery Spring Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

I am parsing a paragraph and find words for each paragraph. I want to find out the repeated words of each paragraph.
Below code I have written to find out the repeated words from List of List.
Want to understand, whether it need some improvement or not.





Thanks,
Atul
 
Marshal
Posts: 79151
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Please explain what you are looking for. Are you putting the words into a Map? What is your definition of a repeated w‍ord? Can you take a Stream from the List instead?
 
Atul More
Ranch Hand
Posts: 138
1
jQuery Spring Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

The logic I am trying to build here is:
1. Calculate the occurance of each word from all the lists. Adding all the list of para into a single list and then calculate the occurance of each word from all the lists.
2. Then take each list,  find the each word occurance of list from the map.
3. If the count is greater than 1, then I am adding that word in list.

This way I am trying to achieve the result.

Thanks,
Atul
 
Campbell Ritchie
Marshal
Posts: 79151
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I am afraid that doesn't help. You need to be specific. What is your definition of a repeated word?
 
Atul More
Ranch Hand
Posts: 138
1
jQuery Spring Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Lets take an example which I mentioned in the code.
"AAA", "BBB" is occured multiple times.
In para1 and para2 list. So for them the count would be 2.
These words are repeated words for me and I am lloking for those.
 
Campbell Ritchie
Marshal
Posts: 79151
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
We need a definition. Only when you have defined what you are going to do can you work out how to do it.
Do you mean words which occur more than once per sentence? Per paragraph? In the whole document?
How do you get the count to be 2?
 
Bartender
Posts: 5465
212
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Your class has only two methods and there are no members, so you might as well make these methods static. You do not need an instance of your class in that case.

Your parax lists are small here, so it makes not much difference, but for real paragraphs with many more words you might want to convert them to a HashSet to speed up the 'contains' part in 'findRepeatedWords'.
And anther improvement might be to first filter for entry.getValue() > 1, and then in the next filter, whether the string is present in the parax. Like in:
 
If you look closely at this tiny ad, you will see five bicycles and a naked woman:
a bit of art, as a gift, that will fit in a stocking
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic