• Post Reply Bookmark Topic Watch Topic
  • New Topic

Find most frequent String using a Predicate.  RSS feed

 
Tom Storm
Ranch Hand
Posts: 31
Fedora Firefox Browser Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I have a number of Java Beans stored in a list. I would like to use a predicate to filter the list. I have a getter method in the beans that returns a string for each bean. I would like the predicate to find the most frequent String that is returned and then filter the objects to those that contain that String. I found a number of posts mentioning hashmaps but it seems very memory inefficient. Is it possible to do this?



Thanks,
TS
 
Junilu Lacar
Sheriff
Posts: 11476
180
Android Debian Eclipse IDE IntelliJ IDE Java Linux Mac Spring Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Tom Storm wrote:I would like to use a predicate to filter the list. ... I would like the predicate to find the most frequent String that is returned and then filter the objects to those that contain that String... Is it possible to do this?

Your first goal fits the purpose of a predicate but your second goal does not. What you describe here is a typical Map-Reduce problem so I suggest you look down that road instead.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Tom Storm wrote:I would like the predicate to find the most frequent String that is returned and then filter the objects to those that contain that String. I found a number of posts mentioning hashmaps but it seems very memory inefficient. Is it possible to do this?

Yes. But even if you can do it with a Stream, who knows whether it will be any more "memory efficient"? Certainly not me; and I'm pretty sure that Java offers no guarantees on that basis, nor indeed any that it will be optimally throughput efficient.

Streams are in their infancy in Java, which means that you may be taking on a "bleeding edge" technique that takes a while to work out all the logistical bugs. That's no reason not to use it - particularly if it does WHAT you want - just don't expect it to be necessarily the best (or most efficient) when it comes to either memory or throughput, even if it "looks good".

Personally, I like 'em. They've been long overdue; but I wouldn't make any claims about efficiency until at least Java 10.

Winston
 
Campbell Ritchie
Marshal
Posts: 56525
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
A map is by no means inefficient for memory. It is equivalent to four references per “K” or more precisely the next power of 2 above number of “K”s.
If you have the usual load factor of 75% you enlarge the array to the next power of 2 above 1⅓ of the number of entries.
So a 1000‑element Map will have a 2048‑element array to contain the Map.Entry objects, each with three references (one to the “K”, one to the “V” and one to the next Map.Entry). And 1000 Map.Entries.
You are looking at something in the region 40kB to 80kB depending on the size of the Map.Entry objects, or something like that. No additional space for the String objects because they already exist.
 
Tom Storm
Ranch Hand
Posts: 31
Fedora Firefox Browser Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for the replies. I was able to use a hashmap to find the most frequent String. However I'm running into an exception when I try to filter a list and then assign it to a new list. The code is below. Is it the code causing the error or the actual data?



The error I'm receiving is java.lang.NumberFormatException For input String""0918"". I think it is the double quotes but I don't know why they're there. The error traces back to the Predicate departureDelayOver10Filter and the filtering of the existing list.

Thanks again,
TS

 
Carey Brown
Saloon Keeper
Posts: 3310
46
Eclipse IDE Firefox Browser Java MySQL Database VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Doesn't like the leading zero or the quotes.
 
Tom Storm
Ranch Hand
Posts: 31
Fedora Firefox Browser Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi there,

Do you know of any method of removing the quotes or why they are there? The dataset doesn't have these quotes.
Thanks,
TS
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!