• Post Reply Bookmark Topic Watch Topic
  • New Topic

Performing a group by Statement on a Java Collection  RSS feed

 
Binesh Gunaratne
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I've got very simple Java List of objects.

The object would be something like:
...
VehicleID,
VehicleName,
ModelName,
...

List<Vehicles> vehicleList = new ArrayList<Vehicles>();

I want to do a "group by" and count on this Java List on say "VehicleName" and retrieve a count of each of the elements. eg:

...
VehilceName: xyzzzz : count: 2
VehilceName: xyzzyyy : count: 10
...

Was just looking for the optimal way of doing this, I know that you could iterate though this and do a count using iterator, place the values into a another list and increment the count depending on the next occurrence. I could also use a map... but these ways seem a little inefficient. Any other better ways to perform a group by on a list? Noticed that there's a Collections.freqency(...), however this can only be used if I know the element that I'm looking to search for before hand...

Thanks.

This is my first time posting on this form.. so please be nice..



 
Campbell Ritchie
Marshal
Posts: 56592
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Welcome to the Ranch

GROUP BY is an SQL instruction, and databases tend to have lots of indices. When the RDBMS sees you have a GROUP BY, it will index the vehicle name column, maybe even when objects are INSERTed, so as to expedite searching. I can't think of a built-in way to do this is Java™. Java™ is a 3rd-generation general purpose applications language, whereas SQL is a 4th generation language for databases, ie it has specific optimisations and higher-level commands for database use. GROUP BY is one of those high-level commands, which Java™ hasn't got.

I am afraid you are going to have to iterate your List, or put all the objects sought into a second List and sort that List with a Comparator<Vehicle>. There is nothing inefficient about using Maps; there is an example of counting in the Java™ Tutorials section about Maps. Methods like put() in the HashMap class run in amortised constant time. The difference is that a database will index the entries before you search, using memory and processor time, whereas a Map does that sort of thing when you search. Swings and Roundabouts.
 
Matthew Brown
Bartender
Posts: 4568
9
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
That's not to say Java couldn't have been designed to do this....for example, C# provides this functionality in its collections framework via LINQ (a SQL-like syntax for querying various data sources, including collections). But while it would be more concise code, it wouldn't necessarily be more efficient than doing it the long-hand way.
 
Binesh Gunaratne
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thank you so much for your responses; I guess this does stray away from the purity of my initial question; but after doing some googling I found an 3rd party library that does let you do a group by on a collection; what's more it even lets you do other cool stuff like retrieve sub-sets etc etc; http://code.google.com/p/lambdaj/. I just thought I would share it with everyone, hopefully someone would find it useful.

I have not dug into the source code or anything as yet, but I did do a simple speed test of my own implementation (List and Map) against the group by provided by lambdaj and my own solution came out on top. I've just added the two bits of code at the bottom for any ones comments; as there might be other things with scaleability and robustness that I might not have noticed. So far the long hand way seems to be winning

My Code:


Lambdaj code:




Thanks again for everyone replies!...



 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!