• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Picking a random Key and Value from a Map?

 
Sheriff
Posts: 5555
326
IntelliJ IDE Python Java Linux
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Here's a seemingly trivial coding task that I just can't seem to find an elegant solution for. Perhaps it's a Friday afternoon 'dumbness' thing but I'm just not seeing it.

I have a Map that represents a set of names for a group:
With content that looks something like this:
What I'd like is an elegant way to extract a random Key Value pair from that Map. For example:
There is no requirement for any strong randomness or any even distribution of the chosen values. A rough approximation is all that is required as I'm just generating some test data for a performance benchmark I'm writing for a new feature I've written at work.

All help greatly appreciated. Cows waiting
 
Saloon Keeper
Posts: 15489
363
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
First some questions:

  • Do you need an entry, or is just the value also okay?
  • If an entry, do changes to it need to be reflected in the map?
  • Are entries ever removed from the map?
  • Can we depend on a specific map implementation?
  •  
    Ranch Hand
    Posts: 86
    18
    • Likes 1
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    Not an elegant solution, but I would simply iterate over the map to a random value between 0 and the count of possibilities and return the key/value pair at this random position.

     
    Bartender
    Posts: 5465
    212
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    I like this version:
     
    Saloon Keeper
    Posts: 10687
    85
    Eclipse IDE Firefox Browser MySQL Database VI Editor Java Windows ChatGPT
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    Or....
     
    Tim Cooke
    Sheriff
    Posts: 5555
    326
    IntelliJ IDE Python Java Linux
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    Lots to think about so far. Thanks very much.

    To answer your questions Stephan:

    Stephan van Hulst wrote:

  • Do you need an entry, or is just the value also okay?
  • Just the String value of some Map key and the String value of one of the Map value Set.

    Stephan van Hulst wrote:

  • If an entry, do changes to it need to be reflected in the map?
  • No. The map mustn't change.

    Stephan van Hulst wrote:

  • Are entries ever removed from the map?
  • No

    Stephan van Hulst wrote:

  • Can we depend on a specific map implementation?
  • Yes, any you choose.
     
    Stephan van Hulst
    Saloon Keeper
    Posts: 15489
    363
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    My preferred way would be to implement a custom Map that delegates all calls to a backing Map, but also maintains a List of keys in parallel. Extend it with a method getRandomEntry() that picks a random key from the list, uses it to get the associated value, and returns a new AbstractMap.SimpleEntry.
     
    Stephan van Hulst
    Saloon Keeper
    Posts: 15489
    363
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    This may seem a little overkill, but you can also use it to add other methods you might be missing. It has the same properties as a LinkedHashMap because it's based on one, including the time and space complexities of all methods.

     
    Stephan van Hulst
    Saloon Keeper
    Posts: 15489
    363
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator

    Stephan van Hulst wrote:It has the same properties as a LinkedHashMap because it's based on one, including the time and space complexities of all methods.


    Except the put(), remove() and related methods (e.g. putIfAbsent()); those have the time complexity of ArrayList.contains() and ArrayList.remove() respectively.
     
    author & internet detective
    Posts: 41860
    908
    Eclipse IDE VI Editor Java
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    I'd create two helper objects. First, I'd convert the Map<String, Set<String>> to a Map<String, List<String>>. An easy conversion and makes the problem easier. I'd also create a List of the keys.

    Then I'd just use Random to get two indexes and be good to go.


     
    Carey Brown
    Saloon Keeper
    Posts: 10687
    85
    Eclipse IDE Firefox Browser MySQL Database VI Editor Java Windows ChatGPT
    • Likes 1
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    Jeanne, although it probably doesn't make a difference in this particular exercise, I'm uncomfortable with the statistical distribution of your random choices by calling nextInt() twice. For example, I've modified the loading of groups in your code.
    Using the randomKeyIndex you have a roughly 1 in 2 chance of selecting either "Gp1" or "Grp2". So let's say it's Grp1, then with randomValueIndex each String in the list has a 1 in 10 chance of being selected, which gives us a 1 in 20 chance that selection "a" would be chosen. Conversely, If Grp2 is selected there's a 1 in 1 chance that "K" will be chosen, or 1 in 2 overall chance. This means that the random distribution over all items in all the lists would not be equal.
     
    Stephan van Hulst
    Saloon Keeper
    Posts: 15489
    363
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    In most solutions the core idea is to use a separate list of keys. I guess the preferred solution depends on whether speed or memory usage is more important:

    1) Keep track of a list of keys for the entire lifetime of the map: O(1) random entry lookup time, but doubles the memory used to hold keys for the duration of the map's life.
    2) Generate a list of keys when a random entry is needed: O(n) random entry lookup time, only doubles memory used to hold keys for the duration of the lookup.
     
    Piet Souris
    Bartender
    Posts: 5465
    212
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator

    Stephan van Hulst wrote:In most solutions the core idea is to use a separate list of keys. I guess the preferred solution depends on whether speed or memory usage is more important (...)


    For me, elegance of the solution is much more important. If creating many extra objects is a problem, only then is it time to look at something less elegant or something more efficient. (admitted: elegance is pretty subjective, I know).

    Mind you, as Carey already pointed out, if you pick out keys at random, let the chance of picking a key be dependant on the length of the value in some way.
     
    Jeanne Boyarsky
    author & internet detective
    Posts: 41860
    908
    Eclipse IDE VI Editor Java
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    Carey,
    That is true. I assumed the sample data was represented and each key had the same number of values.
     
    Carey Brown
    Saloon Keeper
    Posts: 10687
    85
    Eclipse IDE Firefox Browser MySQL Database VI Editor Java Windows ChatGPT
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator

    Carey Brown wrote:Using the randomKeyIndex you have a roughly 1 in 2 chance of selecting either "Gp1" or "Grp2". So let's say it's Grp1, then with randomValueIndex each String in the list has a 1 in 10 chance of being selected, which gives us a 1 in 20 chance that selection "a" would be chosen. Conversely, If Grp2 is selected there's a 1 in 1 chance that "K" will be chosen, or 1 in 2 overall chance. This means that the random distribution over all items in all the lists would not be equal.


    This was my rational for using my "explode()" approach; it solves this issue. However, I really like Piet's Java-8 approach and will modify it here to work lilke explode().

     
    Tim Cooke
    Sheriff
    Posts: 5555
    326
    IntelliJ IDE Python Java Linux
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    Hello everyone, sorry for the delay in reply here I posted this Friday just as I was packing up work for the week and it's been really good weather over the weekend so have spent most of it outside in the sun away from the computer. I'm back at work now so am ready to put some of these suggestions into action.

    Henry wrote:Can we also give out cows too? Some of the solutions are cool

    Absolutely! Distribute your cows as you wish. The solutions are cool.

    Jeanne wrote:I assumed the sample data was represented and each key had the same number of values.

    Ah sorry Jeanne, my sample data does suggest that but it's not representative in that way. Each key can have one to many values.

    Lots of effort has gone into helping me over the last few days and I appreciate it very much. Cow time!!!
     
    Tim Cooke
    Sheriff
    Posts: 5555
    326
    IntelliJ IDE Python Java Linux
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    I've had some more time to mull this one over and have come up with a solution that works well enough for me.

    The solutions presented here are really cool, especially the Java 8 Stream ones inspired by Piet. I have settled on a slightly different approach though. The dataset shown is a set of 'valid' data fields that can make up part of a customer Order object and for a test I'm writing I want to generate lots and lots of customer Orders that contain valid data. It doesn't matter what data it is, just that it's valid and not all the same. For this I want to select a random Key and Value from the valid data.

    Given that the "getRandom" method is the heavily used on I wanted to keep that really simple so have gone with a two stage solution.

    1) Flatten Map<String, Set<String>> to List<List<String, String>>, where each List item is a List containing exactly 2 items (Essentially like Piet's Pair without the dedicated Pair class, coz I'm lazy).
    This turns data structured like this:Into data structured like this:
    2) From the flat List structure I can select a Random index very cheaply many times over.

    That's one million Orders generated with random, but valid, data.
     
    Piet Souris
    Bartender
    Posts: 5465
    212
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    And a cow for you for this also lovely solution.

    But I must confess to feeling a bit embarrassed at the moment. I like all these flatmap java 8 streams, find them very elegant, but yesterday something sprang to mind.
    I was following the Coursera course of fuctional programming with Scala, by Martin Odersky, slightly over three years ago (fantastic course, by the way! Wholeheartedly recommended),
    and in lesson six the problem was to get Pair(i, j)'s with 1 <= j <= i < n, for which i + j was prime. This is part of the slide:

    ""By reassembling the pieces we get:
    (1 until n) flatMap (i =>
                             (1 until i) map (j => (i, j)))
                             filter ( pair => isPrime(pair._1 + pair._2)
    )

    This works, but makes most people's head hurt. Is there a simpler way?


    And next we were introduced to the powerfull scala for comprehensions. Now, I don't know about my fellow students, but I cheered loudly. These for's were so much easier than these headache causing incomprehensible flatmap filter business...
    So yesterday I dusted off my eclipse with scala, started it and came up with this:

    And now I don't know anymore what is elegant or not... My WorldView has collapsed.
     
    Author
    Posts: 282
    11
    Scala IntelliJ IDE Netbeans IDE Python Java Ubuntu Linux
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    This clearly went to sleep a while ago, and has many very imaginative answers. I also don't know what you deem "elegant", but I didn't notice the "obvious" (I could have missed it, given so many) among them (which I hope has the merit of being readily understood by junior programmers, something I think we tend to undervalue in the corporate world!). Herewith, and hoping I didn't misunderstand the requirements:

    reply
      Bookmark Topic Watch Topic
    • New Topic