Picking a random Key and Value from a Map?

Here's a seemingly trivial coding task that I just can't seem to find an elegant solution for. Perhaps it's a Friday afternoon 'dumbness' thing but I'm just not seeing it.

I have a Map that represents a set of names for a group:

With content that looks something like this:

What I'd like is an elegant way to extract a random Key Value pair from that Map. For example:

There is no requirement for any strong randomness or any even distribution of the chosen values. A rough approximation is all that is required as I'm just generating some test data for a performance benchmark I'm writing for a new feature I've written at work.

All help greatly appreciated. Cows waiting

I have a Map that represents a set of names for a group:

With content that looks something like this:

What I'd like is an elegant way to extract a random Key Value pair from that Map. For example:

There is no requirement for any strong randomness or any even distribution of the chosen values. A rough approximation is all that is required as I'm just generating some test data for a performance benchmark I'm writing for a new feature I've written at work.

All help greatly appreciated. Cows waiting

First some questions:

Do you need an entry, or is just the value also okay? If an entry, do changes to it need to be reflected in the map? Are entries ever removed from the map? Can we depend on a specific map implementation?

Not an elegant solution, but I would simply iterate over the map to a random value between 0 and the count of possibilities and return the key/value pair at this random position.

I like this version:

Or....

Lots to think about so far. Thanks very much.

To answer your questions Stephan:

To answer your questions Stephan:

Just the String value of some Map key and the String value of one of the Map value Set.Stephan van Hulst wrote:

Do you need an entry, or is just the value also okay?

No. The map mustn't change.Stephan van Hulst wrote:

If an entry, do changes to it need to be reflected in the map?

NoStephan van Hulst wrote:

Are entries ever removed from the map?

Yes, any you choose.Stephan van Hulst wrote:

Can we depend on a specific map implementation?

My preferred way would be to implement a custom `Map` that delegates all calls to a backing `Map`, but also maintains a `List` of keys in parallel. Extend it with a method `getRandomEntry()` that picks a random key from the list, uses it to get the associated value, and returns a new `AbstractMap.SimpleEntry`.

This may seem a little overkill, but you can also use it to add other methods you might be missing. It has the same properties as a `LinkedHashMap` because it's based on one, including the time and space complexities of all methods.

Stephan van Hulst wrote:It has the same properties as a

LinkedHashMapbecause it's based on one, including the time and space complexities of all methods.

Except the

I'd create two helper objects. First, I'd convert the Map<String, Set<String>> to a Map<String, List<String>>. An easy conversion and makes the problem easier. I'd also create a List of the keys.

Then I'd just use Random to get two indexes and be good to go.

Then I'd just use Random to get two indexes and be good to go.

Jeanne, although it probably doesn't make a difference in this particular exercise, I'm uncomfortable with the statistical distribution of your random choices by calling nextInt() twice. For example, I've modified the loading of groups in your code.

Using the randomKeyIndex you have a roughly 1 in 2 chance of selecting either "Gp1" or "Grp2". So let's say it's Grp1, then with randomValueIndex each String in the list has a 1 in 10 chance of being selected, which gives us a 1 in 20 chance that selection "a" would be chosen. Conversely, If Grp2 is selected there's a 1 in 1 chance that "K" will be chosen, or 1 in 2 overall chance. This means that the random distribution over all items in all the lists would not be equal.

Using the randomKeyIndex you have a roughly 1 in 2 chance of selecting either "Gp1" or "Grp2". So let's say it's Grp1, then with randomValueIndex each String in the list has a 1 in 10 chance of being selected, which gives us a 1 in 20 chance that selection "a" would be chosen. Conversely, If Grp2 is selected there's a 1 in 1 chance that "K" will be chosen, or 1 in 2 overall chance. This means that the random distribution over all items in all the lists would not be equal.

In most solutions the core idea is to use a separate list of keys. I guess the preferred solution depends on whether speed or memory usage is more important:

1) Keep track of a list of keys for the entire lifetime of the map:`O(1)` random entry lookup time, but doubles the memory used to hold keys for the duration of the map's life.

2) Generate a list of keys when a random entry is needed:`O(n)` random entry lookup time, only doubles memory used to hold keys for the duration of the lookup.

1) Keep track of a list of keys for the entire lifetime of the map:

2) Generate a list of keys when a random entry is needed:

Stephan van Hulst wrote:In most solutions the core idea is to use a separate list of keys. I guess the preferred solution depends on whether speed or memory usage is more important (...)

For me, elegance of the solution is much more important. If creating many extra objects is a problem, only then is it time to look at something less elegant or something more efficient. (admitted: elegance is pretty subjective, I know).

Mind you, as Carey already pointed out, if you pick out keys at random, let the chance of picking a key be dependant on the length of the value in some way.

Carey,

That is true. I assumed the sample data was represented and each key had the same number of values.

That is true. I assumed the sample data was represented and each key had the same number of values.

Carey Brown wrote:Using the randomKeyIndex you have a roughly 1 in 2 chance of selecting either "Gp1" or "Grp2". So let's say it's Grp1, then with randomValueIndex each String in the list has a 1 in 10 chance of being selected, which gives us a 1 in 20 chance that selection "a" would be chosen. Conversely, If Grp2 is selected there's a 1 in 1 chance that "K" will be chosen, or 1 in 2 overall chance. This means that the random distribution over all items in all the lists would not be equal.

This was my rational for using my "explode()" approach; it solves this issue. However, I really like Piet's Java-8 approach and will modify it here to work lilke explode().

Hello everyone, sorry for the delay in reply here I posted this Friday just as I was packing up work for the week and it's been really good weather over the weekend so have spent most of it outside in the sun away from the computer. I'm back at work now so am ready to put some of these suggestions into action.

Lots of effort has gone into helping me over the last few days and I appreciate it very much. Cow time!!!

Absolutely! Distribute your cows as you wish. The solutions are cool.Henry wrote:Can we also give out cows too? Some of the solutions are cool

Ah sorry Jeanne, my sample data does suggest that but it's not representative in that way. Each key can have one to many values.Jeanne wrote:I assumed the sample data was represented and each key had the same number of values.

Lots of effort has gone into helping me over the last few days and I appreciate it very much. Cow time!!!

I've had some more time to mull this one over and have come up with a solution that works well enough for me.

The solutions presented here are really cool, especially the Java 8 Stream ones inspired by Piet. I have settled on a slightly different approach though. The dataset shown is a set of 'valid' data fields that can make up part of a customer Order object and for a test I'm writing I want to generate lots and lots of customer Orders that contain valid data. It doesn't matter what data it is, just that it's valid and not all the same. For this I want to select a random Key and Value from the valid data.

Given that the "getRandom" method is the heavily used on I wanted to keep that really simple so have gone with a two stage solution.

1) Flatten Map<String, Set<String>> to List<List<String, String>>, where each List item is a List containing exactly 2 items (Essentially like Piet's Pair without the dedicated Pair class, coz I'm lazy).

This turns data structured like this:Into data structured like this:

2) From the flat List structure I can select a Random index very cheaply many times over.

That's one million Orders generated with random, but valid, data.

The solutions presented here are really cool, especially the Java 8 Stream ones inspired by Piet. I have settled on a slightly different approach though. The dataset shown is a set of 'valid' data fields that can make up part of a customer Order object and for a test I'm writing I want to generate lots and lots of customer Orders that contain valid data. It doesn't matter what data it is, just that it's valid and not all the same. For this I want to select a random Key and Value from the valid data.

Given that the "getRandom" method is the heavily used on I wanted to keep that really simple so have gone with a two stage solution.

1) Flatten Map<String, Set<String>> to List<List<String, String>>, where each List item is a List containing exactly 2 items (Essentially like Piet's Pair without the dedicated Pair class, coz I'm lazy).

This turns data structured like this:Into data structured like this:

2) From the flat List structure I can select a Random index very cheaply many times over.

That's one million Orders generated with random, but valid, data.

And a cow for you for this also lovely solution.

But I must confess to feeling a bit embarrassed at the moment. I like all these flatmap java 8 streams, find them very elegant, but yesterday something sprang to mind.

I was following the Coursera course of fuctional programming with Scala, by Martin Odersky, slightly over three years ago (fantastic course, by the way! Wholeheartedly recommended),

and in lesson six the problem was to get Pair(i, j)'s with 1 <= j <= i < n, for which i + j was prime. This is part of the slide:

And next we were introduced to the powerfull scala for comprehensions. Now, I don't know about my fellow students, but I cheered loudly. These for's were so much easier than these headache causing incomprehensible flatmap filter business...

So yesterday I dusted off my eclipse with scala, started it and came up with this:

And now I don't know anymore what is elegant or not... My WorldView has collapsed.

But I must confess to feeling a bit embarrassed at the moment. I like all these flatmap java 8 streams, find them very elegant, but yesterday something sprang to mind.

I was following the Coursera course of fuctional programming with Scala, by Martin Odersky, slightly over three years ago (fantastic course, by the way! Wholeheartedly recommended),

and in lesson six the problem was to get Pair(i, j)'s with 1 <= j <= i < n, for which i + j was prime. This is part of the slide:

""By reassembling the pieces we get:

(1 until n) flatMap (i =>

(1 until i) map (j => (i, j)))

filter ( pair => isPrime(pair._1 + pair._2)

)

This works, but makes most people's head hurt. Is there a simpler way?

And next we were introduced to the powerfull scala for comprehensions. Now, I don't know about my fellow students, but I cheered loudly. These for's were so much easier than these headache causing incomprehensible flatmap filter business...

So yesterday I dusted off my eclipse with scala, started it and came up with this:

And now I don't know anymore what is elegant or not... My WorldView has collapsed.

This clearly went to sleep a while ago, and has many very imaginative answers. I also don't know what you deem "elegant", but I didn't notice the "obvious" (I could have missed it, given so many) among them (which I hope has the merit of being readily understood by junior programmers, something I think we tend to undervalue in the corporate world!). Herewith, and hoping I didn't misunderstand the requirements:

Wink, wink, nudge, nudge, say no more ... https://richsoil.com/cards

This thread has been viewed 1673 times.

All times above are in ranch (not your local) time.

The current ranch time is

Mar 20, 2018 17:17:41.

The current ranch time is

Mar 20, 2018 17:17:41.