Stefan, regarding your Accumulator code - nice!
I see you're still doing a CDF, which is to say a map of observation to cumulative probability, rather than the inverse function. I.e. Map<T, BigDecimal> rather than Map<BigDecimal, T>. I think the inverse function is what Stefan actually needs, as previously noted, but I'll go with your interpretation here.
Also I've already noted my feelings on BigDecimal for this problem, but here I'll accept it and move on.
I guess there is a possible benefit in being able to pass in a MathContext, at least for some applications.
I found one small optimization to make in the combine() method, always merging the smaller map into the bigger one:
As for the general design... I see you're aggressively reusing the same Map instance throughout. That can work. But I feel it's imposing a lot of costs as well, having to do all your counting by adding BigDecimal.ONE for every single observation, when long would be much faster. Also doing a TreeMap log(N) lookup for every access, while you really only need the sorted nature of the map after the counting has been done. I'm thinking it's better to let Collectors.groupBy() and counting() do most of that work. If BigDecimal is desired, that really only is needed for the division; it can be done in a downstream transformation. And if we really want to reuse the map rather than recopying it, we can still do that too, with a little... ummm... questionable
Of course, if you reverse the map as I think Piet intended, then you might as well make at least one new Map along the way, since you need to map on different keys. But here we're assuming that is not
what is needed.