• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Paul Clapham
  • Rob Spoor
  • Liutauras Vilda
Sheriffs:
  • Jeanne Boyarsky
  • Junilu Lacar
  • Tim Cooke
Saloon Keepers:
  • Tim Holloway
  • Piet Souris
  • Stephan van Hulst
  • Tim Moores
  • Carey Brown
Bartenders:
  • Frits Walraven
  • Himai Minh

Flatmap and Arrays

 
Bartender
Posts: 1059
33
Eclipse IDE Postgres Database C++ Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
So I've got this example from a Streams Master Class working:



I saw the example flashed on the screen but put it away while I worked out that code.

I really appreciate having var as the type of my stream as it evolves, changing type a few times along the way during construction.

It is still hurting my brain, because it wants to see the flatMap( ) lambda looking something like line -> line.split().stream() but as split() returns an Array we need to nest the call to split in the static helper method Arrays.stream() since arrays have no methods...I guess that shows how great it is to be able to convert literally every Collection right to a Stream.

I think it is a bad idea to use Arrays with flatMap() until one is pretty comfortable with it, certainly not a first or second example of it.

I am going to try to get used to flatMap() with no arrays in the picture and come back to this after that is so familiar as to be boring.

As I look at it, it is starting to be more okay with me, but it is still flowing inside out a bit instead of linearly as I read it...
 
Saloon Keeper
Posts: 4612
182
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I remember doing a Scala course, about eight years ago. There we were introduced to flatMap, but it was deemed so incomprehensible that immediately we got the much-easier for-comprehensions. But later I practised with java's flatMap, and in the end it boiled down to two simple rules (for me, that is)

Why would it matter if that Stream comes from Arrays.stream?
 
Marshal
Posts: 73760
332
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You are correct. Lines 2‑4 crteate a plain simple Stream<String>.
If you look up Stream#flatMap(), you find it takes a Function as its parameter. The consumer side of that Function's method takes a T or one of its supertypes. The method returns a type not actually specified, but it is a subtype of a Stream containing elements which are a subtype of R.
Remember that for the purposes of generics wildcards, every type os a subtype and a supertype of itself.
What we are doing (both our code snippets do more or less the same thing) is splitting the text into an array, and then creating a Stream to process the components of that array. It is that Stream that is flattened into part of a new Stream.
Line 6's meaning should be obvious (otherwise see Stream#collect()).
 
Marshal
Posts: 3647
516
Android Eclipse IDE TypeScript Redhat MicroProfile Quarkus Java Linux
  • Likes 2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
As an alternative to:
    flatMap(s -> Arrays.stream(s.split("\\s+")))
you could use:
    flatMap(Pattern.compile("\\s+")::splitAsStream)

It keeps things streamy, and avoids the need to create short-lived Arrays (which might (?) have some benefit when the number of words in each incoming stream element is large).
 
Campbell Ritchie
Marshal
Posts: 73760
332
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I never knew you could use () anywhere in a method reference.
 
Marshal
Posts: 22409
121
Eclipse IDE Spring VI Editor Chrome Java Windows
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Any expression can be put before the ::, as long as its value is something you can call the method on.
 
Campbell Ritchie
Marshal
Posts: 73760
332
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
So it is the returned value from a method call that Ron put to the left of the ::
 
Rob Spoor
Marshal
Posts: 22409
121
Eclipse IDE Spring VI Editor Chrome Java Windows
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Correct: the compiled Pattern.
 
Jesse Silverman
Bartender
Posts: 1059
33
Eclipse IDE Postgres Database C++ Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
This may stray back into the "Premature-Optimization vs. Plain-Old Efficient Clear Coding" territory, but does highlight some things I (and possibly others) could possibly benefit from being clearer on.

The first is Regex-specific:

At least for some complex Regexes, (the spell-checker disapproves of Regexen as I edit this), the compilation of the Regex itself could potentially be non-trivial.
In those cases, whichever they are, my desire to pull computations outside of loops that can happily move outside causes me to want to use the longer-form:



versus the clearly shorter, and possibly clearer:



Whether this is ever going to make a difference if lotsAndLotsOfStrings is appropriately named, and should be remembered as a possibly useful optimization step at least when "\\s+" is replaced by something that might be more expensive to compile, or is the kind of thing that is never going to make a measurable difference, and even thinking about is already in eyelash-counting mode...is the kind of thing I'd usually ask here.  I don't find the first code terribly harder to write or to read than the second, and I remember writing some monster Regexes that I could imagine taking non-trivial time to compile, tho not usually for splits.  I do see a lot of tutorial code where if the compilation is both non-trivial and not pulled out of the loop by the compiler, it would at least be worth measuring sometimes, but also, doing an explicit compile step doesn't really seem that painful to me, or I'd be using Python instead of Java  

Regexes aren't in scope for the 819, so after getting the basics down for interviews and emergency use, they were on my "come back to later" list, including whether there is ever a meaningful cost of avoiding an explicit compile step on Regexes that get applied many times.

So the second part goes towards understanding Streams better as well as touching on the (possibly-Hyper) optimization discussed above:

flatMap(Pattern.compile("\\s+")::splitAsStream)

What I see there, if I am not mistaken, is an "instance method reference bound to a particular object instance".
This means that the Pattern.compile("\\s+") gets executed exactly once, creating an anonymous Pattern object that doesn't get garbage collected because it is bound to the method reference, and then that anonymous method reference gets passed into flatMap() for it to do all its flatMapping as the stream is processed.

I am trying to be cognizant of Leveraging Lazy Evaluation, for example, why it is always better to use
.orElseGet( ExpensiveSupplier )
rather than
.orElse( ExpensiveMethodCall() ) when processing our Optionals.
The answer there is that if the ExpensiveMethodCall() is properly named, and that the Optional in question is hardly ever Empty, then the latter will stupidly call it every single time we access the Optional, which adds up, versus the first one where the method reference for ExpensiveSupplier almost never results in actual code execution (because the Optional is very rarely empty)

Now we get to High Concept.  I think most beginners and most people not used to Java Streams think of the stages in the Stream as executing by actually calling each step seen in the source, one by one.
I think what is actually happening is that Java builds up a pipeline, connected to source and terminal operation, and then the terminal operation sucks thru that straw, causing whatever the source representation of our Stream turns into to do their magic.

So am I correct in seeing that Ron's re-write has the (possibly almost nothing) benefit of calling Pattern.compile() only once per execution of the entire Stream, creating an instance method bound to a particular instance of Pattern, which survives until the Stream terminates?

It may not make a difference in performance, but I am trying to have a more realistic idea of "What exactly is going on when we create and then use a Stream?"
 
Campbell Ritchie
Marshal
Posts: 73760
332
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Jesse Silverman wrote:. . . Regexen . . .

I keep thinking regex is a Latin word, so its plural should be regices

. . .. . .

Leave the optimisation to the compiler and runtime. I think it likely that all the different kinds of code you showed use Pattern#compile(...) somewhere behind the scenes. I expect that Arrays.stream(myText.split("\\s+")) and Pattern.compile("\\s+").splitAsStream(myText) take about the same time to run.
It sound like one of those things you don't really to know. Surely you only really need to know whether your car uses petrol (=gas) or diesel when you go to the pumps ‍.
 
Jesse Silverman
Bartender
Posts: 1059
33
Eclipse IDE Postgres Database C++ Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Campbell Ritchie wrote:...
It sound like one of those things you don't really to know. Surely you only really need to know whether your car uses petrol (=gas) or diesel when you go to the pumps ‍.



Thanks.  Not to take the analogy too far, but when looking at Hybrid Cars, there were some which would get lower mileage when driven by normal people (e.g. the Prius) but would get insane mileage when more knowledgeable (or less mentally balanced) drivers drove them in ways that paid attention to the implementation of the Prius, example:
https://www.torquenews.com/8113/what-it-means-hypermile-toyota-prius

Other vehicles were deemed to likely result in higher average mileage when driven by "normal drivers".

One of my best friends always takes analogies too far, and spends too much time talking about the analogies rather than what we were using them to illustrate.
I think the topic of Hypermiling might be an appropriate analogy here, but will leave further discussion of it out of this thread.

I will forget about this for now in reference to Regices, because they are out of scope for what I am studying.

Whether or not I am correct that in Ron's formulation, Pattern.compile() will only be called once per execution of the Stream may still be relevant to understanding Stream behavior, however.
 
Master Rancher
Posts: 4002
52
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I disagree with "leave optimisation to the compiler and runtime" -  at least in this case.  I agree it's good to develop a sense of when things execute, and how many times they will execute.  Knowing which things execute many times inside a loop and which execute once beforehand is useful, and not immediately obvious as you first learn streams.  And there are times when using streams can hide inefficiencies in code.  It's not unreasonable to improve one's understanding of how the code really works.

Jesse, you are correct to think that the Pattern.compile will only execute once, before any of the looping.  And this isn't really a stream-specific thing, but it depends more on how Java works in general, and how normal Java code differs from lambda expressions.  In general when you see a method call with arguments, the order of execution is that the arguments are evaluated first, from left to right, and then the method is executed.  That remains true.  The catch is that if code is in a lambda expression or it's referenced by a method reference, "evaluating" that code doesn't execute it, but simply generate a reference to something that can evaluate it later.  So in this case:

the flatmap() method can't do anything until its argument is evaluated.  Within that argument, the Pattern.compile() part is code that executes right away, to return a Pattern.  Then that is combined with the ::splitAsStream method reference to be passed into the flatMap() method, which can then execute.  The point is, flatMap() can't possibly do anything, until after Pattern.compile() has executed.  So, the code is already pretty well optimized they way we would want it to be.
 
Jesse Silverman
Bartender
Posts: 1059
33
Eclipse IDE Postgres Database C++ Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Mike:

I regret I have only one Thumb in here to Up that.

My concern is, and I understand why they do that, is when explaining Stream stuff to newbies, many/most/almost all(?) tutorials make it look like the Stream represents a Loop, and it sort of does, and sort of doesn't.

So in Ron's example, the stage that internally gets built from:

.flatMap(Pattern.compile("\\s+")::splitAsStream)

runs exactly once for each element in the original stream, with the "parameter" actually being the next element in the Stream at that moment, but that is very different than having the .flatMap() call itself with the parameter seen in the source getting evaluated more than once per use of the Stream.

I think a lot of beginners don't realize that, I know I do now, and I'm not sure at what moment I first "got it".

If I review any of many "Intro to Streams" presentations, it seems easy to fall into that misunderstanding.

One of the biggest concepts to get, as you point out, is "Lazy Evaluation".
Instead of writing imperative code where we blindly call some functionality and pass its result to a method (or something else), we pass the functionality in and say "Well, if you should need to call it, there it is."
I get that this is by no means limited to Streams.

I used function pointers for years in C and C++, so despite many people saying "This is nothing like that!" they are both ways of saying "Here's some code if you decide you need to call it.  Have a nice day!"

Thanks again.
 
Saloon Keeper
Posts: 8449
71
Eclipse IDE Firefox Browser MySQL Database VI Editor Java Windows
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Campbell Ritchie wrote:You are correct. Lines 2‑4 crteate a plain simple Stream<String>.
If you look up Stream#flatMap(), you find it takes a Function as its parameter. The consumer side of that Function's method takes a T or one of its supertypes. The method returns a type not actually specified, but it is a subtype of a Stream containing elements which are a subtype of R.
Remember that for the purposes of generics wildcards, every type os a subtype and a supertype of itself.
What we are doing (both our code snippets do more or less the same thing) is splitting the text into an array, and then creating a Stream to process the components of that array. It is that Stream that is flattened into part of a new Stream.
Line 6's meaning should be obvious (otherwise see Stream#collect()).

Campbell,
how is this different than using:

Thanks to all the contributors to this thread, I've thoroughly enjoyed it and have learned from it.
 
Piet Souris
Saloon Keeper
Posts: 4612
182
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The only difference is that in Campbell version, the result is an ArrayList, in your version you can't be sure. In your case, if you do want a certain result type,  you could use
Collectors.toCollection(LinkedList::new)
 
Mike Simmons
Master Rancher
Posts: 4002
52
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Jesse Silverman wrote:My concern is, and I understand why they do that, is when explaining Stream stuff to newbies, many/most/almost all(?) tutorials make it look like the Stream represents a Loop, and it sort of does, and sort of doesn't.


Yeah - to a first approximation, it's a loop.  But more accurately, it's like a little worker robot you have, that you can give a series of commands in advance (non-terminal operations) that describe what it will do in a loop, eventually.  Then you finally get around to saying "now, do it" (calling a terminal operation). and it runs the loop per the rules you have set up.  It's like you're stacking up a series of commands, that will be replayed on each element in the loop... and then you run it.

Jesse Silverman wrote:I used function pointers for years in C and C++, so despite many people saying "This is nothing like that!" they are both ways of saying "Here's some code if you decide you need to call it.  Have a nice day!"


Yup, it's very much like that, actually.  But with more compile-time checking of type safety.
 
Campbell Ritchie
Marshal
Posts: 73760
332
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Carey Brown wrote:. . . Campbell,
how is this different than using:
. . .

Piet has already explained it. It is also in the API for the java.util.stream package

That package link about ⅔ way down wrote:pulling the mapping operation out of the accumulator function, we could express it more succinctly as:


    List<String> strings = stream.map(Object::toString)
                                 .collect(ArrayList::new, ArrayList::add, ArrayList::addAll);

Note that Stream#toList() does something slightly different again. Also that Object::toString is another example of consumer‑super, or ...<? super T>...
 
Rob Spoor
Marshal
Posts: 22409
121
Eclipse IDE Spring VI Editor Chrome Java Windows
  • Likes 2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Regarding the regex performance discussion: for a single string, there is no real difference between calling Pattern.compile(regex).split(s) or s.split(regex), as the latter delegates to the former. (There is some optimized path that's followed under certain conditions*, but the default is to delegate.)
However, if you do this for multiple strings, it makes sense to reuse the Pattern and go for the first form. After all, why create the same Pattern instance over and over? Since Pattern is immutable, I even turn them into private static final fields if the regex is always going to be the same.


* This comment from the String source code explains these conditions:
 
Jesse Silverman
Bartender
Posts: 1059
33
Eclipse IDE Postgres Database C++ Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Rob Spoor wrote:...
However, if you do this for multiple strings, it makes sense to reuse the Pattern and go for the first form. After all, why create the same Pattern instance over and over? Since Pattern is immutable, I even turn them into private static final fields if the regex is always going to be the same.
]



Thanks Rob!

I realized something.
For many people, learning just the easiest, simplest convenience way of doing something is always the quickest path to basic competence.

For those who can't stop from wondering if the slightly more complex/slightly longer way isn't actually buying something, it at least makes sense to learn "no difference"/"usually small difference but sometimes not so small" because they are going to waste time wondering whether the other way is better otherwise.  I don't really consider this "premature optimization" because if you reach for the more efficient one out of habit it barely takes any longer to write or to read, just less time to run.

I'm stuck in the latter camp, as this example shows.

I could imagine that the Java team could have decided that there are few enough Regexes in most programs, and they are so frequently re-used that they could implement "Pattern Pooling" like with String literals or Integer.valueOf(5), and it would save each compiled result of Pattern.compile(something) in a pool.  Had they done that, there would be "no difference".  I don't think they did tho.  I will probably remember all this when I come back to "learning more about Java Regexes".
 
Jesse Silverman
Bartender
Posts: 1059
33
Eclipse IDE Postgres Database C++ Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Just wanted to say, since my current focus is on Streams, that the abstractions of:

Stream<T> myStream = Stream.of(  ... );

and

Stream<T> myStream = Stream.empty( );

work just GREAT for me, because there's no angst about whether the stream is "read-only" or "fixed-size" or "mutable" like I get when I see a List<T> reference.

The abstraction of Stream in Java feels less leaky than that of List, ironically since they do sound wet.
 
Mike Simmons
Master Rancher
Posts: 4002
52
  • Likes 2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If it helps, I think in many cases we don't really need List - often Iterable is all we need.  That's a good replacement for about 70% of the Lists I see in code.  About half the remainder could be replaced with "Collection".  I think we just like the fact that the word "List" is so short, we tend to use it as a type even when it isn't the most appropriate.
 
Jesse Silverman
Bartender
Posts: 1059
33
Eclipse IDE Postgres Database C++ Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Mike Simmons wrote:If it helps, I think in many cases we don't really need List - often Iterable is all we need.  That's a good replacement for about 70% of the Lists I see in code.  About half the remainder could be replaced with "Collection".  I think we just like the fact that the word "List" is so short, we tend to use it as a type even when it isn't the most appropriate.



I'm not well versed enough in Streams to have my own opinion yet, but I recall seeing Great Java Experts clash at conventions over how often returning a Stream is the right thing to do in an API.

So, the potential choices are several, returning Stream<T>, returning Collection<T>, returning a very specific collection (e.g. ArrayList<t>), returning a collections interface, (e.g. Map<T>, List<T> ), returning an Iterable<T>...

This thread which has gotten so wildly away from its start gave me a lot to think about in terms of the pros and cons of those choices, earlier in my Java journey than I intended to think about them much, mostly  because I now dislike List<T> as "Yeah, sure, it is a List<T>, whatever THAT means in this case!"
 
Campbell Ritchie
Marshal
Posts: 73760
332
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
MS's point sounds like another example of PECS. The producer is the method returning a List, which declares a subtypemore specific type, and the receiving code (=consumer) decides which supertype of List it is actually going to use.
 
Rob Spoor
Marshal
Posts: 22409
121
Eclipse IDE Spring VI Editor Chrome Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Campbell Ritchie wrote:MS's point sounds like another example of PECS.


For those that don't know this acronym: Producer Extends, Consumer Super. The term is often used with functional interfaces. For instance, you'll often see Function<? super T, ? extends R>, where ? super T is the consumer part (the function consumes input) and ? extends R is the producer part (the function produces a result).
 
Campbell Ritchie
Marshal
Posts: 73760
332
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Rob Spoor wrote:

Campbell Ritchie wrote:. . . PECS.

For those that don't know this acronym: . . .

Reference: Joshua Bloch, Effective Java Pearson:Addison‑Wesley 2/e (2008) page 136 or 3/e (2017) page 141. Post where I tried to explain PECS and that wasn't what the question was about in the first place.
 
Jesse Silverman
Bartender
Posts: 1059
33
Eclipse IDE Postgres Database C++ Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Campbell Ritchie wrote:MS's point sounds like another example of PECS. The producer is the method returning a List, which declares a subtypemore specific type, and the receiving code (=consumer) decides which supertype of List it is actually going to use.



Fair enough, and I think I got that, but as:
? super R
always includes every choice between Object and R inclusive, there is still quite a lot of room for design decisions.
In this particular thread, I became disenchanted with the choice of  List<T> as so ambiguous as to be...disenchanting, realizing for the first time that it is dang near ArrayList<T> and that Java's lack of constant references at the language level made distinctions between mutable and immutable collection objects more important to code readability/manageability.
 
Campbell Ritchie
Marshal
Posts: 73760
332
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Jesse Silverman wrote:. . . ? super R
always includes every choice between Object and R inclusive . . .

That may include Serializable and Cloneable or other interfaces, too.
 
Jesse Silverman
Bartender
Posts: 1059
33
Eclipse IDE Postgres Database C++ Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Campbell Ritchie wrote:

Jesse Silverman wrote:. . . ? super R
always includes every choice between Object and R inclusive . . .

That may include Serializable and Cloneable or other interfaces, too.



Or as Mike pointed out, Iterable is often a good choice.

+100 for people who like TMTOWTDI
https://en.wikipedia.org/wiki/There%27s_more_than_one_way_to_do_it

Remember that I am a Perl 5 Refugee, so "Decisions, decisions, DECISIONS!"
I note with great bemusement that they even have two ways to spell the acronym representing how great and wonderful a multiplicity of choices is:
TMTOWTDI or TIMTOWTDI

By contrast, I was emboldened that Python would be easy because:
In contrast, part of the Zen of Python is, "There should be one— and preferably only one —obvious way to do it."

Unfortunately, that has Gone Out the Window in Modern Python 3, however, in discussions with the Python maintainers, one said "It should still be true that there should be one-- and preferably only one -- obvious BEST way to do it."  Meaning that there will still often be one BEST way, for those who strive to write great code, tho Python will still eat dang near anything and run it....
 
Saloon Keeper
Posts: 13198
286
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
As much as I *want* to use Iterable in my parameter lists, it's often too limiting. I rarely write for-loops anymore, and for-loops are the only thing that Iterable is good for.

Either I want to perform a defensive copy of a sequence of elements in which case it's more efficient to accept a Collection, or I want to perform a sequence of operations on each element once, in which case I will accept a Stream. There is no point in accepting a Collection if the very first thing you're going to do is call stream() on it.

Maybe Iterable would have had some merit if arrays implemented it.

Here too, I think C# took the better approach: IEnumerable is the root interface of all types that represent a sequence of elements that can be consumed (at least) once. It's like Iterable and Stream rolled into one, with the extra benefit that the copy constructors of all collection types accept IEnumerable instead of ICollection.
 
Mike Simmons
Master Rancher
Posts: 4002
52
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
It's true that Iterable is a bit less useful now than it was before streams came out, because it's harder to stream an Iterable.  I'm not sure why they put a stream() method on Collection but not on Iterable.  It's possible to convert it to a Stream ourselves of course:

But it would be nice to have a standard method right there where we need it.  And having an overrideable instance method makes it easier to automatically pick up the optimized implementations in various subclasses.  Though to be fair, many/most Collections subclasses put their optimizations in the spliterator() method, which we do pick up with this code.

One big thing I like about Iterable is that it doesn't force you to have everything in the Collection in memory at once.  You could be processing lines in a file, or even referencing an infinite sequence of some sort, and don't have to keep it all in memory.  Streams allow this too, of course, and have more convenience methods for generating infinite sequences.  But it's annoying that they don't make it easier to go directly from Iterable to Stream, without Collection.

Stephan van Hulst wrote:There is no point in accepting a Collection if the very first thing you're going to do is call stream() on it.


I disagree a bit there - if you write a method that accepts a Stream, you don't know offhand if it's ordered or multithreaded.  You can test, but that's more work.  If you accept a List and call a few methods yourself like stream() or parallelStream(), you can easily guide it where you need it to go.  I guess if you always write good thread-safe code, that's best - but I think coders who accept a Stream may find themselves surprised when it doesn't always behave as they expect.

Stephan van Hulst wrote:Maybe Iterable would have had some merit if arrays implemented it.


That would definitely be nice.  Along the way, they could also give us proper hashCode(), equals() and toString() methods.

I definitely agree with your points about C# and IEnumerable.
 
Rob Spoor
Marshal
Posts: 22409
121
Eclipse IDE Spring VI Editor Chrome Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Mike Simmons wrote:I'm not sure why they put a stream() method on Collection but not on Iterable.


There's been a discussion about that in the OpenJDK core-libs-dev mailing list. Here's the start of it all: https://mail.openjdk.java.net/pipermail/core-libs-dev/2021-August/080545.html

But it's annoying that they don't make it easier to go directly from Iterable to Stream, without Collection.


But it's already so easy, it's a one liner. You've already written it. You just like it to be even shorter

Stephan van Hulst wrote:Maybe Iterable would have had some merit if arrays implemented it.


That would definitely be nice.  Along the way, they could also give us proper hashCode(), equals() and toString() methods.


Although I'd like it too, I doubt it's ever going to happen. They probably think it'll break too much stuff. They have a point of course, as well-known behaviour (equals uses ==) will change.

Of course it's easy enough to use Arrays.equals, Arrays.hashCode and Arrays.toString, as well as JUnit's assertArrayEquals, but it'd at least be nice to get proper toString support. That would allow arrays to be passed to message formats / log statements without having to explicitly convert it to string first.
 
Mike Simmons
Master Rancher
Posts: 4002
52
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Rob Spoor wrote:

But it's annoying that they don't make it easier to go directly from Iterable to Stream, without Collection.


But it's already so easy, it's a one liner. You've already written it. You just like it to be even shorter


True.  But I want it easier for everyone else, too.  Especially since spliterator() and StreamSupport aren't generally something people need to know much about, unless they're implementing and optimizing their own streamable collection.  People learning streams have enough to worry about without those areas.

I also get very used to Stream's fluent interface style, with one instance method leading to another instance method call. - it sticks out like a sore thumb when I have to bring in a static method like Arrays.stream() instead, disrupting the flow.  Yeah, that's mostly aesthetics, but it also influences ease of use, making it easier for newbies to find the right method options in their IDE via autocomplete.

Thanks for the discussion link though - I will look into it.
 
Mike Simmons
Master Rancher
Posts: 4002
52
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Rob Spoor wrote:

Mike Simmons wrote:Along the way, they could also give us proper hashCode(), equals() and toString() methods.


Although I'd like it too, I doubt it's ever going to happen. They probably think it'll break too much stuff. They have a point of course, as well-known behaviour (equals uses ==) will change.


Can you elaborate on the last point?  If equals uses ==, how is that changed by adding good equals() and other methods to arrays?
 
Jesse Silverman
Bartender
Posts: 1059
33
Eclipse IDE Postgres Database C++ Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Mike Simmons wrote:

But I want it easier for everyone else, too.  Especially since spliterator() and StreamSupport aren't generally something people need to know much about, unless they're implementing and optimizing their own streamable collection.  People learning streams have enough to worry about without those areas.

I also get very used to Stream's fluent interface style, with one instance method leading to another instance method call. - it sticks out like a sore thumb when I have to bring in a static method like Arrays.stream() instead, disrupting the flow.



Agreed and Agreed!

JavaDocs wrote:This class is mostly for library writers presenting stream views of data structures; most static stream methods intended for end users are in the various Stream classes



I am in that weird position where I am still catching up on something (Streams) that was new a while ago, but now is just "Basic Java, Man!"
It can go really deep, clearly way beyond "Basic Java" and here I am (mostly) letting the OCJP 819 guide me (to the extent I can tell which parts are in scope from the Jeanne and Scott book), tho some things that clearly are out of scope for that are just too interesting to ignore.  I'm trying.  Spliterator and StreamSupport are out of scope for the first two chapters on FunctionalProgramming in their book, I'll see if they come back to them in the chapter on Multi-threading or not.
 
Rob Spoor
Marshal
Posts: 22409
121
Eclipse IDE Spring VI Editor Chrome Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Mike Simmons wrote:

Rob Spoor wrote:

Mike Simmons wrote:Along the way, they could also give us proper hashCode(), equals() and toString() methods.


Although I'd like it too, I doubt it's ever going to happen. They probably think it'll break too much stuff. They have a point of course, as well-known behaviour (equals uses ==) will change.


Can you elaborate on the last point?  If equals uses ==, how is that changed by adding good equals() and other methods to arrays?


Because some code may depend on array equality meaning identity equality. If equals starts checking for element equality, that code will break.

I don't see any valid reason not to override toString though. I doubt anybody needs toString to return [I@12345678 instead of [1, 2, 3, 4, 5].
 
Mike Simmons
Master Rancher
Posts: 4002
52
  • Likes 2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
@Rob - Ah, I see.  I guess my main reaction to that is that anyone doing such a thing is just wrong.  Seriously, if I saw code comparing arrays using .equals(), I would assume it was a mistake and they should have used Arrays.equals().  Alternately, if they really wanted to check identity, they should have used == which would be clearer, faster, and would not have any possibility of throwing an NPE.

I do get that Oracle have a higher bar than I do for not changing behavior of existing code.  But that doesn't mean I have to like it.
 
Campbell Ritchie
Marshal
Posts: 73760
332
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Rob Spoor wrote:. . . I don't see any valid reason not to override toString though. I doubt anybody needs toString to return [I@12345678 instead of [1, 2, 3, 4, 5].

There were quite a lot of dubious design decisions in the early days of Java®. I think that is why Arrays was introduced to remedy what I think are deficiencies in the design of arrays, only two major versions later. Once the possibly dubious decision was made not to override equals(), hashCode(), and toString() on arrays, it becomes difficult to do so without, if not breaking old code, at least messing it about.
After all, they did override clone() on arrays.
 
Rob Spoor
Marshal
Posts: 22409
121
Eclipse IDE Spring VI Editor Chrome Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Overriding clone() was easy to do. In Object, it's protected, which means that unless a class makes it explicitly public, it can't be called. The other methods are already always there. They probably forgot to override all Object methods in the first versions of Java, which meant we got stuck with the default public implementations (equals, hashCode, toString). If that's the case, clone() couldn't be called until they made it public, which they then did in the correct way. It's even one of the few clone() implementations that uses covariant returns.

The reason why covariant returns weren't used for existing classes probably has to do with binary compatibility. For source compatibility, it's OK to change the return type of a method, as long as the new return type is assignable to the original return type. For binary compatibility, the return type matters, as it's part of the binary method signature. You can see that in JNI, where the method signature is <method_name>(<arg_types>)<return_type>, for example doSomething(Ljava.lang.String;II)V for a method void doSomething(String, int, int).
 
This is my favorite tiny ad:
the value of filler advertising in 2021
https://coderanch.com/t/730886/filler-advertising
reply
    Bookmark Topic Watch Topic
  • New Topic