• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Why is a stream traversable only once?

 
Ranch Hand
Posts: 353
3
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
So, I see statements such as

a stream can be traversed only once and the stream is said to have been consumed after it has been traversed


I do not understand what the statement means, exactly.  What does consume actually mean in this context - expired, fall out of scope, disabled or simply disappears?
What are the reasons for, as it is, making the stream traversable only once?


 
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I guess expired is the right one. It has been used once. See it as a one way trip.. Once you take the flight, your boarding pass expires. Reusing a stream may cause an IllegalStateException.
 
Saloon Keeper
Posts: 15484
363
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The reason is that you can create streams from things that can only be used once by definition, such as an Iterator or a BufferedReader. You can think of a Stream as being consumed the same way as having used a BufferedReader to read a text file to its end. Once you reach the end of the file, the BufferedReader doesn't stop existing, but it just become useless as you can't get anything out of it anymore. If you want to read the file again, you have to create a new reader. The same goes for streams. If you want to process the source of the stream twice, you have to create two separate streams.
 
Biniman Idugboe
Ranch Hand
Posts: 353
3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks Marc and Stephan.  A new stream has to be created in order to repeat the same operation, then whatever happened to reusability.  Isn't reusabelility supposed to be an important aspect of Java programming?
 
Marshal
Posts: 79151
377
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You are confusing reuse of code and reuse of an object. You can reuse code by having one method which is used in different locations. What you have here is an object which is specifically designed to be used in a certain fashion. You go from the first element to the last element. Stream object are intentionally designed not to go back to the first element. That means they can be created simply, requiring little memory for the elements they are processing. In fact a simple sequential stream might only need enough memory for one element. Once that element has been finished with, the memory location is filled by the reference to the next element. There is often no need to keep a record of how many elements have been processed. But in order to take advantage of that tiny memory footprint, it is necessary to close each stream after it has been used.

Duplicating discussion in our Java8 forum.
 
Biniman Idugboe
Ranch Hand
Posts: 353
3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

In fact a simple sequential stream might only need enough memory for one element. Once that element has been finished with, the memory location is filled by the reference to the next element. There is often no need to keep a record of how many elements have been processed.


That clarifies it all for me.  Thanks.
 
Biniman Idugboe
Ranch Hand
Posts: 353
3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
By the way, can I mark my own post as helpful?
 
Stephan van Hulst
Saloon Keeper
Posts: 15484
363
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
What do you mean? If you want to give it a thumbs up, I have just done that for you.

If you want to keep it around for future reference, you can bookmark this topic.
 
Ranch Hand
Posts: 42
2
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
And to complete the thought, if for some reason you wanted to use the contents of the stream again, you could assign the contents of each element to an internal array.

And to complete the thought further, you could wrap the stream function into another class that includes a rewind function to close the existing stream and reopen it at index 0.
 
Saloon Keeper
Posts: 27752
196
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

ras oscar wrote:And to complete the thought, if for some reason you wanted to use the contents of the stream again, you could assign the contents of each element to an internal array.

And to complete the thought further, you could wrap the stream function into another class that includes a rewind function to close the existing stream and reopen it at index 0.



Actually, InputStream has a mark and reset method. If the stream object returns "true" for isMarkSupported, you can mark your position and reset to it to re-read from the mark point.

Marking is similar to the seek() functions that many filesystems offer, except that it is more limited. Its primary intent is to allow for re-scanning from a set point (something that parsers like to do, for example).

Not all streams support marking, since sometimes the data is one-shot. A classic example of this would be an old-time punched-card reader device. Once a card was read, you couldn't reverse the mechanism and "un-read"' the card. These days you might stream on something coming in from a network packet stream, as a more modern example.

Streams are different than files - which, alas, are not the same thing as java.io.File (which is actually more about directories and such than it is about file I/O). Files may be opened for random access. Streams, by definition are linearly accesses, often with some pre-processing of the data being done. For example LineNumberInputStream, which watches for "newline" indicators and maintains a line number counter.
 
Campbell Ritchie
Marshal
Posts: 79151
377
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Biniman Idugboe wrote:. . .  Thanks.

That's a pleasure

I thought you were asking about java.util.stream.Stream and similar objects, rather than InputStream, which does have a limited capability for resetting from a previous index.
Ras Oscar made a good point about storing data. If you have a source whose data cannot be read again, as Tim H said, I suggest you consider whether it is worth collecting everything in a collection of some sort. You will probably find a List<String> easier to use for keyboard input than an array.
 
Biniman Idugboe
Ranch Hand
Posts: 353
3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
All points are noted.
 
Rancher
Posts: 517
15
Notepad Java
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

a stream can be traversed only once and the stream is said to have been consumed after it has been traversed



Streams (java.util.stream.Stream) are meant for data transformation. Collections are, on the other hand, are about data storage and access of that data. Stream operations like filter/map/reduce are most common data transformation operations (a.k.a. as functional-style programming).

I think data transformation is one of the reasons why a stream can be traversed once. The next traversal means another transformation, a new process and new result; so make a new stream from the same data store.
 
Tim Holloway
Saloon Keeper
Posts: 27752
196
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Prasad Saya wrote:
Streams (java.util.stream.Stream) are meant for data transformation. Collections are, on the other hand, are about data storage and access of that data.



I'm going to be pedantic, since that's what I do.  

Streams are meant for serial data transfer. Don't forget that there are both input streams and output streams, also.

Your second assertion weakens the first, since if you stream into a collection, you're not necessarily transforming the data, although you are consuming the stream and producing (or adding to) the collection. The nit here is that the data itself may not have been transformed, just its representation, as it were. The data is not the container, whether the container be stream, collection or persistent storage.
 
Prasad Saya
Rancher
Posts: 517
15
Notepad Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Your second assertion weakens the first, since if you stream into a collection, you're not necessarily transforming the data, although you are consuming the stream and producing (or adding to) the collection.



I may create a collection (thru a stream process), but that would be transformed data from a source collection to a target collection. That's what it means. Otherwise it would be just copying a collection.

For example, I may want to collect the names of all the ranchers with more than, say, 20 cows. The rancher collection is transformed: (filtered where cows > 20), mapped (I am getting only the names, not other attributes) and reduced (collected to a collection of choice). That's transformed data.

Also, note that transformation of data is just one of the primary features of the streams. An advantage of using streams is the code shows how the data is transformed (in the process), as opposed to what the code does. This is self documenting - easily readable (hence maintainable) code.
 
Tim Holloway
Saloon Keeper
Posts: 27752
196
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Prasad Saya wrote:

Your second assertion weakens the first, since if you stream into a collection, you're not necessarily transforming the data, although you are consuming the stream and producing (or adding to) the collection.



I may create a collection (thru a stream process), but that would be transformed data from a source collection to a target collection. That's what it means. Otherwise it would be just copying a collection.

For example, I may want to collect the names of all the ranchers with more than, say, 20 cows. The rancher collection is transformed: (filtered where cows > 20), mapped (I am getting only the names, not other attributes) and reduced (collected to a collection of choice). That's transformed data.

Also, note that transformation of data is just one of the primary features of the streams. An advantage of using streams is the code shows how the data is transformed (in the process), as opposed to what the code does. This is self documenting - easily readable (hence maintainable) code.



Don't overlook what I said about the difference between transforming the collection versus transforming the data in the collection.

For that matter, if I read a collection to obtain a collection of database keys so that I can do things with the records indicated by the keys can you really say I've "transformed" those keys? Consumed, yes, but I think most people would balk are considering that as a transformation.
 
Prasad Saya
Rancher
Posts: 517
15
Notepad Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Don't overlook what I said about the difference between transforming the collection versus transforming the data in the collection.

For that matter, if I read a collection to obtain a collection of database keys so that I can do things with the records indicated by the keys can you really say I've "transformed" those keys? Consumed, yes, but I think most people would balk are considering that as a transformation.



Its still consuming the data for transformation. Yes, one may use the keys to process the data, but the data is also part of that consumption. It must be part of the stream data.

I don't know yet if a stream can be made of keys only and would access an external source to access its related data and then transform that data. That would be a case of causing a side-effect, and according to Stream API javadocs, such procedure is discouraged. Also, one would be introducing statefulness into the stream operations.
 
Tim Holloway
Saloon Keeper
Posts: 27752
196
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Prasad Saya wrote:
Its still consuming the data for transformation. Yes, one may use the keys to process the data, but the data is also part of that consumption. It must be part of the stream data.

I don't know yet if a stream can be made of keys only and would access an external source to access its related data and then transform that data. That would be a case of causing a side-effect, and according to Stream API javadocs, such procedure is discouraged. Also, one would be introducing statefulness into the stream operations.



No, there's no transformation being done in this scenario. The data here is an input (possibly one of many inputs) to an unrelated process. In the example case, this process consumes keys and uses them to determine what records from an entirely different (and non-stream) data source will be processes.

Say I have a file with the loan numbers of delinquent loans. I want to mark those loans for follow-up. I read a stream based on the file and the process that marks the loans uses this stream to lookup and mark the loan records. The file is only read, no transformation, conversion or even storage of the data in the file/stream is being done. The data being transformed here is all in the database.

You seem to be thinking of some very intelligent stream processes. That's fine, but the core definition of a stream says nothing about what ultimately should happen to the stream's contents. A basic stream returns data completely un-transformed and leaves it to other processes as to what, if any transformations might be done with that data. The greater does not imply the lesser; just because some streams transform doesn't mean that all streams transform.
 
ras oscar
Ranch Hand
Posts: 42
2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I find this conversation interesting.

Would it be correct to say:

1. Streams as presented as library functions in Java are meant to be used where the data is time sensitive, or in flux, or subject to change primarily outside the application, and the appropriate procedure for updating the data is to re read the source
Example: a program that reads the current price for 12 stocks listed on NASDAQ

2.  Collections classes as presented in the Java library functions are appropriate where data transformations will be  limited to operations within the Java application.
Example: reading a file containing an array of  values for a Java program that records all prime numbers from zero to 500,000
 
Campbell Ritchie
Marshal
Posts: 79151
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Prasad Saya wrote:[. . . . Its still consuming the data for transformation. . . . .

Let's not get into a war of words, but the idea behind the Streams I thought the OP was asking about is that they leave the source of the data unchanged. And that includes all the data in a collection.

I don't know yet if a stream can be made of keys only and would access an external source . . . .

That is conceivable. But you would have to use a very awkward intermediate operation to map the key to the data read from the database. You would have a similarly awkward terminal operation to write the new data back, which is a side‑effect. Remember that a few terminal operations, e.g. forEach(), are allowed to have side‑effects, and that reading information from a store of information (I think) doesn't count as a side‑effect.
I also think that isn't how databases and java.util.stream.Stream were intended to be used.
 
Campbell Ritchie
Marshal
Posts: 79151
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

ras oscar wrote:. . . . 1. Streams as presented as library functions in Java are meant to be used where the data is time sensitive, or in flux, or subject to change . . .

No. Although it is possible to use a Stream like that, they are suitable for any collection. Please look in the documentation and see whether it says anything about the source collection being modified by other code while it is being read by a Stream.

2.  Collections classes as presented in the Java library functions are appropriate where data transformations will be  limited to operations within the Java application.

Not sure I understand that statement.

Example: reading a file containing an array of  values for a Java program that records all prime numbers from zero to 500,000

That is a bad example. You might be more likely to take all natural numbers ≤ 500000 and use another method to find whether they are prime, maybe by checking from your array. Example follows, assuming primes is a BitSet which has already been filled with whether a number is or isn't a prime. I would use a Sieve of Eratosthenes to fill that BitSet or array.Line 2 selects values without any transformation; line 4 transforms them to a a different, but related, type, and line 5 collects them into a new List without further transformation. That list will be biased towards small numbers because there are more small primes as a proportion of all numbers; it says in The Strange Case of the Dog in the Night‑Time that the number of primes ≤ n is approximately proportional to logn.
Look here and scroll down to “non‑interference” to find out about modifying Collections whilst iterating them with a Stream.
 
Prasad Saya
Rancher
Posts: 517
15
Notepad Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Tim Holloway wrote:

Prasad Saya wrote:
Its still consuming the data for transformation. Yes, one may use the keys to process the data, but the data is also part of that consumption. It must be part of the stream data.

I don't know yet if a stream can be made of keys only and would access an external source to access its related data and then transform that data. That would be a case of causing a side-effect, and according to Stream API javadocs, such procedure is discouraged. Also, one would be introducing statefulness into the stream operations.



No, there's no transformation being done in this scenario. The data here is an input (possibly one of many inputs) to an unrelated process. In the example case, this process consumes keys and uses them to determine what records from an entirely different (and non-stream) data source will be processes.

Say I have a file with the loan numbers of delinquent loans. I want to mark those loans for follow-up. I read a stream based on the file and the process that marks the loans uses this stream to lookup and mark the loan records. The file is only read, no transformation, conversion or even storage of the data in the file/stream is being done. The data being transformed here is all in the database.

You seem to be thinking of some very intelligent stream processes. That's fine, but the core definition of a stream says nothing about what ultimately should happen to the stream's contents. A basic stream returns data completely un-transformed and leaves it to other processes as to what, if any transformations might be done with that data. The greater does not imply the lesser; just because some streams transform doesn't mean that all streams transform.



Referring this from the above quote:

Say I have a file with the loan numbers of delinquent loans. I want to mark those loans for follow-up. I read a stream based on the file and the process that marks the loans uses this stream to lookup and mark the loan records. The file is only read, no transformation, conversion or even storage of the data in the file/stream is being done. The data being transformed here is all in the database.



Yes, one can do this process. But what happens if the database data gets modified while this process is in progress? There is a possibility that the database can get modified (the loan data, in this case) while the marking the loans is in progress. Some other process might update or delete some of the loan data.

In such a case why should we use a stream? One can as well read the file with delinquent loans and update the loans database.
 
Campbell Ritchie
Marshal
Posts: 79151
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Prasad Saya wrote:. . . Referring this from the above quote: . . .

Please only quote the part of the previous post you are addressing, i.e. that part, not the whole post. That simply makes the posts longer without adding new information. Quoting a small part makes it clear what you are addressing.

. . . what happens if the database data gets modified while this process is in progress? . . .

Have a look at the post about non‑interference I gave you earlier. See whether that helps. I don't know what would happen if the database is altered concurrently. That sounds like a policy decision you would have to make. Are you locking the database? Are new payments being recorded so loans lose their delinquent status? Don't know. I think you are going to have to test that.

One can as well read the file with delinquent loans and update the loans database.

Surely you would update the delinquency file from the database?
When Tim is going on about the greater not implying the lesser, I shall go on about hard cases making bad law. I would suggest we try a straightforward example. I can't envisage many people using a database and Stream together as you are suggesting. Let's try something simpler.
reply
    Bookmark Topic Watch Topic
  • New Topic