This looks like a very interesting book. Looking at the table of contents, I was wondering if you'd considered the use of a streaming approach to analyse not only real-time (or close to real-time) data, but also very large finite stored data. This seems to be the approach behind the Apache Beam project, founded by the guys at Google (I don't work for them, by the way, I am just a researcher). Basically, the idea is that you could use the same abstractions from streaming such as windowing to analyse very large, but finite, data. The conventional dichotomy between batch and stream thus ceases to exist, since all big data, finite or infinite, can be treated as stream.
Fascinating stuff, and it really expands the scope of streaming methods and techniques beyond near-real-time or streaming data.
Congratulations on the publication of your book, I have added it to my "to read" list!
You are certainly on to something here. The approach with Apache Beam is fantastic and something I firmly believe we need as a community. It lays the foundation for how we think about and talks about streaming systems regardless of the underlying stream processing engine. You are correct in the approach fo finite (historically called batch) and infinite data streams. When you think about this you come to the conclusion that a batch is really a dataset with a finite start/end time and thus you can treat is like a stream.
When you start to think about things this way, then you start to think streaming first. If you look at Apache Flink you will see a very similar approach -- from day one Flink had the perspective that everything is a stream and batch is just a stream with a fixed start / stop time.
Thank you for the kind words and added my book to your "to read" list.
Danger, 10,000 volts, very electic .... tiny ad:
Free, earth friendly heat - from the CodeRanch trailboss