Congratulations for writing and publishing your book!
Q1] I noticed that the MEAP started around middle of 2013. Is there any component of RabbitMQ that has changed drastically in the last 2 years, and if so, does the book cover those latest changes?
Q2] I see a lot of big data systems going for Kafka as the messaging solution. If you are knowledgeable about Kafka, how does RabbitMQ compare to Kafka in your opinion? In general, in your opinion, how does RabbitMQ fare in big data systems and what are some of its well-known bottlenecks?
Re question #1 - The book is up to date with RabbitMQ v3.5.2 and I have been working through the MEAP to keep it up to date. One of the more fun aspects of writing a book on RabbitMQ is the team has watched what I've had to say and they have directly addressed problems I've discussed in the book requiring me to go back and deal with it. For example, when I first started RabbitMQ used TCP back pressure to throttle publishers when needed. I had written a whole section on how to detect and deal with this issue. Within about 6 months of the MEAP release of the chapter covering the topic, they added a mechanism to let the client know when they are applying the back pressure and when it could publish again.
Re question #2 - I am aware of Kafka though I have not used it in production. From what I know of Kafka, RabbitMQ is much more flexible. I'm a big fan of using the right tool for the job and I'm sure I'll come across one where I'll be able to put Kafka through its paces some day. I've worked in multiple big-data environments and have had RabbitMQ at the heart of some very large event-based architectures. In my day job, for example, I am feeding Avro datum through RabbitMQ for a large event-based system. The datum are then collected into Avro containers and shipped to S3 for warehousing and EMR processing. I'm currently experimenting with connecting directly up to Apache Spark for the next step in our big-data processing.
I'd say the biggest bottleneck can be when users decide they want to persist messages to disk as part of the publishing contract with RabbitMQ but do not consider that the behavior is exactly the same as a write-heavy OLTP database server. If you want to have persisted messages, you need to provision your servers to support the IOP throughput rates that you would need to perform the same database write throughput velocity that you would expect RabbitMQ to perform.
In both the chapter on publishers and consumers I directly address throughput and performance, providing a map of feature use to performance gain or loss. The chapters have been useful for me and my team in deciding which features we need and want to use for our projects.