File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Book Reviews and the fly likes Python for Data Analysis Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Books » Book Reviews
Bookmark "Python for Data Analysis" Watch "Python for Data Analysis" New topic
Author

Python for Data Analysis

Book Review Team
Bartender

Joined: Feb 15, 2002
Posts: 943
Author/s    : Wes McKinney
Publisher   : O'Reilly Media
Category   : Other
Review by : Deepak Bala
Rating        : 7 horseshoes

I managed to catch an early release version of this book. It bears resemblance to another book from Oreilly - 'Mining the social web', (MTSW) although the examples in this book are not restricted to social web sites.

There are a couple of python libraries that the author takes you through before getting you to work on examples. The chapters on these libraries serve as references when you want to revisit technique X on data structure Y. The illustrations include analyzing bit.ly data / twitter tweets etc. To use any of these libraries you MUST know python.

What I like:

* Chapters explaining library functions serve as a good reference to come back to later.
* The author has taken time to point out pitfalls while using libraries. This includes tips on how to use them practically (Say reading a huge CSV file in patches and aggregating the result)
* Pragmatic examples based on data out there USDA food database / baby names

Areas for improvement:

* I really wish the illustrations and library usage were married together. The pandas library needs getting used to for a newcomer. When you read 10 pages of instructions on how to use the library, you forget what you read on page-1. It would have been much easier had the author explained them side by side.

* I don't think the introduction to python was necessary towards the end of the book. It seemed like a weird place to put it. If you wanted to introduce users to python before they use pandas and numpy, you'd rather want to put that as the first chapter and tell those who already know python to skip it. IMHO if you do not know python, following some of the examples will be harder since you need to learn python AND libraries like pandas.


Overall a nice book. I would recommend readers to read this book and then follow it up with MTSW. I dont think the author of MTSW used pandas on any of the use cases, so using the knowledge learned here should help (I've not read MTSW yet and searching the book for 'pandas' threw up no results)

More info at Amazon.com
 
Consider Paul's rocket mass heater.
 
subject: Python for Data Analysis