• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
  • Campbell Ritchie
  • Ron McLeod
  • Paul Clapham
  • Jeanne Boyarsky
  • Liutauras Vilda
  • Tim Cooke
  • Bear Bibeault
  • paul wheaton
Saloon Keepers:
  • Carey Brown
  • Stephan van Hulst
  • Tim Holloway
  • Mikalai Zaikin
  • Piet Souris

Text Processing in Python by David Mertz

Posts: 962
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
<pre>Author/s : David Mertz
Publisher : Addison-Wesley
Category : Other
Review by : Margarita Isayeva
Rating : 8 horseshoes
This book provides a thorough overview of techniques, standard and non-standard modules to perform various tasks that fall under "text processing" umbrella. An ideal reader should be already familiar with Python or experienced in other languages. For the latter category there is an Appendix with a short introduction into Python basics.
The text is evenly divided into five chapters, 70-100 pages each.
Chapter 1 starts with a discussion of functional programming and higher-order functions, followed by an overview of Python's features and data types important for text processing. Relevant (if even remotely) modules in the Standard library are listed, most important of them are illustrated with examples. Chapter 2 shows how standard Python functions, including the most important string module, can be used to solve problems (example: counting number of words in a given text). Chapter 3 offers a short introduction into Regular Expressions followed by several examples of Python programs, usually about a page long (one of the problems to solve: detecting duplicate words). Chapter 4 starts with a light introduction into parsing, grammars and state machines. The author advises on when to use them and when not, then proceeds to an overview of the standard library. Non-standard mx.TextTools, SimpleParse and PLY libraries are compared and their functionality described in more details. Chapter 5 is devoted to assorted tasks, from working with E-mail to parsing HTML and XML, and consists mostly of standard and third-party libraries overviews.
The overall approach is a bit conceptually-oriented, there are questions and problems to solve at the end of the chapters, as one segment of the book's target audience are students. Practitioners will appreciate this book as a solid reference on available Python text-processing tools.

More info at Amazon.com
More info at Amazon.co.uk
Think of how dumb the average person is. Mathematically, half of them are EVEN DUMBER. Smart tiny ad:
Low Tech Laboratory
    Bookmark Topic Watch Topic
  • New Topic