• Post Reply Bookmark Topic Watch Topic
  • New Topic

RE Book - When to use and When to not use  RSS feed

 
Gregg Bolinger
Ranch Hand
Posts: 15304
6
Chrome IntelliJ IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Max, does your book cover suggestions on when to use RE's and when they might over-complicate things (if they would).
Also, does your book give any examples of the above to say; this is how it would be done without, now look how much easier it was to do using RE's?
 
Max Habibi
town drunk
( and author)
Sheriff
Posts: 4118
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Absolutely,
I couldn't agree with this sentiment more. Let me quote myself from page 163.

When you have a hammer, everything starts to look a nail. It's important to be aware that not all text-parsing problems require a regex solution. For example, say you need to break a coma-delimited String into it's various components. Of course, it's easy enough to write a regex that does this. However, you don't need regular expressions for this problem: the StringTokenizer is enough.
Along the same lines, you don't have to limit yourself exclusively to a regex or a traditional Java solution. You can mix and match. For example, say you needed to parse a log file and identify the type and frequency of the Exceptions in it. You could probably write a regex that does this for you in a single line-I'm not smart enough or patient enough to do this, but there are probably plenty of people who are.
However, I would content that this is probably isn't the correct approach in Java. by the time you're done writing, testing, and documenting the regex, the other programmers on your team will be trembling in fear at the thought of maintaining your code. It's probably easier to take a programmatic solution that takes advantage of regex, as opposed to writing a pure regex solution. Such is presented in Listing 4-9.

M
[ April 13, 2004: Message edited by: Max Habibi ]
 
Avi Abrami
Ranch Hand
Posts: 1141
1
Java Oracle
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Max,
According to my reckoning, you have five grammatical and spelling mistakes in the (book) excerpt that you posted. That's an average of almost two mistakes per paragraph. Did you post the excerpt from memory, or is this the actual text as it appears in the book? Pardon me, but that implies (to me) a low level of professionalism on the part of the book's publishers. Is there an errata page where people can draw attention to mistakes found in the book? Perhaps we can hope that the next printing of the book will contain less mistakes.
Good Luck,
Avi.
 
Jeroen Wenting
Ranch Hand
Posts: 5093
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I count 2 errors myself (writing it's where it should be its and coma where comma is intended), some superfluous (but technically correct) punctuation, and a few places where I'd have used different expressions or syntax to convey the same message in fewer words.
All are quite common mistakes and differences of opinion which no automated checker would find and human proofreaders might well not find either.
What counts is that the intent of the author is communicated clearly and it is. Of course in future pressings of the book such things should be corrected as found
 
Tim West
Ranch Hand
Posts: 539
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Three others are:
  • The missing word 'like' in the first sentence: "When you have a hammer, everything starts to look [like] a nail".
  • Use of 'content' in the last paragraph where 'contend' is intended: "I would content that this is probably isn't the correct approach in Java"
  • No capitalisation on 'by', the first word of the second sentence in the final paragraph.


  • That said, I agree with Jeroen's sentiments and wouldn't want to detract from what sounds like a great book on the basis of this.
    -- Tim
    (On a side note, I think my post is probably the result of being brought up by an English teacher :-S)
     
    Max Habibi
    town drunk
    ( and author)
    Sheriff
    Posts: 4118
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Originally posted by Avi Abrami:
    Max,
    According to my reckoning, you have five grammatical and spelling mistakes in the (book) excerpt that you posted. That's an average of almost two mistakes per paragraph. Did you post the excerpt from memory, or is this the actual text as it appears in the book? Pardon me, but that implies (to me) a low level of professionalism on the part of the book's publishers. Is there an errata page where people can draw attention to mistakes found in the book? Perhaps we can hope that the next printing of the book will contain less mistakes.
    Good Luck,
    Avi.


    I typed it while looking @ the book, so there are probably errors. But the book itself isn't completely free of errors: so far, I'm not aware of one that is. However, different books speak to different audiences, and you may find the tone, content, or general sense doesn't communicate well to you.
    But yes, there will be an errata sheet. Mebbe I can get some of you to proof it
    M
    [ April 14, 2004: Message edited by: Max Habibi ]
     
    Jim Yingst
    Wanderer
    Sheriff
    Posts: 18671
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Already checked. I counted seven anomalies in Max's excerpt, ranging from outright errors to stylistic oddities. All were correct in the book. Don't let Max's typing prevent you from buying the book.
    [AA]: Perhaps we can hope that the next printing of the book will contain less mistakes.
    Or fewer mistakes, as the case may be. But actually this isn't a problem.
    I did notice one oddity in the book while looking up the above excerpt, nearby on p. 161:

    Max, is there some situation in which the matcher() method might return null? The null check seems superfluous to me. I see the same thing on 155 too - but not on many other pages using matcher(). Wuzzup?
    [ April 14, 2004: Message edited by: Jim Yingst ]
     
    Max Habibi
    town drunk
    ( and author)
    Sheriff
    Posts: 4118
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    hmmm...looks like someone stole Paul's book
    To answer your question. As far as I could ascertain, there is no requirement that the Pattern.matcher method actually return a Matcher object in the case of unsuccessful calls( that is, where the pattern doesn't match the candidate in any form: think of soliciting a value from a Map when your key isn't in the Map).
    As a matter of fact, the earliest 1.4 JDKs had a habit of not doing this. It's only in 1.4.1+ that they started to change that behavior. The reason I left it in the code is because there was some talk of the IBM JVM choosing this implementation, though I belive they've steered away at this time.
    I actually had an explanation about this in the book, but decided to drop it in the editing phase. I didn't want to burden readers with information they probably wouldn't need. OTOH, I wanted to have a check in place, just in case the architects of the regex stuff changed their mind. I'll probably add a note about this to the errata.
    All best,
    M
     
    Jim Yingst
    Wanderer
    Sheriff
    Posts: 18671
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    looks like someone stole Paul's book
    Yes and no. He got it, was happy to see it, and promptly put his name in it. Of course I pointed out my name was already in it. But this afternoon as I was about to leave work I saw it lying on his bookshelf unattended, and so I... ummm... borrowed it. You know, while waiting for USPS to get their act together. :roll:
    To answer your question. As far as I could ascertain, there is no requirement that the Pattern.matcher method actually return a Matcher object in the case of unsuccessful calls( that is, where the pattern doesn't match the candidate in any form: think of soliciting a value from a Map when your key isn't in the Map).
    The difference being that the API for get() in Map says clearly: "Returns: the value to which this map maps the specified key, or null if the map contains no mapping for this key." Very well-defined. Whereas Pattern's matcher() says: "Returns: A new matcher for this pattern." Also very clear and well-defined, IMO. How can the latter method possibly return null, and still implement the API? Of course we've been known to disagree on interpretation of specs once or twice in the past.
    As a matter of fact, the earliest 1.4 JDKs had a habit of not doing this. It's only in 1.4.1+ that they started to change that behavior.
    Copied from src.zip in JDK 1.4.0:

    This is unchanged in 1.4.2 and 1.5 beta. I don't have a copy of a 1.4 beta handy; perhaps those were more buggy. Though there don't seem to be any relevant bug reports in Sun's database.
    The reason I left it in the code is because there was some talk of the IBM JVM choosing this implementation, though I belive they've steered away at this time.
    I should hope so, given the API.
    I actually had an explanation about this in the book, but decided to drop it in the editing phase. I didn't want to burden readers with information they probably wouldn't need. OTOH, I wanted to have a check in place, just in case the architects of the regex stuff changed their mind. I'll probably add a note about this to the errata.
    You've also got plenty of other code in the book that does not perform this check. Middle of p. 165 for example. Though it's back again at the top of p. 167.
    [ April 14, 2004: Message edited by: Jim Yingst ]
     
    Max Habibi
    town drunk
    ( and author)
    Sheriff
    Posts: 4118
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Originally posted by Jim Yingst:
    The difference being that the API for get() in Map says clearly: "Returns: the value to which this map maps the specified key, or null if the map contains no mapping for this key." Very well-defined. Whereas Pattern's matcher() says: "Returns: A new matcher for this pattern." Also very clear and well-defined, IMO. How can the latter method possibly return null, and still implement the API? Of course we've been known to disagree on interpretation of specs once or twice in the past.

    I know: I'm still waiting for your counter-example, where the Thread pauses. Maybe if you read in a Gig?

    As a matter of fact, the earliest 1.4 JDKs had a habit of not doing this. It's only in 1.4.1+ that they started to change that behavior.
    Copied from src.zip in JDK 1.4.0:

    This is unchanged in 1.4.2 and 1.5 beta. I don't have a copy of a 1.4 beta handy; perhaps those were more buggy. Though there don't seem to be any relevant bug reports in Sun's database.

    I can see where the contructor of Matcher might throw an IndexOutOfBounds Exception, but that wouldn't produce a null. Maybe I'm wrong about the version number, and it was in the Beta. I remember poking around about this, because I was also surprised about the behavior: that's actually why the null check is in there, and the reason I started looking into this. But you're probably right: the null check seems superfluous. I'll reconfirm, and potentially add it to the errata. Thanks for bring it up, though I wish you hadn't caught me flat-footed during the promo
    All best,
    M
     
    Jim Yingst
    Wanderer
    Sheriff
    Posts: 18671
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    I'm still waiting for your counter-example, where the Thread pauses. Maybe if you read in a Gig?
    I remember that at one point I did drop quite a few of the subtopics from my replies, just to focus on one particular issue that seemed fairly clear-cut to me (the API allowing incomplete reads). My thought was to just get that one out of the way before returning to the other issues - but then we ended in that terminal deadlock, neither able to convince the other. Maybe after the promo I'll look again at the earlier parts of that thread to see if there are points worth reopening.
    Thanks for bring it up, though I wish you hadn't caught me flat-footed during the promo
    Sorry 'bout that - wasn't the original intent.
    Returning to the original topic of this thread - I agree wholeheartedly with what Max wrote in his book on the topic. Many people, after learning about regexes, seem to enjoy crafting complex regexes that do all the processing they need in a single line. There's a certain fun in doing this, similar to solving puzzles or participating in an obfuscated Perl contest. But it's usually a bad idea from a software engineering perpective no one else wants to maintain it. The original coder probably won't want to maintain it, a few weeks later when he revists the code and doesn't remember exactly what he was thinking. There are a lot of long regexes that would benefit from being split into two or more (typically outer and inner expressions, rather than first and last). Or on short expression and some other Java processing.
     
    • Post Reply Bookmark Topic Watch Topic
    • New Topic
    Boost this thread!