Win a copy of Kotlin in Action this week in the Kotlin forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

ParseInt() vs Regex Matcher/Pattern class  RSS feed

 
Sally Jenkins
Greenhorn
Posts: 17
Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am asking the user to enter a string.
I would like to "check" if all characters entered are integers.

Two suggestions were to use:

Integer.parseInt() or Regular Expression (Regex) Pattern class/Matcher class

MyTwo Questions:


  • Is one more method more efficient than the other?
  • Should I be concerned about Exceptions?

  •  
    Campbell Ritchie
    Marshal
    Posts: 55711
    163
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Don't use parseInt, because that will throw an Exception and that means you are using Exceptions instead of control statements.
    If you want to see whether a String matches an integer or not, you can create a regex to match an integer. You can also search for it and use a ready‑made regex from elsewhere. Note that the two will match different things. If you find an integer regex it will probably match numbers like 12345678901011121314151617181920. That is a valid integer but is not a valid int, because ints only range from −2³¹ to  2³¹ − 1.

    Another way to do it is like this:-Yes, you can pass a String to a Scanner constructor and test that String for whether it has a next int, but beware: what will happen if you pass this String:-
    "123 Sally Jenkins"?
     
    Carey Brown
    Bartender
    Posts: 2994
    46
    Eclipse IDE Firefox Browser Java MySQL Database VI Editor Windows
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Sally Jenkins wrote:I would like to "check" if all characters entered are integers.

    Do you mean "integers" or "digits"?
    Integer.parseInt() will accept negative integers. If you are looking for "digits" then this won't work. If you use regex, then you get to decide if a leading (-) is acceptable or not.
     
    Mike Simmons
    Ranch Hand
    Posts: 3090
    14
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    I also would use a Pattern here - however, I think this business of avoiding using exceptions is over-emphasized. If one is new to regular expressions, then using Integer.parseInt() may be a safer way to do it. If we were designing Java's libraries ourselves, then sure, throwing an exception here may not be the best way for parseInt to behave. They could have done better. But, they didn't, and methods like parseInt() are still useful. Occasionally using exceptions for control is one of the the realities we have to deal with in Java. Especially when that's the model chosen by the most widely-available library for a given task.

    In the case of Integer.parseInt(), sure, the regex is not that hard to work out. For people familiar with regexes. However I note that if the question had been about parsing a floating-point value, it's suddenly a lot easier to make mistakes in the regex, and much more reason to go with Double.parseDouble(). And as Carey notes, even the Integer version has room for errors if you're not careful.

    Now, the parse method approach can have other downsides, e.g. what if you want to disallow negatives, or what if you need more digits than an int can store? (Or more rarely, what if you need more than even a long can store?) Stuff to consider...

    Sally: I would say, for reliability, go with whichever approach you understand better. And for your own benefit, then go back and try the other approach as well, to make sure you understand how that one works. Both approaches are well worth knowing about, in the long run.
     
    Campbell Ritchie
    Marshal
    Posts: 55711
    163
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    I would probably still use hasNextXXX, but that uses both regexes and exceptions in the background.
     
    Mike Simmons
    Ranch Hand
    Posts: 3090
    14
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Yes, on reflection that probably is best. The regexes have already been written and debugged by other people, and the exception handling is hidden from view, so people can pretend it isn't there. The problem you noted about what if there's more input after the number - well, that can be resolved by checking hasNext() or next() afterwards, and reacting appropriately if there's something there that shouldn't be there. Likewise, you can choose the method calls to match your requirements, e.g. using nextInt() or nextLong() or nextBigInteger(), depending on what range you need to have, and you can also add checks for other things, like is the number negative (assuming it shouldn't be), etc.
     
    Rajdeep Biswas
    Ranch Hand
    Posts: 231
    1
    Eclipse IDE Java Opera
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Regex is also a way, but thing the OP mentioned, I think the simplest way will be to do Integer.parseInt(). If exception is thrown, its not a number. If not, its cool! Thinking this functionality will go in some utility method, just handle the NumberFormatException and return boolean appropriately.
     
    Campbell Ritchie
    Marshal
    Posts: 55711
    163
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Mike Simmons wrote:. . . The regexes have already been written and debugged . . . and the exception handling is hidden from view, so . . . pretend it isn't there. The problem you noted about what if there's more input after the number . . .
    Good point, that the regexes have been validated by twelve years' regular use. If you get the text for the int with scanner1#next() and pass it to scanner2's constructor, that should reduce the risks of passing further input; it will be necessary however to change the delimiter if your input looks like this:-
    "123+234−345×456÷567"You can of course use hasNextInt on the input directly.
     
    Campbell Ritchie
    Marshal
    Posts: 55711
    163
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Rajdeep Biswas wrote:Regex is also a way, but thing the OP mentioned, I think the simplest way will be to do Integer.parseInt(). If exception is thrown, its not a number. If not, its cool! Thinking this functionality will go in some utility method, just handle the NumberFormatException and return boolean appropriately.
    Exception handling is a poor substitute for control structures; in the case of hasNextInt, there is only a risk of an exception (which is hidden from view) if the regex matches an integer and (probably) if it has ten digits. That is because the only range matched by a regex where one can be sure the text represents an integer and one is not sure whether the text represents an int is where there are ten digits. One can refine that by checking that the first digit is a 2.
     
    Dave Tolls
    Rancher
    Posts: 2914
    36
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Another drawback with hasNextInt is that it will allow '1,000' for example.
    And the OP specifically said that 'all characters entered are integers'.

    That's fairly specific and implies (to me) that this isn't really a number, but possibly some id of some sort...so you don't want '-' or ',' or '.' or anything else that might be a valid thing to appear in an integer, but not in something where you are checking for digits.
     
    Carey Brown
    Bartender
    Posts: 2994
    46
    Eclipse IDE Firefox Browser Java MySQL Database VI Editor Windows
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    OP has appeared to have left the building.
     
    Sally Jenkins
    Greenhorn
    Posts: 17
    Java
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Not actually left the building, more trying to do research in order to follow most of the conversation.

    To respond to some of the questions, I want to allow the user to enter a number, let's say 2356

    Then I need a way to make sure that each character is a numerical character. In other words, I want an error message to appear should the user enter 23j5.

    So, from what I understand, both Regex and parseInt are viable solutions.

    What I do NOT understand is the idea of exceptions. (I thought I did, but it is clear that I do not.)

    Would anyone mind defining the term EXCEPTION.

    The books I am reading along with the online research has not clarified the confusion.

    Mike I appreciate your comment in regard to reliability. You are correct, I can always try another approach.
     
    Dave Tolls
    Rancher
    Posts: 2914
    36
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Sally Jenkins wrote:
    To respond to some of the questions, I want to allow the user to enter a number, let's say 2356

    Then I need a way to make sure that each character is a numerical character. In other words, I want an error message to appear should the user enter 23j5.

    So, from what I understand, both Regex and parseInt are viable solutions.


    Only if -235 is allowable, or +235 for that matter.
    Otherwise (and I suspect this is going to be the case) parseInt is not a viable solution.
     
    Sally Jenkins
    Greenhorn
    Posts: 17
    Java
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Then it appears that parseInt() might present a problem.

    Another solution besides regex was that I can check each character of the string using if statements.

    Is that an efficient approach? (see example below)


     
    Dave Tolls
    Rancher
    Posts: 2914
    36
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    I'd go for the regex.
    It's a fairly simple one I expect. Something involving [0-9] or whatever shortcut.
     
    Rajdeep Biswas
    Ranch Hand
    Posts: 231
    1
    Eclipse IDE Java Opera
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    For definition of exception, head to google.
    In regex, you simply match input for a given pattern without worrying for exceptions. Using parseInt or parseLong, you actually try to parse given string input to number (needed when you need given string input as a number in your application, since external inputs are always as String) and in case of error, gives exception that you should handle.
    In regex, you just match the input against a pattern without the worry of exceptions, since you are not actually parsing it to number.
    For example, head to http://regexr.com/3dmdj and mouseover the pattern.
     
    Carey Brown
    Bartender
    Posts: 2994
    46
    Eclipse IDE Firefox Browser Java MySQL Database VI Editor Windows
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Sally Jenkins wrote:Then it appears that parseInt() might present a problem.

    Another solution besides regex was that I can check each character of the string using if statements.

    Is that an efficient approach? (see example below)



    Sounds like a regex is the way to go for your requirements. A simple regex like "\\d+" should suffice.

    Regarding your above code, it would be better written like:
    Edit: mine would need a tweak because it doesn't handle the case of a null or empty string.
     
    Campbell Ritchie
    Marshal
    Posts: 55711
    163
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Both parseInt() and hasNextInt() would find negative numbers. You would appear to be looking for natural numbers rather than all integers. Is there a maximum value for your number? Do you include 0?

    I did some searching for regular expressions for natural numbers and found a few hits: 1 2. Since they use things like "0"|[1-9][0-9]* there is no limit to the length of the text and numbers of any size can be matched. You will probaby want boundary markers with ^ and $ too. Why not simply copy those regexes and acknowledge where you got them from?
     
    Dave Tolls
    Rancher
    Posts: 2914
    36
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    hasNextInt() (and consequently nextInt() as well) is even worse.
    It allows separators, so '1,000' is a valid int in my locale.
     
    Campbell Ritchie
    Marshal
    Posts: 55711
    163
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Dave Tolls wrote:. . . '1,000' is a valid int in my locale.
    I forgot about that; I did know however that the format used by nextInt is different from integer literals in the code.
     
    Sally Jenkins
    Greenhorn
    Posts: 17
    Java
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    First I would like to thank all of you for your responses. The discussion may have been over my head at times, however it is nice to have a challenge. One of the reasons I requested a vernacular definition of an exception was that after 13 hours of "googling" to write one small section of code, I wasn't quite able to understand how it all fit in the big picture of the program. (And yes, the book I have helped some, but I still had to google.) So, I really want to extend an extra thanks for those who were willing to break things down so that I could understand.

    Yes, I am looking for natural numbers and yes it appears that the regex was the best way to go. Negative numbers were not really a problem, in fact they helped me to determine an error of inequality. Both parseInt() and hasNextInt() as you suspected caused problems. So Regex is definitely the winner!
     
    • Post Reply Bookmark Topic Watch Topic
    • New Topic
    Boost this thread!