• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • Devaka Cooray
  • Ron McLeod
  • Jeanne Boyarsky
Sheriffs:
  • Liutauras Vilda
  • paul wheaton
  • Junilu Lacar
Saloon Keepers:
  • Tim Moores
  • Stephan van Hulst
  • Piet Souris
  • Carey Brown
  • Tim Holloway
Bartenders:
  • Martijn Verburg
  • Frits Walraven
  • Himai Minh

Not sure if there is some mistake in SCJP6 for regex

 
Ranch Hand
Posts: 1087
Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I came across the below description.Somehow i felt its wrong with reference to occurrnces at index 11

0[xX][0-9a-fA-F] The preceding expression could be stated: "Find a set of characters in which the first character is a "0", the second character is either an "x" or an "X", and the third character is either a digit from "0" to "9", a letter from "a" to "f" or an uppercase letter from "A" to "F" ". Using the preceding expression, and the following data, source: "12 0x 0x12 0Xf 0xg" index: 012345678901234567 regex would return 6 and 11. (Note: 0x and 0xg are not valid hex numbers.) As a second step, let's think about an easier problem. What if we just wanted regex to find occurrences of integers? Integers can be one or more digits long, so it would be great if we could say "one or more" in an expression. There is a set of regex constructs called quantifiers that let us specify concepts such as "one or more." In fact, the quantifier that represents "one or more" is the "+" character. We'll see the others shortly
 
Bartender
Posts: 10780
71
Hibernate Eclipse IDE Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Vishal Hegde wrote:I came across the below description.Somehow i felt its wrong with reference to occurrnces at index 11


And why is that? Assuming the writer is talking about Matcher.find(), it seems perfectly reasonable to me.

Winston
 
Bartender
Posts: 4568
9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Why don't you think there's a match at 11? Looks like one to me.
 
Vishal Hegde
Ranch Hand
Posts: 1087
Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Matthew Brown wrote:Why don't you think there's a match at 11? Looks like one to me.



Sorry not 11 but the 6th Postiion it said the first value should be 0 , Second should be x or X and 3rd should be either either 0-9 or a-f or A-F but the 6th Index is showing 0x12 , the first two values that is 0 and x are correct but the third value that is 12 how come that is correct? the value should be in range 0-9 only?
 
Java Cowboy
Posts: 16084
88
Android Scala IntelliJ IDE Spring Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
So? The 1 is in the range 0-9, isn't it? The 2 isn't being looked at at all.
 
Vishal Hegde
Ranch Hand
Posts: 1087
Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Jesper de Jong wrote:So? The 1 is in the range 0-9, isn't it? The 2 isn't being looked at at all.



Correct me if i am wrong. 0[xX][0-9a-fA-F] represents that first value should be 0 , second value should be either x or X third value should be either 0-9, a-f or A-F

i see 1 and then 2 by that i assume that regex value should be something like 0[xX][0-9][0-9], so as per me this will be correct >> 0x12
 
Bartender
Posts: 1166
17
Netbeans IDE Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Vishal Hegde wrote:

Jesper de Jong wrote:So? The 1 is in the range 0-9, isn't it? The 2 isn't being looked at at all.



Correct me if i am wrong. 0[xX][0-9a-fA-F] represents that first value should be 0 , second value should be either x or X third value should be either 0-9, a-f or A-F

i see 1 and then 2 by that i assume that regex value should be something like 0[xX][0-9][0-9], so as per me this will be correct >> 0x12



Matcher.find() has to be used for this to make sense in the first place. Your way forces a minimum of two hex digits character before matching but the original regex requires only one hex digit BUT importantly it does not preclude 2 or 3 or 4 or any number of hex digits other than 1 or zero. If you want to force exactly two hex digits then you need 0[xX][0-9A-Fa-f]{2}[^0-9A-Fa-f] which says match on exactly two hex digits but no more.
 
Jesper de Jong
Java Cowboy
Posts: 16084
88
Android Scala IntelliJ IDE Spring Java
  • Likes 2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The regex 0[xX][0-9a-fA-F] only matches three characters. If you call find() to find matches of this regex in the string "12 0x 0x12 0Xf 0xg", it's going to look where in that string there are characters that match the regex.

It finds a match at position 6, because the characters "0x1" match. It doesn't matter what comes after "0x1". It also finds a match at position 11, because "0Xf" also matches.

Note that the regex matcher specifically does not split the text into tokens separated by spaces, which is what you seem to assume. Whether there is a digit, a space, or some other character after the match, doesn't matter.
 
Vishal Hegde
Ranch Hand
Posts: 1087
Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thnks Jesper your post cleared my doubt.
I have few queries though what are tokens in java? and suppose there is a regex [0-9] it will not take a complete number as '22' right? It will be only within a range 0-9?
 
Jesper de Jong
Java Cowboy
Posts: 16084
88
Android Scala IntelliJ IDE Spring Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The regex [0-9] matches a single digit which can be 0, 1, 2, 3, 4, 5, 6, 7, 8 or 9. If you use find() on the string "22" with the regex [0-9], it will find two matches, the "2" at position 0 and the "2" at position 1.

A regex such as [0-9], which matches a single character, is not going to find "22", which is two characters, as a match.

The word "tokens" as I used it doesn't have a special meaning in Java in general.
 
lowercase baba
Posts: 13082
67
Chrome Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
to further elaborate, you can think of it as starting at each and every position in the target string, and then applying the pattern to see if it matches. so, with a regex of 0[xX][0-9a-fA-F], and a target string of "12 0x 0x12 0Xf 0xg", you apply the regex 18 times.

start at position 0 - the character '1'. Does the pattern match if I start here? '1' does not match '0', so don't return this.
start at position 1 - the character '2'. Does the pattern match if I start here? '2' does not match '0', so don't return this.
start at position 2 - the character ' '. Does the pattern match if I start here? ' ' does not match '0', so don't return this.
start at position 3 - the character '0'. Does the pattern match if I start here? '0' does match '0'. Does 'x' match [xX] (x or X)? Yes. Does ' ' match [0-9a-fA-F]? no.
etc...
start at position 6 - the character '0'. Does the pattern match if I start here? '0' does match '0'. Does 'x' match [xX] (x or X)? Yes. Does '1' match [0-9a-fA-F]? Yes. Since that is the end of the pattern, I should return position '6' as a match.
start at position 7 - the character 'x'. Does the pattern match if I start here? 'x' does not match '0', so don't return this.
etc.
start at position 11 - the character '0'. Does the pattern match if I start here? '0' does match '0'. Does 'X' match [xX] (x or X)? Yes. Does 'f' match [0-9a-fA-F]? Yes. Since that is the end of the pattern, I should return position '11' as a match.
etc...
 
The human mind is a dangerous plaything. This tiny ad is pretty safe:
the value of filler advertising in 2021
https://coderanch.com/t/730886/filler-advertising
reply
    Bookmark Topic Watch Topic
  • New Topic