This week's book giveaway is in the Java in General forum.
We're giving away four copies of Event Streams in Action and have Alexander Dean & Valentin Crettaz on-line!
See this thread for details.
Win a copy of Event Streams in Action this week in the Java in General forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Devaka Cooray
  • Liutauras Vilda
  • Jeanne Boyarsky
  • Bear Bibeault
Sheriffs:
  • Paul Clapham
  • Knute Snortum
  • Rob Spoor
Saloon Keepers:
  • Tim Moores
  • Ron McLeod
  • Piet Souris
  • Stephan van Hulst
  • Carey Brown
Bartenders:
  • Tim Holloway
  • Frits Walraven
  • Ganesh Patekar

Usage of Regular expressions

 
Ranch Hand
Posts: 4982
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Max,
I am wondering that does RE is really that useful for string matching.
When I was in my way of SCJD, and I read your SCJD book, you have tried to explain how RE can help in searching data record.
I have tried to adopt the method, but it seems to me that there are some limitations, or maybe I am too new to RE. The case I encountered is that, when a record (or a row of data) contains more than 1 data element using some delimitors which the following data format:
NAME:ADDRESS:TEL:ZIP:COUNTRY
If I really use RE, for example, in case the NAME for row 1 contains the same value of ADDRESS for row 2, and if I search for the NAME, RE may return both rows 1 and 2 to me, becos row 2 contains such a value of NAME in row 1.
I know such searching problem will not appear if it is in the DB format, but in such cases, can I specify to RE that which pattern (or location) to be matched?
Nick
 
Ranch Hand
Posts: 5093
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you want to match only the part before the first separator (colon in your example) you could either create an RE that stops matching on reaching a colon or (which is what I'd probably do) use split(":") on the String and match only on the indexed item you are interested in.
String splitS[] = "NAME:ADDRESS:TEL:ZIP:COUNTRY".split(":") would yield a String array containing ["NAME","ADDRESS","TEL","ZIP","COUNTRY"] so if you want to match the name to a string you want you would just match splitS[0].
If you also wanted to match the telephone number separately you could do that by matching splitS[2] instead.
Your RE (which you may not need, you might be able to use contains() or equals() depending on what you're matching for) would be a lot cleaner and simpler this way.
 
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Originally posted by Nicholas Cheung:
Hi Max,
I know such searching problem will not appear if it is in the DB format, but in such cases, can I specify to RE that which pattern (or location) to be matched?
Nick


Hi Nick,
Glad to hear from you again. To answer your question: yes, there's a mechanism just for this. Actually, there are four, and they're under the general topic of lookarounds. They're not that difficult, but they're not completely trivial either, because they match position rather than existence. And yes, I go over them in the book.
If you provide a few example records, we can work backwards together, and try to see how they come to be. Deal?
All best,
M
[ April 14, 2004: Message edited by: Max Habibi ]
 
Nicholas Cheung
Ranch Hand
Posts: 4982
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Max,
Assume the following is the record format:
NAME:LOCATION:FIELD_1:FIELD_2:...:FIELD_N
For example, we have the following records:

In such case, if we use the most generic search (to see whether the string "NAME_1" exists in a string), both records will be returned.
However, I may only wanna find the NAME with "NAME_1", so, in fact, only the 1st record should be returned. But if we use the generic search, both records will be returned. In addition, we see that the NAME is not with a fixed length.
Thus, how can RE support such searching?
Thanks Max.
Nick
 
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Originally posted by Nicholas Cheung:
[QB]Assume the following is the record format:
NAME:LOCATION:FIELD_1:FIELD_2:...:FIELD_N
For example, we have the following records:

If you happen to be certain that NAME always occurs as the first field, then whey don't you use a (^) anchor in your regex that will anchor the pattern to match only at the beginning of string ?

 
Ranch Hand
Posts: 8934
Firefox Browser Spring Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Are java RE based on the Unix RE?
 
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yes, specifically it's very close to Perl (which incorporated most everything available in other unix tools and added to it). Java's regex is a little different, as described in the Pattern API (see "Comparison to Perl 5"). Possessive quantifiers are the most useful feature added by Java - expect it to appear in future versions of Perl and other languages.
[ April 15, 2004: Message edited by: Jim Yingst ]
 
Ranch Hand
Posts: 396
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Max,
You are the author of the SCJD book! The chapter on threads in the book is the best on threads that I have read anywhere. Great work indeed.
Thanks,
Vasu
 
Nicholas Cheung
Ranch Hand
Posts: 4982
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Tarun,
This is just an example. In fact, I am thinking of, are they any convenient way for searching inside a text file, that regardless to the position.
i.e. If what I want to compare is the 3rd field, not the 1st field, the ^ pattern will not be useful then.
I have thought of this before, and finally, I used scoring scheme to do the generic search in the SCJD assignment. However, in fact, I really wanna know how RE can archive this.
Nick
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Originally posted by vasu maj:
Max,
You are the author of the SCJD book! The chapter on threads in the book is the best on threads that I have read anywhere. Great work indeed.
Thanks,
Vasu


Thanks Vasu
M
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Nick,
There are probably 3 million ways to do this, but the following sounds like the sort of solution you wanted.
step 1. create a tmp.records file on your c: drive, consisting of the following

FIELD_R1_1:FIELD_R1_2:NEW_YORK:FIELD_R1_4:FIELD_R1_5:FIELD_R1_6:FIELD_R1_7:FIELD_R1_8
FIELD_R2_1:NEW_YORK:FIELD_R2_3:FIELD_R2_4:FIELD_R2_5:FIELD_R2_6:FIELD_R2_7:FIELD_R2_8
FIELD_R3_1:FIELD_R3_2:FIELD_R3_3:FIELD_R3_4:NEW_YORK:FIELD_R3_6:FIELD_R3_7:FIELD_R3_8
FIELD_R4_1:FIELD_R4_2:FIELD_R4_3:NEW_YORK:FIELD_R4_5:FIELD_R4_6:FIELD_R4_7:FIELD_R4_8
NEW_YORK:FIELD_R5_2:FIELD_R5_3:FIELD_R5_4:FIELD_R5_5:FIELD_R5_6:FIELD_R5_7:FIELD_R5_8
FIELD_R6_1:FIELD_R6_2:FIELD_R6_3:FIELD_R6_4:FIELD_R6_5:FIELD_R6_6:FIELD_R6_7:NEW_YORK

then the code...

HTH,
M
[ April 16, 2004: Message edited by: Max Habibi ]
 
Nicholas Cheung
Ranch Hand
Posts: 4982
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks a lot, Max.
Your example is exactly what I want to know.
In fact, I feel Java RE is quite similar to PERL's RE syntax.
Does this really the way SUN is doing?
BTW, as vasu said, your SCJD book is really great!
I used your book as my basis of SCJD assignment,
however, I bought your book with 70% off in a bookstore
that was performing moving clearance.
I hope you dont mind.
In fact, like Kathy and Berts, I hope you guys can write
more books on the cert. areas, so that we can prepare for
our exams more easily.
Nick
[ April 16, 2004: Message edited by: Max Habibi ]
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Originally posted by Nicholas Cheung:
Thanks a lot, Max.
Your example is exactly what I want to know.
In fact, I feel Java RE is quite similar to PERL's RE syntax.
Does this really the way SUN is doing?

That's the way it seems to me too

BTW, as vasu said, your SCJD book is really great!
I used your book as my basis of SCJD assignment,
however, I bought your book with 70% off in a bookstore
that was performing moving clearance.
I hope you dont mind.

Are you kidding? I love the fact that people are finding the book useful

In fact, like Kathy and Berts, I hope you guys can write
more books on the cert. areas, so that we can prepare for
our exams more easily.
Nick

Funny you should say that: stay tuned
M

 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!