• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

criteriaFind with java.util.regex

 
Surya Gangadharan
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,
Has anyone implemented criteriaFind with regex. If so any pointers on how the algorithm should be?
Thanks
 
lev grevnin
Ranch Hand
Posts: 35
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I actually implemented criteriaFind with regex very successfully. I know you posted this message a couple of days ago. I tried answering it back then, but I forgot some details of my method, because i wrote it 2 months ago and since then i've been working on a gui. Tonight i will try to document my criteriFind and shed some light on the subject. So i'll post a reply either tonight or tomorrow night.
-lev
 
Surya Gangadharan
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
thanks i am waiting for your reply.
 
lev grevnin
Ranch Hand
Posts: 35
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ok, several things upfront:
** You WILL have to learn precisely how regex works in java. Here is a good tutorial link
regex torial from Sun.
** Note there are some subtle differences between regular expressions in different languages. So, no universal standard.
** Pay particular attention to when you should use Greedy, Reluctant and Possessive quantifiers. It's critically important - means difference
between a working regular expressions and a regular expressions which looks like it SHOULD work, yet it DOESN't -
the most likely reason is in how you use the quantifiers. So learn about them in detail.
** You will need a test harness to try out your regular expressions in an easy way - the tutorial luckly has one for you to download.
It's a good one.
Ok, here are some ideas about how i did it which can help you get started once you learn enough about how regex works in java:
(NOTE: if you don't understand something,don't get stuck, just keep reading - i try to reiterate some important points, so you'll
see me talk about them again)
criteriaFind(String criteria) {
//PHASE NUMBER 1 (better to put it in a separate //function: String [] func1(String criteria)):
//separate the criteria into its constituent //elements with "any character" regex prepended //and postfixed to them;
//for example: "criteria=Origin //airport='ABC',Destination airport='DFG'" yeilds //an array of strings:
//.*?Origin airport='ABC'.*?
//.*?Destination airport='DFG'.*?
//To get this result, you need to come up with a //regex which matches a correct criteria input //string. If the criteria is wrong, your
//regex (if constructed properly) will NOT match //it. Note, you will need to use the group() //function after that
//in the Matcher class to obtain all the //key='value' pairs. As soon as you get a pair, //prepend it and postfix it with .*? - a "any //number
//of any characters" match (we will see why we //need it later). So eventually, you will be able //to get all of these .*?fieldname='value'.*?
//(or .*?key='value'.*? as i called it before) //and store it in an array of strings. Then //func1() returns all of them in a String[].
//Basically func1() is using regex (designed by //you) to construct a bunch or regex's (those //stored in a String[] returned by func1())
//we will use to do more matching later.
//ok, let's review func1() in brief one more time //(i understand, it's very tricky):
//
// 1) Create a regex which represents a correct //input for criteria.
// 2) Try to match your criteria against this //regex. If it doesn't match, return new //String[0];
// This is the most heavyweight regex in this //whole deal. Carefully think it through (hint: //think of the most general correct value
// for your criteria).
// 3) Ok, now the criteria is correct (no weird characters, or wrong format). So use another regex //(a much much smaller one) to pick out
// all the fieldname='value' pairs, prepend //and postfix each one of them with .*? and put it //in the next available String slot in your
// String[]. This procedure will use the //group() method in the Matcher class.
//
//do the actual record matching here
synchronized (this) { //to make sure noone //attempts to change the reccount by adding or //deleting records.
for (every record in the database) {
//PHASE NUMBER 2: (String func2(DataInfo rec))
/*puts rec into key1='value1'key2='value2'key3='value3' ..... format and returns it as one large String (let's call it str)*/
for (every one of those .*?key='value'.*? pairs we obtained above) {[
/*match it with str. If match is not found break out of this loop to go to the next record. */
/*If all the patterns matched a given record, that means all the key='value' pairs in the original criteria were present in this record, so grab this record and put it in some storage (a Vector, maybe) */
}
}
/*create DataInfo[] out of that storage (the one containing all the matched records) and return it. */
}
}
hope this helps
-lev
 
lev grevnin
Ranch Hand
Posts: 35
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
My comments ( //'s) are all over the place. It got misformatted, or something - i don't know what happened. Hope they won't impede you from reading my response.
 
Surya Gangadharan
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
thank you so much.
i have actually gone through the whole java tutorial from sun and got a bit of an idea about regex from that. i will now code with the help of ur ideas.
thanks again
 
Poorna Lakki
Greenhorn
Posts: 11
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Surya Gangadharan:
thank you so much.
i have actually gone through the whole java tutorial from sun and got a bit of an idea about regex from that. i will now code with the help of ur ideas.
thanks again

Do we need to use regular expressions? I mean, the requirements specifically say that the match needs to be exact. Is there any weightage in the exam for using regex. Any comments would be of great help.
Thanks,-Poorna Lakki
 
lev grevnin
Ranch Hand
Posts: 35
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Poorna Lakki,
You can use whatever you want to do the matching in criteria find. I used regex because it automatically checks for many different cases and makes sure your input string is correct. The matching is done in in fewer lines of code (although, the regular expressions themselves can be quite cryptic).
-lev
 
Surya Gangadharan
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hi lev,
is there any way u can remove the // and repost again, it is very confusing making sense of what u have written with the // between the words.
thanks
 
Poorna Lakki
Greenhorn
Posts: 11
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by lev grevnin:
Poorna Lakki,
You can use whatever you want to do the matching in criteria find. I used regex because it automatically checks for many different cases and makes sure your input string is correct. The matching is done in in fewer lines of code (although, the regular expressions themselves can be quite cryptic).
-lev

Lev,
If the input string is wrong or badly formatted should'nt we return null. For example if the column names donot match or column values donot match "exactly" as in the table, i was thinking of returning null.
I just started my assignment so, this is the first iteration for criteriaFind(). Probably i will go with regex by the time i complete the assignment.
Thanks,-Poorna Lakki
 
lev grevnin
Ranch Hand
Posts: 35
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
k, several things upfront:
** You WILL have to learn precisely how regex works in java. Here is a good tutorial link
regex torial from Sun.
** Note there are some subtle differences between regular expressions in different languages. So, no universal standard.
** Pay particular attention to when you should use Greedy, Reluctant and Possessive quantifiers. It's critically important - means difference
between a working regular expressions and a regular expressions which looks like it SHOULD work, yet it DOESN't -
the most likely reason is in how you use the quantifiers. So learn about them in detail.
** You will need a test harness to try out your regular expressions in an easy way - the tutorial luckly has one for you to download.
It's a good one.
Ok, here are some ideas about how i did it which can help you get started once you learn enough about how regex works in java:
(NOTE: if you don't understand something,don't get stuck, just keep reading - i try to reiterate some important points, so you'll
see me talk about them again)
criteriaFind(String criteria) {
/*PHASE NUMBER 1 (better to put it in a separate function: String [] func1(String criteria) ):
separate the criteria into its constituent elements with "any character" regex prepended and postfixed to them; for example: "criteria=Origin airport='ABC',Destination airport='DFG'" yeilds an array of strings:
.*?Origin airport='ABC'.*?
.*?Destination airport='DFG'.*?
To get this result, you need to come up with a regex which matches a correct criteria input string. If the criteria is wrong, your
regex (if constructed properly) will NOT match it. Note, you will need to use the group() function after that in the Matcher class to obtain all the key='value' pairs. As soon as you get a pair, prepend it and postfix it with .*? - a "any number of any characters" match (we will see why we need it later). So eventually, you will be able to get all of these .*?fieldname='value'.*? (or .*?key='value'.*? as i called it before) and store it in an array of strings. Then func1() returns all of them in a String[].
Basically func1() is using regex (designed by you) to construct a bunch or regex's (those stored in a String[] returned by func1()) we will use to do more matching later.
ok, let's review func1() in brief one more time (i understand, it's very tricky):
1) Create a regex which represents a correct
input for criteria.
2) Try to match your criteria against this regex. If it doesn't match, return new String[0];
This is the most heavyweight regex in this whole deal. Carefully think it through (hint: think of the most general correct value for your criteria).
3) Ok, now the criteria is correct (no weird characters, or wrong format). So use another regex (a much much smaller one) to pick out all the fieldname='value' pairs, prepend and postfix each one of them with .*? and put it in the next available String slot in your String[]. This procedure will use the group() method in the Matcher class.
*/
//do the actual record matching here
synchronized (this) {
//needs to be synchronized to make sure noone attempts to change the reccount by adding or deleting records.
for (every record in the database) {
/*PHASE NUMBER 2: (String func2(DataInfo rec)) puts rec into key1='value1'key2='value2'key3='value3' ..... format and returns it as one large String (let's call it str)*/
for (every one of those .*?key='value'.*? pairs we obtained above) {
/*match it with str. If match is not found break out of this loop to go to the next record. If all the patterns matched a given record, that means all the key='value' pairs in the original criteria were present in this record, so grab this record and put it in some storage (a Vector, maybe)*/
}
}
/*create DataInfo[] out of that storage (the one containing all the matched records) and return it.*/
}
}
 
lev grevnin
Ranch Hand
Posts: 35
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Poorna Lakki,
You can return whatever you want for a "non-match". I arranged for a return of an empty DataInfo[], whose length (length = 0) indicates that there were no records to match the input string.
-lev
 
Mario Zott
Greenhorn
Posts: 21
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Do you think judges will punish me for something simple like this piece of code ?
 
lev grevnin
Ranch Hand
Posts: 35
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hmm, looks pretty simple but i have no idea what you're trying to do here. Can you please add some clarifying comments?
-l
 
Max Habibi
town drunk
( and author)
Sheriff
Posts: 4118
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Just to add my two cents here. Another option I've seen people take on this is to actually override the toString method of the record objectitself, so that it provides a regex friendly representation of the record. Then, the only thing left to do is parse the user's input.
M, author
The Sun Certified Java Developer Exam with J2SE 1.4
 
Mario Zott
Greenhorn
Posts: 21
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
clarifying comments to:


samplecriteria
carrier='some carrier',origin='JFK',destination='JFO',another field='another value',

splitting the criteria with "'," results in:
carrier='some carrier
origin='JFK
destination='JFO
another field='another value
splitting these results with "='" results in:
carrier
some carrier
origin
JFK
destination
JFO
another field
another value
looks good, but if either the fieldname or the fieldcontent contains the split criterias, this solution won't work. but what solution would handle it??
didn't found out, but I don't like if a solution isn't 100%.
I think the only way to use special chars like ' or = as fieldname/content parts is to use an escape character... which I use in my latest version of parsing (Implemented with StreamTokenizer)
now my query string has to look like this if special chars should be used
carrier='Austrian\'s Airline',description='the coolest\, funniest and fastest airline'
would be happy to hear some comments....
 
Surya Gangadharan
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Max,
It is a good idea to change the toString method to display regex friendly string. Can you elaborate on that? Do you mean spitting out a string just like "'Carrier=<value>','Origin=<value>'"?
Please let me know.
Thanks
Surya
 
Max Habibi
town drunk
( and author)
Sheriff
Posts: 4118
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well, I don't want to give too much away, but you could have it provide a representation of a given record with no \' or \" characters, maybe even have them in a group. Further, you could have it delimit a given field from another field by using the '|' operator. Just some food for thought.
All best,
M, author
The Sun Certified Java Developer Exam with J2SE 1.4
[ March 08, 2003: Message edited by: Max Habibi ]
 
Jane Weil
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Lev,
After I read the regex tutorial, i implemented the cretariaFind with regex, the regex i figured out as following:
</snip>
because the values are string,i used .* Is that right?
[ March 19, 2003: Message edited by: Max Habibi ]
 
Peter den Haan
author
Ranch Hand
Posts: 3252
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Jane, note that the Data class you got is completely generic: there is no FBN-specific code in there whatsoever. Don't you think this strongly suggests that a fully generic criteriaFind() implementation is in order?
- Peter
 
Jane Weil
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Peter,
I actually get those fields from FieldInfo. I just gave the example in the post.
Thanks a lot for the kind remindering
 
Max Habibi
town drunk
( and author)
Sheriff
Posts: 4118
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Jane, per my email to you, please don't give out specific answers on this forum. I've modified your post accordingly.
M
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic