Win a copy of The Little Book of Impediments (e-book only) this week in the Agile and Other Processes forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

find method, regex and predef characters

 
rinke hoekstra
Ranch Hand
Posts: 152
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi all,

I came to some considerations on the find method of UrlyBird 1.1.1, and was wondering what choices and considerations others have made.

First of all, the interface comment on the find method talks about matching records where the specific field starts with the specific criteria. However, the instructions on the GUI talk about returning exact matches. So my idea was to use regular expressions to meet the specification of the find method, and then, in the GUI, do filtering of these results, in order to keep only the exact matches - however, case insensitive as to make it a bit more user friendly.

However, in working this out, I came to some problems with the price field. First of all, in all records, the price field starts with a dollar sign, and this is a predefined character in regular exceptions, standing for "end of line". This leads to the fact that price fields are never matched, because if we start the criteria with a $ it means something different, and if we do not start the criteria with $, then it just does not match.

So we cannot just allow that $ in the regular expression string for the criteria - we should at least escape it - or something else. But then I realized that searching this field doesn't make sense anyway, at least not when we do it according to the specs of this method: treating the contents as strings and matching at with what it should start. Nobody would want to search this field and having to type a $ sign first. The way users would want to use it is to find all recs having the criteria price as a maximum. But that goes beyond the scope of the assignment, as it is not asked for, and I think one of the basic principles of this assignment is: don't do fancy things which weren't asked for.

So, what is the best way? My thinking goes a bit towards just ignoring this price field anyway, and just excluding it from the criteria search (that is: to let any content in the price field by definition match the criteria). The GUI asks for only two fields to be searchable anyway (though the find comments specifications suggests all fields to be searchable).
But then, how far to take this? Ignore the size field as well? And the date field too? I wonder what others decided on this.

And then: $ is not the only predefined regex character. What if the user passes any of those characters to the find method? Like ^, or \ or whatever? Just ignore that problem? Or allow it and let them use the feature if they are smart enough? Or filter those characters out? Or raise an IllegalArgumentException if these characters are in?

Or just avoid this problem by not using regex and just use something like String.startsWith??
 
Lucy Hummel
Ranch Hand
Posts: 232
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi rinke,

I would not make a special search on an attribute as you suggested for the price attribute. I think each attribute has to handled the same way.

What will you do if the currency symbol changes? You have to change the software, I think that sounds not that good.

I would not support any search symbols as you listed. Just keep it simple and add that this search symbols are a development issue.

But such kind of discussion can be found already in our forum. Just use the search feature.
 
rinke hoekstra
Ranch Hand
Posts: 152
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Hummel,

I would not make a special search on an attribute as you suggested for the price attribute. I think each attribute has to handled the same way.


I have to disagree here. The point I tried to make here, is that if you don't treat this field in a special way, it will not be searchable. As any price field in any record starts at present with a $, and as the find method should return matches "starting with", one would have to type a "$" in the criteria in order to be able to find anything in the price field. But that does not work, as "$" means something else. So, if you don't treat this field in a special way (at least you should escape the $ sign), then the functionality does not work, as the price field will not be searchable.


What will you do if the currency symbol changes? You have to change the software, I think that sounds not that good.


I think you misunderstood here. I don't have to change the software then, because $ is the only currency symbol which is a predefined regular expression character. If the currency symbol is changed, nothing happens.


I would not support any search symbols as you listed. Just keep it simple and add that this search symbols are a development issue.


OK, we agree here. EXCEPT probably for $.
 
Lucy Hummel
Ranch Hand
Posts: 232
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi rinke,

Sorry, can you explain me why you want to use regex?

As far as I understood you are looking for words that starts with a certain prefix.

If that is the case I would use the method java.lang.String#startsWith() method.

Below a Junit Test Case explaining what I suggested.
 
rinke hoekstra
Ranch Hand
Posts: 152
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Hummel,

You are perfectly right: I'm overcomplicating. Though the functionality of the find method is lousy for numerics and prices, any better is not asked for, so should not be implemented.

And there is no need for using regular expressions in this method. I rebuilt the method so now it does NOT use this.

For anyone who did use regular expressions on find:

be aware that it will leave your price field unsearchable (because of the dollar sign it will never match any value or string), unless you take special action, for example escaping it.

thanks, Rinke
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic