• Post Reply Bookmark Topic Watch Topic
  • New Topic

RegularExpression Pattern  RSS feed

 
Ricky Murphy
Ranch Hand
Posts: 31
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello All:

I am looking for a quick way to get BookNames out of this string as an Array or List. I was thinking about regular expression or pattern API. However, I was unable to get it work. Could anyone please point me out?

The incoming string looks like:

#BookName01, author01, year01#BookName02, author02, year02#BookName3, author03,year03...

which you can see that each section is separated by #, inside each section, items are separated by commas (,).

What I want to get is an array or list of bookNames (BookName01,BookName02, BookName03...) (Not a comma dilimited string)

Thank you very much!!!

-Rick
 
Stevi Deter
Ranch Hand
Posts: 265
Hibernate Java Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Rick,

My first question is why you want to use a regular expression instead of String.split(), which for this example would lead to easy to read code.

I would recommend the Java Tutorial on Regular Expressions, which will give you everything you need to figure out how to capture a group from your regular expression and do whatever you want with it.
 
Campbell Ritchie
Marshal
Posts: 56581
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Stevi Deter is correct to refer you to the regular expressions tutorial. But String.split does take a regular expression as a parameter; you probably want to split twice, once on # then later on ", ". Go through the tutorial quoted and see whether you need to escape either of those expressions.
 
Stevi Deter
Ranch Hand
Posts: 265
Hibernate Java Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Campbell,

Just to follow up, you're absolutely correct, String.split(String regex) is taking a regex, and I should have stated that since I knew it!

Just to explain my train of thought more clearly -- in a simple case like this, using String.split() on the "#", then splitting the resulting Strings on the "," will likely result in more readable code than using Pattern.compile(regex) and a Matcher and then looping through the groups.

Then again, it's always good to learn to use regular expressions effectively, if that's the real goal here.
 
Ricky Murphy
Ranch Hand
Posts: 31
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi there. Thank you for the replies. Indeed, I first looked at the string.split and did what you mentioned. The problem I got there is that after first split on #, I got 3 (three sections in my example) long strings. Each string contains 3 (three) sections separated by "," inside it. I then tried to do each them one by one by creating three arrays, etc. You see what i am saying? I thought that was too tedious of a work and it might be helpful to handle this using "regex" and pattern. Any better way here to get it quickly?

I then went on to use regex method (String.split uses regex inside it I believe). However, I wasn't able to figure out a quick way (i.e. one step to: pick the string between # and the first ,") before my post. What I ended up with doing was regex and pattern/matcher to break my long up by # at each word, something like following to get my book list...

String regexp="\\b#\\w*"; //a word begins with a #.
Pattern p=Pattern.compile(regexp);
Matcher m=p.matcher(existingOrg);
List<String> existingList=new ArrayList<String>();

while(m.find())
{
String s=m.group().replaceAll("/","");
existingList.add(s);
System.out.println("the bookis: "+s);
}

Better suggestions?

Thank you.

-Rick
 
Henry Wong
author
Sheriff
Posts: 23295
125
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
However, I wasn't able to figure out a quick way (i.e. one step to: pick the string between # and the first ,") before my post.


Here you go...



Henry
 
Campbell Ritchie
Marshal
Posts: 56581
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
A bit late, BUT:

Your first split() on # produces a String[] array, and you could then create a String[][] array; you split the first member of your String[] with , into a String[] array which is now the first member of the String[][] array.
 
Ricky Murphy
Ranch Hand
Posts: 31
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Henry Wong:


Here you go...



Henry


Eh... I think this code essentially is the same as mine but with a different regexp and moved while loop as one line. However, this is one issue this regexp will yield both "/" and "," in the string. I did put the code into my program and verified that. therefore, you may need replace() as what I did to remove both "/" and ",". Is there any to only show the content in between the "/" and "," (excluding "/" and ",")?

Thank you.

-Rick
 
Ricky Murphy
Ranch Hand
Posts: 31
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Campbell Ritchie:
A bit late, BUT:

Your first split() on # produces a String[] array, and you could then create a String[][] array; you split the first member of your String[] with , into a String[] array which is now the first member of the String[][] array.


I see, Campbell. You wanted to use multi-dimensional array. a different approach. I will sure to give a try as well to compare the performance. Thank you!

-Rick
 
Henry Wong
author
Sheriff
Posts: 23295
125
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Is there any to only show the content in between the "/" and "," (excluding "/" and ",")?


First of all, why "/"? Your example uses "#".

Second, take a look at the code again -- mine doesn't need to replace anything. It uses groups to extract the content in-between. (notice the group(1) call)

Henry
[ March 26, 2008: Message edited by: Henry Wong ]
 
Campbell Ritchie
Marshal
Posts: 56581
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Ricky Murphy:
multi-dimensional array
-Rick
[Pedantic mode]There is no such thing as a multi-dimensional array in Java. What I suggested was an array of arrays.[/Pedantic mode]
 
Ricky Murphy
Ranch Hand
Posts: 31
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Campbell Ritchie:
[Pedantic mode]There is no such thing as a multi-dimensional array in Java. What I suggested was an array of arrays.[/Pedantic mode]


Yes. Campbell. you are right. There is no "multi-dimensional" array in Java. array of arrays is the right term.
 
Ricky Murphy
Ranch Hand
Posts: 31
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Henry Wong:


First of all, why "/"? Your example uses "#".

Second, take a look at the code again -- mine doesn't need to replace anything. It uses groups to extract the content in-between. (notice the group(1) call)

Thanks Henry. First of all, it should be # as I stated in my first post. what happened was after weekend, we started to deal with with "/" in a similar case. when I posted my code Tuesday, I changed the regex portion to comply with my original post but I forgot to change the "/" sign in the while loop.

Second, yes, you are right. the group(1) did it. This is my first time using the regex/pattern/match. my original understand on group() method was off. I did a further reading today and understand how to use number for group. Learned a lot. Thank you.


Henry

[ March 26, 2008: Message edited by: Henry Wong ]
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!