• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

sed regular expression

 
Pat Denton
Greenhorn
Posts: 17
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have a value aaa.c11234.001.xml and I am needing to use the sed command to pull out the x11234 and 001 as individule variables. So far, I am getting the 001 with the following command in my script

var1=`echo aaa.c11234.001.xml | sed -e 's/^.*\.\(.*\)\..*$/\1/'`

But I am having issues with getting the c11234 part. I can't seem to nail down the regular expression to do so. Any help would be greatly appreciated.
 
Ernest Friedman-Hill
author and iconoclast
Marshal
Pie
Posts: 24211
35
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well, one option (deliberately similar to yours)

var2=`echo aaa.c11234.001.xml | sed -e 's/[^.]*\\.\([^.]*\)\\..*$/\1/'`

The regexp is (with group 1 parenthesized) "Any number of not-a-dot characters, followed by a dot, (followed by any number of not-a-dot characters), followed by a dot, followed by anything."
 
Pat Denton
Greenhorn
Posts: 17
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks Ernest. That did the trick. And it helped me with some others I had to create as well. The best part was the worded explination of the regex. That made difference in me understanding it.
[ November 09, 2005: Message edited by: Pat Denton ]
 
Harald Kirsch
Ranch Hand
Posts: 37
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Why use sed?

Assuming you use (ba)sh, try this:


For the details of this percent and hash business see:
http://www.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_06_02
[ November 09, 2005: Message edited by: Harald Kirsch ]
 
Stefan Wagner
Ranch Hand
Posts: 1923
Linux Postgres Database Scala
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Regexpressions are often unreadable one-way-programming.
You may write them, but try to read them!

A little help is the possibility to combine multiple statements with semicolons: "sed 's1;s2'".
So you could snip away the text 'aaa.c' and '.xml' with those two commands:

By replacing `foo` with $(foo), nesting is more easy. Using backticks is discouraged for the bash for that reason.
 
Ernest Friedman-Hill
author and iconoclast
Marshal
Pie
Posts: 24211
35
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Feh. Regular expressions can be ugly, so comments certainly help. One thing in their favor is if the shell script gets turned into a Perl, Ruby, or Python program, or recoded in Java, the regexp can go along for a ride; the other techniques would require a total rewrite.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic