• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

split a string and differentiate the elements in string array

 
Greenhorn
Posts: 22
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi all:
I am trying to split a string basing on some HTML tags inside it
if
String s = "I am <b> bold </b>";
I am want that to be converted to string array and I must be able to differentiate which element in the string array was inside the tags.
Is there any way to do it.
 
Marshal
Posts: 79239
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Depends what you want to split on. There is a split() method in the String class which does what you want, but it takes a regular expression as its parameter.

If you are not familiar with regular expressions, try here to start you off.
 
Krishna Chaitanya Reddy Balam
Greenhorn
Posts: 22
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
but will I be able to differentiate the string int between bold tags in the string array.
 
Campbell Ritchie
Marshal
Posts: 79239
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You can probably design a regular expression which will match <b> and </b> tags, so you should be able to do that, yes.
 
Krishna Chaitanya Reddy Balam
Greenhorn
Posts: 22
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I think this is the pattern for <b> tags
<b\b[^>]*>(.*?)</b>
and
this is for general HTML tags
<([A-Z][A-Z0-9]*)\b[^>]*>.*?</\1>

but if I write something like
String S4 = "I am <b>bold</b> and I am <i>italic</i> and I am <b><i>bold italic</i></b>"
Pattern htmlTag = Pattern.compile("<([A-Z][A-Z0-9]*)\b[^>]*>.*?</\1>");
int length = s4.length();
Matcher matcher = pbold.matcher(s4);
String result = matcher.group();

I need to get the output to String array

like
String[] sa;
and sa should contain {"I am", "bold", "and I am","italic","and I am","bold italic"}
I konow I can get this but after storing in string array I need to differentiate that sa[1] was between bold tags and sa[3] was in italic tags and sa[5] was in bold italic tags.

Is there any way to do this.
Right now I am parsing the string character by character and doing it bu tI need something more generic as it is difficult to have nested tags with character logic.
Please help
 
Campbell Ritchie
Marshal
Posts: 79239
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Difficult to be sure just looking at the code, but you appear to be matching everything from a <b> tag to the next </b> tag. I think you want to match only the <b> and </b>.

You might do well to Google for HTML parsers, as well, if you are looking for more than one kind of tag. Why spend hours and hours re-inventing the wheel?
 
Krishna Chaitanya Reddy Balam
Greenhorn
Posts: 22
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I did and nothing helps.
 
Campbell Ritchie
Marshal
Posts: 79239
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Sorry to hear that.
List of HTML parsers here.

Try myString.split("</.+>") or myString.split("<b>").

[Campbell@queeg applications]$ java BoldSplitter
I am
bold</b> and I am <i>italic</i> and I am
<i>bold italic</i></b>
[Campbell@queeg applications]$

 
reply
    Bookmark Topic Watch Topic
  • New Topic