Win a copy of Five Lines of Code this week in the OO, Patterns, UML and Refactoring forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Bear Bibeault
  • Ron McLeod
  • Jeanne Boyarsky
  • Paul Clapham
Sheriffs:
  • Tim Cooke
  • Liutauras Vilda
  • Junilu Lacar
Saloon Keepers:
  • Tim Moores
  • Stephan van Hulst
  • Tim Holloway
  • fred rosenberger
  • salvin francis
Bartenders:
  • Piet Souris
  • Frits Walraven
  • Carey Brown

Searching the XML string content using Regular Expression

 
Greenhorn
Posts: 16
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I'm having XML content as a string. I'm using W3C dom for getting values from the XML.

I've a very large XML file with attributes, elements similar like below

<Shares>
<bookDetails bookName="How to Learn English" bookAuthor="English Writer">
<Chapter chapterName="From Alphabetes" chapterPage="23"/>
</bookDetails>
<company>
<name>test</name>
<address>test address</address>
<contact>test contact</contact>
<C02>10.5</C02>
</company>
</Shares>

Currently I've written a method which accepts Root Element and searchable name to find the corresponding attribute or element and gets the value.

The method will get the value from a element or an attribute matching the given name.

I've used XPathAPI.selectNodeList to retrieve the value. I've used the below XPATH to check the given searchable name in both attribute and in element

xpath = "//*[@" + inAttr + "]";
xpathElement = "//" + inAttr + "/text()";

NodeList nodelist = XPathAPI.selectNodeList(root, xpath);

NodeList nodelist = XPathAPI.selectNodeList(root, xpathElement)

Sample Input and Output as follows

Input: bookName

Output: How to Learn English

Input: address

Output: test address

Input : C02 --> The element name has numeric character too

Output: 10.5

Problem: The XPATHAPI.selectNodeList() causes performance problem and it takes more time to search and gets the value.

I've planned to use regular expression (Pattern, Matcher) to search and get the values from the XML string.

Can anyone please let me know the regular expression with a code snippet to retrieve value either from a element or an attribute
which matches the element or attribute name ???

Thanks,
Kathir
 
Bartender
Posts: 5167
11
Netbeans IDE Opera Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
kathir, on another occasion and forum you wrote, after you had been referred to this JavaRanch FAQ page:


Sorry about that. i'll make sure that i don't cross post the same matter.



So why weren't you forthright this time around?
http://www.java-forums.org/new-java/41822-searching-xml-string-content-using-regular-expression.html
 
Marshal
Posts: 25671
69
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you're asking for a regex which can extract data from an XML document, such things don't exist. The complexity of XML is at a level higher than the complexity of regex.
    Bookmark Topic Watch Topic
  • New Topic