• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

java parsing using regular expression

 
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
http://www.csc.liv.ac.uk/teaching/modules/year3s1/comp304.html

i need to parse this page and extract the member of staff on this page using regular expression.
Using java.utility.regex

i only need the regular expression rest code i have done

//////
import java.io.*;
import java.net.*;
import java.util.regex.*;

class Spider{
public static void main(String []argv){
try {

URL url = new URL("http://www.csc.liv.ac.uk/teaching/modules/year3s1/comp304.html");
URLConnection urlConnection = url.openConnection();
DataInputStream dis = new DataInputStream(urlConnection.getInputStream());
String html= "", tmp = "";
// read all HTML source from given URL
while ((tmp = dis.readLine()) != null) {
html += " "+tmp;
}
dis.close();

// replace all white spaces region with single space
html = html.replaceAll("\\s+", " ");
// build the pattern using regular expression

//here is the pattern where i have to define a regular expression to find the name of the author from the page
*
*
//please REPLY ME THE REGULAR EXPRESSION NEEDED PLEASE IN THE Pattern.compile
//for the link http://www.csc.liv.ac.uk/teaching/modules/year3s1/comp304.html

Pattern p = Pattern.compile("");
// Match the pattern with given html source
Matcher m = p.matcher(html);
// Get all matches that matched my pattern
while (m.find() == true){
// Print the first matched pattern
System.out.println(m.group(1));
}
}catch (Exception e) {
System.out.println(e);
}
}
}

/////
 
Bartender
Posts: 9626
16
Mac OS X Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Please Do Your Own Homework
I am certain Dr K Atkinson would not want us to give you the answer.
 
Sheriff
Posts: 22787
131
Eclipse IDE Spring Chrome Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Also, please Use Code Tags.
 
shan rast
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Rob Prime wrote:Also, please Use Code Tags.




Thanks i got it but i have new problem i have posted a new post please give a solutionn for that
 
Marshal
Posts: 79468
379
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

shan rast wrote:Thanks i got it but i have new problem i have posted a new post please give a solutionn for that

Please tell us how you solved the problem, so others can learn from your experience.
 
shan rast
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Campbell Ritchie wrote:

shan rast wrote:Thanks i got it but i have new problem i have posted a new post please give a solutionn for that

Please tell us how you solved the problem, so others can learn from your experience.



Pattern p1 = Pattern.compile("<h2[^>]*>"+".*?]*>"+"([^<]+)"+"</a[^>]*>");
this is the pattern and now you just need to call the second group which will give the names of the author
 
shan rast
Greenhorn
Posts: 7

shan rast wrote:

Campbell Ritchie wrote:

shan rast wrote:Thanks i got it but i have new problem i have posted a new post please give a solutionn for that

Please tell us how you solved the problem, so others can learn from your experience.



Pattern p1 = Pattern.compile("&lt;h2[^&gt;]*&gt;&quot;+&quot;.*?<a >]*&gt;&quot;+&quot;([^&lt;]+)&quot;+&quot;&lt;/a[^&gt;]*&gt;&quot;);
this is the pattern and now you just need to call the second group which will give the names of the author



i have parsed the data and now i need to store it in .xml or say it as to write in xml
so thats my second problem i coded for that but its not working please can you give me some link where i can get some tutorial i need it urgent
 
author
Posts: 3285
13
Mac OS X Eclipse IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Please Ease Up, there are many tutorials on writing XML documents using Java, have you tried a Google search?
 
money grubbing section goes here:
a bit of art, as a gift, the permaculture playing cards
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic