Win a copy of Murach's Python Programming this week in the Jython/Python forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

How to trace for more urls in a webpage  RSS feed

 
Atul Oberoi
Greenhorn
Posts: 19
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello Everyone,

Iam using HttpURLConnection in a stand-alone application for connecting to a website.After getting connection iam storing the content of the first page in a file.I want to find all the urls in that page so that i can make further connection and get more content.Does anyone know how to perform this?
Iam passing user id and password to get the first page as it a secure website.

Thanks in advance !

Regards,

Atul.
 
Joe Ess
Bartender
Posts: 9406
12
Linux Mac OS X Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
javax.swing.text.html.parser.DocumentParser reads a stream and invokes methods on a callback class to inform it of the tags it finds. You just have to write a callback to handle anchor tags.
 
Atul Oberoi
Greenhorn
Posts: 19
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Does this means i have to use swings in my program ?
 
Ulf Dittmer
Rancher
Posts: 42970
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
No. This class can be used without the application using Swing for its UI.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!