• Post Reply Bookmark Topic Watch Topic
  • New Topic

How to Extract all urls  RSS feed

 
Hardik Trivedi
Ranch Hand
Posts: 252
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I need to fetch the list of all urls of a website.

e.g. If I enter some live url i.e. www.xyz.com in my web application then it should be able to extract all urls of a given site.
i.e.

www.xyz.com/home
www.xyz.com/contactUs
www.xyz.com/Login/?trial=1

some thing like that.


Is there any way to do this in java. Or there may be some .jars available.

Please help
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13078
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What you are talking about is commonly called a web crawler - the same functions as used by search engine indexers.

A google search for "java web crawler" found lots of examples.

Bill
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!