Granny's Programming Pearls
"inside of every large program is a small program struggling to get out"
JavaRanch.com/granny.jsp
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

why this web page cannot be crawled?  RSS feed

 
wei liu
Ranch Hand
Posts: 35
Eclipse IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I was trying to use an opensource web crawler "WebSPHINX" to crawl all the subwebpages of the url "http://www.kompass.com/guide/extraction-industries/services/GSENWW510109.html".however I failed to get any of its subwebpages. from the url I believe it is not a dynamic webpage. so could anyone tell me why subpages of this url cannot be crawled?

I have tried other ready to use crawler(such as jobo,teleport pro, arale) on this url without success!

any comment is welcome!

thanks in advance.

ps you can download the websphinx in here "http://www.cs.cmu.edu/~rcm/websphinx/"
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!