Win a copy of Functional Reactive Programming this week in the Other Languages forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Writing a spider

 
Dale DeMott
Ranch Hand
Posts: 515
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Okay... so I have an application that needs to be spidered. The issues at hand are such
1) needs to be able to fill out a form field before spidering
2) needs to handle javascript
3) needs to start at a specified location after the form field has been filled out
I was thinking about writing this using HTTPUnit. Has anyone written one using this? Does anyone have any other ideas or programs that I might be able to use. Any ideas would be appreciated.
Regards,
Dale DeMott
 
Cindy Glass
"The Hood"
Sheriff
Posts: 8521
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I guess that we are not into creepy crawling critters in Intermediate .
Let's move this to Advanced and see if they can offer some advice.
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13074
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What is implied in that "needs to handle Javascript" ???
Do you mean it needs to parse out forms, etc that have Javascript mixed in to the HTML or that it has to execute JavaScript.
I just used HttpClient (from the Jakarta Commons toolkit) to create a load tester that faked responding to a form. I had to use JTidy to get a parsed DOM representation of the page because the HTML was not well formed.
Bill
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic