• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Java program that interacts with the web

 
Eric Klytzmany
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I am trying to create a java application that will interact with websites. For example my application may have to navigate to a certain website, extract the text on the page, compute results, fill up a form and submit. Can anyone tell me what is the best way to go about making such a system? Would i have to create teh components that speak http or https or do apis exist?

I came across HTMLunit api which is primarily used to test and java browsers like lobo and jrex that seem to have an api too. How do these compare?

Thanks!
Eric
 
Costi Ciudatu
Ranch Hand
Posts: 74
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Apache HTTP Client is the first thing to check: http://hc.apache.org/httpcomponents-client/index.html
Also, for HTML processing you have http://htmlparser.sourceforge.net/
 
Ulf Dittmer
Rancher
Posts: 42968
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The premier library for this is jWebUnit, IMO. No need to deal with HTTP or HTML on a low level, that's all been done before. Don't be put off that it's billed a "unit testing tool" - it works just fine as a general-purpose web access library.
 
Campbell Ritchie
Sheriff
Pie
Posts: 50175
79
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
And welcome to JavaRanch
 
Eric Klytzmany
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for the replies. I will check them out and get back.
One more thing here, is anyone aware of a similar api that might support interactions with applets as well? The reason i ask is because a large number of sites i will need to perform these functions on might have the content as applets. I know extraction of any text from an applet is going to be tough, but is it even possible? what about interactions on the applet like button clicks?

 
Rob Spoor
Sheriff
Pie
Posts: 20661
65
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You may also check out this thread as it is about roughly the same subject.
 
Ulf Dittmer
Rancher
Posts: 42968
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
That's tough. From within the same JVM, the java.awt.Robot class could be used to control a GUI to a certain extent, but from a different JVM that would be much harder. Going out on a limb, I'd say it's impossible to do in the general case where you don't know the applet beforehand. And even if the applet GUI is known, extracting text that was painted on the screen amounts to OCR; I foresee numerous hard problems that way.
 
Eric Klytzmany
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hmm.. ok here is another idea, ideally all data being displayed by the applet too is coming in through a socket connection made by the browser right. So if i made the browser (or used an api that is a mock browser) I would have access to the data flowing in and out of the applet. And if that is the case, this data would follow a definite pattern and can be extracted, unless the data is encrypted.

Is this even possible and has someone attempted this?
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic