• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Reading HTML Surcecode for a URL

 
omkar wadkar
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am trying to make a HTML page scrapping application .In order to do so I want to read the HTML source for a given URL .
I am using following code


The problem is when i am passing http://www.nextag.com/camera as a search parameter ,Its not showing the entire source .
<head>,<title> are all missing .I am wondering why it behaving when I am putting search parameter along with URL??




 
Devaka Cooray
ExamLab Creator
Marshal
Pie
Posts: 4593
302
Chrome Eclipse IDE Google App Engine IntelliJ IDE jQuery Postgres Database Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It is because http://www.nextag.com/camera has a redirection to http://www.nextag.com/camera/search-html
 
Rob Spoor
Sheriff
Pie
Posts: 20609
63
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You may want to turn on following redirects for the URLConnection.
 
omkar wadkar
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks you guys

But my application is taking search parameter as input ,
So I will have to append the search String to the URL .
Any suggestions about it ??
 
Rob Spoor
Sheriff
Pie
Posts: 20609
63
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The URLEncoder.encode method transforms the value so it can be used in a URL. For instance, space becomes either + or %20, & becomes %26 etc.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic