Win a copy of Functional Reactive Programming this week in the Other Languages forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

how to extract search engine results

 
amitha reddy
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hi iam new to java
i want to extract URLs from search engine such google
i tried it and iam getting one URL
how to get all URLs
here is my code
import java.net.*;
import java.io.*;
import java.util.*;
class Googly2
{
public static void main(String[] args)
{
try{
//Reading the keyword from text file
DataInputStream din=new DataInputStream(new BufferedInputStream(new FileInputStream("C:\\Documents and Settings\\Administrator\\Desktop\\keywordfile.txt")));
String str;
while((str=din.readLine())!=null)
{
System.out.println("\nKeyword :" +str);
//Creating URL
URL url = new URL("http://www.google.com/sponsoredlinks?hl=en&lr=&q="+str);
URLConnection conn = url.openConnection();
conn.setRequestProperty("User-Agent","");
conn.connect();
//Reading the page
BufferedReader in =
new BufferedReader(new InputStreamReader(conn.getInputStream()));
String line;
String lines;
String result="";
String urlResult="";

int c=0;
while((line=in.readLine())!=null)
{
if(line.indexOf("return ss")!=-1)
{
c++;
System.out.println("C :"+c);}
if(line.indexOf("return ss")!=-1)
{
System.out.println("C :"+c);
urlResult= line.substring(line.indexOf("return ss"),line.indexOf("onMouseOut"));

}
}




System.out.println("\nURL : "+urlResult);

}


}catch(Exception e)
{
e.printStackTrace();
}

}}
 
David Harkness
Ranch Hand
Posts: 1646
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The if tests in your while loop are duplicates. Perhaps you want this:Also, in the else block you might want to make sure the line also contains "onMouseOut" or you'll get an exception.
 
Consider Paul's rocket mass heater.
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic