• Post Reply Bookmark Topic Watch Topic
  • New Topic

searching a string a specific character sequence  RSS feed

 
James Palmer
Ranch Hand
Posts: 36
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hi, wondering if anyone could help me out here.
I have HTML source code in a string called urlpath.
In the code, there is a sequence: <meta name=keywords content="key,HTML,cat,dog">
How do I find this in the string? I want to be able to do this for any HTML source code I have.
So basically I want: aString = "key,HTML,cat,dog"
I'm really struggling so any help will be greatly appreciated
 
Nathaniel Stoddard
Ranch Hand
Posts: 1258
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
James,
Cross posting is evil. Please don't do it.
 
Max Habibi
town drunk
( and author)
Sheriff
Posts: 4118
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by James Palmer:
hi, wondering if anyone could help me out here.
I have HTML source code in a string called urlpath.
In the code, there is a sequence: <meta name=keywords content="key,HTML,cat,dog">
How do I find this in the string? I want to be able to do this for any HTML source code I have.
So basically I want: aString = "key,HTML,cat,dog"
I'm really struggling so any help will be greatly appreciated


HTH,
M
 
James Palmer
Ranch Hand
Posts: 36
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi, I have got this so far:
public static void main( String[] args)
{
StringBuffer stringKeywords = new StringBuffer();
URL urlpath = new URL( "http://www.o2.co.uk");
URLConnection objecturl = urlpath.openConnection();
// now read the webpage
BufferedReader in = new BufferedReader(new InputStreamReader(objecturl.getInputStream()));
String keyWords = "";
try
{
BufferedReader in = null;

//read the data
while(in.read() !=-1) //buffread
{
keyWords = in.readLine(); //buffread

//condition to get keywords
if(keyWords.indexOf("\"keywords\"")>0 && keyWords.indexOf("meta")>=0)
{
keyWords= keyWords.substring(keyWords.indexOf("\"keywords\"")+10,keyWords.lastIndexOf("\"")+1);
keyWords = keyWords.replaceAll("content\\s*\\=","");
keyWords = keyWords.replaceAll("\"*\"*","");
keyWords = keyWords.replaceAll("\\d","");
keyWords = keyWords.replaceAll("\\n","");
//print output
System.out.println( keyWords.toString());
}
}
}
catch( Exception exp)
{
System.out.println("\nError in method getKeywords() "+exp+"\n");
}

Its not compiling very well at all. Any suggestions please?
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!