• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Liutauras Vilda
  • Tim Cooke
  • Jeanne Boyarsky
  • Paul Clapham
Sheriffs:
  • Devaka Cooray
  • Ron McLeod
  • paul wheaton
Saloon Keepers:
  • Tim Moores
  • Piet Souris
  • Tim Holloway
  • Stephan van Hulst
  • Carey Brown
Bartenders:
  • Al Hobbs
  • Frits Walraven
  • Scott Selikoff

trying to remove javascript contents with script tags?

 
Ranch Hand
Posts: 55
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I'm trying to remove the javascript stuff withing the script tags in an html file. I'm having no problem removing the script tags and all the stuff inside. However, i'd like to leave the script tags and just remove the the javascript inside them. I have tried taking group(1) which is the contents and running replace(group(1),"") but it was not working consistently. The matcher.replaceall works very good but i have the darn script tags in my matcher. I thought making the script tags in non captured groups would help so i could then call matcher.replaceall but they are still in match.

Any ideas?



 
Ranch Hand
Posts: 1183
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I think there is no need to use regular expressions in this case. Unnecessary overhead.

 
steve labar
Ranch Hand
Posts: 55
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
what if there is multiple scripts? in the file this only would get the first occurrence. So, you think using java regex is costly time wise.

 
Sebastian Janisch
Ranch Hand
Posts: 1183
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The regex engine is pretty heavy weight, so I tend to avoid it whenever possible.

As for your question, yes it only strips out the first occurance.

But you can simply loop over it until sb.indexOf("<script") is -1.
 
It's just a flesh wound! Or a tiny ad:
the value of filler advertising in 2021
https://coderanch.com/t/730886/filler-advertising
reply
    Bookmark Topic Watch Topic
  • New Topic