• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Paul Clapham
  • Liutauras Vilda
  • Knute Snortum
  • Bear Bibeault
Sheriffs:
  • Devaka Cooray
  • Jeanne Boyarsky
  • Junilu Lacar
Saloon Keepers:
  • Ron McLeod
  • Stephan van Hulst
  • Tim Moores
  • Carey Brown
  • salvin francis
Bartenders:
  • Tim Holloway
  • Piet Souris
  • Frits Walraven

Hide Servlet Response from appearing in Google Search

 
Ranch Hand
Posts: 405
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have a Servlet which generates a PDF report, now the report contents appear in google search. Is there any way i can restrict this ?
 
ujjwal soni
Ranch Hand
Posts: 405
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Update ::

I added two meta tags which i hope will work...i am now waiting for google to reindex

 
Ranch Hand
Posts: 33
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hi you have to use JSP page which generate the PDF report with the Dyana mic fields so that the data will never display on google
 
ujjwal soni
Ranch Hand
Posts: 405
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for your reply..I am generating dynamic pdf using servlet, i do not use jsp here, do you think generating pdf in jsp will not index my pdf in google ?

I am storing a pdf as a blob and then displaying it on a servlet by reading blob.

 
Saloon Keeper
Posts: 6208
157
Android Mac OS X Firefox Browser VI Editor Tomcat Server Safari
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If something is accessible without restriction, then it's liable to get indexed. How it was generated does not matter. Drawback of a PDF is that you can't add a NOINDEX meta tag. You can set up a robots.txt file for your site, though.

Note that both these approaches rely on the spider cooperating. Google does so, but other spiders may not. If you want to be sure that your information is safe, don't make it publicly available.
 
ujjwal soni
Ranch Hand
Posts: 405
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for your reply Tim. I will password protect the PDF's now.
 
Sheriff
Posts: 6564
1210
IntelliJ IDE jQuery Eclipse IDE Postgres Database Tomcat Server Chrome Google App Engine
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
One another thing you can do besides setting up a robot.txt is checking the user-agent header to see where the request is coming from. However it should be noted that there are some stupid spiders presenting themselves as Firefox or IE. If you need to prevent all spiders grabbing your sensitive data, you should never expose them in a publicly-accessible page.
 
Tim Moores
Saloon Keeper
Posts: 6208
157
Android Mac OS X Firefox Browser VI Editor Tomcat Server Safari
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

checking the user-agent header


That can be spoofed, just like the Referer header, so it can't be relied upon for anything that matters (like security).
 
ujjwal soni
Ranch Hand
Posts: 405
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Its better to password protect pdf's then...In my case, there are two types of Pdf's, one which are publicly available and searchable in google and other ones are private (non indexable). So, i created two servlets one which serves public pdf's and other one for private pdf's....private pdf servlet is now SSO password protected so google wont be able to crawl it.

I am going to test this tomorrow morning on live server but it is currently working fine on my test system

thanks for help.

 
Ranch Hand
Posts: 143
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Just following on from what everyone else has said, but if something is accessible without restriction then Google will find and index this. I work in SEO professionally and can honestly say that if you don't want Google to see something then it must be password protected. Personally I prefer to put all sensitive information in a /secure/ directory which required authentication prior to accessing everything, opposed to placing a password on the document as has been described above.

All of the information spoke about above (robots.txt, noindex etc) are all guidelines which Google may choose to ignore, and often does.

Thanks
Michael
 
Whose rules are you playing by? This tiny ad doesn't respect those rules:
Java file APIs (DOC, XLS, PDF, and many more)
https://products.aspose.com/total/java
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!