• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Only read a URL page if it has been modified (Condition to use) ?

 
Greenhorn
Posts: 11
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

I want to read certain contents from a URL page only if it has been modified.
So what should be the exact condition I need to put for my IF block ?
So far this is what I got as a basic using ifModfiedSince



Thanks in advance!
 
Saloon Keeper
Posts: 7585
176
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You would need to store the time when you last downloaded the page, so that you can set that as IfModifiedSince value the next time you check the URL.
 
lowercase baba
Posts: 13089
67
Chrome Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
that is the ultimate question...modified since when?  In the past hour?  the past week? the past year?  Since the last time you read it?

regardless of what the answer is, you have to either a) keep track of the previous time, or b) have a way to query for the previous time. You also need to decide what to do if you don't have a previous time - I assume it would be 'read the page', but that's really your decision.
 
Subu me
Greenhorn
Posts: 11
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

fred rosenberger wrote:that is the ultimate question...modified since when?  In the past hour?  the past week? the past year?  Since the last time you read it?

regardless of what the answer is, you have to either a) keep track of the previous time, or b) have a way to query for the previous time. You also need to decide what to do if you don't have a previous time - I assume it would be 'read the page', but that's really your decision.



I just edited the code and now I have set the ifModifiedSince to 24 hrs ago.
So, now please tell me how to get a condition w.r.t. if any page change in last 24 hrs.
I would be running the code once in every 24 hrs.



Thanks!
 
Tim Moores
Saloon Keeper
Posts: 7585
176
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
There would be no condition - you would open the connection in either case. But if the resource hadn't been modified, the response code (which you can get at by casting the connection to an HttpUrlConnection) would indicate that, possibly using an HTTP_NOT_MODIFIED code.
 
Subu me
Greenhorn
Posts: 11
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Tim Moores wrote:But if the resource hadn't been modified, the response code (which you can get at by casting the connection to an HttpUrlConnection) would indicate that, possibly using an HTTP_NOT_MODIFIED code.



I am not aware of this as you are saying ?
How to utilize the HttpUrlConnection  ?
Can you please explain me how and where I can include that in the code please using HTTP_NOT_MODIFIED ?

Because I am reading certain contents from page so I need to have a check before the code to read no matter how.

Appreciate any help!
 
Marshal
Posts: 4501
572
VSCode Eclipse IDE TypeScript Redhat MicroProfile Quarkus Java Linux
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Here's are a couple of examples to show how If-Modified-Since works.  When you send your request, include a timestamp in the If-Modified-Since header.  If the page has not been modified since then, you will get a response with a result code 304, indicating that the page has not been modified.  If the page has been modified since then, you will get a response with a result code 200, and a Last-Modified header indicating the last time it was modified (in addition to the contents for the page).

Page has not been modified



Page has been modified


Not all sites and site pages will support this, and the page you are checking,  https://get.adobe.com/air/, appears like it does not.

Response for get.adobe.com/air/
 
Subu me
Greenhorn
Posts: 11
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Not all sites and site pages will support this, and the page you are checking,  https://get.adobe.com/air/, appears like it does not.

Response for get.adobe.com/air/



Thanks for a so much of a detailed answer !!!
Air page was just an example. I would be basically dealing with multiple webpages.
So basically how do I check if a page has the support for ifModifiedSince. Do I need to get hold of some return value and check ?
And if a page supports, then how can I get hold of these 304 or 200 return values so that I can use the check in my If block and proceed further for reading ?

Thanks again for your detailed explanation!

 
Tim Moores
Saloon Keeper
Posts: 7585
176
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If you go back to my 2nd post, you'll find that I mentioned casting the connection object to another class - did that make sense? The method to get the response code is called (no surprise there) getResponseCode.
 
Subu me
Greenhorn
Posts: 11
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Tim Moores wrote:If you go back to my 2nd post, you'll find that I mentioned casting the connection object to another class - did that make sense? The method to get the response code is called (no surprise there) getResponseCode.



By just these 2 lines I can get hold of the response code;

But, an AIR page always returns 200 and the site doesn't support the ifModifiedSince concept.
So, every time I would get a status of 200 even if the page has not been modified. Correct ? (Please correct me If I am wrong here)

Hence, looking for a solution as how to exactly know if the site supports ifModifiedSince standard as in my code above I am setting the ifModifiedSince to 24hrs earlier and would run the script every 24 hrs.

Please if anyone can give some tweak to my code above with a working example using any site as you wish to. Or may be a sample code as you wish with a working example on a webpage where we can
get some kind of notification or output if any change in the page from last 24hrs or anyway you propose.
Am just beginning here with java :-) so please consider me to learn a bit slower.

Please let me know, if I need to take a different approach to get this done ?

Appreciate all your help here!!
Regards.
 
Consider Paul's rocket mass heater.
reply
    Bookmark Topic Watch Topic
  • New Topic