Win a copy of Java Concurrency Live Lessons this week in the Threads forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

Web page size calculation  RSS feed

 
Eswar Varanasi
Greenhorn
Posts: 20
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi All,
I've a question, how to find any web pages size using java is there any api for that. I want to know each and every part of the response say CSS,Images,Content, headers..etc, how about HTTPClient from apache will this server the purpose.

Any assistance is greatly appreciated.

thanks,
Ravi.
 
Pat Farrell
Rancher
Posts: 4678
7
Linux Mac OS X VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I don't understand what you are asking for.

its fairly easy in Java to call a library/API that emulates a browser, so you can point your pseudo-browser at a web page, get the information, and have your way with it. Counting the bytes would be trivial.
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13078
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I dont think it is all that simple. You would have to recognize all the ways that the base HTML page can refer to resources from CSS to images and imported javascript.

I actually tried to write a general web page analyzer that captured all the parts separately - HttpClient is certainly the way to go.

Bill
 
Eswar Varanasi
Greenhorn
Posts: 20
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
yes, "HttpClient" provides us the basic functionality to get the response and print it. but how do you think we can get the base level values of the response. Say, if the url is encripted so we need to use the encrypted gzipped one and then access it.. i want to get the complete page size according the network not by decompressing the obtained compressed file

any suggestion is greatly appreciated.

thanks,
Ravi
 
Eswar Varanasi
Greenhorn
Posts: 20
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi, Is there any way to capture the page size from a web page request
 
Akhilesh Trivedi
Ranch Hand
Posts: 1608
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Ravi Shankar:
Hi, Is there any way to capture the page size from a web page request


Do you want to do via programming or if you simply want to check out some performance thing. We sometimes use HttpAnalyzer
 
Dave Wingate
Ranch Hand
Posts: 262
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you don't need to do the analysis programmatically, then I can recommend the tool firebug.
 
Eswar Varanasi
Greenhorn
Posts: 20
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
thinking to do it programatically from java is there any api available.

what about jcaps or jni any idea
will this serve the purpose
 
Ulf Dittmer
Rancher
Posts: 42970
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Both JCAPS and JNI have no relation to accessing web pages. A library like HttpClient or jWebUnit will help, but keep in mind what William wrote - you need to account for CSS, JavaScript, images and IFrames. Furthermore, tags that source those resources can also be created dynamically by JavaScript, so simply parsing the HTML won't do. Maybe these aren't important for the pages you need to access, but it's something to think about for a general solution.
 
Eswar Varanasi
Greenhorn
Posts: 20
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi William Brogden,
I'm about to develop web page size calculation tool but not clear how httpclient allows us to read the html page element sizes say CSS, JS, IMAGES...etc. Can you share your view on this.

Any assistance is greatly appreciated.

Thanks,
Ravi
 
Norm Radder
Rancher
Posts: 1734
22
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Try using a proxy server with a browser.
The browser would pull in all the files reference by an HTML page and the proxy server could measure the file sizes as they passed thru,
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13078
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm about to develop web page size calculation tool but not clear how httpclient allows us to read the html page element sizes say CSS, JS, IMAGES...etc. Can you share your view on this.


The reason to use HttpClient for these requests is that it will help you correctly handle cookies and headers just like a browser would.

For each URL to a page element, such as CSS, you would open an HttpConnection and read the resulting InputStream into a buffer - when the InputStream ends you have a buffer with the total content of that resource.

I would suggest that you first play around with the Firebug add-on for the Firefox browser just to get an idea of how complex the pages you are looking at really are.

Bill
 
Eswar Varanasi
Greenhorn
Posts: 20
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi, I've started using httpclient to get the response as string n trying to get other file names of CSS, JS, Images, IFRAMES, Background images/(url's).

Using string operations, i got these names & path n im calculating the size individually. Do you have anything about this.

cookies,session size needs to added

thanks
[ September 04, 2008: Message edited by: Ravi ]
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!