• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Devaka Cooray
  • Knute Snortum
  • Paul Clapham
  • Tim Cooke
Sheriffs:
  • Liutauras Vilda
  • Jeanne Boyarsky
  • Bear Bibeault
Saloon Keepers:
  • Tim Moores
  • Stephan van Hulst
  • Ron McLeod
  • Piet Souris
  • Frits Walraven
Bartenders:
  • Ganesh Patekar
  • Tim Holloway
  • salvin francis

Accessing similar URLs  RSS feed

 
Ranch Hand
Posts: 32
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have a program that must read the results from similar URLs. These URLs are of the format similar to:
http://bla.bla.com/db/query?sql=select%20*%20from%20ARCHIVE1
http://bla.bla.com/db/query?sql=select%20*%20from%20ARCHIVE2
http://bla.bla.com/db/query?sql=select%20*%20from%20ARCHIVE3
http://bla.bla.com/db/query?sql=select%20*%20from%20ARCHIVE4
I must read from 2,000-3,000 URLs such as these. Presently, my code makes a URL, gets a InputStreeam from the URL and creates a BufferedReader from the InputStream from which it reads. Like this:

It takes close to 1 second to perform the url.openStream(). So, I have 2 questions:
1) Is this the most efficient way to convert a URL into something from which I can read.
2) Since these URLs are accessing the same host (bla.bla.com) is there a better way to "navigate" from URL to the next?

TIA
- Rolf.
 
author and iconoclast
Posts: 24203
43
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Explicitly get the URLConnection object associated with the URL using getConnection(), get the input stream from that object instead of directly from the URL, and be sure to call close() on the URLConnection itself after the request. It may be counterintuitive, but this is what is necessary to tell Java it's OK to reuse the connection.
 
Rolf Johansson
Ranch Hand
Posts: 32
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator


Explicitly get the URLConnection object associated with the URL using getConnection()


Do you mean openConnection()?


and be sure to call close() on the URLConnection itself after the request.


I do not see a close() method in the URLConnection class. Perhaps I am misunderstanding something?


So, something like this:
assume the arrayOfURLs is filled with URLs to read from like:
"http://bla.bla.com/db/query?sql=select%20*%20from%20ARCHIVE1";
"http://bla.bla.com/db/query?sql=select%20*%20from%20ARCHIVE2";
"http://bla.bla.com/db/query?sql=select%20*%20from%20ARCHIVE3";

 
(instanceof Sidekick)
Posts: 8791
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'll be very interested to see how you solve this. I have a program that does something similar and wouldn't mind speeding up the connects.

I run a configurable number of URLs concurrently in a thread pool. At only 4 or so I'm pegging the CPU most of the time (XP, 3.1 Ghz, 2gig memory) but if I keep adding a few more I can push up my network throughput a tiny bit more to about 80% of what my cable company promises. After that there is not much gain or loss of throughput, but if I could cut out that second or two between files maybe I could light the wire on fire. XP is able to give me just enough CPU to play FreeCell while this is going on.

BTW: I use patterns and auto-incrementing numbers to work through the files instead of a list of names. Let me know if you're interested in how that works.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!