First,
you should now there are already some free Java crawlers out there you could use and customize.
I also developed a "downloader" years ago when I didn't have Internet access and Teleport was not a choice.
You should start by thinking thoroughly a design as this is not as simple as it looks, a spider has many aspects.
Here are some toughts:
make more download threads and have a manager for them have a reference table to keep track of each file status (downloaded, downloading, parsing etc) build links at the finish of all downloadings As for the urls problem, there's no class no give you all the links a page, but you could use regular expressions. Also note that URL(host, any_file) give you an absolute correct url, no matter file si relative to host or is an outside url.
Also, if you want a challenge - and a feature that I don't know any spider that offers it -, figure out links that are build using JavaScript
[ January 23, 2002: Message edited by: gigel chiazna ]