posted 8 years ago
I just completed the Web Crawler exercise (at "127.0.0.1:3999/concurrency/10") and therefore the whole Go Tour. But I'm just kind of wondering. The exercise was to create a web crawler that explored every URL on a page, and for each such URL every URL on the page that URL referred to, and so on and on forever recursively. (Well, not quite forever; there was a depth limit built into it.) My code that accomplished it was:
But note that in order to get it to work I had to put in a call to "time.Sleep( time.Second)" in my main function. Without that line in, the main function would end up returning, and terminating the program, before very many calls to "Crawl()" had gotten executed. Is there some way in Go to tell the main function to wait and stay alive until all currently executing lightweight threads have completed executing?
I was thinking one way I could implement that would be to add an integer "Count" field to my "SafeMap" struct, increment it before each call to "go Crawl(sm, u, depth-1, fetcher)", and only decrement it at the end of the "Crawl()" function, and then have my main function loop on that "Count" variable until it was zero again. That seems kind of drastic though. Anybody have any better ideas?