posted 14 years ago
thanks for that hint with explain.
first i start with a link (get it from the db or as command line parameter) and check if the corresponding table exists:
if it doesnt, i create the table
set the link as busy
download the link and set as downloaded
then i extract all links from that downloaded html and insert them:
last, i set the processed field to true so i can exclude those from my next query
i just read that its possible to enable time tracking in the pg logs - like a verbose mode
i will try this, maybe i can find the bottleneck
edit: i forgot, i create an index after creating the table: