My problem: I have a table with approx 5 million rows. I need to grab the data in a text column in this table, tokenize it, write it back out to the table, and write the first
word of that data out to another column in the same table.
My Solution: This only has to be run once so I'm not as concerned with elegance and speed as long as it works. My plan was to retrieve data into a resultset, process it, write the data to the db, then get more data into the resultset and keep going until I've processed the whole table. I have to use a 3rd party tokenize that is written in
java so performing this on the server with SPs and SQL isn't an option.
I get the same error on the same row number every time which leads me to believe it is a memory problem but I'm not sure what it is or how to fix it. I can run this code on a table where there are only 500 rows and it executes fine. This is the error I get:
java.io.EOFException
MESSAGE: Can not read response from server. Expected to read 65 bytes, read 47 bytes before connection was unexpectedly lost.
I use two Connection objects, one for reading and one for writing. My understanding is that I have to completely read all of the rows before I can use the connection object for any other resultset since I'm getting a row at a time.
This is how I get the resultset. It is within a try/catch block:
This is how I process it. :
Do I need to instantiate a new PreparedStatement every time I write to the table? I'm using the same statement and simply setting the values in it every time.
I'm the only person using the DB so it isn't that someone else is trying to access the same data I'm accessing. The Java code is executing from my desktop. I do not have access to an application server.
I'm hoping someone can see my error or has an better solution to my problem.