A question which was asked of me during an interview was as follows:
How would you design an application which needs to process a file with 1000 records (rows) using multi-threading? In this context, "processing" means to parse some type of delimited file, like CSV and extract each row and write it to the database.
I asked the interviewer whether I need to implement this using PL/SQL or Java, and he said "it's up to you."
So, I said I would first create 10 threads, and each thread would be responsible for reading rows 1 - 100, 101 - 200, and so on...
I would put a try-catch block in my code where the processing occurs, in case there was an error processing these records.
In the event that one of my batches failed, I would still maintain sequence on the database by using a key column for the records which I insert.
I answered this question not really having a solid foundation of how multi-threading works, and I would appreciate if anyone can expand on my answer and tell me how I could have answered this question better. Thank you.
I would just use features built into Java to do that task. Set up a ThreadPoolExecutor with 10 threads, then read the rows from the file and submit them to the ThreadPoolExecutor.
Of course that would be just to process the rows once they were read. Parsing a file which no doubt has lines of different length is not a suitable task for more than one thread because of the difficulty of finding where line 101 starts. And if the only processing done on a row is to add it to a database, I would measure my multi-threaded solution to the obvious single-threaded solution to see if it made any difference.