The application I'm considering is one that detects bit-rot on backup drives. There would be one row per file which would contain full path, date of file, size of file, MD5, last verified date, and maybe one or two others. Verification would probably happen every few days. Verification can be halted part way through (probably the reason I was leaning DB instead of flat file). Possible indexes: by full path, by verified date.
Another route I'm considering is having a $CHECKSUM root directory on the backup drive and under that would be a directory tree that mirrors the tree of the backup drive with one flat file per directory containing all the checksums for all the backup files in the corresponding directory. This way the checksum files would typically be small and manageable. Traversing a tree of checksums would be no more difficult than trasversing a regular directory tree. This approach might be lighter weight than using a DB. I was originally thinking if the checksum table could be kept it a single table along with associated indexes it would be simpler. As one big DB table random access would be necessary. With multiple small flat files random access might not be necessary. Is this a KISS problem?
P.S.
My current backup drive holds 16TB of which 8TB is currently filled with 1,613,018 files.
It contains 236,504 directories.