We have large files, which not fit in memory. There exists b+-tree indexes. But finding record require big file seek. One seek is not equivalent 4 kiB block - seek has 10 millisecond of time. whereas speed of reading is 50 MB/s therefore one seek takes time as reading 500 kB. Event more - better for mechanic disc is reading continuous data than seeking.
For example: Bitcoin blocks has 100 GB of data, 200 millions transactions, 500 millions inputs + 500 millions outputs = one billion scripts.
assume, we not use database system, but working on file system and algorithms: 200 mln transactions are indexes, now we must fin d addresses of inputs: each input has no address but has link to transaction. We must do 500 mln seeks = take time as reading 250 TB data (!) instead "only" 100 GB od data.
Which idea to decrease seek count? Maybe delayed using of found transaction, to accumulate 1000 transactions near one other?
How to make
profit from memory cache? In memory can be 1%, 10%,50% of big file. Naive seek not distinguish if is more than one record in memory.