• Post Reply Bookmark Topic Watch Topic
  • New Topic

Implementing a file cache  RSS feed

 
Mark Mescher
Ranch Hand
Posts: 34
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,
I want to implement a filecache. I have a number of directories and I want to save all filenames in there with there lasteditdate and a md5-checksum.
In a Background Thread I check every few minutes if a file has changed and if so I refresh the MD5-Hash. Up here that is no big problem.
I want to know if this solution is performant. Theoretically it is possible that there are a big number of files in the cache. Since now I save Information in a simple DataObject, all together put into a Collection.
Is there a way to minimize the Memory needed by this mechanism???
Bye
Mark
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13078
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
A few obvious points you probably already do - store the long value of lastModified rather than a String or Date object, store the byte[] of MD5digest rather than a String representation.

However the question of performance will depend on how this cache is used.
How many Threads will be accessing the cache at one time?
How is lookup done - by filename only?
Do you have to cache all the files or only the most active?
Whats the ratio of lookups to file changes?
What is more important, minimizing memory, speed of lookup, multi-thread capability?

I recently did some work for a client that was very similar to what you are after.
Bill
 
Damanjit Kaur
Ranch Hand
Posts: 346
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Theoretically it is possible that there are a big number of files in the cache. Since now I save Information in a simple DataObject, all together put into a Collection.


Well, I don't know exactly how file cache works but just an idea ,to reduce the number of files in cache-

may be you can start initially with storing all files in directory with the information that you are already storing along with another information about the frequency of editing the files. Initially it will be same/0 for all.
In backgroud thread, whenever some file is edited, its frequency is also incremented and in the collection say its a linked list where the maximum edited files comes first and minimum no. of times edited files comes later.

The file with 0 as frequency of editing can be deleted after some time and other files which are edited and not in list, can be added to this list.
 
Mark Mescher
Ranch Hand
Posts: 34
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thx for your hints. Since now I save Date as Date and MD5 as String.
I will optimize this point. I think for me its very important to reduce the needed memory because I will have to manage very large filesystems...
Perhaps for very very large fs I will implement a db-solution to save mem...
Bye
Mark
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13078
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Don't say bye yet - this is an interesting area - let us know what you come up with. The performance forum would be a good place.
Bill
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!