I am working on a utility which reads a directory(folder) recursively and displays all the file names in that directory and its sub-directories. I am working on Windows 7. Lets say I want to read all the files contained in C: and display their file names, etc.
Right now my approach is to use FileUtils.listFiles provided in org.apache.commons.io.FileUtils to get all the files from the folders. This approach is very slow(The folder that I want to read has ~20000 files recursively located in it)
Could you guys please suggest alternatives to my approach that I could use to improve the performance?
Too difficult a question for “beginning”, so I shall move you to our performance forum.
What does, “slow” mean? How long does it take to read all the files? How are you timing it? Is it listing the files that is slow, or displaying the names? How do you manage to get as few as 20000 files in C:? I would have thought you would find more like 2000000 searching recursively. If you are using an Apache program, you are probably beyond the stage of needing the Java™ Tutorials, but I shall give you the link anyway.
I am basically trying to store the absolute path of each file in a db table. I am timing it using simple System.currentTimeMillis() calls before and after the method. The accessing of these files in the file system is slow,
I mean the loop that gets the file details recursively and puts them in a List(which I later save in a db). The C: was an example, actual scenario is a folder with lots of sub folders that contain around 20K files. I traverse this base folder and all its sub folders and keep adding the absolute file paths to the List. This traversal of the base directory consumes a lot of time.
How long does it take? How much of that is traversing the file structure, how much storing the names of the files and how much inserting the data into the database? Is the database on your own computer or at a distant location?
You showed up just in time for the waffles! And this tiny ad:
Building a Better World in your Backyard by Paul Wheaton and Shawn Klassen-Koop