• Post Reply Bookmark Topic Watch Topic
  • New Topic

Deleting a Folder or dir which contain more than 10K subfolders

 
Balakrishna Reddy
Greenhorn
Posts: 7
Eclipse IDE Java Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I have to Delete a Folder or dir which contain more than 10K subfolders can any one please give me solution on this.

I am thinking File.listFiles() is taking more time (2 hrs)

is there any other way to list the files?
 
K Abhijit
Ranch Hand
Posts: 88
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Would you please tell us more about the requirement?
Is it any 10K folders or with any specific name/property ?
 
Rob Spoor
Sheriff
Posts: 20819
68
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I don't think that matters much.

Balakrishna, in Java 7, with java.nio.file.Path, you can get a DirectoryStream. This seems to be more stream-based -- you can read a few entries at a time instead of everything in one array. The bad news is that Java 7 isn't finished yet...
 
Balakrishna Reddy
Greenhorn
Posts: 7
Eclipse IDE Java Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

Thank you so much for spending your valuable time for my question.

I am using JAVA 1.6 and my requirement is :

My application will process around 1000 files/minute and these files will get stored in folders based on root/date/xx/xx/files.xxx .
One shedular will run every day and it will check if the date is older than 'X' days it has to delete the 'date' directory.

For this I am doing as below:

1. get all list of files from root DIR.
2. for each folder check the date created time is older than X no of days
3. delete if yes else No

Thanks & Regards
Balakrishna Reddy
 
Paul Clapham
Sheriff
Posts: 21876
36
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yes. And as Rob implied (but didn't say) it's a known problem that reading the list of files from a a directory with a very large number of files is very slow in Java at the moment. Hence the new feature that Rob mentioned.
 
Balakrishna Reddy
Greenhorn
Posts: 7
Eclipse IDE Java Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I cant use JAVA 1.7 is there any other solution for this?

what you say about getting Runtime and executing DIR command? Is this make sence ?
 
Rob Spoor
Sheriff
Posts: 20819
68
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you want to use Runtime (after reading Michael C DaCosta's article on Javaworld), you can just use RMDIR and be done with it:
The CMD /C part is required because RMDIR (like DIR) is a built-in command of the command prompt, not a tool on its own. CMD /C RMDIR launches the command prompt (CMD), then calls RMDIR in it (because of /C). The /S and /Q are flags for removing the contents as well (/S) and for suppressing confirmation (/Q).

Type in CMD /? and RMDIR /? in a command window for more information on these flags.
 
Balakrishna Reddy
Greenhorn
Posts: 7
Eclipse IDE Java Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks ROB,

but this is the only way to improve performance ? no other way using JAVA 1.6?
 
Rob Spoor
Sheriff
Posts: 20819
68
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Java 6 only has one way for looping through the children of a folder, and that's using File (list, listFiles). Since that's not an option because of slowness you'll have to think outside the API.
 
Balakrishna Reddy
Greenhorn
Posts: 7
Eclipse IDE Java Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Rob,

will lucene will help me?
 
Rob Spoor
Sheriff
Posts: 20819
68
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I couldn't tell you, I haven't used it yet.
 
K Abhijit
Ranch Hand
Posts: 88
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Does it take same time to read/stream a single directory/file?
If it's faster then we can overcome this performance issue with different algorithm

But this has lot of limitations and hacks...
this may not actually fit into you problem...

today is 19th Oct Say we need to delete all files till 9th (X =10 i.e Today - 10 days)

When scheduler starts
it would read the last deleted date entry from a log file..If null then start from Today-19 (say 1st Oct)
If the logs directory gets created using specific format, (date/xx/xx/files.xxx ) which can be Constructed by scheduler then it should Construct directory/file and delete it ie. Scheduler would try to fetch file/directory 10012010(Today-19 = 1st Oct) /xx/xx/files.xxx if it exixts then delete ...
then try for next date .... till Today -10
before exit: write last deleted date in sys log file

not sure how much this would help you
 
Balakrishna Reddy
Greenhorn
Posts: 7
Eclipse IDE Java Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I will be able to check weather the folder is how many days older. but the problem is to list the parent directory is taking much time (since its having more than 10K folders), and deleting the folder which contain some > 20MB data.
 
Shaaf Shah
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Provided the creation process is also under your control.

Why wouldn't you able to create and index file. ( You can look up lucene as well, but then you have to ask lucene to index as well)
that deletes an entry from it when a directory is deleted.
And adds a directory to it when a directory is added.

This way you don't even have to traverse through anything except get date folder coresponding to the index file.
Thinking out loud you could perhaps use a regular expression to go via the index file and it could make it faster.

And since you have an X number of days in equation you could simply delete anything above or beyond that.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!