Win a copy of Murach's Python Programming this week in the Jython/Python forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

File copy: Java Program or OS Command which is faster  RSS feed

 
Sivaraman Lakshmanan
Ranch Hand
Posts: 231
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi all,
I am trying to copy a 100 MB file. I am free to choose whatever way I can that is I can use OS "Copy" command to copy the file or write a Java Program to copy this file. As a Java programmer I want to try it out with java. So I wrote a small Java File copy program which uses Buffered Stream and copied a 10 MB file for testing which took 2 Mints and 28 seconds, when I used the OS copy command it copied in 1 Mint.
So does this mean file operation done using Java is slower than OS Copy or is there any efficient way to write a copy program in Java.

Thanks in Advance

Regards,
Sivaraman.L
 
Ilja Preuss
author
Sheriff
Posts: 14112
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
There is a more efficient way, using the nio package: http://javaalmanac.com/egs/java.nio/File2File.html

Notice that the exception handling is kind of sloppy and not something you should adopt...
 
Vlado Zajac
Ranch Hand
Posts: 245
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Buffered streams are unnecessary - they need two buffers (for input and output).

What you can do without nio:
Create a buffer (byte[] with size 1<<16 (=64KB) or so) and read/write whole buffer at once.
Optimal size of the buffer cannot be determined without measuring speed with different sized (64KB is just my guess).

Using nio should stiil be faster.


[ November 29, 2005: Message edited by: Vlado Zajac ]
 
Sivaraman Lakshmanan
Ranch Hand
Posts: 231
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi all,
Thanks for all your replies. As you guys said, I used FileChannel of Java.nio.channel package and studied the time to copy a file. This seems much faster than the normal IO file copy but even then this FileChannel is slower than the "COPY" command of OS. Does this mean that copying of file using programming language will be slower than the OS way.
Please clarify

Regards,
Sivaraman.L
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13078
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I would expect the OS way to always be faster since the OS can use its internal file buffers directly. Any program is probably going to be working with indirect access to OS file buffers.
Bill
 
Rajagopal Manohar
Ranch Hand
Posts: 183
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Sivaraman Lakshmanan:
Hi all,
Thanks for all your replies. As you guys said, I used FileChannel of Java.nio.channel package and studied the time to copy a file. This seems much faster than the normal IO file copy but even then this FileChannel is slower than the "COPY" command of OS. Does this mean that copying of file using programming language will be slower than the OS way.
Please clarify

Regards,
Sivaraman.L


I guess OS copy is the way to go, my reasons

all this buffering and nio will help you only when you have to copy 1 huge file. That too only to an extent

The moment you have to copy 100 MB of data scattered over hundreds if not thousands of file these tricks don�t work and you are left to grapple with crappy IO performance of java

So its best you stick with OS calls when you can

I am just venting out my frustration

-Rajagopal
 
Stefan Wagner
Ranch Hand
Posts: 1923
Linux Postgres Database Scala
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
But copying with java should give much more easy error access.
Imagine copying 1000 files with the OS, and getting interrupted by an error while processing file 397 or 721 or which?
 
Jesper de Jong
Java Cowboy
Sheriff
Posts: 15860
80
Android IntelliJ IDE Java Scala Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by William Brogden:
I would expect the OS way to always be faster since the OS can use its internal file buffers directly. Any program is probably going to be working with indirect access to OS file buffers.
Bill

Java doesn't necessarily work with indirect buffers. You can also allocate direct buffers. Read the documentation of the NIO API (package java.nio):

Direct vs. non-direct buffers

A byte buffer is either direct or non-direct. Given a direct byte buffer, the Java virtual machine will make a best effort to perform native I/O operations directly upon it. That is, it will attempt to avoid copying the buffer's content to (or from) an intermediate buffer before (or after) each invocation of one of the underlying operating system's native I/O operations.
 
Rajagopal Manohar
Ranch Hand
Posts: 183
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Stefan Wagner:
But copying with java should give much more easy error access.
Imagine copying 1000 files with the OS, and getting interrupted by an error while processing file 397 or 721 or which?


I agree, but given a choice between this and 10 minutes less time to do that same task I will choose the OS call way (especially when the program is called to run often).

-Rajagopal
 
Roger Chung-Wee
Ranch Hand
Posts: 1683
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Are you doing this as a short-term excercise or is it something which will be put into production? If it is the latter, then the file copies should be done programmatically or by scripting. I would write it in Java and run it with an Ant script.
 
Rajagopal Manohar
Ranch Hand
Posts: 183
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Roger Chung-Wee:
Are you doing this as a short-term excercise or is it something which will be put into production? If it is the latter, then the file copies should be done programmatically or by scripting. I would write it in Java and run it with an Ant script.


Well it is the latter. It is in production for a month now and there has been no problems yet.

But I must admit we can be a little bit let panicky about exceptions because we are setting up development environment for the developers (That�s the cause we are moving gigs of data around and time is money). Imagine hundreds of developers spending 2 hours every week to set their workspaces up.

In case some unforeseen error occurs, no probs that is not a show stopper they can do it the old albeit longer way and report the error and we'll fix it

PS: I will not recommend ANT for copying anything but medium quantities of data as we have no control over buffer size the last time I saw it.

make sense?

-Rajagopal
 
Roger Chung-Wee
Ranch Hand
Posts: 1683
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Can't the projects be checked into something like a CVS repository hosted on a server? The developers can then check out what project(s) they need.
 
Ilja Preuss
author
Sheriff
Posts: 14112
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I agree with Roger - this sounds like it should be handled by a Version control system. (Though I would recommend SVN over CVS... )
 
Rajagopal Manohar
Ranch Hand
Posts: 183
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Ilja Preuss:
I agree with Roger - this sounds like it should be handled by a Version control system. (Though I would recommend SVN over CVS... )


Well its a bit more complicated than that. Actually we do have a version control system (StarTeam )and we do interface with it to get a lot of files.

Apart from that we need to get a lot of dependencies which are built which lie outside the workspace.

Also there are n releases for which we have a algorithm to get the latest dependencies as some dependencies may not be part of that release(not to speak of the n component teams each of which want different projects and dependencies)

All config files have 2 versions developer(windows)and production(Unix)which need to be handled, etc...

Some files like Test Servers are not a part of VC and ofcourse to top it we have to build on a existing AutoBuild process

To sum it all up apart from the n number of files from StarTeam(VC) we also have to get a larger number of files from outside it

-Rajagopal
 
Ilja Preuss
author
Sheriff
Posts: 14112
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Frankly, this sounds like a nightmare to me, something that needs to be fixed at a more fundamental level. But of course I'm not there and could be totally wrong.

Having said that, I'd guess that using the OS command probably is the way to go.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!