Warning: copying very large files (larger than available memory) on Windows will bring your computer to a screeching halt.
Let’s say you have a 60 gigabyte file on Computer A that you wish to copy to Computer B. Both Computer A and Computer B have 16 gigabytes of memory. Assuming that you have the network and file sharing permissions set correctly, you can issue this command on ComputerB:
copy /v \\ComputerA\Data\Filename.bin C:\Data\Filename.bin
As you would expect, that command reaches across the network and begins copying the file from Computer A to the local drive on Computer B.
What you don’t expect is for the command to bring Computer A and possibly Computer B to a screeching halt. It takes a while, but after 20 or 30 gigabytes of the file is copied, Computer A stops responding. It doesn’t gradually get slower. No, at some point it just stops responding to keyboard and mouse input. Every program starts running as though you’re emulating a Pentium on a 2 MHz 6502, using a cassette tape as virtual memory.
Why does this happen? I’m so glad you asked. It happens because Windows is caching the reads. It’s reading ahead, copying data from the disk into memory as fast as it can, and then dribbling it out across the network as needed. When the cache has consumed all unused memory, it starts chewing on memory that’s used by other programs, somehow forcing the operating system to page executing code and active data out to virtual memory in favor of the cache. Then, the system starts thrashing: swapping things in and out of virtual memory.
It’s a well known problem with Windows. As I understand it, it comes from the way that the COPY and XCOPY commands (as well as the copy operation in Windows Explorer) are implemented. Those commands use the CopyFile or CopyFileEx API functions, which “take advantage” of disk caching. The suggested workaround is to use a program that creates an empty file and then calls the ReadFile and WriteFile functions to read and write smaller-sized blocks of the file.
That’s idiotic. There may be very good reasons to use CopyFileEx in favor of ReadFile/WriteFile, but whatever advantages that function has are completely negated if using it causes Windows to cache stupidly and prevent other programs from running. It seems to me that either CopyFileEx should be made a little smarter about caching, or COPY, XCOPY and whatever other parts of Windows use it should be rewritten. There is no excuse for a file copy to consume all memory in the system.
I find it interesting that the TechNet article I linked above recommends using a different program (ESEUTIL, which apparently is part of Exchange) to copy large files.
This problem has been known for a very long time. Can anybody give me a good reason why it hasn’t been addressed? Is there some benefit to have the system commands implemented in this way?
Update, October 16
It might be that Microsoft doesn’t consider this a high priority. In my opinion it should be given the highest possible priority because it enables what is in effect a denial of service attack. Copying a large file across the network will cause the source machine to become unresponsive. As bad as that is for desktop machines, it’s much worse for servers. Imagine finding that your Web site is unresponsive because you decided to copy a large transaction file from the server.