Python: how to quickly copy files

To copy files using the shutil.copyfile () file, as compared to a regular bracket, right-click> right-click to paste using Windows File Explorer or Mac Finder at least 3 times. Is there a faster alternative to shutil.copyfile () in Python? What can be done to speed up the process of copying files? (The destination of the files is on the network drive ... if that matters ...).

EDITED LATER:

Here is what I got:

def copyWithSubprocess(cmd): proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE) win=mac=False if sys.platform.startswith("darwin"):mac=True elif sys.platform.startswith("win"):win=True cmd=None if mac: cmd=['cp', source, dest] elif win: cmd=['xcopy', source, dest, '/K/O/X'] if cmd: copyWithSubprocess(cmd) 
+6
source share
4 answers

The fastest version without re-optimizing the code I have with the following code:

 class CTError(Exception): def __init__(self, errors): self.errors = errors try: O_BINARY = os.O_BINARY except: O_BINARY = 0 READ_FLAGS = os.O_RDONLY | O_BINARY WRITE_FLAGS = os.O_WRONLY | os.O_CREAT | os.O_TRUNC | O_BINARY BUFFER_SIZE = 128*1024 def copyfile(src, dst): try: fin = os.open(src, READ_FLAGS) stat = os.fstat(fin) fout = os.open(dst, WRITE_FLAGS, stat.st_mode) for x in iter(lambda: os.read(fin, BUFFER_SIZE), ""): os.write(fout, x) finally: try: os.close(fin) except: pass try: os.close(fout) except: pass def copytree(src, dst, symlinks=False, ignore=[]): names = os.listdir(src) if not os.path.exists(dst): os.makedirs(dst) errors = [] for name in names: if name in ignore: continue srcname = os.path.join(src, name) dstname = os.path.join(dst, name) try: if symlinks and os.path.islink(srcname): linkto = os.readlink(srcname) os.symlink(linkto, dstname) elif os.path.isdir(srcname): copytree(srcname, dstname, symlinks, ignore) else: copyfile(srcname, dstname) # XXX What about devices, sockets etc.? except (IOError, os.error), why: errors.append((srcname, dstname, str(why))) except CTError, err: errors.extend(err.errors) if errors: raise CTError(errors) 

This code is a bit slower than native linux "cp -rf".

Compared to shutil, the gain for local storage up to tmfps is about 2x-3x and about 6x for NFS for local storage.

After profiling, I noticed that shutil.copy does a lot of fstat syscals, which are pretty heavyweight. If you want to optimize further, I would suggest making one fstat for src and reusing the values. Honestly, I didn’t go any further, since I got almost the same numbers as my own linux copy tool, and optimization for a few milliseconds hundrends was not my goal.

+9
source

You can simply use the OS you are copying to for Windows:

 from subprocess import call call(["xcopy", "c:\\file.txt", "n:\\folder\\", "/K/O/X"]) 

/ K - copies attributes. Xcopy typically resets read-only attributes.

/ O - copies file ownership and ACL information.

/ X - copies file audit settings (implies / O).

+2
source
 import sys import subprocess def copyWithSubprocess(cmd): proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE) cmd=None if sys.platform.startswith("darwin"): cmd=['cp', source, dest] elif sys.platform.startswith("win"): cmd=['xcopy', source, dest, '/K/O/X'] if cmd: copyWithSubprocess(cmd) 
+1
source

This is just an assumption, but ... your timing is wrong ... that is, when you copy a file, it opens the file and reads all this in memory, so when you insert only create a file and unload the contents of your memory

in python

 copied_file = open("some_file").read() 

is equivalent to copy ctrl + c

then

 with open("new_file","wb") as f: f.write(copied_file) 

is equivalent to ctrl + v paste (so it's time for equiprobability ....)

if you want it to be more scalable for big data (but it won't be as fast as ctrl + v / ctrl + c

 with open(infile,"rb") as fin,open(outfile,"wb") as fout: fout.writelines(iter(fin.readline,'')) 
0
source

All Articles