Copy-on-Write filesystems have the nice property that it is possible (not to say easy) to “clone” files in \(\mathcal{O}(1)\), as opposed to the classical \(\mathcal{O}(n)\), by having the new file refer to the old blocks, and copying (possibly) changed blocks (as opposed to changing the original blocks, as a hard link would do). This both saves time and space, and can be very beneficial in a lot of situations.

Since in Linux the FICLONE system call was standardised, the use of reflink cloning files now became filesystem agnostic. Currently supporting filesystems include btrfs, XFS, and OCFS2.

The (currently) sad part of the story is language support; Pythons shutil.copyfile call does not (yet) make use of this new system call 1.

Hereby, I introduce a reflink library for Python, which uses cffi to make the system call:

from reflink import reflink
reflink("large_file.img", "copy_of_file.img")

I hope someone finds this to be useful, and that it may gain ReFS and APFS support one day.

In the future, I would also like to incorporate the FICLONE_RANGE call, which “clones” a range of blocks, instead of a whole file.

  1. Please note that it will be improbable for Python to change their shutil.copyfile routine to actually use FICLONE, as this would change the semantics of the call. One might be writing an application that’s supposed to make a backup (against corruption) of a file, while the call actually wouldn’t make a security copy.