Pathlib Basics
Python 3 provides a class called pathlib which allows users to work with paths as objects, with the operations like recursively listing directories or accessing items that you would expect out of the command line or Powershell. The official documentation is here. Eventually I want to understand it well because I think it’ll reduce the friction between what I want to do on a filesystem (point a function at a directory, access an image) and writing the code to do it.
To bootstrap my mental model I’m going to imagine a filesystem as a filing cabinet whose drawers contain folders.
Pure Paths
Pure Paths are the parent class of Path objects and “provide purely computational operations without I/O”. In other words, you can do operations like concatenate two Pure Paths or extract the components of a Pure Path, but it doesn’t contain any functionality to actually access the filesystem. I think of a Pure Path as the labels of the filing cabinet, e.g. ‘the bengal folder in the cat drawer’. What’s in there? Pure Paths have no idea.
In a word, Pure Path
s know everything about paths (roots, suffixes, parents) except what’s actually in them.
The abstraction of Pure Path
s come with several quality-of-life improvements compared to working with paths as unadorned strings. For example, it can differentiate between operating systems, since Window and Posix filesystems have different syntax despite sharing the same abstract file system structure. Depending on the operating system, creating a Pure Pat
h will output a different class: a PurePosixPath
object on Posix systems and a PureWindowsPath
on Windows systems. They can be compared, and are compatible with operations which take a os.PathLike
object (in fact it’s a parent class).
They’re instantiated, intuitively, by strings, lists of strings, or other Path objects, e.g.
PurePath('hello', 'world')
PurePath(Path('foo'), Path('bar'))
Paths
In practice you’ll use Concrete Paths which are Pure Paths which can do IO operations. You can ask it how many folders there are in the Cats
drawer, for example. Being filesystem-aware also means that, conversely, you can create Path
objects which depend on the specifics of your filesystem, e.g. the Path.home()
method which outputs a Path
object representing your home directory (the equivalent of ~
) or Path.cwd()
which does the intuitive thing.
Some useful methods:
Path.exists()
:Self → Bool
. Whether the path points to an existing file or directory. Similar arePath.is_dir()
andPath.is_file()
.Path.expanduser()
:Self → Path
. Expands~
constructions inside aPath
object).Path.iterdir(): Self → Iter[Path]
. Yields a generator containing Path objects of the directory contents.Path.mkdir(mode=0o777, parents=False, exist_ok=False): Self → Null
. Creates a directory forSelf
.Path.resolve(): Self → Path
. Make the path absolute, resolving any symlinks (most importantly.
and..
. )
Examples: List the directories in your current directory:
p = Path.getcwd()
[x for x in p.iterdir() if x.is_dir()]
Create a hello
directory inside your home directory:
p = Path('~', 'hello')
p.expanduser().mkdir()
Any basic operation you’re used to doing on the command line you can likely do with Path
objects.