Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

from pathlib import Path, PureWindowsPath
import itertools
path = Path(".").resolve()
path
PosixPath('/workdir/archives/blog/en/content/2020/10/python-pathlib.Path')

No Trailing Slashes

A path object always removes the trailing slashes. And Path can be used to manipulate URLs tool, which is convenient.

Path("https://github.com/dclong/dsutil//")
PosixPath('https:/github.com/dclong/dsutil')

Path.absolute

Generally speaking, Path.resolve is preferred over Path.absolute.

path.absolute()
PosixPath('/app/archives/blog/misc/content')

Path.anchor

path.anchor
'/'

Path.as_posix

path.as_posix()
'/app/archives/blog/misc/content'

Path.as_uri

path.as_uri()
'file:///app/archives/blog/misc/content'

Path.chmod(mode)

Unlike Path.mkdir, mode is the final mode of the file. It is not affected by the current umask.

help(path.chmod)
Help on method chmod in module pathlib:

chmod(mode) method of pathlib.PosixPath instance
    Change the permissions of the path, like os.chmod().

Path.cwd

Path.cwd is a static method to get the current working direcotry.

Path.cwd()
PosixPath('/workdir/archives/docker')
path.drive
''

Path.exists

path.exists()
True

Path.expanduser

Path.home() is preferred to Path('~').expanduser().

Path("~").expanduser()
PosixPath('/root')
Path("~/archives").expanduser()
PosixPath('/root/archives')

Path.glob

  1. pathlib.Path.glob returns a generator (rather than list).

Find all Jupyter/Lab notebooks in the current directory.

path = Path()
path.glob(pattern="*.ipynb")
<generator object Path.glob at 0x123918c10>

Find all CSS files in the current directory.

list(path.glob(pattern="*.css"))
[]

Find all Jupyter/Lab notebooks files under the current directory and its sub-directories.

nbs = Path().glob("**/*.ipynb")
len(list(nbs))
402

Path.home

Both Path.home() (preferred) and Path('~').expanduser() returns a new Path object representing the user’s home directory, which is the same as os.path.expanduser('~'). However, Path('~').resolve() does not return a new Path object representing the user’s home directory!

Path.home()
PosixPath('/root')
Path("~").expanduser()
PosixPath('/root')
Path("~").resolve()
PosixPath('/app/archives/blog/misc/content/~')

Path.iterdir

Iterate the content of the directory.

The code below shows the first 5 files/folders under path.

[p for p in itertools.islice(path.iterdir(), 5)]
[PosixPath('2018-07-21-conda-build-issue.markdown'), PosixPath('2018-10-29-monitoring-and-alerting-tools.markdown'), PosixPath('2019-02-10-unit-testing-debugging-python.markdown'), PosixPath('2017-01-15-chinese-locale.markdown'), PosixPath('2012-05-17-java-difference-abstract-interface.markdown')]

Path.mkdir(mode=0o777, parents=False, exist_ok=False)

  1. The option parents=True creates missing parent directories as needed. The option exist_ok=True makes FileExistsError to be omitted. path.mkdir(parents=True, exists_ok=True) is equivalent to the shell command mkdir -p path.

  2. By default, the mode option has the value 777. However, this doesn’t mean that a created directory will have the permission 777 by default. The option mode works together with umask to decide the permission of the created directory. To make a created directory to have the permission 777, you can set umask to 0 first.

     :::python
     import os
     mask = os.umask(0)
     Path("/opt/spark/warehouse").mkdir(parents=True, exist_ok=True)
     os.umask(mask)

    Another way is to manually set the permission using the method Path.chmod (not affect by the current umask) after creating the directory.

path.name
'content'
Path("/root/abc.txt").name
'abc.txt'

Path.parent

path.parent
PosixPath('/app/archives/blog/misc')

Notice that the parent of the root directories (/, C:, etc.) are themselves.

path = Path("/")
path
PosixPath('/')
path.parent
PosixPath('/')
path.parent is path
True
PureWindowsPath("C:").parent
PureWindowsPath('C:')
path.parts
('/', 'app', 'archives', 'blog', 'misc', 'content')

Path.resolve

path.resolve()
PosixPath('/app/archives/blog/misc/content')

Path.relative_to

Path("/app/archives/blog").relative_to("/app")
PosixPath('archives/blog')
Path("/app/archives/blog").relative_to(Path("/app"))
PosixPath('archives/blog')

Path.rename(target)

On Windows, if target exists, FileExistsError will be raised. The behavior of Path.rename on Linux is as below (assume the user has permissions):

  • If target is an existing file, it is overwritten silently.

  • If target is an existing empty directory, it is overwritten silently.

  • if target is an existing non-empty directory, an OSError (Errno 39) is thrown.

If you want to overwrite existing target unconditionally, you can use the method shutil.copytree(src, dst, dirs_exist_ok=True) and then remove the source directory.

!rm -rf test1 && mkdir -p test1 && touch test1/1.txt && ls test1/
1.txt
!rm -rf test2 && mkdir -p test2 && touch test2/2.txt && ls test2/
2.txt
Path("test1").rename("test2")
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
Cell In[13], line 1
----> 1 Path("test1").rename("test2")

File /usr/lib/python3.10/pathlib.py:1234, in Path.rename(self, target)
   1224 def rename(self, target):
   1225     """
   1226     Rename this path to the target path.
   1227 
   (...)
   1232     Returns the new Path instance pointing to the target path.
   1233     """
-> 1234     self._accessor.rename(self, target)
   1235     return self.__class__(target)

OSError: [Errno 39] Directory not empty: 'test1' -> 'test2'
!ls test2/
untitled.txt

Path.stem

path.stem
'content'
Path("/root/abc.txt").stem
'abc'

Make this path a symbolic link to target. Under Windows, target_is_directory must be True (default False) if the link’s target is a directory. Under POSIX, target_is_directory’s value is ignored. It is suggested that you always set target_is_directory to be True (no matter of OS) if the link’s target is a directory.

Notice that a FileExistsError is throw if the current path already exists. You can first unlink it (using Path.unlink) and then create a symbolic link again using Path.symlink_to.

import tempfile

Path("/tmp/_12345").symlink_to(path, target_is_directory=True)
!ls /tmp/_12345 | head -n 5
2009-11-01-format-data-in-sas.markdown
2009-11-01-general-tips-for-sas.markdown
2009-11-01-macro-in-sas.markdown
2010-11-20-clustering-in-r.markdown
2010-11-20-general-tips-for-latex.markdown

Path.__str__

path.__str__()
'/app/archives/blog/misc/content'
str(path)
'/app/archives/blog/misc/content'

Path.with_name

Return a new path with the name changed. If the original path doesn’t have a name, ValueError is raised.

path = Path("/root/abc.txt")
path
PosixPath('/root/abc.txt')
path.with_name("ABC.txt")
PosixPath('/root/ABC.txt')
path.with_name(path.name.replace("abc", "ABC"))
PosixPath('/root/ABC.txt')

Or another way is to manipulate the underlying string of the path directly.

str(path).replace("abc", "ABC")
'/root/ABC.txt'

Path.with_suffix

Return a new path with the suffix changed. If the original path doesn’t have a suffix, the new suffix is appended instead. If the suffix is an empty string, the original suffix is removed.

path = Path("/root/abc.txt")
path
PosixPath('/root/abc.txt')

Change the file extension to .pdf.

path.with_suffix(".pdf")
PosixPath('/root/abc.pdf')

Remove the file extension.

path.with_suffix("")
PosixPath('/root/abc')

Examples of Using pathlib.Path

Rename files in the current directory.

for path in Path(".").iterdir():
    if path.suffix == ".txt":
        path.rename(path.name.replace("1m", "100k"))