Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Iterate All Descendant Files in a Directory in Python

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Using pathlib.Path.glob

This is the easiest way to iterate through all descendant files of a directory in Python.

from pathlib import Path
paths = Path("cn").glob("**/*")
[path for path in paths if path.is_dir()][:20]
[PosixPath('cn/__pycache__'), PosixPath('cn/output'), PosixPath('cn/content'), PosixPath('cn/output/tag'), PosixPath('cn/output/author'), PosixPath('cn/output/feeds'), PosixPath('cn/output/.git'), PosixPath('cn/output/category'), PosixPath('cn/output/theme'), PosixPath('cn/output/drafts'), PosixPath('cn/output/blog'), PosixPath('cn/output/.git/logs'), PosixPath('cn/output/.git/objects'), PosixPath('cn/output/.git/info'), PosixPath('cn/output/.git/hooks'), PosixPath('cn/output/.git/branches'), PosixPath('cn/output/.git/refs'), PosixPath('cn/output/.git/logs/refs'), PosixPath('cn/output/.git/logs/refs/remotes'), PosixPath('cn/output/.git/logs/refs/heads')]
paths = Path("misc").glob("**/*")
[path for path in paths if path.is_file()][:20]
[PosixPath('misc/pconf.py'), PosixPath('misc/pages/shopping.markdown'), PosixPath('misc/pages/links.markdown'), PosixPath('misc/pages/tools.markdown'), PosixPath('misc/pages/stat.markdown'), PosixPath('misc/pages/job.markdown'), PosixPath('misc/pages/news.markdown'), PosixPath('misc/pages/forum.markdown'), PosixPath('misc/pages/learning.markdown'), PosixPath('misc/__pycache__/pconf.cpython-37.pyc'), PosixPath('misc/output/index45.html'), PosixPath('misc/output/index146.html'), PosixPath('misc/output/index150.html'), PosixPath('misc/output/index56.html'), PosixPath('misc/output/index29.html'), PosixPath('misc/output/index65.html'), PosixPath('misc/output/index160.html'), PosixPath('misc/output/index151.html'), PosixPath('misc/output/index49.html'), PosixPath('misc/output/index60.html')]

Using os.walk

import os

for subdir, dirs, files in os.walk("."):
    for file in files:
        filepath = os.path.join(subdir, file)
        if filepath.endswith(".csv"):
            print(subdir)
            print(dirs)
            print(filepath)
./f2.csv
[]
./f2.csv/part-00000-96dab35f-bfbb-4134-babe-14553e963d25-c000.csv

Or you can implement it yourself using Path.iterdir().