It’s a simple and common problem:
You have a directory structure with files in it. You want to find all the files with certain extensions. “jpg”, “jpeg”, “png” and “gif” for argument’s sake.
The language for this exercise is Python.
It’s important to use the standard library. A search for something like “python directory find recursive” will lead you very quickly to os.walk. Which is exactly what you want for walking the directory structure.
But, there is the file extension to be checked. endswith is not appropriate because it’s case-sensitive. What you want is fnmatch. You know that because you either glanced at File and Directory Access when you found the standard library documentation. Or, because you searched for something like “python match filename.”
We’re ready to code. Nothing complex:
def image_files_1(directory):
for root, dirs, files in os.walk(directory):
for extension in '*.jpeg', '*.jpg', '*.png', '*.gif':
for fn in fnmatch.filter(files, extension):
yield os.path.join(root, fn)
If this was a barrier to getting your job done, mission complete. But, dude, iterators and list comprehensions! And, when all you have is a hammer…
def image_files_2(directory):
return itertools.chain(*[[os.path.join(root, fn)
for fn in fnmatch.filter(files, '*.jpg') +
fnmatch.filter(files, '*.jpeg') +
fnmatch.filter(files, '*.png') +
fnmatch.filter(files, '*.gif')]
for root, dirs, files in os.walk(directory)])
But, what about new file formats?
def image_files_3(directory, extensions):
return itertools.chain(*[[os.path.join(root, fn)
for fn in sum([fnmatch.filter(files, '*.' + ext)
for ext in extensions],
[])]
for root, dirs, files in os.walk(directory)])
I am, appropriately, embarrassed that I wrote any of this.
Embarrassed enough to share.
Leave a Reply