This is the first article in a series that describes file find, grep and in-place search/substitute tools for Python.
What’s this all about?
As described in my previous article I find myself often in a situation where I need to
- find files (whose paths/names are to be filtered)
- find files and grep through their contents
- find files and modify their content in some way
However, I want this kind of functionality available while programming in Python, my favourite programming language.
In this article I am covering the first use case (searching for files in a directory tree).
bbox33:scriptutil $ find . . ./a ./a/a.txt ./a/b ./a/b/b.txt ./a/b/c ./a/b/c/c.txt ./all.doc ./d ./d/d.txt ./d/e ./d/e/e.txt ./o ./o/o.txt ./o/p ./o/p/p.txt ./o/p/q ./o/p/q/q.txt ./o/p/q/r ./o/p/q/r/r.txt ./o/p/q/r/s ./o/p/q/r/s/s.txt
As a “warm-up exercise” we’ll start looking at a few usage examples of the
1 bbox33:scriptutil $ python2.5 2 Python 2.5.1 (r251:54863, May 14 2007, 09:23:46) 3 [GCC 4.0.1 (Apple Computer, Inc. build 5367)] on darwin 4 Type "help", "copyright", "credits" or "license" for more information. 5 >>> import scriptutil as SU 6 >>> import re 7 >>> files = SU.ffind('.', namefs=(re.compile('[a-d]\\.txt$').search,)) 8 >>> files 9 ['./a/a.txt', './a/b/b.txt', './a/b/c/c.txt', './d/d.txt'] 10 >>> SU.printr(files) 11 ./a/a.txt 12 ./a/b/b.txt 13 ./a/b/c/c.txt 14 ./d/d.txt
On line 7 (above) I am invoking the
scriptutil.ffind() function and passing the fllowing paremeters to it:
- the path to the directory tree to be searched (
- a tuple with functions (
namefs) to use for filtering the files we want; in this instance I am passing just one function which merely encapsulates a
The function’s return value is stored in the
files list whose content is shown on line 9. On the next line the
scriptutil.printr() helper function is invoked to pretty-print the find results (lines 11-14).
In the example below I am adding one more filter function to the
namefs tuple (line 16). That second function effectively weeds out any file path that contains the letter ‘b’.
15 >>> files = SU.ffind('.', namefs=(re.compile('[a-d]\\.txt$').search, 16 ... lambda s: s.find('b') == -1)) 17 >>> files 18 ['./a/a.txt', './d/d.txt'] 19 >>> SU.printr(files) 20 ./a/a.txt 21 ./d/d.txt
Hint: when working with a source code tree I would often use the following file name filter function to ignore any files internal to the subversion versioning system:
lambda s: s.find('.svn') == -1
In the next article I will present the
scriptutil.ffindgrep() function that not only helps you find files but also allows you to search inside them.