Declarative parsing of command line arguments in python


You will probably agree with me that one of the most boring programming chores is the parsing of command line arguments. After programming one too many argument handling routine I decided to write a utility that does the job for me.

The following example shows the use of the resulting python class (CLAP):

 1  #!/usr/bin/env python
 3  # please note: the code below is just a usage example (modelled on a
 4  # hypothetical crypto utility)
 6  import sys, pprint
 7  from parseargs import CLAP
 9  class Crypto:
10      def __init__(self):
11          self.handleArgs()
13      def handleArgs(self):
14          # dictionary with command line args along with their types and defaults
15          args = {
16              ('–a', '–x', '––algo')      :   ('algo', str, None),
17              ('–c', '––crypt')           :   ('crypt', bool, None),
18              ('–d', '––decrypt')         :   ('decrypt', bool, None),
19              ('–e', '––echo', '––fyo')   :   ('echo', bool, None),
20              ('–l', '––lines')           :   ('lines', int, '25'),
21              ('–i', )                    :   ('input', str, None),
22              ('–o', )                    :   ('output', str, None),
23              ('–p', '––pager', '––pgr')  :   ('pager', str, '/usr/bin/less'),
24              ('–r', '––recipient')       :   ('recipient', str, None)
25          }
27          apu = CLAP(sys.argv[1:], args, min_args=2)
28          self.args = apu.check_args()
30  if __name__ == '__main__':
31      c = Crypto()
32      pp = pprint.PrettyPrinter(indent=4)
33      pp.pprint(c.args)

As you can see, the main action is on lines 15-28. I opted for a more declarative approach i.e. I wanted to be able to “declare” the expected command line arguments (along with their types and default values) and be done.

Here is an usage example:

mhr@playground2:~/src/published$ python -d -a blowfish --echo
{   'algo': 'blowfish',
    'decrypt': True,
    'echo': True,
    'lines': 25,
    'pager': '/usr/bin/less'}

The first three values were supplied on the command line whereas the last two stem from defaults declared in the client code (see lines 20 and 23 above).

Here is what happens in case of erroneous user input (e.g. supplying the value ‘abc’ for argument ‘lines’ which is of type integer):

mhr@playground2:~/src/published$ python -d -l abc
!! Invalid parameter value: invalid literal for int(): abc !!

In the invocation below an unsupported parameter (‘-z’) was passed:

mhr@playground2:~/src/published$ python -d -z
!! Error: option -z not recognized !!

The utility class

The utility class CLAP is reasonably straightforward and will be introduced below. To see it in full beaty click here :-)

  1  #!/usr/bin/env python
  2  """
  3  Utility class for handling of command line arguments, see the bottom of the
  4  file for an example showing how it should be used.
  5  """
  6  # Copyright: (c) 2006 Muharem Hrnjadovic
  7  # created: 21/11/2006 15:15:49
  9  __version__ = "$Id$"
 10  # $HeadURL $
 12  import sys, getopt, re
 13  import itertools as IT
 14  import operator as OP
 16  class CLAP(object):
 17      """A class that uses a declarative technique for command line
 18      argument parsing"""
 20      def __init__(self, argv, args, min_args = 0, help_string = None):
 21          """initialiser, just copies its arguments to attributes"""
 22          self.args = args
 23          self.min_args = min_args
 24          self.help_string = help_string
 25          # skip any leading arguments that don't start with a dash (since
 26          # this confuses the getopt utility)
 27          self.argv = list(IT.dropwhile(lambda s: not s.startswith('-'), argv))

lines 20-27 (initialiser method): merely copies the parameters passed to it to attributes of the same name.

 29      def check_args(self):
 30          if not self.argv or not self.args or len(self.argv) < self.min_args:
 31              sys.stderr.write("!! Error: not enough arguments or data " \\
 32                               "for parsing !!\\n")
 35          self.construct_getopt_data()
 36          try:
 37              opts, args = getopt.getopt(self.argv, self.shortflags,
 38                  self.longflags)
 39          except getopt.GetoptError, e:
 40              sys.stderr.write("!! Error: %s !!\\n" % str(e))

lines 35-41: the data required for the getopt() function is put together (from the the command line argument “declaration” supplied by the client code). Subsequently getopt() is invoked to perform the low level argument parsing.

 43          # holds arguments that were actually supplied on the command line
 44          suppliedd = {}
 45          # result dictionary
 46          resultd = {}
 48          # initialise args where approppriate
 49          try:
 50              for flags, (argn,typef,initv) in self.args.iteritems():
 51                  if initv is not None: resultd[argn] = typef(initv)
 52          except Exception, e:
 53              sys.stderr.write("!! Internal error: %s !!\\n" % str(e))

lines 49-54: for any arguments that have default values an attempt to initialise them with these is made. Please note how the code uses python type functions to perform the initialisation (line 51)

 56          # dictionary needed for matching against the command line flags
 57          matchd = dict([(arg, (OP.itemgetter(0)(v), OP.itemgetter(1)(v))) for \\
 58                         args, v in self.args.iteritems() for arg in args])
 60          # check the arguments provided on the command line
 61          try:
 62              for opt, argv in opts:
 63                  if opt in matchd:
 64                      argn, typef = matchd[opt]
 65                      suppliedd[argn] = (typef == bool and True) or typef(argv)
 66          except Exception, e:
 67              sys.stderr.write("!! Invalid parameter value: %s !!\\n" % str(e))

lines 57-68: given the command line arguments shown in the “Introduction” section, the matchd dictionary will have the following value:

{   '--algo': ('algo', <type 'str'>),
    '--crypt': ('crypt', <type 'bool'>),
    '--decrypt': ('decrypt', <type 'bool'>),
    '--echo': ('echo', <type 'bool'>),
    '--fyo': ('echo', <type 'bool'>),
    '--lines': ('lines', <type 'int'>),
    '--pager': ('pager', <type 'str'>),
    '--pgr': ('pager', <type 'str'>),
    '--recipient': ('recipient', <type 'str'>),
    '-a': ('algo', <type 'str'>),
    '-c': ('crypt', <type 'bool'>),
    '-d': ('decrypt', <type 'bool'>),
    '-e': ('echo', <type 'bool'>),
    '-i': ('input', <type 'str'>),
    '-l': ('lines', <type 'int'>),
    '-o': ('output', <type 'str'>),
    '-p': ('pager', <type 'str'>),
    '-r': ('recipient', <type 'str'>),
    '-x': ('algo', <type 'str'>)}

It is used to add the arguments that were actually supplied on the command line to the suppliedd dictionary. Please note again, how python type functions are used to convert (any non-boolean) command line arguments from strings to the desired type (line 65).

 70          # merge arguments (supplied on the command line) with the defaults
 71          resultd.update(suppliedd)
 73          return (resultd)

lines 71-73: last but not least we merge the arguments that were actually supplied on the command line to the default values and return the result.

 75      def construct_getopt_data(self):
 76          # pair all flags will their respective types
 77          flags = [(arg, OP.itemgetter(1)(v)) for args, v in \\
 78                                          self.args.iteritems() for arg in args]
 79          def ff(((argf, argt), fchar)):
 80              return argt == bool and argf.lstrip('-') or \\
 81              "%s%s" % (argf.lstrip('-'), fchar)
 82          # single character flags
 83          self.shortflags = ''.join(map(ff, zip(filter(lambda t: len(t[0]) <= 2,
 84                                                       flags), IT.repeat(':'))))
 85          # multiple character flags
 86          self.longflags = map(ff, zip(filter(lambda t: len(t[0]) > 2, flags),
 87                                       IT.repeat('=')))

Again, based on the example above, the flags dictionary will be as follows:

[   ('-l', <type 'int'>),
    ('--lines', <type 'int'>),
    ('-e', <type 'bool'>),
    ('--echo', <type 'bool'>),
    ('--fyo', <type 'bool'>),
    ('-p', <type 'str'>),
    ('--pager', <type 'str'>),
    ('--pgr', <type 'str'>),
    ('-a', <type 'str'>),
    ('-x', <type 'str'>),
    ('--algo', <type 'str'>),
    ('-r', <type 'str'>),
    ('--recipient', <type 'str'>),
    ('-o', <type 'str'>),
    ('-i', <type 'str'>),
    ('-d', <type 'bool'>),
    ('--decrypt', <type 'bool'>),
    ('-c', <type 'bool'>),
    ('--crypt', <type 'bool'>)]

The ensuing manipulations result in the following data (to be passed to getopt()):


[   'lines=',
 89      def help(self, exit_code=0):
 90          if self.help_string: sys.stderr.write(self.help_string)
 91          sys.exit(exit_code)

In case of an error check_args() will invoke the help() function which terminates the program execution after printing a help string (if any was supplied).


In case you liked the command line argument processing class introduced above, please feel free to download it from here here and play with it. The colorised source code without any interspersed commentary can be viewed here.