Python Arguments: getopt

01 May 2012

Arguments allow you to pass paramaters to your code. If you are familiar with the command line you will know what I'm talking about. In todays blogpost I will tell you all about how arguments can be passed and parsed to a python script using getopt.

Arguments in Python

Suggest you have a python script and you pass arguments to it like this:

./myscript.py arg1 arg2 arg3

Where do these argument statements go? Well if you import "sys" they get stored in the "sys.argv" array. To demonstrate this run this script:

#!/usr/bin/env python
import sys
arguments = sys.argv
for arg in arguments:
       print arg

Now save it as "example1" and make it executable. Run the script like this:

./example1 test1 test2 test3

The output will look like this:

example1
test1
test2
test3

So our array looks like this:

  • Index 0: example1 (our script name)
  • Index 1: test1
  • Index 2: test2
  • Index 3: test3
Our script name is also added to the array. This is not really useful to us, so we will only look at the array starting from index '1'.

Getopt

From the python website: The getopt module is a parser for command line options whose API is designed to be familiar to users of the C getopt() function. Users who are unfamiliar with the C getopt() function or who would like to write less code and get better help and error messages should consider using the argparse module instead. If you are not familiar with getopt , you will need to wait for next week's post where I will explain argparse. Here is a little script I wrote the other day using getopt. The script parses a document or a string that contains html and looks for http links. I named it "find-links.py":

#!/usr/bin/env python
import sys
import getopt
import re
def main():
    try:
        opts,operands = getopt.getopt(sys.argv[1:],'s:f:',["file=","string=","help"])
        if len(opts) == 0:
            print """ Please use the correct arguments, for usage type --help  """
        else:
            for option,value in opts:
                if option == "-s" or option == "--string":
                    checkString( value )
                if option == "-f" or option == "--file":
                    checkFile( value )
                if option == "--help":
                    printHelp()
    except getopt.GetoptError,err:
        print str(err)
        print """ Please use the correct arguments, for usage type --help """
def printHelp():
    print    """ We have multiple options: \n\t -s or --string: required option is  a string \n\t -f or --file option is a file    """
def checkString( value ):
    if "http" in str(value):
            res =  value.split('href="')[-1].split('"')[0]
            if 'http://' in res:
                print res
"""We split a string and use -1 to indicate we want the part that comes after href"""
def checkFile( value  ):
    FILE  = open(value,'r')
    for values  in FILE:
        if 'http' in values:
            res =  values.split('href="')[-1].split('"')[0]
            if 'http://' in res:
                print res
if __name__ == "__main__":
    main()

Now you can pass a html file to this script and you will get all href links with http in it returned. (It's not really a good way to find urls and you should use a html parser instead, this is just a quick script to demonstrate). Back to getopt, from the api:

getopt.getopt(args, options, [long_options])

Example:

opts,operands = getopt.getopt(sys.argv[1:],'s:f:',["file=","string=","help"])

Getopt works by passing an argument array:

sys.argv[1:]

[1:] means you pass everything, but starting at index 1 instead of index 0 (index 0 was our script name).

's:f:'

The ':' after the letter means that when this letter (option) is used, an value is mandatory. The one lettered options are 's' and 'f', so when executing a script you can pass arguments like this:

./find-links.py -s <a href="http://google.com">

These are the long versions of the options. The '=' indicates that a value is mandatory. Note that the long options are not mandatory and can be left out when declaring getopt.

["file=","string=","help"]

and can be passed like:

./find-links.py --string <a href="http://google.com">

or:

./find-links.py --string=<a href="http://google.com">

If you pass an option (a letter/string preceded by '-' or '--') to getopt that isn't present in the arguments list, getopt will raise an exception. We catch this exception and print a help message.

try:
    ...
except getopt.GetoptError,err:
        print str(err)
        print """ Please use the correct arguments, for usage type --help              """

The options (-s,-f,--string,--file,--help) are stored in opts (a dict). The value that goes with the option is stored in opts as well, (option --> value).

opts,operands =

Operands are all values given to the script that come after the last option and last value: if we do run a script that takes an argument -s

./somegetoptpythonscript -s string1 string2 string3

Then in this example string2 and string3 are operands. If there weren't any arguments, return a help message:

if len(opts) == 0:
            print """ Please use the correct arguments, for usage type --help  """

else we can start parsing. If an option equals an argument, then we take this argument and parse it's corresponding value.

 for option,value in opts:
                if option == "-s" or option == "--string":
                    checkString( value )
                if option == "-f" or option == "--string":
                    checkFile( value )
                if option == "--help":
                    printHelp()

Final word

That's how it works. Next week we will have a look at using argparse (which is a lot better/easier in my opinion).