Learn 2 Programming: learn python in 10 minutes

9. Control Flow

In the programs we have seen till now, there has always been a series of statements faithfully executed by Python in exact top-down order. What if you wanted to change the flow of how it works? For example, you want the program to take some decisions and do different things depending on different situations, such as printing 'Good Morning' or 'Good Evening' depending on the time of the day?

As you might have guessed, this is achieved using control flow statements. There are three control flow statements in Python - if, for and while.

9.1. The if statement

The if statement is used to check a condition: if the condition is true, we run a block of statements (called the if-block), else we process another block of statements (called the else-block). The else clause is optional.

Example (save as if.py):

number = 23
guess = int(raw_input('Enter an integer : '))

if guess == number:
    # New block starts here
    print 'Congratulations, you guessed it.'
    print '(but you do not win any prizes!)'
    # New block ends here
elif guess < number:
    # Another block
    print 'No, it is a little higher than that'
    # You can do whatever you want in a block ...
else:
    print 'No, it is a little lower than that'
    # you must have guessed > number to reach here

print 'Done'
# This last statement is always executed,
# after the if statement is executed.

Output:

$ python if.py
Enter an integer : 50
No, it is a little lower than that
Done

$ python if.py
Enter an integer : 22
No, it is a little higher than that
Done

$ python if.py
Enter an integer : 23
Congratulations, you guessed it.
(but you do not win any prizes!)
Done

How It Works

In this program, we take guesses from the user and check if it is the number that we have. We set the variable number to any integer we want, say 23. Then, we take the user’s guess using the raw_input() function. Functions are just reusable pieces of programs. We’ll read more about them in the next chapter.

We supply a string to the built-in raw_input function which prints it to the screen and waits for input from the user. Once we enter something and press enter key, the raw_input() function returns what we entered, as a string. We then convert this string to an integer using int and then store it in the variable guess. Actually, the int is a class but all you need to know right now is that you can use it to convert a string to an integer (assuming the string contains a valid integer in the text).

Next, we compare the guess of the user with the number we have chosen. If they are equal, we print a success message. Notice that we use indentation levels to tell Python which statements belong to which block. This is why indentation is so important in Python. I hope you are sticking to the "consistent indentation" rule. Are you?

Notice how the if statement contains a colon at the end - we are indicating to Python that a block of statements follows.

Then, we check if the guess is less than the number, and if so, we inform the user that they must guess a little higher than that. What we have used here is the elif clause which actually combines two related if else-if else statements into one combined if-elif-else statement. This makes the program easier and reduces the amount of indentation required.

The elif and else statements must also have a colon at the end of the logical line followed by their corresponding block of statements (with proper indentation, of course)

You can have another if statement inside the if-block of an if statement and so on - this is called a nested if statement.

Remember that the elif and else parts are optional. A minimal valid if statement is:

if True:
    print 'Yes, it is true'

After Python has finished executing the complete if statement along with the associated elif and else clauses, it moves on to the next statement in the block containing the if statement. In this case, it is the main block (where execution of the program starts), and the next statement is the print 'Done' statement. After this, Python sees the ends of the program and simply finishes up.

Even though this is a very simple program, I have been pointing out a lot of things that you should notice. All these are pretty straightforward (and surprisingly simple for those of you from C/C++ backgrounds). You will need to become aware of all these things initially, but after some practice you will become comfortable with them, and it will all feel 'natural' to you.

Note	Note for C/C++ Programmers There is no `switch` statement in Python. You can use an `if..elif..else` statement to do the same thing (and in some cases, use a dictionary to do it quickly)

9.2. The while Statement

The while statement allows you to repeatedly execute a block of statements as long as a condition is true. A while statement is an example of what is called a looping statement. A while statement can have an optional else clause.

Example (save as while.py):

number = 23
running = True

while running:
    guess = int(raw_input('Enter an integer : '))

    if guess == number:
        print 'Congratulations, you guessed it.'
        # this causes the while loop to stop
        running = False
    elif guess < number:
        print 'No, it is a little higher than that.'
    else:
        print 'No, it is a little lower than that.'
else:
    print 'The while loop is over.'
    # Do anything else you want to do here

print 'Done'

Output:

$ python while.py
Enter an integer : 50
No, it is a little lower than that.
Enter an integer : 22
No, it is a little higher than that.
Enter an integer : 23
Congratulations, you guessed it.
The while loop is over.
Done

How It Works

In this program, we are still playing the guessing game, but the advantage is that the user is allowed to keep guessing until he guesses correctly - there is no need to repeatedly run the program for each guess, as we have done in the previous section. This aptly demonstrates the use of the while statement.

We move the raw_input and if statements to inside the while loop and set the variable running to True before the while loop. First, we check if the variable running is True and then proceed to execute the corresponding while-block. After this block is executed, the condition is again checked which in this case is the running variable. If it is true, we execute the while-block again, else we continue to execute the optional else-block and then continue to the next statement.

The else block is executed when the while loop condition becomes False - this may even be the first time that the condition is checked. If there is an else clause for a while loop, it is always executed unless you break out of the loop with a break statement.

The True and False are called Boolean types and you can consider them to be equivalent to the value 1 and 0 respectively.

Note	Note for C/C++ Programmers Remember that you can have an `else` clause for the `while` loop.

9.3. The for loop

The for..in statement is another looping statement which iterates over a sequence of objects i.e. go through each item in a sequence. We will see more about sequences in detail in later chapters. What you need to know right now is that a sequence is just an ordered collection of items.

Example (save as for.py):

for i in range(1, 5):
    print i
else:
    print 'The for loop is over'

Output:

$ python for.py
1
2
3
4
The for loop is over

How It Works

In this program, we are printing a sequence of numbers. We generate this sequence of numbers using the built-in range function.

What we do here is supply it two numbers and range returns a sequence of numbers starting from the first number and up to the second number. For example, range(1,5) gives the sequence

[1, 2,
3, 4]

. By default, range takes a step count of 1. If we supply a third number to range, then that becomes the step count. For example, range(1,5,2) gives [1,3]. Remember that the range extends up to the second number i.e. it does not include the second number.

Note that range() generates a sequence of numbers, but it will generate only one number at a time, when the for loop requests for the next item. If you want to see the full sequence of numbers immediately, use list(range()). Lists are explained in the data structures chapter.

The for loop then iterates over this range - for i in range(1,5) is equivalent to

for i in [1,
2, 3, 4]

which is like assigning each number (or object) in the sequence to i, one at a time, and then executing the block of statements for each value of i. In this case, we just print the value in the block of statements.

Remember that the else part is optional. When included, it is always executed once after the for loop is over unless a break statement is encountered.

Remember that the for..in loop works for any sequence. Here, we have a list of numbers generated by the built-in range function, but in general we can use any kind of sequence of any kind of objects! We will explore this idea in detail in later chapters.

Note

Note for C/C++/Java/C# Programmers

The Python for loop is radically different from the C/C++ for loop. C# programmers will note that the for loop in Python is similar to the foreach loop in C#. Java programmers will note that the same is similar to for (int i : IntArray) in Java 1.5.

In C/C, if you want to write `for (int i = 0; i < 5; i)

, then in Python you write just `for i
in range(0,5)

. As you can see, the for loop is simpler, more expressive and less error prone in Python.

9.4. The break Statement

The break statement is used to break out of a loop statement i.e. stop the execution of a looping statement, even if the loop condition has not become False or the sequence of items has not been completely iterated over.

An important note is that if you break out of a for or while loop, any corresponding loop else block is not executed.

Example (save as break.py):

while True:
    s = raw_input('Enter something : ')
    if s == 'quit':
        break
    print 'Length of the string is', len(s)
print 'Done'

Output:

$ python break.py
Enter something : Programming is fun
Length of the string is 18
Enter something : When the work is done
Length of the string is 21
Enter something : if you wanna make your work also fun:
Length of the string is 37
Enter something : use Python!
Length of the string is 11
Enter something : quit
Done

How It Works

In this program, we repeatedly take the user’s input and print the length of each input each time. We are providing a special condition to stop the program by checking if the user input is 'quit'. We stop the program by breaking out of the loop and reach the end of the program.

The length of the input string can be found out using the built-in len function.

Remember that the break statement can be used with the for loop as well.

Swaroop’s Poetic Python

The input I have used here is a mini poem I have written:

Programming is fun
When the work is done
if you wanna make your work also fun:
    use Python!

9.5. The continue Statement

The continue statement is used to tell Python to skip the rest of the statements in the current loop block and to continue to the next iteration of the loop.

Example (save as continue.py):

while True:
    s = raw_input('Enter something : ')
    if s == 'quit':
        break
    if len(s) < 3:
        print 'Too small'
        continue
    print 'Input is of sufficient length'
    # Do other kinds of processing here...

Output:

$ python continue.py
Enter something : a
Too small
Enter something : 12
Too small
Enter something : abc
Input is of sufficient length
Enter something : quit

How It Works

In this program, we accept input from the user, but we process the input string only if it is at least 3 characters long. So, we use the built-in len function to get the length and if the length is less than 3, we skip the rest of the statements in the block by using the continue statement. Otherwise, the rest of the statements in the loop are executed, doing any kind of processing we want to do here.

Note that the continue statement works with the for loop as well.

9.6. Summary

We have seen how to use the three control flow statements - if, while and for along with their associated break and continue statements. These are some of the most commonly used parts of Python and hence, becoming comfortable with them is essential.

Next, we will see how to create and use functions.

10. Functions

Functions are reusable pieces of programs. They allow you to give a name to a block of statements, allowing you to run that block using the specified name anywhere in your program and any number of times. This is known as calling the function. We have already used many built-in functions such as len and range.

The function concept is probably the most important building block of any non-trivial software (in any programming language), so we will explore various aspects of functions in this chapter.

Functions are defined using the def keyword. After this keyword comes an identifier name for the function, followed by a pair of parentheses which may enclose some names of variables, and by the final colon that ends the line. Next follows the block of statements that are part of this function. An example will show that this is actually very simple:

Example (save as function1.py):

def say_hello():
    # block belonging to the function
    print 'hello world'
# End of function

say_hello() # call the function
say_hello() # call the function again

Output:

$ python function1.py
hello world
hello world

How It Works

We define a function called say_hello using the syntax as explained above. This function takes no parameters and hence there are no variables declared in the parentheses. Parameters to functions are just input to the function so that we can pass in different values to it and get back corresponding results.

Notice that we can call the same function twice which means we do not have to write the same code again.

10.1. Function Parameters

A function can take parameters, which are values you supply to the function so that the function can do something utilising those values. These parameters are just like variables except that the values of these variables are defined when we call the function and are already assigned values when the function runs.

Parameters are specified within the pair of parentheses in the function definition, separated by commas. When we call the function, we supply the values in the same way. Note the terminology used - the names given in the function definition are called parameters whereas the values you supply in the function call are called arguments.

Example (save as function_param.py):

def print_max(a, b):
    if a > b:
        print a, 'is maximum'
    elif a == b:
        print a, 'is equal to', b
    else:
        print b, 'is maximum'

# directly pass literal values
print_max(3, 4)

x = 5
y = 7

# pass variables as arguments
print_max(x, y)

Output:

$ python function_param.py
4 is maximum
7 is maximum

How It Works

Here, we define a function called print_max that uses two parameters called a and b. We find out the greater number using a simple if..else statement and then print the bigger number.

The first time we call the function print_max, we directly supply the numbers as arguments. In the second case, we call the function with variables as arguments. print_max(x, y) causes the value of argument x to be assigned to parameter a and the value of argument y to be assigned to parameter b. The printMax function works the same way in both cases.

10.2. Local Variables

When you declare variables inside a function definition, they are not related in any way to other variables with the same names used outside the function - i.e. variable names are local to the function. This is called the scope of the variable. All variables have the scope of the block they are declared in starting from the point of definition of the name.

Example (save as function_local.py):

x = 50

def func(x):
    print 'x is', x
    x = 2
    print 'Changed local x to', x

func(x)
print 'x is still', x

Output:

$ python function_local.py
x is 50
Changed local x to 2
x is still 50

How It Works

The first time that we print the value of the name x with the first line in the function’s body, Python uses the value of the parameter declared in the main block, above the function definition.

Next, we assign the value 2 to x. The name x is local to our function. So, when we change the value of x in the function, the x defined in the main block remains unaffected.

With the last print statement, we display the value of x as defined in the main block, thereby confirming that it is actually unaffected by the local assignment within the previously called function.

10.3. The global statement

If you want to assign a value to a name defined at the top level of the program (i.e. not inside any kind of scope such as functions or classes), then you have to tell Python that the name is not local, but it is global. We do this using the global statement. It is impossible to assign a value to a variable defined outside a function without the global statement.

You can use the values of such variables defined outside the function (assuming there is no variable with the same name within the function). However, this is not encouraged and should be avoided since it becomes unclear to the reader of the program as to where that variable’s definition is. Using the global statement makes it amply clear that the variable is defined in an outermost block.

Example (save as function_global.py):

x = 50

def func():
    global x

    print 'x is', x
    x = 2
    print 'Changed global x to', x

func()
print 'Value of x is', x

Output:

$ python function_global.py
x is 50
Changed global x to 2
Value of x is 2

How It Works

The global statement is used to declare that x is a global variable - hence, when we assign a value to x inside the function, that change is reflected when we use the value of x in the main block.

You can specify more than one global variable using the same global statement e.g.

global x, y,
z

10.4. Default Argument Values

For some functions, you may want to make some parameters optional and use default values in case the user does not want to provide values for them. This is done with the help of default argument values. You can specify default argument values for parameters by appending to the parameter name in the function definition the assignment operator (=) followed by the default value.

Note that the default argument value should be a constant. More precisely, the default argument value should be immutable - this is explained in detail in later chapters. For now, just remember this.

Example (save as function_default.py):

def say(message, times=1):
    print message * times

say('Hello')
say('World', 5)

Output:

$ python function_default.py
Hello
WorldWorldWorldWorldWorld

How It Works

The function named say is used to print a string as many times as specified. If we don’t supply a value, then by default, the string is printed just once. We achieve this by specifying a default argument value of 1 to the parameter times.

In the first usage of say, we supply only the string and it prints the string once. In the second usage of say, we supply both the string and an argument 5 stating that we want to say the string message 5 times.

Caution

Only those parameters which are at the end of the parameter list can be given default argument values i.e. you cannot have a parameter with a default argument value preceding a parameter without a default argument value in the function’s parameter list.

This is because the values are assigned to the parameters by position. For example,

def func(a,
b=5)

is valid, but def func(a=5, b) is not valid.

10.5. Keyword Arguments

If you have some functions with many parameters and you want to specify only some of them, then you can give values for such parameters by naming them - this is called keyword arguments - we use the name (keyword) instead of the position (which we have been using all along) to specify the arguments to the function.

There are two advantages - one, using the function is easier since we do not need to worry about the order of the arguments. Two, we can give values to only those parameters to which we want to, provided that the other parameters have default argument values.

Example (save as function_keyword.py):

def func(a, b=5, c=10):
    print 'a is', a, 'and b is', b, 'and c is', c

func(3, 7)
func(25, c=24)
func(c=50, a=100)

Output:

$ python function_keyword.py
a is 3 and b is 7 and c is 10
a is 25 and b is 5 and c is 24
a is 100 and b is 5 and c is 50

How It Works

The function named func has one parameter without a default argument value, followed by two parameters with default argument values.

In the first usage, func(3, 7), the parameter a gets the value 3, the parameter b gets the value 7 and c gets the default value of 10.

In the second usage func(25, c=24), the variable a gets the value of 25 due to the position of the argument. Then, the parameter c gets the value of 24 due to naming i.e. keyword arguments. The variable b gets the default value of 5.

In the third usage func(c=50, a=100), we use keyword arguments for all specified values. Notice that we are specifying the value for parameter c before that for a even though a is defined before c in the function definition.

10.6. VarArgs parameters

Sometimes you might want to define a function that can take any number of parameters, i.e. variable number of arguments, this can be achieved by using the stars (save as function_varargs.py):

def total(initial=5, *numbers, **keywords):
    count = initial
    for number in numbers:
        count += number
    for key in keywords:
        count += keywords[key]
    return count

print total(10, 1, 2, 3, vegetables=50, fruits=100)

Output:

$ python function_varargs.py
166

How It Works

When we declare a starred parameter such as *param, then all the positional arguments from that point till the end are collected as a tuple called 'param'.

Similarly, when we declare a double-starred parameter such as **param, then all the keyword arguments from that point till the end are collected as a dictionary called 'param'.

We will explore tuples and dictionaries in a later chapter.

10.7. The return statement

The return statement is used to return from a function i.e. break out of the function. We can optionally return a value from the function as well.

Example (save as function_return.py):

def maximum(x, y):
    if x > y:
        return x
    elif x == y:
        return 'The numbers are equal'
    else:
        return y

print maximum(2, 3)

Output:

$ python function_return.py
3

How It Works

The maximum function returns the maximum of the parameters, in this case the numbers supplied to the function. It uses a simple if..else statement to find the greater value and then returns that value.

Note that a return statement without a value is equivalent to return None. None is a special type in Python that represents nothingness. For example, it is used to indicate that a variable has no value if it has a value of None.

Every function implicitly contains a return None statement at the end unless you have written your own return statement. You can see this by running print some_function() where the function some_function does not use the return statement such as:

def some_function():
    pass

The pass statement is used in Python to indicate an empty block of statements.

Tip	There is a built-in function called `max` that already implements the 'find maximum' functionality, so use this built-in function whenever possible.

10.8. DocStrings

Python has a nifty feature called documentation strings, usually referred to by its shorter name docstrings. DocStrings are an important tool that you should make use of since it helps to document the program better and makes it easier to understand. Amazingly, we can even get the docstring back from, say a function, when the program is actually running!

Example (save as function_docstring.py):

def print_max(x, y):
    '''Prints the maximum of two numbers.

    The two values must be integers.'''
    # convert to integers, if possible
    x = int(x)
    y = int(y)

    if x > y:
        print x, 'is maximum'
    else:
        print y, 'is maximum'

print_max(3, 5)
print print_max.__doc__

Output:

$ python function_docstring.py
5 is maximum
Prints the maximum of two numbers.

    The two values must be integers.

How It Works

A string on the first logical line of a function is the docstring for that function. Note that DocStrings also apply to modules and classes which we will learn about in the respective chapters.

The convention followed for a docstring is a multi-line string where the first line starts with a capital letter and ends with a dot. Then the second line is blank followed by any detailed explanation starting from the third line. You are strongly advised to follow this convention for all your docstrings for all your non-trivial functions.

We can access the docstring of the print_max function using the doc (notice the double underscores) attribute (name belonging to) of the function. Just remember that Python treats everything as an object and this includes functions. We’ll learn more about objects in the chapter on classes.

If you have used help() in Python, then you have already seen the usage of docstrings! What it does is just fetch the doc attribute of that function and displays it in a neat manner for you. You can try it out on the function above - just include help(print_max) in your program. Remember to press the q key to exit help.

Automated tools can retrieve the documentation from your program in this manner. Therefore, I strongly recommend that you use docstrings for any non-trivial function that you write. The pydoc command that comes with your Python distribution works similarly to help() using docstrings.

10.9. Summary

We have seen so many aspects of functions but note that we still haven’t covered all aspects of them. However, we have already covered most of what you’ll use regarding Python functions on an everyday basis.

Next, we will see how to use as well as create Python modules.

11. Modules

You have seen how you can reuse code in your program by defining functions once. What if you wanted to reuse a number of functions in other programs that you write? As you might have guessed, the answer is modules.

There are various methods of writing modules, but the simplest way is to create a file with a .py extension that contains functions and variables.

Another method is to write the modules in the native language in which the Python interpreter itself was written. For example, you can write modules in the C programming language and when compiled, they can be used from your Python code when using the standard Python interpreter.

A module can be imported by another program to make use of its functionality. This is how we can use the Python standard library as well. First, we will see how to use the standard library modules.

Example (save as module_using_sys.py):

import sys

print('The command line arguments are:')
for i in sys.argv:
    print i

print '\n\nThe PYTHONPATH is', sys.path, '\n'

Output:

$ python module_using_sys.py we are arguments
The command line arguments are:
module_using_sys.py
we
are
arguments


The PYTHONPATH is ['/tmp/py',
# many entries here, not shown here
'/Library/Python/2.7/site-packages',
'/usr/local/lib/python2.7/site-packages']

How It Works

First, we import the sys module using the import statement. Basically, this translates to us telling Python that we want to use this module. The sys module contains functionality related to the Python interpreter and its environment i.e. the system.

When Python executes the import sys statement, it looks for the sys module. In this case, it is one of the built-in modules, and hence Python knows where to find it.

If it was not a compiled module i.e. a module written in Python, then the Python interpreter will search for it in the directories listed in its sys.path variable. If the module is found, then the statements in the body of that module are run and the module is made available for you to use. Note that the initialization is done only the first time that we import a module.

The argv variable in the sys module is accessed using the dotted notation i.e. sys.argv. It clearly indicates that this name is part of the sys module. Another advantage of this approach is that the name does not clash with any argv variable used in your program.

The sys.argv variable is a list of strings (lists are explained in detail in a later chapter. Specifically, the sys.argv contains the list of command line arguments i.e. the arguments passed to your program using the command line.

If you are using an IDE to write and run these programs, look for a way to specify command line arguments to the program in the menus.

Here, when we execute python module_using_sys.py we are arguments, we run the module module_using_sys.py with the python command and the other things that follow are arguments passed to the program. Python stores the command line arguments in the sys.argv variable for us to use.

Remember, the name of the script running is always the first argument in the sys.argv list. So, in this case we will have 'module_using_sys.py' as sys.argv[0], 'we' as sys.argv[1], 'are' as sys.argv[2] and 'arguments' as sys.argv[3]. Notice that Python starts counting from 0 and not 1.

The sys.path contains the list of directory names where modules are imported from. Observe that the first string in sys.path is empty - this empty string indicates that the current directory is also part of the sys.path which is same as the PYTHONPATH environment variable. This means that you can directly import modules located in the current directory. Otherwise, you will have to place your module in one of the directories listed in sys.path.

Note that the current directory is the directory from which the program is launched. Run

import
os; print os.getcwd()

to find out the current directory of your program.

11.1. Byte-compiled .pyc files

Importing a module is a relatively costly affair, so Python does some tricks to make it faster. One way is to create byte-compiled files with the extension .pyc which is an intermediate form that Python transforms the program into (remember the introduction section on how Python works?). This .pyc file is useful when you import the module the next time from a different program - it will be much faster since a portion of the processing required in importing a module is already done. Also, these byte-compiled files are platform-independent.

Note	These `.pyc` files are usually created in the same directory as the corresponding `.py` files. If Python does not have permission to write to files in that directory, then the `.pyc` files will not be created.

11.2. The from … import statement

If you want to directly import the argv variable into your program (to avoid typing the sys. everytime for it), then you can use the from sys import argv statement.

In general, you should avoid using this statement and use the import statement instead since your program will avoid name clashes and will be more readable.

Example:

from math import sqrt
print "Square root of 16 is", sqrt(16)

11.3. A module’s `name`

Every module has a name and statements in a module can find out the name of their module. This is handy for the particular purpose of figuring out whether the module is being run standalone or being imported. As mentioned previously, when a module is imported for the first time, the code it contains gets executed. We can use this to make the module behave in different ways depending on whether it is being used by itself or being imported from another module. This can be achieved using the name attribute of the module.

Example (save as module_using_name.py):

if __name__ == '__main__':
    print 'This program is being run by itself'
else:
    print 'I am being imported from another module'

Output:

$ python module_using_name.py
This program is being run by itself

$ python
>>> import module_using_name
I am being imported from another module
>>>

How It Works

Every Python module has its name defined. If this is 'main', that implies that the module is being run standalone by the user and we can take appropriate actions.

11.4. Making Your Own Modules

Creating your own modules is easy, you’ve been doing it all along! This is because every Python program is also a module. You just have to make sure it has a .py extension. The following example should make it clear.

Example (save as mymodule.py):

def say_hi():
    print 'Hi, this is mymodule speaking.'

__version__ = '0.1'

The above was a sample module. As you can see, there is nothing particularly special about it compared to our usual Python program. We will next see how to use this module in our other Python programs.

Remember that the module should be placed either in the same directory as the program from which we import it, or in one of the directories listed in sys.path.

Another module (save as mymodule_demo.py):

import mymodule

mymodule.say_hi()
print 'Version', mymodule.__version__

Output:

$ python mymodule_demo.py
Hi, this is mymodule speaking.
Version 0.1

How It Works

Notice that we use the same dotted notation to access members of the module. Python makes good reuse of the same notation to give the distinctive 'Pythonic' feel to it so that we don’t have to keep learning new ways to do things.

Here is a version utilising the from..import syntax (save as mymodule_demo2.py):

from mymodule import say_hi, __version__

say_hi()
print 'Version', __version__

The output of mymodule_demo2.py is same as the output of mymodule_demo.py.

Notice that if there was already a version name declared in the module that imports mymodule, there would be a clash. This is also likely because it is common practice for each module to declare it’s version number using this name. Hence, it is always recommended to prefer the import statement even though it might make your program a little longer.

You could also use:

from mymodule import *

This will import all public names such as say_hi but would not import version because it starts with double underscores.

Warning

Remember that you should avoid using import-star, i.e. from mymodule import *.

Zen of Python

One of Python’s guiding principles is that "Explicit is better than Implicit". Run import this in Python to learn more and see this StackOverflow discussion which lists examples for each of the principles.

11.5. The dir function

You can use the built-in dir function to list the identifiers that an object defines. For example, for a module, the identifiers include the functions, classes and variables defined in that module.

When you supply a module name to the`dir()` function, it returns the list of the names defined in that module. When no argument is applied to it, it returns the list of names defined in the current module.

Example:

$ python
>>> import sys

# get names of attributes in sys module
>>> dir(sys)
['__displayhook__', '__doc__',
'argv', 'builtin_module_names',
'version', 'version_info']
# only few entries shown here

# get names of attributes for current module
>>> dir()
['__builtins__', '__doc__',
'__name__', '__package__']

# create a new variable 'a'
>>> a = 5

>>> dir()
['__builtins__', '__doc__', '__name__', '__package__', 'a']

# delete/remove a name
>>> del a

>>> dir()
['__builtins__', '__doc__', '__name__', '__package__']

How It Works

First, we see the usage of dir on the imported sys module. We can see the huge list of attributes that it contains.

Next, we use the dir function without passing parameters to it. By default, it returns the list of attributes for the current module. Notice that the list of imported modules is also part of this list.

In order to observe the dir in action, we define a new variable a and assign it a value and then check dir and we observe that there is an additional value in the list of the same name. We remove the variable/attribute of the current module using the del statement and the change is reflected again in the output of the dir function.

A note on del - this statement is used to delete a variable/name and after the statement has run, in this case del a, you can no longer access the variable a - it is as if it never existed before at all.

Note that the dir() function works on any object. For example, run dir(print) to learn about the attributes of the print function, or dir(str) for the attributes of the str class.

There is also a vars() function which can potentially give you the attributes and their values, but it will not work for all cases.

11.6. Packages

By now, you must have started observing the hierarchy of organizing your programs. Variables usually go inside functions. Functions and global variables usually go inside modules. What if you wanted to organize modules? That’s where packages come into the picture.

Packages are just folders of modules with a special init.py file that indicates to Python that this folder is special because it contains Python modules.

Let’s say you want to create a package called 'world' with subpackages 'asia', 'africa', etc. and these subpackages in turn contain modules like 'india', 'madagascar', etc.

This is how you would structure the folders:

- <some folder present in the sys.path>/
    - world/
        - __init__.py
        - asia/
            - __init__.py
            - india/
                - __init__.py
                - foo.py
        - africa/
            - __init__.py
            - madagascar/
                - __init__.py
                - bar.py

Packages are just a convenience to hierarchically organize modules. You will see many instances of this in the standard library.

11.7. Summary

Just like functions are reusable parts of programs, modules are reusable programs. Packages are another hierarchy to organize modules. The standard library that comes with Python is an example of such a set of packages and modules.

We have seen how to use these modules and create our own modules.

Next, we will learn about some interesting concepts called data structures.

12. Data Structures

Data structures are basically just that - they are structures which can hold some data together. In other words, they are used to store a collection of related data.

There are four built-in data structures in Python - list, tuple, dictionary and set. We will see how to use each of them and how they make life easier for us.

12.1. List

A list is a data structure that holds an ordered collection of items i.e. you can store a sequence of items in a list. This is easy to imagine if you can think of a shopping list where you have a list of items to buy, except that you probably have each item on a separate line in your shopping list whereas in Python you put commas in between them.

The list of items should be enclosed in square brackets so that Python understands that you are specifying a list. Once you have created a list, you can add, remove or search for items in the list. Since we can add and remove items, we say that a list is a mutable data type i.e. this type can be altered.

12.2. Quick Introduction To Objects And Classes

Although I’ve been generally delaying the discussion of objects and classes till now, a little explanation is needed right now so that you can understand lists better. We will explore this topic in detail in a later chapter.

A list is an example of usage of objects and classes. When we use a variable i and assign a value to it, say integer 5 to it, you can think of it as creating an object (i.e. instance) i of class (i.e. type) int. In fact, you can read help(int) to understand this better.

A class can also have methods i.e. functions defined for use with respect to that class only. You can use these pieces of functionality only when you have an object of that class. For example, Python provides an append method for the list class which allows you to add an item to the end of the list. For example, mylist.append('an item') will add that string to the list mylist. Note the use of dotted notation for accessing methods of the objects.

A class can also have fields which are nothing but variables defined for use with respect to that class only. You can use these variables/names only when you have an object of that class. Fields are also accessed by the dotted notation, for example, mylist.field.

Example (save as ds_using_list.py):

# This is my shopping list
shoplist = ['apple', 'mango', 'carrot', 'banana']

print 'I have', len(shoplist), 'items to purchase.'

print 'These items are:',
for item in shoplist:
    print item,

print '\nI also have to buy rice.'
shoplist.append('rice')
print 'My shopping list is now', shoplist

print 'I will sort my list now'
shoplist.sort()
print 'Sorted shopping list is', shoplist

print 'The first item I will buy is', shoplist[0]
olditem = shoplist[0]
del shoplist[0]
print 'I bought the', olditem
print 'My shopping list is now', shoplist

Output:

$ python ds_using_list.py
I have 4 items to purchase.
These items are: apple mango carrot banana
I also have to buy rice.
My shopping list is now ['apple', 'mango', 'carrot', 'banana', 'rice']
I will sort my list now
Sorted shopping list is ['apple', 'banana', 'carrot', 'mango', 'rice']
The first item I will buy is apple
I bought the apple
My shopping list is now ['banana', 'carrot', 'mango', 'rice']

How It Works

The variable shoplist is a shopping list for someone who is going to the market. In shoplist, we only store strings of the names of the items to buy but you can add any kind of object to a list including numbers and even other lists.

We have also used the for..in loop to iterate through the items of the list. By now, you must have realised that a list is also a sequence. The speciality of sequences will be discussed in a later section.

Notice the use of the trailing comma in the print statement to indicate that we want to end the output with a space instead of the usual line break. Think of the comma as telling Python that we have more items to print on the same line.

Next, we add an item to the list using the append method of the list object, as already discussed before. Then, we check that the item has been indeed added to the list by printing the contents of the list by simply passing the list to the print statement which prints it neatly.

Then, we sort the list by using the sort method of the list. It is important to understand that this method affects the list itself and does not return a modified list - this is different from the way strings work. This is what we mean by saying that lists are mutable and that strings are immutable.

Next, when we finish buying an item in the market, we want to remove it from the list. We achieve this by using the del statement. Here, we mention which item of the list we want to remove and the del statement removes it from the list for us. We specify that we want to remove the first item from the list and hence we use del shoplist[0] (remember that Python starts counting from 0).

If you want to know all the methods defined by the list object, see help(list) for details.

12.3. Tuple

Tuples are used to hold together multiple objects. Think of them as similar to lists, but without the extensive functionality that the list class gives you. One major feature of tuples is that they are immutable like strings i.e. you cannot modify tuples.

Tuples are defined by specifying items separated by commas within an optional pair of parentheses.

Tuples are usually used in cases where a statement or a user-defined function can safely assume that the collection of values i.e. the tuple of values used will not change.

Example (save as ds_using_tuple.py):

# I would recommend always using parentheses
# to indicate start and end of tuple
# even though parentheses are optional.
# Explicit is better than implicit.
zoo = ('python', 'elephant', 'penguin')
print 'Number of animals in the zoo is', len(zoo)

new_zoo = 'monkey', 'camel', zoo
print 'Number of cages in the new zoo is', len(new_zoo)
print 'All animals in new zoo are', new_zoo
print 'Animals brought from old zoo are', new_zoo[2]
print 'Last animal brought from old zoo is', new_zoo[2][2]
print 'Number of animals in the new zoo is', \
    len(new_zoo)-1+len(new_zoo[2])

Output:

$ python ds_using_tuple.py
Number of animals in the zoo is 3
Number of cages in the new zoo is 3
All animals in new zoo are ('monkey', 'camel', ('python', 'elephant', 'penguin'))
Animals brought from old zoo are ('python', 'elephant', 'penguin')
Last animal brought from old zoo is penguin
Number of animals in the new zoo is 5

How It Works

The variable zoo refers to a tuple of items. We see that the len function can be used to get the length of the tuple. This also indicates that a tuple is a sequence as well.

We are now shifting these animals to a new zoo since the old zoo is being closed. Therefore, the new_zoo tuple contains some animals which are already there along with the animals brought over from the old zoo. Back to reality, note that a tuple within a tuple does not lose its identity.

We can access the items in the tuple by specifying the item’s position within a pair of square brackets just like we did for lists. This is called the indexing operator. We access the third item in new_zoo by specifying new_zoo[2] and we access the third item within the third item in the new_zoo tuple by specifying new_zoo[2][2]. This is pretty simple once you’ve understood the idiom.

Note

Tuple with 0 or 1 items

An empty tuple is constructed by an empty pair of parentheses such as myempty = (). However, a tuple with a single item is not so simple. You have to specify it using a comma following the first (and only) item so that Python can differentiate between a tuple and a pair of parentheses surrounding the object in an expression i.e. you have to specify singleton = (2 , ) if you mean you want a tuple containing the item 2.

Note

Note for Perl programmers

A list within a list does not lose its identity i.e. lists are not flattened as in Perl. The same applies to a tuple within a tuple, or a tuple within a list, or a list within a tuple, etc. As far as Python is concerned, they are just objects stored using another object, that’s all.

12.4. Dictionary

A dictionary is like an address-book where you can find the address or contact details of a person by knowing only his/her name i.e. we associate keys (name) with values (details). Note that the key must be unique just like you cannot find out the correct information if you have two persons with the exact same name.

Note that you can use only immutable objects (like strings) for the keys of a dictionary but you can use either immutable or mutable objects for the values of the dictionary. This basically translates to say that you should use only simple objects for keys.

Pairs of keys and values are specified in a dictionary by using the notation

d = {key1 : value1,
key2 : value2 }

. Notice that the key-value pairs are separated by a colon and the pairs are separated themselves by commas and all this is enclosed in a pair of curly braces.

Remember that key-value pairs in a dictionary are not ordered in any manner. If you want a particular order, then you will have to sort them yourself before using it.

The dictionaries that you will be using are instances/objects of the dict class.

Example (save as ds_using_dict.py):

# 'ab' is short for 'a'ddress'b'ook

ab = {  'Swaroop'   : 'swaroop@swaroopch.com',
        'Larry'     : 'larry@wall.org',
        'Matsumoto' : 'matz@ruby-lang.org',
        'Spammer'   : 'spammer@hotmail.com'
    }

print "Swaroop's address is", ab['Swaroop']

# Deleting a key-value pair
del ab['Spammer']

print '\nThere are {} contacts in the address-book\n'.format(len(ab))

for name, address in ab.items():
    print 'Contact {} at {}'.format(name, address)

# Adding a key-value pair
ab['Guido'] = 'guido@python.org'

if 'Guido' in ab:
    print "\nGuido's address is", ab['Guido']

Output:

$ python ds_using_dict.py
Swaroop's address is swaroop@swaroopch.com

There are 3 contacts in the address-book

Contact Swaroop at swaroop@swaroopch.com
Contact Matsumoto at matz@ruby-lang.org
Contact Larry at larry@wall.org

Guido's address is guido@python.org

How It Works

We create the dictionary ab using the notation already discussed. We then access key-value pairs by specifying the key using the indexing operator as discussed in the context of lists and tuples. Observe the simple syntax.

We can delete key-value pairs using our old friend - the del statement. We simply specify the dictionary and the indexing operator for the key to be removed and pass it to the del statement. There is no need to know the value corresponding to the key for this operation.

Next, we access each key-value pair of the dictionary using the items method of the dictionary which returns a list of tuples where each tuple contains a pair of items - the key followed by the value. We retrieve this pair and assign it to the variables name and address correspondingly for each pair using the for..in loop and then print these values in the for-block.

We can add new key-value pairs by simply using the indexing operator to access a key and assign that value, as we have done for Guido in the above case.

We can check if a key-value pair exists using the in operator.

For the list of methods of the dict class, see help(dict).

Tip

Keyword Arguments and Dictionaries

If you have used keyword arguments in your functions, you have already used dictionaries! Just think about it - the key-value pair is specified by you in the parameter list of the function definition and when you access variables within your function, it is just a key access of a dictionary (which is called the symbol table in compiler design terminology).

12.5. Sequence

Lists, tuples and strings are examples of sequences, but what are sequences and what is so special about them?

The major features are membership tests, (i.e. the in and not in expressions) and indexing operations, which allow us to fetch a particular item in the sequence directly.

The three types of sequences mentioned above - lists, tuples and strings, also have a slicing operation which allows us to retrieve a slice of the sequence i.e. a part of the sequence.

Example (save as ds_seq.py):

shoplist = ['apple', 'mango', 'carrot', 'banana']
name = 'swaroop'

# Indexing or 'Subscription' operation #
print 'Item 0 is', shoplist[0]
print 'Item 1 is', shoplist[1]
print 'Item 2 is', shoplist[2]
print 'Item 3 is', shoplist[3]
print 'Item -1 is', shoplist[-1]
print 'Item -2 is', shoplist[-2]
print 'Character 0 is', name[0]

# Slicing on a list #
print 'Item 1 to 3 is', shoplist[1:3]
print 'Item 2 to end is', shoplist[2:]
print 'Item 1 to -1 is', shoplist[1:-1]
print 'Item start to end is', shoplist[:]

# Slicing on a string #
print 'characters 1 to 3 is', name[1:3]
print 'characters 2 to end is', name[2:]
print 'characters 1 to -1 is', name[1:-1]
print 'characters start to end is', name[:]

Output:

$ python ds_seq.py
Item 0 is apple
Item 1 is mango
Item 2 is carrot
Item 3 is banana
Item -1 is banana
Item -2 is carrot
Character 0 is s
Item 1 to 3 is ['mango', 'carrot']
Item 2 to end is ['carrot', 'banana']
Item 1 to -1 is ['mango', 'carrot']
Item start to end is ['apple', 'mango', 'carrot', 'banana']
characters 1 to 3 is wa
characters 2 to end is aroop
characters 1 to -1 is waroo
characters start to end is swaroop

How It Works

First, we see how to use indexes to get individual items of a sequence. This is also referred to as the subscription operation. Whenever you specify a number to a sequence within square brackets as shown above, Python will fetch you the item corresponding to that position in the sequence. Remember that Python starts counting numbers from 0. Hence, shoplist[0] fetches the first item and shoplist[3] fetches the fourth item in the `shoplist`sequence.

The index can also be a negative number, in which case, the position is calculated from the end of the sequence. Therefore, shoplist[-1] refers to the last item in the sequence and shoplist[-2] fetches the second last item in the sequence.

The slicing operation is used by specifying the name of the sequence followed by an optional pair of numbers separated by a colon within square brackets. Note that this is very similar to the indexing operation you have been using till now. Remember the numbers are optional but the colon isn’t.

The first number (before the colon) in the slicing operation refers to the position from where the slice starts and the second number (after the colon) indicates where the slice will stop at. If the first number is not specified, Python will start at the beginning of the sequence. If the second number is left out, Python will stop at the end of the sequence. Note that the slice returned starts at the start position and will end just before the end position i.e. the start position is included but the end position is excluded from the sequence slice.

Thus, shoplist[1:3] returns a slice of the sequence starting at position 1, includes position 2 but stops at position 3 and therefore a slice of two items is returned. Similarly, shoplist[:] returns a copy of the whole sequence.

You can also do slicing with negative positions. Negative numbers are used for positions from the end of the sequence. For example, shoplist[:-1] will return a slice of the sequence which excludes the last item of the sequence but contains everything else.

You can also provide a third argument for the slice, which is the step for the slicing (by default, the step size is 1):

>>> shoplist = ['apple', 'mango', 'carrot', 'banana']
>>> shoplist[::1]
['apple', 'mango', 'carrot', 'banana']
>>> shoplist[::2]
['apple', 'carrot']
>>> shoplist[::3]
['apple', 'banana']
>>> shoplist[::-1]
['banana', 'carrot', 'mango', 'apple']

Notice that when the step is 2, we get the items with position 0, 2,… When the step size is 3, we get the items with position 0, 3, etc.

Try various combinations of such slice specifications using the Python interpreter interactively i.e. the prompt so that you can see the results immediately. The great thing about sequences is that you can access tuples, lists and strings all in the same way!

12.6. Set

Sets are unordered collections of simple objects. These are used when the existence of an object in a collection is more important than the order or how many times it occurs.

Using sets, you can test for membership, whether it is a subset of another set, find the intersection between two sets, and so on.

>>> bri = set(['brazil', 'russia', 'india'])
>>> 'india' in bri
True
>>> 'usa' in bri
False
>>> bric = bri.copy()
>>> bric.add('china')
>>> bric.issuperset(bri)
True
>>> bri.remove('russia')
>>> bri & bric # OR bri.intersection(bric)
{'brazil', 'india'}

How It Works

The example is pretty much self-explanatory because it involves basic set theory mathematics taught in school.

12.7. References

When you create an object and assign it to a variable, the variable only refers to the object and does not represent the object itself! That is, the variable name points to that part of your computer’s memory where the object is stored. This is called binding the name to the object.

Generally, you don’t need to be worried about this, but there is a subtle effect due to references which you need to be aware of:

Example (save as ds_reference.py):

print 'Simple Assignment'
shoplist = ['apple', 'mango', 'carrot', 'banana']
# mylist is just another name pointing to the same object!
mylist = shoplist

# I purchased the first item, so I remove it from the list
del shoplist[0]

print 'shoplist is', shoplist
print 'mylist is', mylist
# Notice that both shoplist and mylist both print
# the same list without the 'apple' confirming that
# they point to the same object

print 'Copy by making a full slice'
# Make a copy by doing a full slice
mylist = shoplist[:]
# Remove first item
del mylist[0]

print 'shoplist is', shoplist
print 'mylist is', mylist
# Notice that now the two lists are different

Output:

$ python ds_reference.py
Simple Assignment
shoplist is ['mango', 'carrot', 'banana']
mylist is ['mango', 'carrot', 'banana']
Copy by making a full slice
shoplist is ['mango', 'carrot', 'banana']
mylist is ['carrot', 'banana']

How It Works

Most of the explanation is available in the comments.

Remember that if you want to make a copy of a list or such kinds of sequences or complex objects (not simple objects such as integers), then you have to use the slicing operation to make a copy. If you just assign the variable name to another name, both of them will ''refer'' to the same object and this could be trouble if you are not careful.

Note	Note for Perl programmers Remember that an assignment statement for lists does not create a copy. You have to use slicing operation to make a copy of the sequence.

12.8. More About Strings

We have already discussed strings in detail earlier. What more can there be to know? Well, did you know that strings are also objects and have methods which do everything from checking part of a string to stripping spaces!

The strings that you use in program are all objects of the class str. Some useful methods of this class are demonstrated in the next example. For a complete list of such methods, see help(str).

Example (save as ds_str_methods.py):

# This is a string object
name = 'Swaroop'

if name.startswith('Swa'):
    print 'Yes, the string starts with "Swa"'

if 'a' in name:
    print 'Yes, it contains the string "a"'

if name.find('war') != -1:
    print 'Yes, it contains the string "war"'

delimiter = '_*_'
mylist = ['Brazil', 'Russia', 'India', 'China']
print delimiter.join(mylist)

Output:

$ python ds_str_methods.py
Yes, the string starts with "Swa"
Yes, it contains the string "a"
Yes, it contains the string "war"
Brazil_*_Russia_*_India_*_China

How It Works

Here, we see a lot of the string methods in action. The startswith method is used to find out whether the string starts with the given string. The in operator is used to check if a given string is a part of the string.

The find method is used to locate the position of the given substring within the string; find returns -1 if it is unsuccessful in finding the substring. The str class also has a neat method to join the items of a sequence with the string acting as a delimiter between each item of the sequence and returns a bigger string generated from this.

12.9. Summary

We have explored the various built-in data structures of Python in detail. These data structures will be essential for writing programs of reasonable size.

Now that we have a lot of the basics of Python in place, we will next see how to design and write a real-world Python program.

13. Problem Solving

We have explored various parts of the Python language and now we will take a look at how all these parts fit together, by designing and writing a program which does something useful. The idea is to learn how to write a Python script on your own.

13.1. The Problem

The problem we want to solve is:

I want a program which creates a backup of all my important files.

Although, this is a simple problem, there is not enough information for us to get started with the solution. A little more analysis is required. For example, how do we specify which files are to be backed up? How are they stored? Where are they stored?

After analyzing the problem properly, we design our program. We make a list of things about how our program should work. In this case, I have created the following list on how I want it to work. If you do the design, you may not come up with the same kind of analysis since every person has their own way of doing things, so that is perfectly okay.

The files and directories to be backed up are specified in a list.
The backup must be stored in a main backup directory.
The files are backed up into a zip file.
The name of the zip archive is the current date and time.
We use the standard zip command available by default in any standard GNU/Linux or Unix distribution. Note that you can use any archiving command you want as long as it has a command line interface.

Note	For Windows users Windows users can install the `zip` command from the GnuWin32 project page and add `C:\Program Files\GnuWin32\bin` to your system `PATH` environment variable, similar to what we did for recognizing the python command itself.

13.2. The Solution

As the design of our program is now reasonably stable, we can write the code which is an implementation of our solution.

Save as backup_ver1.py:

import os
import time

# 1. The files and directories to be backed up are
# specified in a list.
# Example on Windows:
# source = ['"C:\\My Documents"', 'C:\\Code']
# Example on Mac OS X and Linux:
source = ['/Users/swa/notes']
# Notice we had to use double quotes inside the string
# for names with spaces in it.

# 2. The backup must be stored in a
# main backup directory
# Example on Windows:
# target_dir = 'E:\\Backup'
# Example on Mac OS X and Linux:
target_dir = '/Users/swa/backup'
# Remember to change this to which folder you will be using

# 3. The files are backed up into a zip file.
# 4. The name of the zip archive is the current date and time
target = target_dir + os.sep + \
         time.strftime('%Y%m%d%H%M%S') + '.zip'

# Create target directory if it is not present
if not os.path.exists(target_dir):
    os.mkdir(target_dir) # make directory

# 5. We use the zip command to put the files in a zip archive
zip_command = "zip -r {0} {1}".format(target,
                                      ' '.join(source))

# Run the backup
print "Zip command is:"
print zip_command
print "Running:"
if os.system(zip_command) == 0:
    print 'Successful backup to', target
else:
    print 'Backup FAILED'

Output:

$ python backup_ver1.py
Zip command is:
zip -r /Users/swa/backup/20140328084844.zip /Users/swa/notes
Running:
  adding: Users/swa/notes/ (stored 0%)
  adding: Users/swa/notes/blah1.txt (stored 0%)
  adding: Users/swa/notes/blah2.txt (stored 0%)
  adding: Users/swa/notes/blah3.txt (stored 0%)
Successful backup to /Users/swa/backup/20140328084844.zip

Now, we are in the testing phase where we test that our program works properly. If it doesn’t behave as expected, then we have to debug our program i.e. remove the bugs (errors) from the program.

If the above program does not work for you, copy the line printed after the Zip command is line in the output, paste it in the shell (on GNU/Linux and Mac OS X) / cmd (on Windows), see what the error is and try to fix it. Also check the zip command manual on what could be wrong. If this command succeeds, then the problem might be in the Python program itself, so check if it exactly matches the program written above.

How It Works

You will notice how we have converted our design into code in a step-by-step manner.

We make use of the os and time modules by first importing them. Then, we specify the files and directories to be backed up in the source list. The target directory is where we store all the backup files and this is specified in the target_dir variable. The name of the zip archive that we are going to create is the current date and time which we generate using the time.strftime() function. It will also have the .zip extension and will be stored in the target_dir directory.

Notice the use of the os.sep variable - this gives the directory separator according to your operating system i.e. it will be '/' in GNU/Linux and Unix, it will be '\\' in Windows and ':' in Mac OS. Using os.sep instead of these characters directly will make our program portable and work across all of these systems.

The time.strftime() function takes a specification such as the one we have used in the above program. The %Y specification will be replaced by the year with the century. The %m specification will be replaced by the month as a decimal number between 01 and 12 and so on. The complete list of such specifications can be found in the Python Reference Manual.

We create the name of the target zip file using the addition operator which concatenates the strings i.e. it joins the two strings together and returns a new one. Then, we create a string zip_command which contains the command that we are going to execute. You can check if this command works by running it in the shell (GNU/Linux terminal or DOS prompt).

The zip command that we are using has some options and parameters passed. The -r option specifies that the zip command should work recursively for directories i.e. it should include all the subdirectories and files. The two options are combined and specified in a shortcut as -qr. The options are followed by the name of the zip archive to create followed by the list of files and directories to backup. We convert the source list into a string using the join method of strings which we have already seen how to use.

Then, we finally run the command using the os.system function which runs the command as if it was run from the system i.e. in the shell - it returns 0 if the command was successfully, else it returns an error number.

Depending on the outcome of the command, we print the appropriate message that the backup has failed or succeeded.

That’s it, we have created a script to take a backup of our important files!

Note	Note to Windows Users Instead of double backslash escape sequences, you can also use raw strings. For example, use `'C:\\Documents'` or `r’C:\Documents'`. However, do not use `'C:\Documents'` since you end up using an unknown escape sequence `\D`.

Now that we have a working backup script, we can use it whenever we want to take a backup of the files. This is called the operation phase or the deployment phase of the software.

The above program works properly, but (usually) first programs do not work exactly as you expect. For example, there might be problems if you have not designed the program properly or if you have made a mistake when typing the code, etc. Appropriately, you will have to go back to the design phase or you will have to debug your program.

13.3. Second Version

The first version of our script works. However, we can make some refinements to it so that it can work better on a daily basis. This is called the maintenance phase of the software.

One of the refinements I felt was useful is a better file-naming mechanism - using the time as the name of the file within a directory with the current date as a directory within the main backup directory. The first advantage is that your backups are stored in a hierarchical manner and therefore it is much easier to manage. The second advantage is that the filenames are much shorter. The third advantage is that separate directories will help you check if you have made a backup for each day since the directory would be created only if you have made a backup for that day.

Save as backup_ver2.py:

import os
import time

# 1. The files and directories to be backed up are
# specified in a list.
# Example on Windows:
# source = ['"C:\\My Documents"', 'C:\\Code']
# Example on Mac OS X and Linux:
source = ['/Users/swa/notes']
# Notice we had to use double quotes inside the string
# for names with spaces in it.

# 2. The backup must be stored in a
# main backup directory
# Example on Windows:
# target_dir = 'E:\\Backup'
# Example on Mac OS X and Linux:
target_dir = '/Users/swa/backup'
# Remember to change this to which folder you will be using

# Create target directory if it is not present
if not os.path.exists(target_dir):
    os.mkdir(target_dir) # make directory

# 3. The files are backed up into a zip file.
# 4. The current day is the name of the subdirectory
# in the main directory.
today = target_dir + os.sep + time.strftime('%Y%m%d')
# The current time is the name of the zip archive.
now = time.strftime('%H%M%S')

# The name of the zip file
target = today + os.sep + now + '.zip'

# Create the subdirectory if it isn't already there
if not os.path.exists(today):
    os.mkdir(today)
    print 'Successfully created directory', today

# 5. We use the zip command to put the files in a zip archive
zip_command = "zip -r {0} {1}".format(target,
                                      ' '.join(source))

# Run the backup
print "Zip command is:"
print zip_command
print "Running:"
if os.system(zip_command) == 0:
    print 'Successful backup to', target
else:
    print 'Backup FAILED'

Output:

$ python backup_ver2.py
Successfully created directory /Users/swa/backup/20140329
Zip command is:
zip -r /Users/swa/backup/20140329/073201.zip /Users/swa/notes
Running:
  adding: Users/swa/notes/ (stored 0%)
  adding: Users/swa/notes/blah1.txt (stored 0%)
  adding: Users/swa/notes/blah2.txt (stored 0%)
  adding: Users/swa/notes/blah3.txt (stored 0%)
Successful backup to /Users/swa/backup/20140329/073201.zip

How It Works

Most of the program remains the same. The changes are that we check if there is a directory with the current day as its name inside the main backup directory using the os.path.exists function. If it doesn’t exist, we create it using the os.mkdir function.

13.4. Third Version

The second version works fine when I do many backups, but when there are lots of backups, I am finding it hard to differentiate what the backups were for! For example, I might have made some major changes to a program or presentation, then I want to associate what those changes are with the name of the zip archive. This can be easily achieved by attaching a user-supplied comment to the name of the zip archive.

Warning

The following program does not work, so do not be alarmed, please follow along because there’s a lesson in here.

Save as backup_ver3.py:

import os
import time

# 1. The files and directories to be backed up are
# specified in a list.
# Example on Windows:
# source = ['"C:\\My Documents"', 'C:\\Code']
# Example on Mac OS X and Linux:
source = ['/Users/swa/notes']
# Notice we had to use double quotes inside the string
# for names with spaces in it.

# 2. The backup must be stored in a
# main backup directory
# Example on Windows:
# target_dir = 'E:\\Backup'
# Example on Mac OS X and Linux:
target_dir = '/Users/swa/backup'
# Remember to change this to which folder you will be using

# Create target directory if it is not present
if not os.path.exists(target_dir):
    os.mkdir(target_dir) # make directory

# 3. The files are backed up into a zip file.
# 4. The current day is the name of the subdirectory
# in the main directory.
today = target_dir + os.sep + time.strftime('%Y%m%d')
# The current time is the name of the zip archive.
now = time.strftime('%H%M%S')

# Take a comment from the user to
# create the name of the zip file
comment = raw_input('Enter a comment --> ')
# Check if a comment was entered
if len(comment) == 0:
    target = today + os.sep + now + '.zip'
else:
    target = today + os.sep + now + '_' +
        comment.replace(' ', '_') + '.zip'

# Create the subdirectory if it isn't already there
if not os.path.exists(today):
    os.mkdir(today)
    print 'Successfully created directory', today

# 5. We use the zip command to put the files in a zip archive
zip_command = "zip -r {0} {1}".format(target,
                                      ' '.join(source))

# Run the backup
print "Zip command is:"
print zip_command
print "Running:"
if os.system(zip_command) == 0:
    print 'Successful backup to', target
else:
    print 'Backup FAILED'

Output:

$ python backup_ver3.py
  File "backup_ver3.py", line 39
    target = today + os.sep + now + '_' +
                                        ^
SyntaxError: invalid syntax

How This (does not) Work

This program does not work! Python says there is a syntax error which means that the script does not satisfy the structure that Python expects to see. When we observe the error given by Python, it also tells us the place where it detected the error as well. So we start debugging our program from that line.

On careful observation, we see that the single logical line has been split into two physical lines but we have not specified that these two physical lines belong together. Basically, Python has found the addition operator (+) without any operand in that logical line and hence it doesn’t know how to continue. Remember that we can specify that the logical line continues in the next physical line by the use of a backslash at the end of the physical line. So, we make this correction to our program. This correction of the program when we find errors is called bug fixing.

13.5. Fourth Version

Save as backup_ver4.py:

import os
import time

# 1. The files and directories to be backed up are
# specified in a list.
# Example on Windows:
# source = ['"C:\\My Documents"', 'C:\\Code']
# Example on Mac OS X and Linux:
source = ['/Users/swa/notes']
# Notice we had to use double quotes inside the string
# for names with spaces in it.

# 2. The backup must be stored in a
# main backup directory
# Example on Windows:
# target_dir = 'E:\\Backup'
# Example on Mac OS X and Linux:
target_dir = '/Users/swa/backup'
# Remember to change this to which folder you will be using

# Create target directory if it is not present
if not os.path.exists(target_dir):
    os.mkdir(target_dir) # make directory

# 3. The files are backed up into a zip file.
# 4. The current day is the name of the subdirectory
# in the main directory.
today = target_dir + os.sep + time.strftime('%Y%m%d')
# The current time is the name of the zip archive.
now = time.strftime('%H%M%S')

# Take a comment from the user to
# create the name of the zip file
comment = raw_input('Enter a comment --> ')
# Check if a comment was entered
if len(comment) == 0:
    target = today + os.sep + now + '.zip'
else:
    target = today + os.sep + now + '_' + \
        comment.replace(' ', '_') + '.zip'

# Create the subdirectory if it isn't already there
if not os.path.exists(today):
    os.mkdir(today)
    print 'Successfully created directory', today

# 5. We use the zip command to put the files in a zip archive
zip_command = "zip -r {0} {1}".format(target,
                                      ' '.join(source))

# Run the backup
print "Zip command is:"
print zip_command
print "Running:"
if os.system(zip_command) == 0:
    print 'Successful backup to', target
else:
    print 'Backup FAILED'

Output:

$ python backup_ver4.py
Enter a comment --> added new examples
Zip command is:
zip -r /Users/swa/backup/20140329/074122_added_new_examples.zip /Users/swa/notes
Running:
  adding: Users/swa/notes/ (stored 0%)
  adding: Users/swa/notes/blah1.txt (stored 0%)
  adding: Users/swa/notes/blah2.txt (stored 0%)
  adding: Users/swa/notes/blah3.txt (stored 0%)
Successful backup to /Users/swa/backup/20140329/074122_added_new_examples.zip

How It Works

This program now works! Let us go through the actual enhancements that we had made in version 3. We take in the user’s comments using the input function and then check if the user actually entered something by finding out the length of the input using the len function. If the user has just pressed enter without entering anything (maybe it was just a routine backup or no special changes were made), then we proceed as we have done before.

However, if a comment was supplied, then this is attached to the name of the zip archive just before the .zip extension. Notice that we are replacing spaces in the comment with underscores - this is because managing filenames without spaces is much easier.

13.6. More Refinements

The fourth version is a satisfactorily working script for most users, but there is always room for improvement. For example, you can include a verbosity level for the program where you can specify a -v option to make your program become more talkative or a -q to make it quiet.

Another possible enhancement would be to allow extra files and directories to be passed to the script at the command line. We can get these names from the sys.argv list and we can add them to our source list using the extend method provided by the list class.

The most important refinement would be to not use the os.system way of creating archives and instead using the zipfile or tarfile built-in modules to create these archives. They are part of the standard library and available already for you to use without external dependencies on the zip program to be available on your computer.

However, I have been using the os.system way of creating a backup in the above examples purely for pedagogical purposes, so that the example is simple enough to be understood by everybody but real enough to be useful.

Can you try writing the fifth version that uses the zipfile module instead of the os.system call?

13.7. The Software Development Process

We have now gone through the various phases in the process of writing a software. These phases can be summarised as follows:

What (Analysis)
How (Design)
Do It (Implementation)
Test (Testing and Debugging)
Use (Operation or Deployment)
Maintain (Refinement)

A recommended way of writing programs is the procedure we have followed in creating the backup script: Do the analysis and design. Start implementing with a simple version. Test and debug it. Use it to ensure that it works as expected. Now, add any features that you want and continue to repeat the Do It-Test-Use cycle as many times as required.

Remember:

Software is grown, not built.

— Bill de hÓra

13.8. Summary

We have seen how to create our own Python programs/scripts and the various stages involved in writing such programs. You may find it useful to create your own program just like we did in this chapter so that you become comfortable with Python as well as problem-solving.

Next, we will discuss object-oriented programming.

14. Object Oriented Programming

In all the programs we wrote till now, we have designed our program around functions i.e. blocks of statements which manipulate data. This is called the procedure-oriented way of programming. There is another way of organizing your program which is to combine data and functionality and wrap it inside something called an object. This is called the object oriented programming paradigm. Most of the time you can use procedural programming, but when writing large programs or have a problem that is better suited to this method, you can use object oriented programming techniques.

Classes and objects are the two main aspects of object oriented programming. A class creates a new type where objects are instances of the class. An analogy is that you can have variables of type int which translates to saying that variables that store integers are variables which are instances (objects) of the int class.

Note

Note for Static Language Programmers

Note that even integers are treated as objects (of the int class). This is unlike C++ and Java (before version 1.5) where integers are primitive native types.

See help(int) for more details on the class.

C# and Java 1.5 programmers will find this similar to the boxing and unboxing concept.

Objects can store data using ordinary variables that belong to the object. Variables that belong to an object or class are referred to as fields. Objects can also have functionality by using functions that belong to a class. Such functions are called methods of the class. This terminology is important because it helps us to differentiate between functions and variables which are independent and those which belong to a class or object. Collectively, the fields and methods can be referred to as the attributes of that class.

Fields are of two types - they can belong to each instance/object of the class or they can belong to the class itself. They are called instance variables and class variables respectively.

A class is created using the class keyword. The fields and methods of the class are listed in an indented block.

14.1. The self

Class methods have only one specific difference from ordinary functions - they must have an extra first name that has to be added to the beginning of the parameter list, but you do not give a value for this parameter when you call the method, Python will provide it. This particular variable refers to the object itself, and by convention, it is given the name self.

Although, you can give any name for this parameter, it is strongly recommended that you use the name self - any other name is definitely frowned upon. There are many advantages to using a standard name - any reader of your program will immediately recognize it and even specialized IDEs (Integrated Development Environments) can help you if you use self.

Note	Note for C++/Java/C# Programmers The `self` in Python is equivalent to the `this` pointer in C++ and the `this` reference in Java and C#.

You must be wondering how Python gives the value for self and why you don’t need to give a value for it. An example will make this clear. Say you have a class called MyClass and an instance of this class called myobject. When you call a method of this object as

myobject.method(arg1,
arg2)

, this is automatically converted by Python into MyClass.method(myobject, arg1, arg2) - this is all the special self is about.

This also means that if you have a method which takes no arguments, then you still have to have one argument - the self.

14.2. Classes

The simplest class possible is shown in the following example (save as oop_simplestclass.py).

class Person:
    pass # An empty block

p = Person()
print(p)

Output:

$ python oop_simplestclass.py
<__main__.Person instance at 0x10171f518>

How It Works

We create a new class using the class statement and the name of the class. This is followed by an indented block of statements which form the body of the class. In this case, we have an empty block which is indicated using the pass statement.

Next, we create an object/instance of this class using the name of the class followed by a pair of parentheses. (We will learn more about instantiation in the next section). For our verification, we confirm the type of the variable by simply printing it. It tells us that we have an instance of the Person class in the main module.

Notice that the address of the computer memory where your object is stored is also printed. The address will have a different value on your computer since Python can store the object wherever it finds space.

14.3. Methods

We have already discussed that classes/objects can have methods just like functions except that we have an extra self variable. We will now see an example (save as oop_method.py).

class Person:
    def say_hi(self):
        print('Hello, how are you?')

p = Person()
p.say_hi()
# The previous 2 lines can also be written as
# Person().say_hi()

Output:

$ python oop_method.py
Hello, how are you?

How It Works

Here we see the self in action. Notice that the say_hi method takes no parameters but still has the self in the function definition.

14.4. The `init` method

There are many method names which have special significance in Python classes. We will see the significance of the init method now.

The init method is run as soon as an object of a class is instantiated. The method is useful to do any initialization you want to do with your object. Notice the double underscores both at the beginning and at the end of the name.

Example (save as oop_init.py):

class Person:
    def __init__(self, name):
        self.name = name
    def say_hi(self):
        print 'Hello, my name is', self.name

p = Person('Swaroop')
p.say_hi()
# The previous 2 lines can also be written as
# Person('Swaroop').say_hi()

Output:

$ python oop_init.py
Hello, my name is Swaroop

How It Works

Here, we define the init method as taking a parameter name (along with the usual self). Here, we just create a new field also called name. Notice these are two different variables even though they are both called 'name'. There is no problem because the dotted notation self.name means that there is something called "name" that is part of the object called "self" and the other name is a local variable. Since we explicitly indicate which name we are referring to, there is no confusion.

Most importantly, notice that we do not explicitly call the init method but pass the arguments in the parentheses following the class name when creating a new instance of the class. This is the special significance of this method.

Now, we are able to use the self.name field in our methods which is demonstrated in the sayHi method.

14.5. Class And Object Variables

We have already discussed the functionality part of classes and objects (i.e. methods), now let us learn about the data part. The data part, i.e. fields, are nothing but ordinary variables that are bound to the namespaces of the classes and objects. This means that these names are valid within the context of these classes and objects only. That’s why they are called name spaces.

There are two types of fields - class variables and object variables which are classified depending on whether the class or the object owns the variables respectively.

Class variables are shared - they can be accessed by all instances of that class. There is only one copy of the class variable and when any one object makes a change to a class variable, that change will be seen by all the other instances.

Object variables are owned by each individual object/instance of the class. In this case, each object has its own copy of the field i.e. they are not shared and are not related in any way to the field by the same name in a different instance. An example will make this easy to understand (save as oop_objvar.py):

class Robot:
    """Represents a robot, with a name."""

    # A class variable, counting the number of robots
    population = 0

    def __init__(self, name):
        """Initializes the data."""
        self.name = name
        print "(Initializing {})".format(self.name)

        # When this person is created, the robot
        # adds to the population
        Robot.population += 1

    def die(self):
        """I am dying."""
        print "{} is being destroyed!".format(self.name)

        Robot.population -= 1

        if Robot.population == 0:
            print "{} was the last one.".format(self.name)
        else:
            print "There are still {:d} robots working.".format(
                Robot.population)

    def say_hi(self):
        """Greeting by the robot.

        Yeah, they can do that."""
        print "Greetings, my masters call me {}.".format(self.name)

    @classmethod
    def how_many(cls):
        """Prints the current population."""
        print "We have {:d} robots.".format(cls.population)


droid1 = Robot("R2-D2")
droid1.say_hi()
Robot.how_many()

droid2 = Robot("C-3PO")
droid2.say_hi()
Robot.how_many()

print "\nRobots can do some work here.\n"

print "Robots have finished their work. So let's destroy them."
droid1.die()
droid2.die()

Robot.how_many()

Output:

$ python oop_objvar.py
(Initializing R2-D2)
Greetings, my masters call me R2-D2.
We have 1 robots.
(Initializing C-3PO)
Greetings, my masters call me C-3PO.
We have 2 robots.

Robots can do some work here.

Robots have finished their work. So let's destroy them.
R2-D2 is being destroyed!
There are still 1 robots working.
C-3PO is being destroyed!
C-3PO was the last one.
We have 0 robots.

How It Works

This is a long example but helps demonstrate the nature of class and object variables. Here, population belongs to the Robot class and hence is a class variable. The name variable belongs to the object (it is assigned using self) and hence is an object variable.

Thus, we refer to the population class variable as Robot.population and not as self.population. We refer to the object variable name using self.name notation in the methods of that object. Remember this simple difference between class and object variables. Also note that an object variable with the same name as a class variable will hide the class variable!

Instead of Robot.population, we could have also used self.__class__.population because every object refers to it’s class via the self.__class__ attribute.

The how_many is actually a method that belongs to the class and not to the object. This means we can define it as either a classmethod or a staticmethod depending on whether we need to know which class we are part of. Since we refer to a class variable, let’s use classmethod.

We have marked the how_many method as a class method using a decorator.

Decorators can be imagined to be a shortcut to calling a wrapper function, so applying the @classmethod decorator is same as calling:

how_many = classmethod(how_many)

Observe that the init method is used to initialize the Robot instance with a name. In this method, we increase the population count by 1 since we have one more robot being added. Also observe that the values of self.name is specific to each object which indicates the nature of object variables.

Remember, that you must refer to the variables and methods of the same object using the self only. This is called an attribute reference.

In this program, we also see the use of docstrings for classes as well as methods. We can access the class docstring at runtime using Robot.doc and the method docstring as Robot.say_hi.doc

In the die method, we simply decrease the Robot.population count by 1.

All class members are public. One exception: If you use data members with names using the double underscore prefix such as __privatevar, Python uses name-mangling to effectively make it a private variable.

Thus, the convention followed is that any variable that is to be used only within the class or object should begin with an underscore and all other names are public and can be used by other classes/objects. Remember that this is only a convention and is not enforced by Python (except for the double underscore prefix).

Note	Note for C++/Java/C# Programmers All class members (including the data members) are public and all the methods are virtual in Python.

14.6. Inheritance

One of the major benefits of object oriented programming is reuse of code and one of the ways this is achieved is through the inheritance mechanism. Inheritance can be best imagined as implementing a type and subtype relationship between classes.

Suppose you want to write a program which has to keep track of the teachers and students in a college. They have some common characteristics such as name, age and address. They also have specific characteristics such as salary, courses and leaves for teachers and, marks and fees for students.

You can create two independent classes for each type and process them but adding a new common characteristic would mean adding to both of these independent classes. This quickly becomes unwieldy.

A better way would be to create a common class called SchoolMember and then have the teacher and student classes inherit from this class i.e. they will become sub-types of this type (class) and then we can add specific characteristics to these sub-types.

There are many advantages to this approach. If we add/change any functionality in SchoolMember, this is automatically reflected in the subtypes as well. For example, you can add a new ID card field for both teachers and students by simply adding it to the SchoolMember class. However, changes in the subtypes do not affect other subtypes. Another advantage is that if you can refer to a teacher or student object as a SchoolMember object which could be useful in some situations such as counting of the number of school members. This is called polymorphism where a sub-type can be substituted in any situation where a parent type is expected i.e. the object can be treated as an instance of the parent class.

Also observe that we reuse the code of the parent class and we do not need to repeat it in the different classes as we would have had to in case we had used independent classes.

The SchoolMember class in this situation is known as the base class or the superclass. The Teacher and Student classes are called the derived classes or subclasses.

We will now see this example as a program (save as oop_subclass.py):

class SchoolMember:
    '''Represents any school member.'''
    def __init__(self, name, age):
        self.name = name
        self.age = age
        print '(Initialized SchoolMember: {})'.format(self.name)

    def tell(self):
        '''Tell my details.'''
        print 'Name:"{}" Age:"{}"'.format(self.name, self.age),

class Teacher(SchoolMember):
    '''Represents a teacher.'''
    def __init__(self, name, age, salary):
        SchoolMember.__init__(self, name, age)
        self.salary = salary
        print '(Initialized Teacher: {})'.format(self.name)

    def tell(self):
        SchoolMember.tell(self)
        print 'Salary: "{:d}"'.format(self.salary)

class Student(SchoolMember):
    '''Represents a student.'''
    def __init__(self, name, age, marks):
        SchoolMember.__init__(self, name, age)
        self.marks = marks
        print '(Initialized Student: {})'.format(self.name)

    def tell(self):
        SchoolMember.tell(self)
        print 'Marks: "{:d}"'.format(self.marks)

t = Teacher('Mrs. Shrividya', 40, 30000)
s = Student('Swaroop', 25, 75)

# prints a blank line
print

members = [t, s]
for member in members:
    # Works for both Teachers and Students
    member.tell()

Output:

$ python oop_subclass.py
(Initialized SchoolMember: Mrs. Shrividya)
(Initialized Teacher: Mrs. Shrividya)
(Initialized SchoolMember: Swaroop)
(Initialized Student: Swaroop)

Name:"Mrs. Shrividya" Age:"40" Salary: "30000"
Name:"Swaroop" Age:"25" Marks: "75"

How It Works

To use inheritance, we specify the base class names in a tuple following the class name in the class definition. Next, we observe that the init method of the base class is explicitly called using the self variable so that we can initialize the base class part of the object. This is very important to remember - Python does not automatically call the constructor of the base class, you have to explicitly call it yourself.

We also observe that we can call methods of the base class by prefixing the class name to the method call and then pass in the self variable along with any arguments.

Notice that we can treat instances of Teacher or Student as just instances of the SchoolMember when we use the tell method of the SchoolMember class.

Also, observe that the tell method of the subtype is called and not the tell method of the SchoolMember class. One way to understand this is that Python always starts looking for methods in the actual type, which in this case it does. If it could not find the method, it starts looking at the methods belonging to its base classes one by one in the order they are specified in the tuple in the class definition.

A note on terminology - if more than one class is listed in the inheritance tuple, then it is called multiple inheritance.

The trailing comma is used at the end of the print statement in the superclass’s tell() method to print a line and allow the next print to continue on the same line. This is a trick to make print not print a \n (newline) symbol at the end of the printing.

14.7. Summary

We have now explored the various aspects of classes and objects as well as the various terminologies associated with it. We have also seen the benefits and pitfalls of object-oriented programming. Python is highly object-oriented and understanding these concepts carefully will help you a lot in the long run.

Next, we will learn how to deal with input/output and how to access files in Python.

15. Input and Output

There will be situations where your program has to interact with the user. For example, you would want to take input from the user and then print some results back. We can achieve this using the raw_input() function and print statement respectively.

For output, we can also use the various methods of the str (string) class. For example, you can use the rjust method to get a string which is right justified to a specified width. See help(str) for more details.

Another common type of input/output is dealing with files. The ability to create, read and write files is essential to many programs and we will explore this aspect in this chapter.

15.1. Input from user

Save this program as io_input.py:

def reverse(text):
    return text[::-1]

def is_palindrome(text):
    return text == reverse(text)

something = raw_input("Enter text: ")
if is_palindrome(something):
    print "Yes, it is a palindrome"
else:
    print "No, it is not a palindrome"

Output:

$ python io_input.py
Enter text: sir
No, it is not a palindrome

$ python io_input.py
Enter text: madam
Yes, it is a palindrome

$ python io_input.py
Enter text: racecar
Yes, it is a palindrome

How It Works

We use the slicing feature to reverse the text. We’ve already seen how we can make slices from sequences using the seq[a:b] code starting from position a to position b. We can also provide a third argument that determines the step by which the slicing is done. The default step is 1 because of which it returns a continuous part of the text. Giving a negative step, i.e., -1 will return the text in reverse.

The raw_input() function takes a string as argument and displays it to the user. Then it waits for the user to type something and press the return key. Once the user has entered and pressed the return key, the raw_input() function will then return that text the user has entered.

We take that text and reverse it. If the original text and reversed text are equal, then the text is a palindrome.

15.1.1. Homework exercise

Checking whether a text is a palindrome should also ignore punctuation, spaces and case. For example, "Rise to vote, sir." is also a palindrome but our current program doesn’t say it is. Can you improve the above program to recognize this palindrome?

If you need a hint, the idea is that… [2]

15.2. Files

You can open and use files for reading or writing by creating an object of the file class and using its read, readline or write methods appropriately to read from or write to the file. The ability to read or write to the file depends on the mode you have specified for the file opening. Then finally, when you are finished with the file, you call the close method to tell Python that we are done using the file.

Example (save as io_using_file.py):

poem = '''\
Programming is fun
When the work is done
if you wanna make your work also fun:
    use Python!
'''

# Open for 'w'riting
f = open('poem.txt', 'w')
# Write text to file
f.write(poem)
# Close the file
f.close()

# If no mode is specified,
# 'r'ead mode is assumed by default
f = open('poem.txt')
while True:
    line = f.readline()
    # Zero length indicates EOF
    if len(line) == 0:
        break
    # The `line` already has a newline
    # at the end of each line
    # since it is reading from a file.
    print line,
# close the file
f.close()

Output:

$ python io_using_file.py
Programming is fun
When the work is done
if you wanna make your work also fun:
    use Python!

How It Works

First, open a file by using the built-in open function and specifying the name of the file and the mode in which we want to open the file. The mode can be a read mode ('r'), write mode ('w') or append mode ('a'). We can also specify whether we are reading, writing, or appending in text mode ('t') or binary mode ('b'). There are actually many more modes available and help(open) will give you more details about them. By default, open() considers the file to be a 't’ext file and opens it in 'r’ead mode.

In our example, we first open the file in write text mode and use the write method of the file object to write to the file and then we finally close the file.

Next, we open the same file again for reading. We don’t need to specify a mode because 'read text file' is the default mode. We read in each line of the file using the readline method in a loop. This method returns a complete line including the newline character at the end of the line. When an empty string is returned, it means that we have reached the end of the file and we 'break' out of the loop.

In the end, we finally close the file.

Now, check the contents of the poem.txt file to confirm that the program has indeed written to and read from that file.

15.3. Pickle

Python provides a standard module called pickle using which you can store any plain Python object in a file and then get it back later. This is called storing the object persistently.

Example (save as io_pickle.py):

import pickle

# The name of the file where we will store the object
shoplistfile = 'shoplist.data'
# The list of things to buy
shoplist = ['apple', 'mango', 'carrot']

# Write to the file
f = open(shoplistfile, 'wb')
# Dump the object to a file
pickle.dump(shoplist, f)
f.close()

# Destroy the shoplist variable
del shoplist

# Read back from the storage
f = open(shoplistfile, 'rb')
# Load the object from the file
storedlist = pickle.load(f)
print storedlist

Output:

$ python io_pickle.py
['apple', 'mango', 'carrot']

How It Works

To store an object in a file, we have to first open the file in write binary mode and then call the dump function of the pickle module. This process is called pickling.

Next, we retrieve the object using the load function of the pickle module which returns the object. This process is called unpickling.

15.4. Unicode

So far, when we have been writing and using strings, or reading and writing to a file, we have used simple English characters only. If we want to be able to read and write other non-English languages, we need to use the unicode type, and it all starts with the character u:

>>> "hello world"
'hello world'
>>> type("hello world")
<type 'str'>
>>> u"hello world"
u'hello world'
>>> type(u"hello world")
<type 'unicode'>

We use the unicode type instead of strings to make sure that we handle non-English languages in our programs. However, when we read or write to a file or when we talk to other computers on the Internet, we need to convert our unicode strings into a format that can be sent and received, and that format is called "UTF-8". We can read and write in that format, using a simple keyword argument to our standard open function:

# encoding=utf-8
import io

f = io.open("abc.txt", "wt", encoding="utf-8")
f.write(u"Imagine non-English language here")
f.close()

text = io.open("abc.txt", encoding="utf-8").read()
print text

How It Works

You can ignore the import statement for now, we’ll explore that in detail in the modules chapter.

Whenever we write a program that uses Unicode literals like we have used above, we have to make sure that Python itself is told that our program uses UTF-8, and we have to put # encoding=utf-8 comment at the top of our program.

We use io.open and provide the "encoding" and "decoding" argument to tell Python that we are using unicode, and in fact, we have to pass in a string in the form of u"" to make it clear that we are using Unicode strings.

You should learn more about this topic by reading:

15.5. Summary

We have discussed various types of input/output, about file handling, about the pickle module and about Unicode.

Next, we will explore the concept of exceptions.

16. Exceptions

Exceptions occur when exceptional situations occur in your program. For example, what if you are going to read a file and the file does not exist? Or what if you accidentally deleted it when the program was running? Such situations are handled using exceptions.

Similarly, what if your program had some invalid statements? This is handled by Python which raises its hands and tells you there is an error.

16.1. Errors

Consider a simple print function call. What if we misspelt print as Print? Note the capitalization. In this case, Python raises a syntax error.

>>> Print "Hello World"
  File "<stdin>", line 1
    Print "Hello World"
                      ^
SyntaxError: invalid syntax
>>> print "Hello World"
Hello World

Observe that a SyntaxError is raised and also the location where the error was detected is printed. This is what an error handler for this error does.

16.2. Exceptions

We will try to read input from the user. Press ctrl-d and see what happens.

>>> s = raw_input('Enter something --> ')
Enter something --> Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
EOFError

Python raises an error called EOFError which basically means it found an end of file symbol (which is represented by ctrl-d) when it did not expect to see it.

16.3. Handling Exceptions

We can handle exceptions using the try..except statement. We basically put our usual statements within the try-block and put all our error handlers in the except-block.

Example (save as exceptions_handle.py):

try:
    text = raw_input('Enter something --> ')
except EOFError:
    print 'Why did you do an EOF on me?'
except KeyboardInterrupt:
    print 'You cancelled the operation.'
else:
    print 'You entered {}'.format(text)

Output:

# Press ctrl + d
$ python exceptions_handle.py
Enter something --> Why did you do an EOF on me?

# Press ctrl + c
$ python exceptions_handle.py
Enter something --> ^CYou cancelled the operation.

$ python exceptions_handle.py
Enter something --> No exceptions
You entered No exceptions

How It Works

We put all the statements that might raise exceptions/errors inside the try block and then put handlers for the appropriate errors/exceptions in the except clause/block. The except clause can handle a single specified error or exception, or a parenthesized list of errors/exceptions. If no names of errors or exceptions are supplied, it will handle all errors and exceptions.

Note that there has to be at least one except clause associated with every try clause. Otherwise, what’s the point of having a try block?

If any error or exception is not handled, then the default Python handler is called which just stops the execution of the program and prints an error message. We have already seen this in action above.

You can also have an else clause associated with a try..except block. The else clause is executed if no exception occurs.

In the next example, we will also see how to get the exception object so that we can retrieve additional information.

16.4. Raising Exceptions

You can raise exceptions using the raise statement by providing the name of the error/exception and the exception object that is to be thrown.

The error or exception that you can raise should be a class which directly or indirectly must be a derived class of the Exception class.

Example (save as exceptions_raise.py):

class ShortInputException(Exception):
    '''A user-defined exception class.'''
    def __init__(self, length, atleast):
        Exception.__init__(self)
        self.length = length
        self.atleast = atleast

try:
    text = raw_input('Enter something --> ')
    if len(text) < 3:
        raise ShortInputException(len(text), 3)
    # Other work can continue as usual here
except EOFError:
    print 'Why did you do an EOF on me?'
except ShortInputException as ex:
    print ('ShortInputException: The input was ' + \
           '{0} long, expected at least {1}')\
          .format(ex.length, ex.atleast)
else:
    print 'No exception was raised.'

Output:

$ python exceptions_raise.py
Enter something --> a
ShortInputException: The input was 1 long, expected at least 3

$ python exceptions_raise.py
Enter something --> abc
No exception was raised.

How It Works

Here, we are creating our own exception type. This new exception type is called ShortInputException. It has two fields - length which is the length of the given input, and atleast which is the minimum length that the program was expecting.

In the except clause, we mention the class of error which will be stored as the variable name to hold the corresponding error/exception object. This is analogous to parameters and arguments in a function call. Within this particular except clause, we use the length and atleast fields of the exception object to print an appropriate message to the user.

16.5. Try … Finally

Suppose you are reading a file in your program. How do you ensure that the file object is closed properly whether or not an exception was raised? This can be done using the finally block.

Save this program as exceptions_finally.py:

import sys
import time

f = None
try:
    f = open("poem.txt")
    # Our usual file-reading idiom
    while True:
        line = f.readline()
        if len(line) == 0:
            break
        print line,
        sys.stdout.flush()
        print "Press ctrl+c now"
        # To make sure it runs for a while
        time.sleep(2)
except IOError:
    print "Could not find file poem.txt"
except KeyboardInterrupt:
    print "!! You cancelled the reading from the file."
finally:
    if f:
        f.close()
    print "(Cleaning up: Closed the file)"

Output:

$ python exceptions_finally.py
Programming is fun
Press ctrl+c now
^C!! You cancelled the reading from the file.
(Cleaning up: Closed the file)

How It Works

We do the usual file-reading stuff, but we have arbitrarily introduced sleeping for 2 seconds after printing each line using the time.sleep function so that the program runs slowly (Python is very fast by nature). When the program is still running, press ctrl + c to interrupt/cancel the program.

Observe that the KeyboardInterrupt exception is thrown and the program quits. However, before the program exits, the finally clause is executed and the file object is always closed.

Note that we use sys.stdout.flush() after print so that it prints to the screen immediately.

16.6. The with statement

Acquiring a resource in the try block and subsequently releasing the resource in the finally block is a common pattern. Hence, there is also a with statement that enables this to be done in a clean manner:

Save as exceptions_using_with.py:

with open("poem.txt") as f:
    for line in f:
        print line,

How It Works

The output should be same as the previous example. The difference here is that we are using the open function with the with statement - we leave the closing of the file to be done automatically by with open.

What happens behind the scenes is that there is a protocol used by the with statement. It fetches the object returned by the open statement, let’s call it "thefile" in this case.

It always calls the thefile.enter function before starting the block of code under it and always calls thefile.exit after finishing the block of code.

So the code that we would have written in a finally block should be taken care of automatically by the exit method. This is what helps us to avoid having to use explicit try..finally statements repeatedly.

More discussion on this topic is beyond scope of this book, so please refer PEP 343 for a comprehensive explanation.

16.7. Summary

We have discussed the usage of the try..except and try..finally statements. We have seen how to create our own exception types and how to raise exceptions as well.

Next, we will explore the Python Standard Library.

17. Standard Library

The Python Standard Library contains a huge number of useful modules and is part of every standard Python installation. It is important to become familiar with the Python Standard Library since many problems can be solved quickly if you are familiar with the range of things that these libraries can do.

We will explore some of the commonly used modules in this library. You can find complete details for all of the modules in the Python Standard Library in the 'Library Reference' section of the documentation that comes with your Python installation.

Let us explore a few useful modules.

Caution

If you find the topics in this chapter too advanced, you may skip this chapter. However, I highly recommend coming back to this chapter when you are more comfortable with programming using Python.

17.1. sys module

The sys module contains system-specific functionality. We have already seen that the sys.argv list contains the command-line arguments.

Suppose we want to check the version of the Python software being used, the sys module gives us that information.

$ python
>>> import sys
>>> sys.version_info
sys.version_info(major=2, minor=7, micro=6, releaselevel='final', serial=0)
>>> sys.version_info.major == 2
True

How It Works

The sys module has a version_info tuple that gives us the version information. The first entry is the major version. We can pull out this information to use it.

17.2. logging module

What if you wanted to have some debugging messages or important messages to be stored somewhere so that you can check whether your program has been running as you would expect it? How do you "store somewhere" these messages? This can be achieved using the logging module.

Save as stdlib_logging.py:

import os, platform, logging

if platform.platform().startswith('Windows'):
    logging_file = os.path.join(os.getenv('HOMEDRIVE'),
                                os.getenv('HOMEPATH'),
                                'test.log')
else:
    logging_file = os.path.join(os.getenv('HOME'),
                                'test.log')

print "Logging to", logging_file

logging.basicConfig(
    level=logging.DEBUG,
    format='%(asctime)s : %(levelname)s : %(message)s',
    filename = logging_file,
    filemode = 'w',
)

logging.debug("Start of the program")
logging.info("Doing something")
logging.warning("Dying now")

Output:

$ python stdlib_logging.py
Logging to /Users/swa/test.log

$ cat /Users/swa/test.log
2014-03-29 09:27:36,660 : DEBUG : Start of the program
2014-03-29 09:27:36,660 : INFO : Doing something
2014-03-29 09:27:36,660 : WARNING : Dying now

If you do not have the cat command, then you can just open the test.log file in a text editor.

How It Works

We use three modules from the standard library - the os module for interacting with the operating system, the platform module for information about the platform i.e. the operating system and the logging module to log information.

First, we check which operating system we are using by checking the string returned by platform.platform() (for more information, see import platform; help(platform)). If it is Windows, we figure out the home drive, the home folder and the filename where we want to store the information. Putting these three parts together, we get the full location of the file. For other platforms, we need to know just the home folder of the user and we get the full location of the file.

We use the os.path.join() function to put these three parts of the location together. The reason to use a special function rather than just adding the strings together is because this function will ensure the full location matches the format expected by the operating system.

We configure the logging module to write all the messages in a particular format to the file we have specified.

Finally, we can put messages that are either meant for debugging, information, warning or even critical messages. Once the program has run, we can check this file and we will know what happened in the program, even though no information was displayed to the user running the program.

17.3. Module of the Week Series

There is much more to be explored in the standard library such as debugging, handling command line options, regular expressions and so on.

The best way to further explore the standard library is to read Doug Hellmann’s excellent Python Module of the Week series (also available as a book) and reading the Python documentation.

17.4. Summary

We have explored some of the functionality of many modules in the Python Standard Library. It is highly recommended to browse through the Python Standard Library documentation to get an idea of all the modules that are available.

Next, we will cover various aspects of Python that will make our tour of Python more complete.

18. More

So far we have covered a majority of the various aspects of Python that you will use. In this chapter, we will cover some more aspects that will make our knowledge of Python more well-rounded.

18.1. Passing tuples around

Ever wished you could return two different values from a function? You can. All you have to do is use a tuple.

>>> def get_error_details():
...     return (2, 'details')
...
>>> errnum, errstr = get_error_details()
>>> errnum
2
>>> errstr
'details'

Notice that the usage of a, b = <some expression> interprets the result of the expression as a tuple with two values.

This also means the fastest way to swap two variables in Python is:

>>> a = 5; b = 8
>>> a, b
(5, 8)
>>> a, b = b, a
>>> a, b
(8, 5)

18.2. Special Methods

There are certain methods such as the init and del methods which have special significance in classes.

Special methods are used to mimic certain behaviors of built-in types. For example, if you want to use the x[key] indexing operation for your class (just like you use it for lists and tuples), then all you have to do is implement the getitem() method and your job is done. If you think about it, this is what Python does for the list class itself!

Some useful special methods are listed in the following table. If you want to know about all the special methods, see the manual.

init(self, …): This method is called just before the newly created object is returned for usage.
del(self): Called just before the object is destroyed (which has unpredictable timing, so avoid using this)
str(self): Called when we use the print statement or when str() is used.
lt(self, other): Called when the less than operator (<) is used. Similarly, there are special methods for all the operators (+, >, etc.)
getitem(self, key): Called when x[key] indexing operation is used.
len(self): Called when the built-in len() function is used for the sequence object.

18.3. Single Statement Blocks

We have seen that each block of statements is set apart from the rest by its own indentation level. Well, there is one caveat. If your block of statements contains only one single statement, then you can specify it on the same line of, say, a conditional statement or looping statement. The following example should make this clear:

>>> flag = True
>>> if flag: print 'Yes'
...
Yes

Notice that the single statement is used in-place and not as a separate block. Although, you can use this for making your program smaller, I strongly recommend avoiding this short-cut method, except for error checking, mainly because it will be much easier to add an extra statement if you are using proper indentation.

18.4. Lambda Forms

A lambda statement is used to create new function objects. Essentially, the lambda takes a parameter followed by a single expression only which becomes the body of the function and the value of this expression is returned by the new function.

Example (save as more_lambda.py):

points = [ { 'x' : 2, 'y' : 3 },
           { 'x' : 4, 'y' : 1 } ]
points.sort(key=lambda i : i['y'])
print points

Output:

$ python more_lambda.py
[{'y': 1, 'x': 4}, {'y': 3, 'x': 2}]

How It Works

Notice that the sort method of a list can take a key parameter which determines how the list is sorted (usually we know only about ascending or descending order). In our case, we want to do a custom sort, and for that we need to write a function but instead of writing a separate def block for a function that will get used in only this one place, we use a lambda expression to create a new function.

18.5. List Comprehension

List comprehensions are used to derive a new list from an existing list. Suppose you have a list of numbers and you want to get a corresponding list with all the numbers multiplied by 2 only when the number itself is greater than 2. List comprehensions are ideal for such situations.

Example (save as more_list_comprehension.py):

listone = [2, 3, 4]
listtwo = [2*i for i in listone if i > 2]
print listtwo

Output:

$ python more_list_comprehension.py
[6, 8]

How It Works

Here, we derive a new list by specifying the manipulation to be done (2*i) when some condition is satisfied (if i > 2). Note that the original list remains unmodified.

The advantage of using list comprehensions is that it reduces the amount of boilerplate code required when we use loops to process each element of a list and store it in a new list.

18.6. Receiving Tuples and Dictionaries in Functions

There is a special way of receiving parameters to a function as a tuple or a dictionary using the * or ** prefix respectively. This is useful when taking variable number of arguments in the function.

>>> def powersum(power, *args):
...     '''Return the sum of each argument raised to the specified power.'''
...     total = 0
...     for i in args:
...         total += pow(i, power)
...     return total
...
>>> powersum(2, 3, 4)
25
>>> powersum(2, 10)
100

Because we have a * prefix on the args variable, all extra arguments passed to the function are stored in args as a tuple. If a ** prefix had been used instead, the extra parameters would be considered to be key/value pairs of a dictionary.

18.7. The assert statement

The assert statement is used to assert that something is true. For example, if you are very sure that you will have at least one element in a list you are using and want to check this, and raise an error if it is not true, then assert statement is ideal in this situation. When the assert statement fails, an AssertionError is raised.

>>> mylist = ['item']
>>> assert len(mylist) >= 1
>>> mylist.pop()
'item'
>>> assert len(mylist) >= 1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AssertionError

The assert statement should be used judiciously. Most of the time, it is better to catch exceptions, either handle the problem or display an error message to the user and then quit.

18.8. Decorators

Decorators are a shortcut to applying wrapper functions. This is helpful to "wrap" functionality with the same code over and over again. For example, I created a retry decorator for myself that I can just apply to any function and if any exception is thrown during a run, it is retried again, till a maximum of 5 times and with a delay between each retry. This is especially useful for situations where you are trying to make a network call to a remote computer:

from time import sleep
from functools import wraps
import logging
logging.basicConfig()
log = logging.getLogger("retry")


def retry(f):
    @wraps(f)
    def wrapped_f(*args, **kwargs):
        MAX_ATTEMPTS = 5
        for attempt in range(1, MAX_ATTEMPTS + 1):
            try:
                return f(*args, **kwargs)
            except:
                log.exception("Attempt %s/%s failed : %s",
                              attempt,
                              MAX_ATTEMPTS,
                              (args, kwargs))
                sleep(10 * attempt)
        log.critical("All %s attempts failed : %s",
                     MAX_ATTEMPTS,
                     (args, kwargs))
    return wrapped_f


counter = 0


@retry
def save_to_database(arg):
    print "Write to a database or make a network call or etc."
    print "This will be automatically retried if exception is thrown."
    global counter
    counter += 1
    # This will throw an exception in the first call
    # And will work fine in the second call (i.e. a retry)
    if counter < 2:
        raise ValueError(arg)


if __name__ == '__main__':
    save_to_database("Some bad value")

Output:

$ python more_decorator.py
Write to a database or make a network call or etc.
This will be automatically retried if exception is thrown.
ERROR:retry:Attempt 1/5 failed : (('Some bad value',), {})
Traceback (most recent call last):
  File "more_decorator.py", line 14, in wrapped_f
    return f(*args, **kwargs)
  File "more_decorator.py", line 39, in save_to_database
    raise ValueError(arg)
ValueError: Some bad value
Write to a database or make a network call or etc.
This will be automatically retried if exception is thrown.

How It Works

Search Google

Tuesday, 7 April 2015

learn python in 10 minutes

No comments:

Post a Comment