Cover page images (keyboard)

Python's Data Types

Sue Evans & David Walser



Press space bar for next page

Learning Outcomes

Data types

Whenever we describe a data type, we not only need to give a definition of what that type is but also the operations that can be done on that type. The definition may include the range of possible values that type can hold.

Example: integer

Python's data types

Strings

Data Structures

Why is this important?

Choosing the right data structure to solve a problem affects how easy the problem is to solve, how readable the code will be, and how efficient the resulting program will be.

Lists

Lists vs. Arrays

You may think that a list in Python is the same as an array in C or other languages. They are similar but there are some significant differences.

  1. Although you may have arrays of any type, arrays are homogeneous, meaning they can only hold data of one type. Python lists are heterogeneous. The type of the data being stored can be different at different indices in the same list.

  2. Arrays are a fixed size, but lists shrink and grow as necessary to hold the exact number of items being stored.

Lists in Python

There are several ways to create new lists:

>>> # assign the literal empty list
>>> empty = []
>>> empty
[]

>>> # use the list constructor
>>> empty = list()    
>>> empty
[]

>>> # give a literal assignment of items
>>> items = ["bread", "milk", "cheese", "cider"]
>>> #         0        1       2         3
>>> items
['bread', 'milk', 'cheese', 'cider']

>>> # use the list constructor with a string
>>> letters = list("hello")
>>> letters
['h', 'e', 'l', 'l', 'o']

>>> # use string.split which returns a list of strings
>>> words = "a bunch of words".split()
>>> words
['a', 'bunch', 'of', 'words']

Operators:

Methods:

Built-in functions that operate on lists:

Examples

List Methods

append(item) - add an item to the end of the list

>>> items
['bread', 'milk', 'cheese', 'lemonade']
>>> items.append('ham')
>>> items
['bread', 'milk', 'cheese', 'lemonade', 'ham']
>>> items.append('ham')
>>> items
['bread', 'milk', 'cheese', 'lemonade', 'ham', 'ham']
>>>

count(item) - count occurences of an item in the list

>>> items
['bread', 'milk', 'cheese', 'lemonade', 'ham', 'ham']
>>> items.count('ham')
2
>>> items.count('cheese')
1
>>>

remove(item) - remove the first occurence of an item in the list

>>> items
['bread', 'milk', 'cheese', 'lemonade', 'ham', 'ham']
>>> items.remove('ham')
>>> items
['bread', 'milk', 'cheese', 'lemonade', 'ham']
>>>

extend(list) - append multiple items to end of the list

>>> items.extend(['lettuce', 'tomatoes'])
>>> items
['bread', 'milk', 'cheese', 'lemonade', 'ham', 'lettuce', 'tomatoes']
>>>

index(item) - locate the index of an item in the list

>>> items
['bread', 'milk', 'cheese', 'lemonade', 'ham', 'lettuce', 'tomatoes']
>>> items.index('milk')
1
>>>

insert(index, item) - insert an item before the one at the index

>>> items
['bread', 'milk', 'cheese', 'lemonade', 'ham', 'lettuce', 'tomatoes']
>>> items.insert(2, 'eggs')
>>> items
['bread', 'milk', 'eggs', 'cheese', 'lemonade', 'ham', 'lettuce', 'tomatoes']
>>>

reverse() - reverse the order of the items in the list

>>> items
['bread', 'milk', 'eggs', 'cheese', 'lemonade', 'ham', 'lettuce', 'tomatoes']
>>> items.reverse()
>>> items
['tomatoes', 'lettuce', 'ham', 'lemonade', 'cheese', 'eggs', 'milk', 'bread']
>>>

sort() - sort the list

>>> items
['tomatoes', 'lettuce', 'ham', 'lemonade', 'cheese', 'eggs', 'milk', 'bread']
>>> items.sort()
>>> items
['bread', 'cheese', 'eggs', 'ham', 'lemonade', 'lettuce', 'milk', 'tomatoes']
>>>

Built-In Functions

len(list) - count number of items in the list

>>> items
['bread', 'cheese', 'eggs', 'ham', 'lemonade', 'lettuce', 'milk', 'tomatoes']
>>> len(items)
8
>>>

del(list[index]) - remove the item at the index from the list

>>> items
['bread', 'cheese', 'eggs', 'ham', 'lemonade', 'lettuce', 'milk', 'tomatoes']
>>> del(items[4])
>>> items
['bread', 'cheese', 'eggs', 'ham', 'lettuce', 'milk', 'tomatoes']
>>>

Using lists

Use a list when order matters. 

names = []

name = raw_input("Enter a name (or . to quit): ")

while name != ".":
    names.append(name)
    name = raw_input("Enter a name (or . to quit): ")

names.sort()

for name in names:
    print name

Mutability

An object that is mutable can be changed in-place.  Lists are mutable.
Example:  list[2] = "foo"

>>> list = ['cat', 'dog', 'bird', 'hamster']
>>> list
['cat', 'dog', 'bird', 'hamster']
>>> list[2] = "foo"
>>> list
['cat', 'dog', 'foo', 'hamster']
>>>

Strings are immutable (aka, not mutable).
Example:  str[3] = 'f' is not supported

>>> str = 'fooooo'
>>> str[3] = 'f'
Traceback (most recent call last):
  File "", line 1, in ?
TypeError: object does not support item assignment
>>>

Lists, and other data structures can be used as return values from functions.  They can also be used as arguments.  Mutable data structures can be changed by the function.

names.dat

Nicholas, Charles
Finin, Tim
desJardins, Marie
Evans, Sue

Example output:

1    DESJARDINS, MARIE
2    EVANS, SUE
3    FININ, TIM
4    NICHOLAS, CHARLES

How about a top-down design.


import string

# printGreeting() explains the program to the user
# Inputs: None
# Output: None
def printGreeting():
    
    print "This program reads in a list of names from a file,"
    print "capitalizes them, sorts them alphabetically, and prints"
    print "them out in a numbered table."


# readNames() reads in the names from a file
# and puts them into a list called names
# Input: filename, the name of the file to open
# Output: names, the list of names 
def readNames(filename):

    names = []

    file = open(filename, 'r')

    for line in file:
        name = line.strip()
        names.append(name)

    file.close()

    return names


# capitalizeNames() capitalizes names (changes the list in-place)
# Inputs: the list of names
# Outputs: None, but the list was modified by this function
def capitalizeNames(names):

    for i in range(len(names)):
        names[i] = names[i].upper()


# printTable() prints the table of numbered names
# one per line
def printTable(names):

    length = len(names)

    for i in range(length):
        print (i + 1), "\t", names[i]


def main():

    printGreeting()

    # get filename from user    
    filename = raw_input("What's the name of the file ? ")

    # process the names
    names = readNames(filename)
    capitalizeNames(names)
    names.sort()

    # print the table
    printTable(names)

main()

Sets

Definition:

Sets are an unordered collection of items, where duplicates are not allowed.
Items must be immutable (such as numbers or strings).

Operations:

Sets in Python

Operators

Methods

Built-in functions that operate on sets

Using sets

When would you use a set?

When order doesn't matter and you don't want to allow duplicates.

linuxserver1.cs.umbc.edu[128] python nameset.py
Enter a name (or . to quit): sue
Enter a name (or . to quit): ben
Enter a name (or . to quit): dan
Enter a name (or . to quit): dawn
Enter a name (or . to quit): tess
Enter a name (or . to quit): dan
Enter a name (or . to quit): dawn
Enter a name (or . to quit): .
sue
ben
dan
dawn
tess
linuxserver1.cs.umbc.edu[129]

Using Sets for HW4

You should have a file that contains the names of the courses
you've taken toward the CS major.  There should be one course
per line
Please enter the name of this file : courses.txt

Part A Requirements:
You need to take CMSC 411
You need to take CMSC 313
You need to take CMSC 421
You need to take CMSC 341
You need to take CMSC 345
You need to take CMSC 331
You need to take CMSC 441
You need to take CMSC 304

Part B Requirements:
You need to take MATH 221

Oh, no!
We have the right answer, but our list of courses is out of order!

So after trying 2 different data structures, strings and sets for HW4, what data structure do you think would be the best for this program ?

Associative Arrays

Definition:

Associative arrays are tuples of key-value pairs.  The keys must be immutable (such as numbers or strings).  Values can be any type.  They associate one value with each key.  Keys aren't in any particular order.  Associative arrays are implemented with the Python dictionary type.

Operations:

Associative arrays are built into many languages, but they are called many different things:

Associative Arrays in Python

Creating new dictionaries:

# use the constructor to make an empty dictionary
empty = dict()

# use the empty dictionary literal
empty = {}        

# use the constructor with some initial pairings
# in this case the keys are strings
numbers = dict(one=1, two=2)        

# use the dictionary literal with some initial pairings
capitals = {'MD':'Annapolis', 'ME':'Augusta'}

Operators:

Methods:

Built-in functions that operate on dictionaries:

Dictionary Examples

Operators:

Methods:

copy() - create a copy of the dictionary

>>> cap2 = capitals.copy()
>>> cap2
{'ME': 'Augusta', 'MD': 'Annapolis', 'MI': 'Lansing'}

clear() - remove all entries from the dictionary

>>> cap2.clear()
>>> cap2
{}

get(key) - find the value associated with a key

>>> cap = capitals.get('MD')
>>> cap
'Annapolis'
>>> cap = capitals.get('VA')
>>> cap
>>>

get(key, default) - use a default value when a key isn't found

>>> cap = capitals.get('MD', 'unknown')
>>> cap
'Annapolis'
>>> cap = capitals.get('VA', 'unknown')
>>> cap
'unknown'

has_key(key) - another way to test for key containment

>>> capitals.has_key('ME')
True
>>> capitals.has_key('ND')
False

keys() - create a list of the keys in the dictionary

>>> capitals.keys()
['ME', 'MD', 'MI']

values() - create a list of the values in the dictionary

>>> capitals.values()
['Augusta', 'Annapolis', 'Lansing']

Built-in functions that operate on dictionaries:

len(dict) - count # of keys in the dictionary

>>> len(capitals)
3

del(dict[key]) - remove a key (and its value) from the dictionary

>>> del(capitals['ME'])
>>> capitals
{'MD': 'Annapolis', 'MI': 'Lansing'}
>>> 

Using associative arrays

Why use an associative array?

When you want to associate keys with values.  Keys don't have to be numbers, or if they are, they don't have to be contiguous and starting from zero (like in a list).

ite207-pc-01.cs.umbc.edu[126] python dice.py
Sum     Occurences
------------------
2       38
3       60
4       92
5       104
6       141
7       169
8       132
9       120
10      68
11      43
12      33
ite207-pc-01.cs.umbc.edu[127] 

Dictionary Exercise

Here's the data file

Plymouth 02360
Bar_Harbor 04609
Mystic 06355
Cape_May 08204
Hyde_Park 12538
Niagara_Falls 14301
Paradise 17562
Rehoboth_Beach 19971
D.C. 20001
UMBC 21250
Key_West 33040
Nashville 37201
Custer 57730
West_Yellowstone 59758
Whitefish 59901
Chicago 60610
St._Louis 63121
Kansas_City 66101
Baton_Rouge 70801
Austin 78701
Boulder 80301
Cody 82414
Ketchum 83340
Moab 84532
Tempe 85281
Taos 87571
Las_Vegas 89110
Big_Sur 93920
Carmel 93923
Monterey 93940
San_Francisco 94110
import string

zipcodes = dict()

filename = raw_input('Enter the filename of the zipcode file : ')

file = open(filename, 'r')
for line in file:
    string.strip(line)
    place, zip = string.split(line)
    zipcodes[place] = zip

file.close()

place = raw_input('Which location (Q to quit) ? ')

while place != 'Q':

    zip = zipcodes.get(place, 'unknown')
    print zip
    
    place = raw_input('Which location ? ')

Let's try it out!

linuxserver1.cs.umbc.edu[149] python zipcodes.py
Enter the filename of the zipcode file : zipcodes.txt
Which location (Q to quit) ? Plymouth
02360
Which location ? Carmel
93923
Which location ? UMBC
21250
Which location ? Oshkosh
unknown
Which location ? Niagara_Falls
14301
Which location ? Q
linuxserver1.cs.umbc.edu[150]

Which data structure ?

What data structures would you use for these programs?