Marcus Kazmierczak

Working with Python

Collections

Aside from lists and dictionaries Python has several other collection objections that are useful in the standard collections module.

See: Python Collections documentation

Sets

Python set object is equivalent to a mathematical set, and supports similar operations. You can also think of a set as a list, but every element is unique and a set is unordered.

Note: I know sets are not in the collections module but built-in, it just felt like this section of the guide made the most sense.

Create a set

You can create a set multiple ways, when creating a set it will automatically drop any duplicate elements.

 
# using curly brackets
A = { 'a', 'b', 'c', 'a'}
>>> { 'c', 'a', 'b' }
 
# from list
A = set([1, 2, 3, 1])
>>> {1, 2, 3}
 
# using comprehensions
A = { c for c in 'abracadabra' }
>>> {'d', 'c', 'b', 'a', 'r'}

Set Operations

You can perform common set operations, union, intersection and difference.

A = { 1, 3, 5, 7, 9 }
B = { 2, 4, 6, 8, 0 }
C = { 2, 3, 4 }
 
# return set that combines the two sets
A.union(B)
>>> {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
 
# return set with elements common between two sets
A.intersection(C)
>> {3}
 
# return set with elements only in A, not C
A.difference(C)
>>> {1, 5, 9, 7}

The operations also work on an arbritary number of sets.

 
A = set() # empty set
B = { 2, 3 }
C = { 4, 5 }
D = { 6, 7 }
A.union(B, C, D)
>>> { 2, 3, 4, 5, 6, 7 }

Default Dictionary

The defaultdict is extremely useful when building a dictionary and you don't want to always check if it already contains a key.

Here's how you might count characters in a string using the standard dict object.

d = {}
s = "abcdadkacb"
for ch in s:
	if not ch in d:
		d[ch] = 1
	else:
		d[ch] += 1

Here's an easier way using defaultdict type and avoiding the extra check.

from collections import defaultdict
 
d = defaultdict(int)
s = "abcdadkacb"
for ch in s:
	d[ch] += 1

Using int as the default will always return 0 for missing keys. You can set your own default using a lambda function returning a constant. For example to have a default value of 1 use:

from collections import defaultdict
d = defaultdict(lambda: 1)
keys = 'abc'
d['b'] = 2
for k in keys:
    print(d[k])
>>> 1
>>> 2
>>> 1

Named Tuple

The namedtuple is a convenience type that allows you to create a tuple and assign names for the positions that can then be accessed. This is useful to make it more readable and simpler to access.

from collections import namedtuple
 
Point = namedtuple('Point', ['x', 'y'])
p1 = Point(1, 2)
p2 = Point(y=4, x=2)
 
d = abs(p2.x - p1.x) + abs(p2.y - p1.y)

deque

The deque object (pronounced "deck") is an efficient double-sided list. It adds methods .popleft() and .appendleft() to add and remove items from both left and right sides of list.

from collections import deque
 
d = deque(['a', 'b', 'c'])
a = d.popleft()
d.appendleft('z')
print(d)
>>> deque(['z', 'b', 'c'])

Rotation

A deque object additional has a .rotate(n) method which allows rotating the elements n times, this works for both positive and negative values of n.

from collections import deque
 
d = deque(['a', 'b', 'c' ,'d'])
d.rotate(2)
print(d)
>>> deque(['c', 'd', 'a', 'b'])
 
d.rotate(-3)
print(d)
>>> deque(['b', 'c', 'd', 'a'])

Counter

The Counter is useful to count items in a list and return a dict of value and count for each item.

from collections import Counter
c = Counter('abcdeabieaabi')
 
print(c)
>>> Counter({'a': 4, 'b': 3, 'e': 2, 'i': 2, 'c': 1, 'd': 1})

Most Common

The .most_common(n) method will return an ordered list of tuples of the most common items with the item and count as the two items in the tuple. c.most_common(1)

from collections import Counter
c = Counter('abcdeabieaabi')
mc = c.most_common(1)
print(f"Most common {mc[0][0]} with count {mc[0][1]})

Total

Use .total() to return the sum of the counts.

from collections import Counter
c = Counter('abcdeabieaabi')
print(c.total())