Working with Python
Collections
Aside from lists and dictionaries Python has several other collection objections that are useful in the standard collections
module.
See: Python Collections documentation
Sets
Python set
object is equivalent to a mathematical set, and supports similar operations. You can also think of a set
as a list
, but every element is unique and a set is unordered.
Note: I know sets are not in the collections module but built-in, it just felt like this section of the guide made the most sense.
Create a set
You can create a set multiple ways, when creating a set it will automatically drop any duplicate elements.
# using curly brackets
A = { 'a', 'b', 'c', 'a'}
>>> { 'c', 'a', 'b' }
# from list
A = set([1, 2, 3, 1])
>>> {1, 2, 3}
# using comprehensions
A = { c for c in 'abracadabra' }
>>> {'d', 'c', 'b', 'a', 'r'}
Set Operations
You can perform common set operations, union, intersection and difference.
A = { 1, 3, 5, 7, 9 }
B = { 2, 4, 6, 8, 0 }
C = { 2, 3, 4 }
# return set that combines the two sets
A.union(B)
>>> {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
# return set with elements common between two sets
A.intersection(C)
>> {3}
# return set with elements only in A, not C
A.difference(C)
>>> {1, 5, 9, 7}
The operations also work on an arbritary number of sets.
A = set() # empty set
B = { 2, 3 }
C = { 4, 5 }
D = { 6, 7 }
A.union(B, C, D)
>>> { 2, 3, 4, 5, 6, 7 }
Default Dictionary
The defaultdict
is extremely useful when building a dictionary and you don't want to always check if it already contains a key.
Here's how you might count characters in a string using the standard dict
object.
d = {}
s = "abcdadkacb"
for ch in s:
if not ch in d:
d[ch] = 1
else:
d[ch] += 1
Here's an easier way using defaultdict
type and avoiding the extra check.
from collections import defaultdict
d = defaultdict(int)
s = "abcdadkacb"
for ch in s:
d[ch] += 1
Using int
as the default will always return 0 for missing keys. You can set your own default using a lambda function returning a constant. For example to have a default value of 1 use:
from collections import defaultdict
d = defaultdict(lambda: 1)
keys = 'abc'
d['b'] = 2
for k in keys:
print(d[k])
>>> 1
>>> 2
>>> 1
Named Tuple
The namedtuple
is a convenience type that allows you to create a tuple and assign names for the positions that can then be accessed. This is useful to make it more readable and simpler to access.
from collections import namedtuple
Point = namedtuple('Point', ['x', 'y'])
p1 = Point(1, 2)
p2 = Point(y=4, x=2)
d = abs(p2.x - p1.x) + abs(p2.y - p1.y)
deque
The deque
object (pronounced "deck") is an efficient double-sided list. It adds methods .popleft()
and .appendleft()
to add and remove items from both left and right sides of list.
from collections import deque
d = deque(['a', 'b', 'c'])
a = d.popleft()
d.appendleft('z')
print(d)
>>> deque(['z', 'b', 'c'])
Rotation
A deque
object additional has a .rotate(n)
method which allows rotating the elements n times, this works for both positive and negative values of n.
from collections import deque
d = deque(['a', 'b', 'c' ,'d'])
d.rotate(2)
print(d)
>>> deque(['c', 'd', 'a', 'b'])
d.rotate(-3)
print(d)
>>> deque(['b', 'c', 'd', 'a'])
Counter
The Counter
is useful to count items in a list and return a dict of value and count for each item.
from collections import Counter
c = Counter('abcdeabieaabi')
print(c)
>>> Counter({'a': 4, 'b': 3, 'e': 2, 'i': 2, 'c': 1, 'd': 1})
Most Common
The .most_common(n)
method will return an ordered list of tuples of the most common items with the item and count as the two items in the tuple.
c.most_common(1)
from collections import Counter
c = Counter('abcdeabieaabi')
mc = c.most_common(1)
print(f"Most common {mc[0][0]} with count {mc[0][1]})
Total
Use .total()
to return the sum of the counts.
from collections import Counter
c = Counter('abcdeabieaabi')
print(c.total())