Nested functions can access variables of the enclosing scope. The criteria for a closure are:
- Have a nested function.
- The nested function references a variable defined in the enclosing function.
- The enclosing function returns the nested function.
def general_multiplier(n):
def multiplier(x):
return x*n
return multiplier
times_2 = general_multiplier(2)
times_3 = general_multiplier(3)
times_2(10) # 20
times_3(10) # 30
del general_multiplier
# still works after the function has been removed
times_2(8) # 16
times_3(8) # 24
The value in the enclosing scope is remembered even when the variable goes out of scope or the function itself is removed from the current namespace.
When to use closure?
- If there are few methods of a class, it might be a better idea to use a closure instead of that class.
- To avoid the use of global variables and provide some form of data hiding.
class A:
def __init__(self):
self.data = []
# make its instances callable
def __call__(self, a):
self.data.append(a)
_sum = sum(self.data)
return _sum
a = A()
# call the instance
a(1) # 1
a(2) # 3
a(3) # 6
a(4) # 10
# implement the same class as closure
def f():
data = [] # act like attributes
def sum(a):
data.append(a)
_sum = sum(data)
return _sum
return sum
t = f()
t(1) # 1
t(2) # 3
t(3) # 6
t(4) # 10
When to not use closure?
- When the number of attributes and methods grow larger, use a class and not a closure.
# x, y, z are nonlocal variables of g
def f():
x = 2
y = 4
z = 6
def g(a, b, c):
return x*a, y*b, z*c
return g
s = f()
# check if s is a closure function
s.__closure__ # returns a tuple of cell objects
# the cell object has the attribute cell_contents which stores the closed value
# this will return the values of the nonlocal varibles which are 2, 4, 6
s.__closure__[0].cell_contents # 2
s.__closure__[1].cell_contents # 4
s.__closure__[2].cell_contents # 6
Use the nonlocal
keyword to modify a variable in the outer function scope.
def outer():
x = 8
def inner():
nonlocal x # x must already be defined in the outer scope
x = 2
print("x =", x)
inner()
print("x =", x)
outer()
# prints
# x = 8
# x = 2
Almost everything in Python is an object, i.e., it is an subclass or an instance of a subclass of the Object
class. Hence, functions are also objects and they can be passed as arguments to other functions or returned from other functions.
def f(text):
return text.upper()
def g(text):
return text.lower()
# function as a parameter
def h(func, text):
s = func(text)
print(s)
# function as arguments
h(f, 'Python') # PYTHON
h(g, 'JavaScript') # javascript
The following are also possible.
- Assign the function to a variable.
- Use
del
to delete functions. - Functions can be stored in lists, etc.
def f(n):
i = 2
a = []
def g(x):
return x*i
while (i<=n):
# store functins in a list
a.append(g)
i += 1
# return a list of functions
return a
# assign the function to a variable
h = f
# h and f are two references to the function
# let's delete f and work with h
del f
# unpack the functions
x, y, z = h(4)
# all are multipliers of 5
x(5) # 25
y(7) # 35
z(9) # 45
Decorators are the most common use of higher-order functions. They take in functions (as arguments), add functionality (decorate) and return it.
def decorate(f):
def inner():
print("Just got decorated.")
f()
return inner
def g():
print("I am just a function")
# send g to decorate()
# h is also a closure function
h = decorate(g)
g()
# I am just a function
h()
# Just got decorated
# I am just a function
In the above, one could have written:
g = decorate(g)
The function g just got upgraded (or decorated). This is exactly what a decorator is. Python has a special syntax for this.
@decorate
def g():
print('I am just a function')
# above code is equivalent to the following
def g():
print('I am just a function')
g = decorate(g)
Parametrized functions can be decorated like the following:
def smart_divider(f):
def inner(a, b):
if b == 0:
print("Can't divide by zero.")
return
return f(a, b)
return inner
@smart_divider
def divider(x, y):
return x/y
divider(4, 2) # 2.0
divider(4, 0)
# Can't divide by zero.
Parameters of the nested inner() function inside the decorator is the same as the parameters of functions it decorates. A general decorator can be made using *args
(which accepts tuple of positional arguments) and **kwargs
(which accepts dictionary of keyword arguments).
def decorate(f):
def inner(*args, **kwargs):
print('I can decorate any function')
return f(*args, **kwargs)
return inner
Decorators can also be chained. One can decorate functions with same or different decorators many times.
def first_decorator(f):
def inner(*args, **kwargs):
print('I am first decorator')
return f(*args, **kwargs)
return inner
def second_decorator(f):
def inner(*args, **kwargs):
print('I am second decorator')
return f(*args, **kwargs)
return inner
@second_decorator
@first_decorator
def adder(*args):
return sum(args)
adder(1, 2, 3)
# I am second decorator
# I am first decorator
# 6
# an equivalent way of decorating
adder = second_decorator(first_decorator(adder))
One can pass arguments to the decorator itself. To achieve this, define a decorator maker that accepts arguments then define a decorator inside it. Henceforth, we replace inner() by wrapper() because it wraps around the orginal function, decorating it.
def decorator_maker(dec_arg_1, dec_arg_2):
def decorator(func):
def wrapper(*args, **kwargs):
''' I am wrapper '''
print('I am the wrapper function')
print(dec_arg1, dec_arg_2)
return func(*args, **kwargs)
return wrapper
return decorator
@decorator_maker(4, 8)
def f():
''' I am f '''
return 'I want to get decorated'
f()
# I am the wrapper function
# 4 8
# 'I want to get decorated'
In trying to get metadata of the undecorated function, you get the metadata of the wrapper() function instead which might be frustrating during debugging process. Python offers functools.wraps
decorator which copies the lost metadata from the undecorated function to the decorated closure.
# try to get metadata of f
f.__name__ # 'wrapper'
f.__doc__ # ' I am wrapper '
# the metadata of f() is hidden or overriden by wrapper()
import functools
def decorator(func):
@functools.wraps(func)
def wrapper():
''' I am wrapper '''
return func().upper()
return wrapper
@decorator
def printer():
''' I am printer '''
return 'Python'
# check metadata
printer.__name__ # printer
printer.__doc__ # I am printer
Python offers property()
which is used to create property of a class. It has the following syntax.
property(getter=None, setter=None, deleter=None, doc=None)
# getter() – used to get the value of attribute
# setter() – used to set the value of attribute
# deleter() – used to delete the attribute value
# doc() – string that contains the documentation (docstring) for the attribute
If doc isn’t provided, property() takes the docstring of getter().
class Person:
def __init__(self, val = 0):
self.age = val
# getter
def getter(self):
return self._age
# setter
def setter(self, val):
if val < 0:
raise ValueError('Age cannot be negative.')
# note the use of private variable
self._age = val
# give age a property from getter, setter
age = property(getter, setter)
p = Person(-4) # ValueError: Age cannot be negative
p = Person(24) # okay
p.age = -12 # ValueError: Age cannot be negative
# comment the line where we used property(), then try the following
p.age = -12 # okay
p.__dict__ # {'age': -12}
# uncomment the line where we used property(), then try the following
p.age = 26
p.__dict__ # {'_age': 26}
# interesting, when we modified age, it reflects on _age and __dict__ shows only _age as attribute
In short, the attribute age has gained some properties from getter, setter functions. Any code that retrieves the value of age automatically calls getter() instead of a dictionary (__dict__
) look-up. Any code that assigns a value to age automatically calls setter().
Also, the actual age value is stored in the private _age variable. The age attribute is just a property object which provides an interface to this private variable.
Python offers @property
decorator as a syntactic sugar for property()
.
class Person:
def __init__(self, val = 0):
self.age = val
@property
def age(self):
return self._age
@age.setter
def age(self, val):
if val < 0:
raise ValueError('Negative value is illegal')
self._age = val
This code has the same functionality as the one we wrote using property().
Python has re
module to work with regular expressions. See regex101 for building, testing, debugging regular expressions.
# match a RegEx against a string
# returns a re.Match object if matching is found, else returns None
re.match('^a.[s].k*t$', 'apsyt') # match = 'apsyt'
re.match('\bfoot[pqr]', 'footr') # None
re.match('a+j?s{1,2}', 'aaajssh') # match = 'aaajss'
re.match('^p.*{tyz}h?(k|o)n', 'python') # None
re.match('^p.*t+hon$', 'python')
# find all matches and return a list
re.findall('\d+', 'I am 18, weighs 64 kgs and 6 ft tall.') # ['18', '64', '6']
re.findall('\bse', 'Twenty seconds ago, I set the timer.') # []
re.findall('[bc].r', 'I took the car to the bar') # ['car', 'bar']
re.findall('a{1,2}', 'It was my cat') # ['a', 'a']
re.findall('a*j', 'I love jazz') # ['j']
# split the string where there is a match and return a list of strings where the splits have occurred.
re.split('\d+', 'I am 18, weighs 64 kgs and 6 ft tall.') # ['I am ', ', weights ', ' kgs and ', ' ft tall.']
re.split('\bse', 'Twenty seconds ago, I set the timer.') # ['Twenty seconds ago, I set the timer.']
re.split('[bc].r', 'I took the car to the bar') # ['I took the ', ' to the ', '']
re.split('a{1,2}', 'It was my cat') # ['It w', 's my c', 't']
re.split('a*j', 'I love jazz') # ['I love ', 'azz']
# some other methods
# return a string where matched occurrences are replaced with the content of replace variable
re.sub(pattern, replace, string)
# look for the first location where the RegEx pattern produces a match with the string
re.search(pattern, str)
See re
library for more functionality.
A process can broken down into threads and each thread can be run by different cores (of a multi-processor) simultaneously, thus reducing the time spent in solving the problem. This is called threading. In Python, Global Interpreter Lock (GIL
) allows only one thread to hold the control of the Python interpreter. This means you can execute only one thread at a given time. Thus, there is no true threading in Python.
Despite there is no true threading in Python, there is threading
library which allows you to use threads. Let's explore this.
import threading
import time
def f():
for i in range(10000000):
pass
def g():
# start time
t = time.time()
f()
f()
# stop time
s = time.time()
# return execution time
return s - t
# use threading
def h():
t = time.time()
# create two threads
thread_1 = threading.Thread(target = f)
thread_2 = threading.Thread(target = f)
# start threads
thread_1.start()
thread_2.start()
# ensure that threads have been terminated
thread_1.join()
thread_2.join()
s = time.time()
# return execution time
return s - t
# without threading
g()
# with threading
h()
Funnily enough, threading might actually consume more execution time than without threading. In fact, in many cases it actually slows down execution rather than speed it up. However if the code includes I/O operations, file operations or user interaction, threading can actually be very beneficial.
Instead of threading
, one can use multiprocessing
to break the program into processes rather than threads. Since processes are not constrained by GIL, multiple cores can be used for multiple processes at a given time. The syntax is very similar.
import multiprocessing
import time
def f():
for i in range(10000000):
pass
def g():
t = time.time()
# execute f four times
for i in range(4):
f()
s = time.time()
return s - t
def h():
t = time.time()
# create four processes
proc_1 = multiprocessing.Process(target = f)
proc_2 = multiprocessing.Process(target = f)
proc_3 = multiprocessing.Process(target = f)
proc_4 = multiprocessing.Process(target = f)
# start processes
proc_1.start()
proc_2.start()
proc_3.start()
proc_4.start()
# block the execution of the processes until they terminate
proc_1.join()
proc_2.join()
proc_3.join()
proc_4.join()
s = time.time()
return s - t
# without multiprocessing
g()
# with multiprocessing
h()
In the above case, multiprocessing significantly speeds up execution time. In general, use multiprocessing on computationally intensive code and threading on code which relies on I/O, file operations and user interactions.
Each process is independent and has its own memory space.
import os
import multiprocessing
lst = []
def f(nums):
global lst
for i in nums:
lst.append(i)
print('lst =', lst)
# os.getpid() returns the process ID of the process running this code
print(os.getpid())
nums = [2, 4, 6, 8]
proc_1 = multiprocessing.Process(target = f, args = (nums, ))
proc_1.start()
# lst = [2, 4, 6, 8]
# 4859 (this is PID, might be different on another run)
print(lst) # []
os.getpid() # 4842
# lst in main process is not affected by what happens in process 1
Sometimes you may need the processes to communicate or have shared memory. For this, multiprocessing
provides Array
and Value
objects.
import os, multiprocessing
def f(lst, val):
for i, v in enumerate([2, 4, 6, 8]):
# cannot use append method (why?)
lst[i] = v
# you need [:] operator to convert it into normal list form
# you need value method of Value class to access or assign values
val.value = sum(lst[:])
print ('lst = ', lst[:])
print ('sum = ', val.value)
# lst should hold 4 'i'nteger values
lst = multiprocessing.Array('i', 4)
# val should hold integer value
val = multiprocessing.Value('i')
# you can give initial value as
val = multiprocessing.Value('i', 0)
# you should give the Array and Value objects as arguments
proc_1 = multiprocessing.Process(target = f, args = (lst, val))
proc_1.start()
# lst = [2, 4, 6, 8]
# sum = 20
# changes are reflected in the main process
print(lst[:]) # [2, 4, 6, 8]
print(val.value) # 20