Thinking in Functions
atmb4u
CTO @ Profoundis Inc.
Agenda
★ Basics of Functional Programming
★ Functional Programming and Python
★ Everyday Functional Programming
★ Scaling with Functional Programming
★ Performance Implications
★ Data Explorations
House rules
★ Show of hands please
★ Ask anytime - Raise your hands
★ Be hands on
★ Free to use online resources
★ One step at a time
★ No-one left out policy
About Functional
Programming
About Functional
Programming ★ Formal Provability
About Functional
Programming ★ Modularity
About Functional
Programming
★ Ease of Debugging and
Testing
About Functional
Programming ★ Composability
Basics of Functional
Programming
★ Function as the atomic entity
○ First-class functions
○ Higher-order functions
○ Pure functions
Basics of Functional
Programming
★ Side Effects
○ Avoid state
○ Immutability
Basics of Functional
Programming ★ Lazy Evaluation
Basics of Functional
Programming ★ Type Systems
Basics of Functional
Programming
★ Recursion, Tail Recursion, Iterators,
Currying, Sequences, pattern matching,
monads....
Myths of Functional
Programming
★ It requires immutability/pureness
★ It requires an advanced type system
★ It is significantly less efficient
★ It makes you learn advanced math
★ You must give up all your imperative
programming notions
★ Object orientation and functional paradigms
are incompatible
★ Functional programs are easier to debug
★ Dynamic types are better than Static types
Functional methodology in Python
★ itertools and functools
★ Decorators and Generators
★ What python CAN do
○ lazy eval, lambda, map, reduce, filter
★ What python CANNOT do (pragmatically)
○ Eg: Tail Recursion, Pure immutability, Pure Functions
Functional Programming:
Done everyday
★ map
★ reduce
★ filter
★ enumerate
★ iter
★ list comprehension
★ Generator Expressions
★ Generators
★ itertools
★ functools
★ operator
★ decorator
★ Lambda Functions
List Comprehension
data = [1,2,3,4,5]
[item * 2 for item in data]
Generator Expression
data = [1,2,3,4,5]
generator_exp = (i*i for i in data)
sum(generator_exp)
operator module
import operator
operator.add(40,2)
operator.sub(45,3)
data = {'company': 'Acme Corp.'}
getCompany = operator.itemgetter('company')
getCompany(data)
enumerate
for item in enumerate(['subject', 'verb', 'object']):
print item
map
def upper(s):
return s.upper()
data = [“pycon”,”india”]
map(upper, data)
Map - Explained
from urllib import urlopen
urls = ['http://www.google.com',
'http://www.wikipedia.com',
'http://www.apple.com',
'http://www.python.org'
]
result = []
for item in urls:
result.append(urlopen(item))
return result
def fib(n):
a, b = 0, 1
for i in range(n):
a, b = b, a + b
....
integers = [1, 2, 3, 4, 5]
result = []
for item in integers:
result.append(fib(item))
return result
?
Map - Explained
from urllib import urlopen
urls = ['http://www.google.com',
'http://www.wikipedia.com',
'http://www.apple.com',
'http://www.python.org'
]
result = []
for item in urls:
result.append(urlopen(item))
return result
def fib(n):
a, b = 0, 1
for i in range(n):
a, b = b, a + b
...
integers = [1, 2, 3, 4, 5]
result = []
for item in integers:
result.append(fib(item))
return result
def map(function, sequence):
result = []
for item in sequence:
result.append(function(item))
return result
html_texts = map(urlopen, urls)
fib_integers = map(fib, integers)
Lambda
count_lambda =lambda w: len(w)
print map(count_lambda, 'It is raining cats and dogs'.split())
#conditions in lambda
lambda x: True if x % 2 == 0 else False
reduce
import operator
data = ["PyCon", "India", "2015"]
reduce(operator.concat, data)
Reduce - Explained
# Sum of a list of numbers
def add(x, y):
return x + y
def sum(data):
result = 0
for x in data:
result = add(result, x)
return result sum([5, 2, 3])
# Smallest in a list
def lesser(x, y):
if x < y:
return x
else:
return y
def min(data):
result = 999999999999
for x in data:
result = lesser(result, x)
return result min([5, 2, 3])
?
Reduce - Explained
# Sum of a list of numbers
def add(x, y):
return x + y
def sum(data):
result = 0
for x in data:
result = add(result, x)
return result sum([5, 2, 3])
# Smallest in a list
def lesser(x, y):
if x < y:
return x
else:
return y
def min(data):
result = 999999999999
for x in data:
result = lesser(result, x)
return result min([5, 2, 3])
# Sum
result = sum(data)
result = reduce(add, data, 0)
# Min
result = min(data)
result = reduce(lesser, data, 9999999999)
filter
def is_even(x):
return (x % 2) == 0
filter(is_even, range(10))
#even better
from itertools import ifilter
filter_iterator = ifilter(is_even,data)
filter_iterator.next()
iter function
numbers = [1,2,3]
it = iter(numbers)
# using while and StopIteration Exception
try:
while True:
print it.next()
except StopIteration:
print "Complete"
# as iterator in for loop
it = iter(numbers)
for value in it:
print value
import itertools
itertools.count(100)
itertools.cycle([1,2,3,4,5])
itertools.repeat("Hello", 5)
itertools - count, cycle, repeat
itertools - chain
it1 = iter([1, 2, 3])
it2 = iter([4, 5, 6])
itertools.chain(it1, it2)
itertools - groupby
city_list = [('Cochin', 'KL'),
('Bengaluru', 'KA'),
('Chennai', 'TN'),
('Mumbai', 'MH'),
('Trivandrum', 'KL'),
('Salem', 'TN'),
('Pune', 'MH')]
for city, city_iterator in groupby(city_list, lambda x: x[0]):
for city_state in city_iterator:
print "%s is in %s." % (city, city_state[1])
itertools - combinations, permutations
import itertools
itertools.permutations([1, 2, 3])
itertools.combinations([1, 2, 3, 4, 5], 2)
itertools - izip
dict(itertools.izip("ABCD", [1,2,3,4]))
#OR
zip("ABCD", [1,2,3,4])
functools - partial
import functools
def log (message, subsystem):
print '%s: %s' % (subsystem, message)
server_log = functools.partial(log,
subsystem='server')
server_log('Unable to open socket')
Decorators
def p_decorate(func):
def func_wrapper(name):
return "Super %s" func
(name)
return func_wrapper
@p_decorate
def get_text(name):
return "Hello %s" % name
print get_text("John")
Yield - creating custom generators
data_list = ["Orange", "Microsoft", "Apple"]
def one_at_a_time(data_list):
for item in data_list:
yield item
one_at_a_time.next()
Don't do unnecessary Classes
class Greeting(object):
def __init__(self, greeting="hello"):
self.greeting = greeting
def greet(self, name):
return "{greeting} {name}!".format
(greeting=greeting, name=name)
hola = Greeting("hola")
print hola.greet("bob")
or
?
Don't do unnecessary Classes
class Greeting(object):
def __init__(self, greeting="hello"):
self.greeting = greeting
def greet(self, name):
return "{greeting} {name}!".format
(greeting=greeting, name=name)
hola = Greeting("hola")
print hola.greet("bob")
def greet(greeting, name):
return "{greeting} {name}!".format(greeting=greeting,
name=name)
hello = functools.partial(greet, "Hello")
hello("Dude")
or
The Bad Parts
★ Memory Inefficiencies
★ Purity
★ No Tail Recursion
★ Innately imperative (Guido)
★ Class based type system
★ Only imperative Error Handling
(Exception)
★ Function Overloading
★ Mutable variables
Python vs Functional
Thinking about Scalability with Functions
★ map-reduce-filter - recipe for distributed computing
★ shared states- to be or not to be
★ immutable 'variables'
★ independent functions
★ Execution Pipelines - chained map-reduce
Performance v/s Scalability
★ Functional Programs vs Object Oriented Programs
★ CPU intensive processes vs I/O intensive processes
★ The curse of GIL - workarounds
○ multiprocessing
★ Benchmarking
○ %timeit
multiprocessing.Pool
import multiprocessing
def worker():
print 'Execute your function here'
return
if __name__ == '__main__':
jobs = []
for i in range(5):
p = multiprocessing.Process(target=worker)
jobs.append(p)
p.start()
// Why should I think in functions?
why!
★ no side effects -no state, no deadlocks, no semaphores
★ automatic parallelization - unlimited scalability
★ composability - break down into smaller functions
★ Testing -independent functions; well defined arguments and return values
★ partial evaluation - pass around half baked functions instead of objects
★ elegant code -forces to write logically correct programs
Hands On: Let’s do some Data Wrangling
Blockbuster Database (http://www.crowdflower.com/data-for-everyone)
➢ Which genre has most movies?
➢ Which movie studio gross the most?
➢ Sort by most grossed and Highly rated movies
demo code here
Questions
code samples in this presentation here
atm@profoundis.com
@atmb4u

Thinking in Functions: Functional Programming in Python

  • 1.
  • 2.
    Agenda ★ Basics ofFunctional Programming ★ Functional Programming and Python ★ Everyday Functional Programming ★ Scaling with Functional Programming ★ Performance Implications ★ Data Explorations
  • 3.
    House rules ★ Showof hands please ★ Ask anytime - Raise your hands ★ Be hands on ★ Free to use online resources ★ One step at a time ★ No-one left out policy
  • 4.
  • 5.
  • 6.
  • 7.
    About Functional Programming ★ Easeof Debugging and Testing
  • 8.
  • 9.
    Basics of Functional Programming ★Function as the atomic entity ○ First-class functions ○ Higher-order functions ○ Pure functions
  • 10.
    Basics of Functional Programming ★Side Effects ○ Avoid state ○ Immutability
  • 11.
  • 12.
  • 13.
    Basics of Functional Programming ★Recursion, Tail Recursion, Iterators, Currying, Sequences, pattern matching, monads....
  • 14.
    Myths of Functional Programming ★It requires immutability/pureness ★ It requires an advanced type system ★ It is significantly less efficient ★ It makes you learn advanced math ★ You must give up all your imperative programming notions ★ Object orientation and functional paradigms are incompatible ★ Functional programs are easier to debug ★ Dynamic types are better than Static types
  • 15.
    Functional methodology inPython ★ itertools and functools ★ Decorators and Generators ★ What python CAN do ○ lazy eval, lambda, map, reduce, filter ★ What python CANNOT do (pragmatically) ○ Eg: Tail Recursion, Pure immutability, Pure Functions
  • 16.
    Functional Programming: Done everyday ★map ★ reduce ★ filter ★ enumerate ★ iter ★ list comprehension ★ Generator Expressions ★ Generators ★ itertools ★ functools ★ operator ★ decorator ★ Lambda Functions
  • 17.
    List Comprehension data =[1,2,3,4,5] [item * 2 for item in data]
  • 18.
    Generator Expression data =[1,2,3,4,5] generator_exp = (i*i for i in data) sum(generator_exp)
  • 19.
    operator module import operator operator.add(40,2) operator.sub(45,3) data= {'company': 'Acme Corp.'} getCompany = operator.itemgetter('company') getCompany(data)
  • 20.
    enumerate for item inenumerate(['subject', 'verb', 'object']): print item
  • 21.
    map def upper(s): return s.upper() data= [“pycon”,”india”] map(upper, data)
  • 22.
    Map - Explained fromurllib import urlopen urls = ['http://www.google.com', 'http://www.wikipedia.com', 'http://www.apple.com', 'http://www.python.org' ] result = [] for item in urls: result.append(urlopen(item)) return result def fib(n): a, b = 0, 1 for i in range(n): a, b = b, a + b .... integers = [1, 2, 3, 4, 5] result = [] for item in integers: result.append(fib(item)) return result ?
  • 23.
    Map - Explained fromurllib import urlopen urls = ['http://www.google.com', 'http://www.wikipedia.com', 'http://www.apple.com', 'http://www.python.org' ] result = [] for item in urls: result.append(urlopen(item)) return result def fib(n): a, b = 0, 1 for i in range(n): a, b = b, a + b ... integers = [1, 2, 3, 4, 5] result = [] for item in integers: result.append(fib(item)) return result def map(function, sequence): result = [] for item in sequence: result.append(function(item)) return result html_texts = map(urlopen, urls) fib_integers = map(fib, integers)
  • 24.
    Lambda count_lambda =lambda w:len(w) print map(count_lambda, 'It is raining cats and dogs'.split()) #conditions in lambda lambda x: True if x % 2 == 0 else False
  • 25.
    reduce import operator data =["PyCon", "India", "2015"] reduce(operator.concat, data)
  • 26.
    Reduce - Explained #Sum of a list of numbers def add(x, y): return x + y def sum(data): result = 0 for x in data: result = add(result, x) return result sum([5, 2, 3]) # Smallest in a list def lesser(x, y): if x < y: return x else: return y def min(data): result = 999999999999 for x in data: result = lesser(result, x) return result min([5, 2, 3]) ?
  • 27.
    Reduce - Explained #Sum of a list of numbers def add(x, y): return x + y def sum(data): result = 0 for x in data: result = add(result, x) return result sum([5, 2, 3]) # Smallest in a list def lesser(x, y): if x < y: return x else: return y def min(data): result = 999999999999 for x in data: result = lesser(result, x) return result min([5, 2, 3]) # Sum result = sum(data) result = reduce(add, data, 0) # Min result = min(data) result = reduce(lesser, data, 9999999999)
  • 28.
    filter def is_even(x): return (x% 2) == 0 filter(is_even, range(10)) #even better from itertools import ifilter filter_iterator = ifilter(is_even,data) filter_iterator.next()
  • 29.
    iter function numbers =[1,2,3] it = iter(numbers) # using while and StopIteration Exception try: while True: print it.next() except StopIteration: print "Complete" # as iterator in for loop it = iter(numbers) for value in it: print value
  • 30.
  • 31.
    itertools - chain it1= iter([1, 2, 3]) it2 = iter([4, 5, 6]) itertools.chain(it1, it2)
  • 32.
    itertools - groupby city_list= [('Cochin', 'KL'), ('Bengaluru', 'KA'), ('Chennai', 'TN'), ('Mumbai', 'MH'), ('Trivandrum', 'KL'), ('Salem', 'TN'), ('Pune', 'MH')] for city, city_iterator in groupby(city_list, lambda x: x[0]): for city_state in city_iterator: print "%s is in %s." % (city, city_state[1])
  • 33.
    itertools - combinations,permutations import itertools itertools.permutations([1, 2, 3]) itertools.combinations([1, 2, 3, 4, 5], 2)
  • 34.
    itertools - izip dict(itertools.izip("ABCD",[1,2,3,4])) #OR zip("ABCD", [1,2,3,4])
  • 35.
    functools - partial importfunctools def log (message, subsystem): print '%s: %s' % (subsystem, message) server_log = functools.partial(log, subsystem='server') server_log('Unable to open socket')
  • 36.
    Decorators def p_decorate(func): def func_wrapper(name): return"Super %s" func (name) return func_wrapper @p_decorate def get_text(name): return "Hello %s" % name print get_text("John")
  • 37.
    Yield - creatingcustom generators data_list = ["Orange", "Microsoft", "Apple"] def one_at_a_time(data_list): for item in data_list: yield item one_at_a_time.next()
  • 38.
    Don't do unnecessaryClasses class Greeting(object): def __init__(self, greeting="hello"): self.greeting = greeting def greet(self, name): return "{greeting} {name}!".format (greeting=greeting, name=name) hola = Greeting("hola") print hola.greet("bob") or ?
  • 39.
    Don't do unnecessaryClasses class Greeting(object): def __init__(self, greeting="hello"): self.greeting = greeting def greet(self, name): return "{greeting} {name}!".format (greeting=greeting, name=name) hola = Greeting("hola") print hola.greet("bob") def greet(greeting, name): return "{greeting} {name}!".format(greeting=greeting, name=name) hello = functools.partial(greet, "Hello") hello("Dude") or
  • 40.
    The Bad Parts ★Memory Inefficiencies ★ Purity ★ No Tail Recursion ★ Innately imperative (Guido) ★ Class based type system ★ Only imperative Error Handling (Exception) ★ Function Overloading ★ Mutable variables Python vs Functional
  • 41.
    Thinking about Scalabilitywith Functions ★ map-reduce-filter - recipe for distributed computing ★ shared states- to be or not to be ★ immutable 'variables' ★ independent functions ★ Execution Pipelines - chained map-reduce
  • 42.
    Performance v/s Scalability ★Functional Programs vs Object Oriented Programs ★ CPU intensive processes vs I/O intensive processes ★ The curse of GIL - workarounds ○ multiprocessing ★ Benchmarking ○ %timeit
  • 43.
    multiprocessing.Pool import multiprocessing def worker(): print'Execute your function here' return if __name__ == '__main__': jobs = [] for i in range(5): p = multiprocessing.Process(target=worker) jobs.append(p) p.start()
  • 44.
    // Why shouldI think in functions?
  • 45.
    why! ★ no sideeffects -no state, no deadlocks, no semaphores ★ automatic parallelization - unlimited scalability ★ composability - break down into smaller functions ★ Testing -independent functions; well defined arguments and return values ★ partial evaluation - pass around half baked functions instead of objects ★ elegant code -forces to write logically correct programs
  • 46.
    Hands On: Let’sdo some Data Wrangling Blockbuster Database (http://www.crowdflower.com/data-for-everyone) ➢ Which genre has most movies? ➢ Which movie studio gross the most? ➢ Sort by most grossed and Highly rated movies demo code here
  • 47.
    Questions code samples inthis presentation here atm@profoundis.com @atmb4u