/code/blog

Code, code and code

10 Python One Liners to Impress Your Friends

After 10 Scala / Ruby / Clojure / CoffeeScript one liners to impress your friends, i thought it might be interesting to quickly try out the same in Python too. Without much ado.. here goes. Note that the variable declarations and imports are on separate lines as necessary. Also every line is written so as to print out the results to stdout for quick verification For what it is worth, this hardly took any time - this post is probably one of the quickest I have written.

1. Multiple Each Item in a List by 2

1
print map(lambda x: x * 2, range(1,11))
2. Sum a List of Numbers
1
print sum(range(1,1001))
3. Verify if Exists in a String
1
2
3
4
wordlist = ["scala", "akka", "play framework", "sbt", "typesafe"]
tweet = "This is an example tweet talking about scala and sbt."

print map(lambda x: x in tweet.split(),wordlist)
4. Read in a File
1
print open("ten_one_liners.py").readlines()
5. Happy Birthday to You!
1
print map(lambda x: "Happy Birthday to " + ("you" if x != 2 else "dear Name"),range(4))
6. Filter list of numbers
1
print reduce(lambda(a,b),c: (a+[c],b) if c > 60 else (a,b + [c]), [49, 58, 76, 82, 88, 90],([],[]))
7. Fetch and Parse an XML web service
1
2
3
4
from xml.dom.minidom import parse, parseString
import urllib2
# note - i convert it back into xml to pretty print it
print parse(urllib2.urlopen("http://search.twitter.com/search.atom?&q=python")).toprettyxml(encoding="utf-8")
8. Find minimum (or maximum) in a List
1
2
print min([14, 35, -7, 46, 98])
print max([14, 35, -7, 46, 98])
9. Parallel Processing
1
2
3
4
import multiprocessing
import math

print list(multiprocessing.Pool(processes=4).map(math.exp,range(1,11)))
10. Sieve of Eratosthenes There is no Sieve of Eratosthenes operator, but that is hardly a constraint.
1
2
3
n = 50 # We want to find prime numbers between 2 and 50

print sorted(set(range(2,n+1)).difference(set((p * f) for p in range(2,int(n**0.5) + 2) for f in range(2,(n/p)+1))))

Code Kata : Ruby Programming Challenge for Newbies in Python

An interesting contest caught my eye today. There’s a site Ruby Learning by Pune’s Satish Talim (Twitter : @IndianGuru) which organises regular Ruby Programming Challenge for Newbies and it introduced the 13th challenge earlier yesterday : RPCFN: Economics 101 (#13) by Dr. Bruce Scharlau. While neither being a regular rubyist nor being a newbie, I thought it made for a decent exercise and a diversion for a little while, albeit in python. As an added interest I wrote the solutions twice. Once in a very procedural way and once leveraging the functional programming constructs. Here’s the brief problem as stated in the challenge :
The file cia-1996.xml (links back to rublylearning.org) is the data from the CIA World Factbook of 1996 in XML format. It has details about 260 countries across five continents. Your challenge, should you choose to accept it, is to uncover the following details buried within this file:
  • What is the population of the country with the most people? Yes, we know it’s China, but just how many people lived there in 1996?
  • What are the five countries with the highest inflation rates, and what were those rates in 1996?
  • What are the six continents in the file and which countries belong to which continent? Can you also produce them in alphabetical order?
I used python 2.6 with the lxml xml parser for this exercises. Parsing the xml Since the source file is a zip file, one needs to open the zip, and extract the xml out of it. Since I am using python 2.6 and not python 2.7 I couldn’t use the with construct which would’ve not required the explicit zipfile close() statement.
1
2
3
4
5
6
7
8
9
10
import zipfile
from lxml import objectify
from operator import itemgetter
import itertools

# Unzip file, extract xml, convert to object    
zip = zipfile.ZipFile("../cia-1996.zip")
xmlfile = zip.open("cia-1996.xml")
root = objectify.fromstring("".join(line for line in xmlfile.readlines()))
zip.close()
This opens the zip file, extracts the xml out of it, concatenates all the lines as a single string and converts the data into a single object referred to as root Note: the itemgetter and itertools imports are used subsequently. Procedural : Find the country with the highest population
1
2
3
4
5
6
7
8
9
# Country with highest population
maxp = 0
maxc = ''
for country in root.country :
    if int(country.get('population',0)) >= maxp :
        maxc, maxp =country.get('name'),int(country.get('population',0))
print (maxc,maxp)

# Output is : ('China', 1210004956)
Procedural : Top 5 countries with highest inflation rates
1
2
3
4
5
6
7
8
# Top 5 countries with highest inflation rates
inflation_country_tuples = []
for country in root.country :
    inflation_country_tuples.append((float(country.get('inflation',0.0)), country.get('name')))
inflation_country_tuples = sorted(inflation_country_tuples, key=itemgetter(0), reverse=True)
print inflation_country_tuples[0:5]

# Output : ((244.0, 'Belarus'), (94.0, 'Turkey'), (85.0, 'Azerbaijan'), (83.299999999999997, 'Malawi'), (71.299999999999997, 'Yemen'))
This extracts the inflation, name tuple from each country and creates a list out of it, sorts the list using inflation in a descending order and then prints the first five elements. Procedural : Sorted continents, each associated with all their sorted countries
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# Countries by continent
continent_country_tuples = []
for country in root.country :
    continent_country_tuples.append((country.get('continent',''), country.get('name')))
continent_country_tuples = sorted(continent_country_tuples)
current_continent = None
countries_of_continent = None
continent_grouped_countries = []
for continent, country in continent_country_tuples :
    if continent != current_continent :
        if current_continent :
            continent_grouped_countries.append((current_continent, countries_of_continent))
        countries_of_continent = []
        current_continent = continent
    countries_of_continent.append(country)
continent_grouped_countries.append((current_continent, countries_of_continent))
print continent_grouped_countries

# Output : too long to include here
Functional Programming Solutions Incidentally the same problem can also be solved using very functional programming constructs as follows. This shows an interesting contrast of solutions in both the procedural and functional programming ways. The logic used across both the sets is virtually the same thought the constructs are different.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# Country with highest population
print reduce(lambda (maxc,maxp), c :
             (c.get('name'),int(c.get('population',0)))
                if int(c.get('population',0)) >= maxp else
                    (maxc,maxp),
             root.country,('',0))

# Top 5 countries with highest inflation rates
print tuple(itertools.islice(
            sorted((
                (float(country.get('inflation',0.0)), country.get('name'))
                        for country in root.country),
                key=itemgetter(0),reverse=True),
                5))

# Countries by continent
print tuple((continent,tuple(country[1] for country in countries))
        for continent, countries in itertools.groupby(
            sorted((country.get('continent',''), country.get('name'))
                    for country in root.country),itemgetter(0)))
Update I further structured the functional programming approach code. This code has no list comprehensions (for loops) at all. The code is as follows :
1
2
3
4
5
6
7
8
9
10
# Find the country with the maximum population
print reduce(lambda country_pop_max, country_pop_next : country_pop_next if country_pop_next[1] > country_pop_max[1] else country_pop_max,
        map(lambda country : (country.get('name'), int(country.get('population', 0))),root.country),('',0))

print tuple(itertools.islice(sorted(
            map(lambda country : (country.get('name'), float(country.get('inflation', 0.0))),root.country),
            key=itemgetter(1),reverse=True),5))

print map(lambda (continent, continent_country_tuples) : (continent, map(lambda (continent, country) : country, continent_country_tuples)),
    itertools.groupby(sorted(map(lambda country : (country.get('continent'), country.get('name')),root.country)),itemgetter(0)))
Since the above is likely to be a little too cryptic and confusing, here’s the detailed commented code (only comments and whitespace added)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
print reduce(
        # Comparator to decide the country with the maximum population
        lambda country_pop_max, country_pop_next : country_pop_next if country_pop_next[1] > country_pop_max[1] else country_pop_max,
        # get a sequence of tuples of (country name, country population)    
        map(lambda country : (country.get('name'), int(country.get('population', 0))),root.country),
        # initial seed
        ('',0))

print tuple(
        # take the top 5 items
        itertools.islice(
            # sort the list
            sorted(
                # get a sequence of tuples of (country name, inflation)   
                map(lambda country : (country.get('name'), float(country.get('inflation', 0.0))),root.country),
                # sorting to be done using the second element of the tuple
                key=itemgetter(1),
                # sort to be done in the descending order
                reverse=True),
            # count of elements to be sliced
            5))

print map(
    # function to flatten the continent_country_tuple tuple into a country tuple      
    lambda (continent, continent_country_tuples) :
        # first element of the tuple is continent
        (continent,
        # as the second element, return only the country from the continent country tuple to form a tuple of countries                                        
        map(lambda (continent, country) : country, continent_country_tuples)),
    # group the continent country tuples by continent
    itertools.groupby(
        # sort the continent country tuples
        sorted(
            # extract a continent country tuple from the country
            map(lambda country : (country.get('continent'), country.get('name')),root.country)),
        # this is the function to extract the key to sort by
        itemgetter(0)))

Clojure Style Multi Methods in Python

What are multi-methods used for ? Simply put, they allow for function overloading, ie. they allow for different implementations of the same function to be provided for different contexts, and the appropriate context and therefore the implementing function to be automatically selected and performed at runtime. The typical method of function overloading is based on differing type based signatures (a carryover from the C++ / Java days). There is a multimethod module in python. That can be used for type based switching. There is another way function overloading gets used, an approach used by clojure which it terms as multi methods based on an intermediate function used to compute a value, and then switches based upon that value. Note: Switching = Coming to a fork on the road and then deciding which path to take. For an easier reading on clojure multimethods, you may want to first read Clojure multi-methods, a nice post by Alex Miller. One more reason to read that post, is that the specific situation I use to demonstrate the usage of multi methods is also identical to the one described in that post. Here’s one way to implement clojure style multimethods in python. However I will put the cart before the horse, and demonstrate how multi methods could be used in python, before explaining the underlying support functions I use to make that feasible. Situation We have a person structure with a name and an age as member attributes. We would like to print a person’s name but prefix it with Young, Adult or Senior based on the person’s age.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Declare the class
class Person(object):
    """ A person. Has a name and an age"""
    def __init__(self,name,age):
        self.name = name
        self.age = age

# Instantiate the objects
jimmy = Person("Jimmy",5)
alex = Person("Alex",36)
edna = Person("Edna",99)

# Run the multi method
print name(jimmy)
print name(alex)
print name(edna)

#Expected output is 
#        Young Jimmy
#        Adult Alex
#        Senior Edna
To do the same, we will actually declare a switcher function which computes a value to switch upon.
1
2
3
4
5
def ticket(person):
    """ The switcher function to decide the result to be used for multi method switching """
    if person.age < 16 : return "young"
    elif person.age >= 66 : return "senior"
    else : return "adult"
We shall need a switcher which will switch based on this value.
1
2
# Declare the existence of a multi method switcher
name = multi(ticket)
Ahh, so we now need three methods to be declared, one each for each value. These we shall do so using decorator @multi_method. Note all the three functions have the same name. Both the decorator and the support that allows the functions to have the same name comes up later.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Add the first switched function for result value "young"
@multi_method(name, "young")
def name(person):
    return "Young " + person.name

# Add the next switched function for the result value "adult"
@multi_method(name, "adult")
def name(person):
    return "Adult " + person.name

# Add yet another switched function for the result value "senior"
@multi_method(name, "senior")
def name(person):
    return "Senior " + person.name
Now this can be made to work. But it needs a couple of constructs behind the scenes. First we need the constructor for a multi function, which accepts the switcher function (in this case ticket).
1
2
3
4
5
6
7
8
9
10
11
12
13
def multi(switcher_func):
    """ Declares a multi map based method which will switch to the
    appropriate function based on the results of the switcher func"""

    def dispatcher(*args, **kwargs):
        key = switcher_func(*args, **kwargs)
        func = dispatcher.dispatch_map[key]
        if func :
            return func(*args,**kwargs)
        else :
            raise Exception("No function defined for dispatch key: %s" % key)
    dispatcher.dispatch_map = {}
    return dispatcher
This declares an inner function dispatcher, which has the logic to perform the switch based on the supplied switcher function and the run time arguments to the function. What this method however does is that it sets up a dictionary called dispatch_map as an attribute of the dispatcher function which will be used to store the possible result values and the resultant function to switch to. Having done that, we still need to define each of the individual functions for each possible value of the switcher function. That is done using the following decorator where the dispatch_map that got defined earlier is populated with the appropriate result and the function to dispatch to.. Note that the function name gets changed in order to avoid a conflict.
1
2
3
4
5
6
7
8
9
10
def multi_method(dispatcher, result):
    """ The multi method decorator which allows one method at a time
    to be added to the broader switch for the given result value"""
    def inner(wrapped):
        dispatcher.dispatch_map[result] = wrapped
        # Change the method name from clashing with the multi and allowing
        # all multi methods to be written using the same name
        wrapped.func_name = "_" + wrapped.func_name + "_" + str(result)
        return dispatcher
    return inner
There’s a way to use these capabilities for class members. So if I wanted to dispatch the __str__ method based on the values, the class would now be
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
class Person(object):
    """ A person. Has a name and an age"""
    def __init__(self,name,age):
        self.name = name
        self.age = age

    __str__ = multi(ticket)

    @multi_method(__str__,"young")
    def str_young(self):
        return "Young " + self.name

    @multi_method(__str__,"adult")
    def str_adult(self):
        return "Adult " + self.name

    @multi_method(__str__,"senior")
    def str_senior(self):
        return "Senior " + self.name
Not bad for an hours work. Update 1: Even as I had just tweeted this codeblog post, another tweet caught my attention (possibly unrelated) which expected the switching condition for each function to be defined at each function level and not as a composite function returning some value. Since this was fresh in my mind, I thought that would be an interesting option too, and in a few minutes had another implementation based on the conditions being specified at a per function level. Here’s the separate version for the same. Pay attention to the fact that there is no ticket function, and that condition for each function is declared in the decorator itself.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
def multi():
    """ Declares a multi map based method which will switch to the
    appropriate function based on the results of the switcher func"""

    def dispatcher(*args, **kwargs):
        for condition, func in dispatcher.functions :
            if condition(*args, **kwargs) :
                return func(*args,**kwargs)
        raise "No function identified for dispatch key"

    dispatcher.functions = []
    return dispatcher

def multi_method(dispatcher, condition):
    """ The multi method decorator which allows one method at a time
    to be added to the broader switch for the given result value"""
    def inner(wrapped):
        dispatcher.functions.append((condition, wrapped))
        # Change the method name from clashing with the multi and allowing
        # all multi methods to be written using the same name
        wrapped.func_name = "_" + wrapped.func_name + "_" + str(condition)
        return dispatcher
    return inner

class Person(object):
    """ A person. Has a name and an age"""
    def __init__(self,name,age):
        self.name = name
        self.age = age

# Declare the existence of a multi method switcher
name = multi()

# Add the first switched function for result value "young"
@multi_method(name, lambda p : p.age < 16)
def name(person):
    return "Young " + person.name

# Add the next switched function for the result value "adult"
@multi_method(name, lambda p : 16 <= p.age < 66)
def name(person):
    return "Adult " + person.name

# Add yet another switched function for the result value "senior"
@multi_method(name, lambda p : 66 <= p.age)
def name(person):
    return "Senior " + person.name

# Instantiate the objects
jimmy = Person("Jimmy",5)
alex = Person("Alex",36)
edna = Person("Edna",99)

# Run the multi method
print name(jimmy)
print name(alex)
print name(edna)
Update 2 I was requested to demonstrate this in terms of the clojure polymorphism as described in Clojure - runtime_polymorphism (code snippet from that page listed below) :
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
(defmulti encounter (fn [x y] [(:Species x) (:Species y)]))
(defmethod encounter [:Bunny :Lion] [b l] :run-away)
(defmethod encounter [:Lion :Bunny] [l b] :eat)
(defmethod encounter [:Lion :Lion] [l1 l2] :fight)
(defmethod encounter [:Bunny :Bunny] [b1 b2] :mate)
(def b1 {:Species :Bunny :other :stuff})
(def b2 {:Species :Bunny :other :stuff})
(def l1 {:Species :Lion :other :stuff})
(def l2 {:Species :Lion :other :stuff})
(encounter b1 b2)
-> :mate
(encounter b1 l1)
-> :run-away
(encounter l1 b1)
-> :eat
(encounter l1 l2)
-> :fight
Here’s how the same effect can be had in python using the first version of multi and multi_method I described above.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# Declare the existence of a multi method switcher
encounter = multi(lambda x,y : (x["Species"], y["Species"]))

@multi_method(encounter, ("Bunny","Lion"))
def encounter(a1, a2):
    return "run-away"

@multi_method(encounter, ("Lion","Bunny"))
def encounter(a1, a2):
    return "eat"

@multi_method(encounter, ("Bunny","Bunny"))
def encounter(a1, a2):
    return "mate"

@multi_method(encounter, ("Lion","Lion"))
def encounter(a1, a2):
    return "fight"

b1 = {"Species" : "Bunny", "Other" : "Stuff"}
b2 = {"Species" : "Bunny", "Other" : "Stuff"}
l1 = {"Species" : "Lion", "Other" : "Stuff"}
l2 = {"Species" : "Lion", "Other" : "Stuff"}

print encounter(b1, b2)
print encounter(b1, l1)
print encounter(l1, b1)
print encounter(l1, l2)
The resultant output is
1
2
3
4
mate
run-away
eat
fight

Simple Dependency Injection in Python

I wrote this code a simple dependency injection capability in my application. Hopefully the comments should be self explanatory. If you want to just understand the usage, see the class decorator usage on lines 52 and 63 and the code from line 69 onwards
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
# Global registry for instance registration
resource_registry = {}

def provides(name):
    """Class decorator for an instance provider.
    The name passed in is what the attribute is called.
    The instance will be auto registered in the registry
    with the value of the handle
    TODO : the name could also be a function which could return the
    attribute name to extract the instance name from
    """
    def wrapper(cls):
        def init_and_register(init):
            def inner(self,*args,**kwargs):
                "The new wrapper constructor"
                # Call the old constructor
                init(self,*args,**kwargs)
                # Update the resource registry with the desired key and instance
                resource_registry[self.__dict__[name]] = self
            return inner
        # Replace the constructor with a wrapped constructor
        cls.__init__ = init_and_register(cls.__init__)
        return cls
    return wrapper

def requires(*resource_types):
    """ Class decorator to inject instances from the registry
    into the instance under construction"""
    def wrap_constructor(self,*args,**kw):
        "The constructor wrapper"
        # First call the original constructor
        self.__oldinit__(*args,**kw)
        # If this instance requires dependencies to be injected
        if hasattr(self,'_resource_types') :
            # for each dependency
            for resource_type in self._resource_types:
                # inject the dependency directly as in instance variable 
                # from the resource registry
                self.__dict__[resource_type] = resource_registry[kw[resource_type]]

    def inner(cls):
        cls._resource_types = resource_types
        cls.__oldinit__ = cls.__init__
        cls.__init__ = wrap_constructor
        return cls

    return inner

# The following class indicates a sample resource provider which could be
# injected. The 'handle' passed to the class decorator is the name of the attribute
# whole value will be used as the key to register the instance name in the registry
@provides('handle')
class Provider(object):
    def __init__(self,handle):
        self.handle = handle
    def __str__(self):
        return 'Provider(%s)' % self.handle

# A sample user object (ie. the object into which the dependency is to be injected
# The constructor needs to accept kw based arguments for the injection to happen
# The requires decorator lists the attribute names under which the dependencies will
# be injected into the instance of this class
@requires('first','second')
class User(object):
    def __init__(self, **kwargs):
        pass

if __name__ == '__main__' :
    # Initialise the providers. They auto register themselves.    
    p1 = Provider('one')
    p2 = Provider('two')

    # Instantiate the user. specify the key  under which the
    # injected objects have been registered in the values of the
    # keyword arguments (the key being the attribute name declared
    # earlier in the class decorator)
    u = User(first = 'one', second = 'two')

    # Test that the dependencies have been appropriately injected
    assert u.first == p1
    assert u.second == p2

Extracting Data Using Recursive Descent Parsing

The other day there was an interesting query on the bangpypers mailing list : How should I do it?. It was a question which wondered how to extract data out of a datastructure. The question reproduced below was as follows : I have a txt file in the following format:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
"confident" => {
  count => 4,
  trans => {
     "ashahvasahta" => 0.74918568,
    "atahmavaishahvaasa" => 0.09095465,
    "pahraaram\.nbha" => 0.06990729,
         "mailatae" => 0.02856427,
           "utanai" => 0.01929341,
             "anaa" => 0.01578552,
         "uthaanae" => 0.01403157,
         "jaitanae" => 0.01227762,
    },
},
"consumers" => {
  count => 4,
  trans => {
    "upabhaokahtaa" => 0.75144362,
    "upabhaokahtaaom\.n" => 0.12980166,
    "sauda\�\�\�dha" => 0.11875471,
    },
},
"a" => {
  count => 1164,
  trans => {
              "eka" => 0.14900491,
           "kaisai" => 0.08834675,
             "haai" => 0.06774697,
             "kaoi" => 0.05394308,
              "kai" => 0.04981982,
         "\(none\)" => 0.04400085,
              "kaa" => 0.03726579,
              "kae" => 0.03446450,
    },
},
and I need to extract “confident” , “ashahvasahta” from the first record, “consumers”, “upabhaokahtaa” from the second record… i.e. “word in english” and the “first word in the probable-translations”
The first level key is an english word and the entries against the subkey “trans” seem translations with their associated weightages. The task is to extract the english word followed by the (sanskrit) translation word that has the highest weightage associated with it corresponding to the word. While the problem could have been solved in different (and perhaps simpler) ways, it seemed an attractive one since it also could be solved using a recursive descent parser - something that I had not worked upon before (which then became the primary motivator for solving it). A quick study showed up a number of recursive descent parsers written in python language and the one I quickly settled on was pyparsing. A good introductory writeup on the same can be found at O’reilly onlamp.com at Building Recursive Descent Parsers with Python. The code below is a suggested solution for the same. Hopefully the comments inline are sufficiently self explanatory. A quick legend to some of the pyparsing constructs : Forward() : This is a forward declaration to a token whose structure will be defined later Word(characters) : A contiguous sequence of one or more characters that form the word Group() : A construct which separately classifies the matching data in the results token ^ token : The caret indicates an OR ie one of the two tokens should be matched. Optional(token) : An instruction which matches 0 or 1 occurences of token (equivalent to ‘?’ in regex) .setResultsName(name) : An instruction which marks the matched result with the given name token.suppress() : An instruction which matches but then ignores the matched data in the result
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
# coding=utf-8
from pyparsing import *
import pprint
import sys

data = '''
"confident" => {
   count => 4,
   trans => {
     "ashahvasahta" => 0.74918568,
     "atahmavaishahvaasa" => 0.09095465,
     "pahraaram\.nbha" => 0.06990729,
     "mailatae" => 0.02856427,
     "utanai" => 0.01929341,
     "anaa" => 0.01578552,
     "uthaanae" => 0.01403157,
     "jaitanae" => 0.01227762,
   },
},
"consumers" => {
 count => 4,
 trans => {
   "upabhaokahtaa" => 0.75144362,
   "upabhaokahtaaom\.n" => 0.12980166,
   "sauda\�\�\�dha" => 0.11875471,
   },
},
"a" => {
 count => 1164,
 trans => {
             "eka" => 0.14900491,
          "kaisai" => 0.08834675,
            "haai" => 0.06774697,
            "kaoi" => 0.05394308,
             "kai" => 0.04981982,
        "\(none\)" => 0.04400085,
             "kaa" => 0.03726579,
             "kae" => 0.03446450,
   },
}
'''

# Setup pyparsing tokens

# Set up a forward reference to a dict (to be defined later)
dict_ = Forward()
# A key is a word of alphabets & numbers or is a quoted string
key = (Word(alphas + nums) ^ quotedString).setResultsName("key")
# A value is a sequence of digits and . or a dictionary
val = (Word("0123456789.") ^ dict_).setResultsName("value")
# A key value pair is a key followed by => operator followed by value
key_value_pair = Group(key + Literal("=>").suppress() + val)
# A key value pair list is a comma delimited list of key value pairs
key_value_pair_list = delimitedList(key_value_pair)
# A dictionary is { plus a key value pair list plus an optional trailing
# comma followed by a }
dict_ << Group(Literal("{").suppress() + key_value_pair_list +
            Optional(Literal(",").suppress()) + Literal("}").suppress())

# parse data
parsed = key_value_pair_list.parseString(data)

# function to extract the parsed data into a dictionary
# Note : the function is kind of weird and does type checks
# due to the way ParseResult is modeled  
def extract(result):
        if 'key' in result.keys() :
            if isinstance(result.value,ParseResults) :
                return ( result.key,  extract(result.value) )
            else :
                return ( result.key,  result.value )
        else :
            return(dict(extract(elem) for elem in result))

# This converts the parsed ParseResult structures into a dictionary       
extracted = extract(parsed)

# print the dictionary
pprint.pprint(extracted, sys.stdout)

# print the english word and first translated word

print "\n\n============\nTranslations\n============\n"
translated = {}
for english, translations in extracted.items() :
    current_word = ''
    current_weight = 0.0
    for word,weight in translations['trans'].items() :
        if float(weight) > current_weight :
            current_word = word
            current_weight = float(weight)
    translated[english] = current_word
print translated
As usual comments are most welcome. Enjoy!

Implementing Request Interceptors for Tornado

One of my favourite new software introductions last year was the Tornado web server. However ever since my Webwork 2 (and subsequently Struts 2) days, I had learnt to greatly appreciate the power and cleanliness of implementing interceptors for a variety of aspects related to request preprocessing. The same feature is also available on a variety of wsgi based web application frameworks except that these are referred to as wsgi middleware. So it was with great disappointment that I realised Tornado simply did not have any such interceptors. Worse, there was no way to plug in or roll one’s own, since the tornado request processing pipeline had no place where one could plug something in. In a statically typed language it would have required me to take one of the two options - (a) fork Tornado codebase, change it to introduce plugins or (b) live without it. Well every once in a while in a dynamic typed language you run into a situation where metaprogramming saves your bacon. It was relatively easy to implement one for Tornado. So how does one do it. Tornado during request processing calls a method called _execute on the handler (which is always available on all handlers since its in the base class for handlers). I wrote a class decorator called interceptor which wraps a class, and in doing so actually wraps one of its methods. It saves the reference to the current _execute method, replaces it with a call to another user supplied method (as a part of applying the interceptor), and calls the saved _execute method reference after the user supplied method returns successfully. Note that as per the semantics, if the user supplied method returns False, further request processing is aborted. It is also possible to chain a number of such interceptors. In the example below, I have taken the basic authentication logic found in one of the Tornado examples and reformatted the same as an interceptor. The first two functions authenticator and user_extractor are rather trivialised implementations of the user supplied authentication logic. So here’s the complete sample tornadoweb application with the MainHandler being wrapped with an interceptors which triggers basic authentication. Enjoy.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
import base64
import logging
import logging.config
import tornado.httpserver
import tornado.ioloop
import tornado.web

log = logging.getLogger("root")

def authenticator(realm,handle,password):
    """ 
    This method is a sample authenticator. 
    It treats authentication as successful
    if the handle and passwords are the same.
    It returns a tuple of handle and user name 
    """
    if handle == password :
        return (handle,'User Name')
    return None

def user_extractor(user_data):
    """
    This method extracts the user handle from
    the data structure returned by the authenticator
    """
    return user_data[0]

def basic_authenticate(realm, authenticator,user_extractor) :
    """
    This is a basic authentication interceptor which 
    protects the desired URIs and requires 
    authentication as per configuration
    """
    def wrapper(self, transforms, *args, **kwargs):
        def _request_basic_auth(self):
            if self._headers_written:
                raise Exception('headers have already been written')

            self.set_status(401)
            self.set_header('WWW-Authenticate','Basic realm="%s"' % realm)
            self.finish()

            return False
        request = self.request
        format = ''
        clazz = self.__class__
        log.debug('intercepting for class : %s', clazz)
        try:
            auth_hdr = request.headers.get('Authorization')

            if auth_hdr == None:
                return _request_basic_auth(self)
            if not auth_hdr.startswith('Basic '):
                return _request_basic_auth(self)

            auth_decoded = base64.decodestring(auth_hdr[6:])
            username, password = auth_decoded.split(':', 2)

            user_info = authenticator(realm, unicode(username), password)
            if user_info :
                self._user_info = user_info
                self._current_user = user_extractor(user_info)
                log.debug('authenticated user is : %s',
                          str(self._user_info))
            else:
                return _request_basic_auth(self)
        except Exception, e:
            return _request_basic_auth(self)
        return True
    return wrapper

def interceptor(func):
    """
    This is a class decorator which is helpful in configuring
    one or more interceptors which are able to intercept, inspect,
    process and approve or reject further processing of the request
    """
    def classwrapper(cls):
        def wrapper(old):
            def inner(self, transforms, *args, **kwargs):
                log.debug('Invoking wrapper %s',func)
                ret = func(self,transforms,*args,**kwargs)
                if ret :
                    return old(self,transforms,*args,**kwargs)
                else :
                    return ret
            return inner
        cls._execute = wrapper(cls._execute)
        return cls
    return classwrapper

@interceptor(basic_authenticate('dummy_realm',authenticator,user_extractor))
class MainHandler(tornado.web.RequestHandler):
    def get(self):
        self.write("Hello, world")

application = tornado.web.Application([
    (r"/", MainHandler),
])

if __name__ == "__main__":
    http_server = tornado.httpserver.HTTPServer(application)
    http_server.listen(8888)
    tornado.ioloop.IOLoop.instance().start()

Functionally Programming Tic Tac Toe

I am a novice functional programmer. So if you find any areas of suggested improvement on my FP skills, do add a comment below. Below is a sample Tic Tac Toe written using Functional Programming constructs (or at least whatever I understood about them). Right now, I’ve implemented only a simple player who basically marks off a cell randomly. It should be possible to define more intelligent players similarly. Hopefully all the comments in the code should be sufficiently self explanatory.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
import random
import functools
import itertools
import copy

def row_gen(board):
    """ 
    Generator which returns tuples of tuples. 
    The outer tuple represents a row.
    The inner tuple represents each cell with 
    its coordinates and data 
    """
    return (
        tuple((row,col,board[row][col])
              for col in range(3))
        for row in range(3))

def col_gen(board):
    """ 
    Generator which returns tuples of tuples. 
    The outer tuple represents a column.
    The inner tuple represents each cell with 
    its coordinates and data 
    """
    return (
        tuple((row,col,board[row][col])
              for row in range(3))
        for col in range(3))

def dia_gen(board):
    """ 
    Generator which returns two tuples of tuples. 
    The outer tuple represents a diagonal.
    The inner tuple represents each cell with 
    its coordinates and data 
    """
    return (
        tuple((i,i,board[i][i]) for i in range(3)),
        tuple((i,2-i,board[i][2-i]) for i in range(3)))

def empty(board):
    """ Gets a list of all empty cells on the board """
    return (
        (row,col)
        for row in range(3)
            for col in range(3)
                if board[row][col] == None)

def is_empty(board):
    """ checks if any cell on the board is still empty """
    for row in range(3) :
        for col in range(3) :
            if board[row][col] is None :
                return False
    return True

def random_player(chr,board):
    """Represents a player who randomly marks his next move """
    empty_cells = list(empty(board))
    play = empty_cells[random.randint(0,len(empty_cells)-1)]
    board = copy.copy(board)
    board[play[0]][play[1]] = chr
    return board

def show(board):
    """ Shows the current representation of the board """
    print '+-+-+-+'
    for row in range(3) :
        print '|%s|' % '|'.join(
            board[row][col]if board[row][col] is not None else ' '
            for col in range(3))
        print '+-+-+-+'

if __name__ == '__main__' :
    # Initialise the board with all values set to None       
    board = list(list(None for i in range(3)) for j in range(3))
    show(board)

    # Initialise the two players. Each is given a separate char
    player1 = functools.partial(random_player,'X')
    player2 = functools.partial(random_player,'O')

    # Setup an iterator to continuously cycle through both players
    players = itertools.cycle((player1,player2))

    # Play game
    over = False
    while not over and not is_empty(board) :
        # Ask the next player to make his move
        board = players.next()(board)
        show(board)

        # Note a cell bag set is all rows, 
        # all cols and all diagonals
        for cell_bag_set in (
                row_gen(board),
                col_gen(board),
                dia_gen(board)) :

            # Note a cell_bag can be a row, column or a dia
            for cell_bag in cell_bag_set :
                vals = set(val[2] for val in cell_bag)
                if len(vals) == 1 and  \
                        vals.__iter__().next() is not None :
                    over = True
                    print 'Over'
                    break

Functional Programming in PHP

Was recently brave (or foolhardy) enough to attempt to present how functional programming constructs could be used even in a language such as PHP. Now I am fully aware that one of the bigger reasons for functional programming ie. concurrency is simply not applicable as yet for a language like PHP, but I thought it might still be useful to reduce global variable proliferation, and improve both composability and testability. The following sample PHP code was what I presented. The associated Twig template is not shown since it is not particularly relevant to the context. Note that the following is simply a sample code - any attempts to use it are at your own risk - I am simply not a good enough PHP programmer to even know if the code is good or has many smells that are entirely invisible (or unodorous) to me. The code simply retrieves data from a table, filters it and shows it on screen.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
<?php
 require_once 'Twig/Autoloader.php';
 Twig_Autoloader::register();

##### Global constants #####
$base='\/funcblog';

function initialise()
{
	##### Environment #####
	$loader = new Twig_Loader_Filesystem('templates');
	$twig = new Twig_Environment($loader, array(
	  'debug' => true,
	  'cache' => 'cache',
	));
	
	
	#### Database initialisation #####
	
	$link = mysql_connect('localhost', 'funcblog', 'funcblog')
	    or die('Could not connect: ' . mysql_error());
	mysql_select_db('funcblog',$link);
	return array('twig' => $twig, 'link' => $link);
}

##### Application configuration #####
$routes = array(
	array(path=>'posts',handler=>'get_posts'),
	array(path=>'categories',handler => 'get_categories')
	);

// Initialise the route map based on the routes configuration
$route_map = array_reduce(
		$routes,
		function ($x,$y) { $x[$y['path']] = $y['handler']; return $x;},	
		array());

##### Functions #####

// Extract path and request components from URI
function compose_request($request_uri,$base)
{
	preg_match_all('/('. $base . ')(?P<path>(\/\w+)+)(\?(?P<query>(\w+=[^\&]+\&?)*))?/', $request_uri, $words, PREG_SET_ORDER);
	return array($words[0]['path'], $words[0]['query']);
}

// process the uri based on path and query options
function process_uri($context, $route_map, $path,$query)
{
	$handler_name = $route_map[ltrim($path,'/')];
	return $handler_name($context, $query);
}

// Display the template with the associated data
function display_template($context,$template,$data)
{
	$context['twig']->loadTemplate($template)->display($data);
}

$context = initialise();

// Closure to extract rows from table
$context['get_rows'] = function ($query) use($context)
{
	$result = mysql_query($query,$context['link']);
	$rows = array();
	while ($row = mysql_fetch_assoc($result))
	{
		$rows[] = $row;
	}
	return $rows;
};

// From the request URI, extract the path and the query parameters
list($path, $query) = compose_request($_SERVER['REQUEST_URI'],$base);

// Dispatch to the appropriate function based on the uri to routes mapping
list($template, $data) = process_uri($context, $route_map,$path,$query);

// Display the template
display_template($context,$template,$data);

##### Custom Application Logic #####

function get_posts($context,$query)
{
	# NOTE : An Array filter function, filters a collection using a
	#        filter function. It returns a subset of that collection
	#        for which the filter function (predicate) evaluates to true
	$posts = array_filter(
		$context['get_rows']('select * from posts'),
		function($row) {
			return ($row['author'] == 'dhananjay') ;
		}
	);
	return array('posts.html', array('rows'=> $posts));	
}
?> 

Dynamically Adding Methods With Metaprogramming : Ruby and Python

I came across an interesting post Metaprogramming: Ruby vs. Javascript, which discusses and contrasts about how metaprogramming can be implemented in Ruby and Javascript. I thought it might be fun to document the same from a python perspective as well. Here are the discussed samples. All the ruby code is quoted from from the blog post linked to above whereas the python code is what I wrote. 1. Initial class declaration and initialisation We first declare a Ninja class and create two instances. Ruby
1
2
3
4
5
6
7
8
9
10
class Ninja
  attr_accessor :name

  def initialize(name)
    @name = name
  end
end

drew = Ninja.new("Drew")
adam = Ninja.new("Adam")
Python
1
2
3
4
5
6
class Ninja(object):
	def __init__(self,name) :
		self.name = name

drew = Ninja('drew')
adam = Ninja('adam')
2. Add a method to the class In this step we add a method to the class dynamically. It is therefore available to all the instances of the class Ruby
1
2
3
4
5
6
7
8
9
10
class Ninja
  def battle_cry
    puts "#{name} says zing!!!"
  end
end

drew.battle_cry
# => Drew says zing!!!
adam.battle_cry
# => Adam says zing!!!
Python
1
2
3
4
5
6
7
8
def battle_cry(self):
	print '%s says zing!!!' % self.name
Ninja.battle_cry = battle_cry

drew.battle_cry()
# => Drew says zing!!!
adam.battle_cry()
# => Adam says zing!!!
3. Add a method to an instance In this case we add a method only to a particular instance. The same method will not be available to other instances of the same class. Ruby
1
2
3
4
5
6
def drew.throw_star
  puts "throwing a star"
end

drew.throw_star
# => throwing a star
Python
1
2
3
4
5
6
7
import types
def throw_star(self):
	print 'throwing a star'
drew.throw_star = types.MethodType(throw_star,drew)

drew.throw_star()
# => throwing a star
4. Invoke a method dynamically In this case we supply the method name as a string and invoke it. Ruby
1
2
drew.send(:battle_cry)
# => Drew says zing!!!
Python
1
2
drew.__getattribute__('battle_cry')()
# => Drew says zing!!!
5. Defining class level methods dynamically with closures Here a class level method is defined which closes over some of the attributes in its context (in this case the method color is able to access the variable color_name as a closure). Ruby
1
2
3
4
5
6
7
8
9
10
color_name = 'black'

Ninja.send(:define_method, 'color') do
  puts "#{name}'s color is #{color_name}"
end

drew.color
# => Drew's color is black
adam.color
# => Adam's color is black
Python
1
2
3
4
5
6
7
8
9
10
color_name = 'black'

def color(self):
	print "%s's color is %s" % (self.name, color_name)
Ninja.color = color

drew.color()
# => Drew's color is black
adam.color()
# => Adam's color is black
6. Defining a method dynamically on an instance that closes over local scope and accesses the instance’s state Note that in this case the function is being defined on the instance using a dynamic name (‘swing’) Ruby
1
2
3
4
5
6
7
8
9
10
11
12
13
14
class Object
  def metaclass
    class << self; self; end
  end
end

sword_symbol = "*********"

drew.metaclass.send(:define_method, 'swing') do |sound_effect|
  puts "#{name}: #{sword_symbol} #{sound_effect}"
end

drew.swing 'slash!!'
# => Drew: ********* slash!!
Python
1
2
3
4
5
6
7
8
sword_symbol = "*********"

def foo(self,sound_effect):
	print "%s : %s %s" % (self.name,sword_symbol,sound_effect)
	
drew.__dict__['swing'] = types.MethodType(foo,drew)
drew.swing('slash!!')
# => Drew : ********* slash!!

Concise Python Coding

I love python for a number of reasons. Brevity is one of them. Readability is another. However writing readable concise code can take some time to getting used to. This post discusses some code snippets especially in the context of some of the comparisons done between python and clojure - another language which excels at brevity and shares yet another feature with python, ie. people either love or hate its syntax, they are rarely indifferent to it. Please note : The intent of this post is not to compare LOC of python with other languages. It is meant to demonstrate python code implementation which helps describe the capabilities of python in terms of writing clean, concise, readable code. This post continuously refers to other posts and may require you to read the referred posts in conjunction with this post to get maximum value. Code and ceremony The post Java vs Clojure – Lets talk ceremony! describes a simple problem which requires 28 LOC in Java and 1 or 9 in clojure. It is a simple logic which creates a list, populates it with four values (two of them identical) and creates a list with unique items. There are two alternative clojure implementations, one in 1 line of code and another in 9 - the former using a built in construct in the language which especially helps to write a more concise logic, and the latter by writing code similar to the Java code. Without any further ado, here are the two alternative python implementations a. By using built in capabilities - In this case we use the set collection
1
print set(("foo","bar","baz","foo"))
b. Now here’s where we will not use a built in operator which automatically chooses the unique elements but again write code similar to the java code. This code is as follows :
1
print reduce(lambda x,y:  x + [y] if y not in x else x ,("foo","bar","baz","foo"),[])
There - both of them are exactly one line long. Project Euler solutions The post Python vs Clojure – Evolving refers to many sample python implementations solving some of the problems documented on project euler site. We shall revisit these python implementations here. a. Find all numbers dividable by 3 or 5 in 0 < x < 1000: In this case the clojure code is 4 lines long where as the python implementation is 3 lines long
1
print sum(i for i in range(1,1000) if i%3 == 0 or i%5 == 0)
Again the implementation is exactly 1 line long - no lambdas, no filters etc. b. Get the sum of all even Fibonaccis below 4 mill. The suggested python implementation is about 17 lines long. The implementation below 7.
1
2
3
4
5
6
7
8
9
from itertools import takewhile

def fib():
    x,y = 1,1
    while True :
        x,y = y,x+y
        yield x

print sum(i for i in takewhile(lambda x : x < 4000000,fib()) if i%2 == 0)
c. Finding Palindromes The suggested python solution uses about 23 lines of code, to 6 in clojure and takes 15 seconds almost 300% slower than the clojure code (emphasis from the post referred to). Here’s another alternative implementation.
1
print max(x * y for x in xrange(100,1000) for y in xrange(100,1000)  if str(x*y) == str(x*y)[::-1])
Again if you notice - exactly one line of code. And incidentally on my notebook it clocks in about 0.85 seconds which makes the clojure implementation about 470% slower than the python implementation. (emphasis mine) Top Rank Per Group In yet another post Python vs Clojure – Reloaded, the author discusses the Rosetta Top Rank Per Group implementation of python. Unsurprisingly the python implementation uses 11 lines of code (not including the initialisation) compared to clojure’s 6. One of the commenters suggests an eminently readable and simple python solution in about 6 lines of code. The code snippet is as follows
1
2
3
4
5
6
7
8
9
10
11
from itertools import groupby
from operator import itemgetter

data = [dict(name="Tyler Bennett", id="E10297", salary=32000, department="D101"), ...]

department = itemgetter('department')
for dep, persons in groupby(sorted(data, key=department), department):
    print "\nDepartment", dep
    print " Employee Name   Employee ID     Salary          Department"
    for person in sorted(persons, key=itemgetter('salary'):
        print "%(name)-15s %(id)-15s %(salary)-15s %(department)-15s"
An alternative far less readable solutions which shaves off one more line but certainly not preferred to the solution above (am just listing it since it shaves off the step of having to create an intermediate data structure at all) is
1
2
3
4
5
6
7
8
9
10
import operator
from itertools import groupby

data = [('Employee Name', 'Employee ID', 'Salary', 'Department'),....]

print reduce(lambda x,y : x + y,
         (('\nDepartment %s\n  Employee Name   Employee ID     Salary          Department  ' % dept +
           reduce(lambda x,y : x + ("\n  " + "%-15s " * len(data[0]) % y),
                  sorted(recs, key=lambda rec: -rec[-2])[:3],''))
         for dept, recs in groupby(sorted(data[1:],key=operator.itemgetter(3)), lambda x : x[3])),'')
Some might wonder whether I am joining multiple lines of code into one just to reduce the line count. The code is consistent with the idiomatic usage of python and also with the way python list comprehensions are documented in the formal python proposal for list comprehensions PEP 202 : List Comprehensions. Besides python is a whitespace sensitive language and the compiler disallows multiple lines of code in one line without using the ‘;’ separator which I have most definitely not used in the suggested solutions. PS: There’s of course an interesting image at the end of the post indicating the perception of how clojure code speaks for itself ;). The reason I mention it is that I believe if one is expressing a strong opinion as the one reflected by the picture, one better be far more careful and thorough with verifying the veracity of the assertionsappropriateness of the samples compared to. Let me clearly state that I do not know that the clojure loc compared to is an appropriate target so do not infer this post to be a Python vs Clojure LOC as much as a post which emphasises that Python is a good vehicle to write concise readable code. FWIW I am learning clojure and look forward to building more competence with it.