Axiom Tutoring

View Original

Combinatorics simulated in Python, using Sympy

As part of a project I'm building I wrote the following code to simulate combinatorial enumeration of certain sets. I'll explain the code a little bit in this blog post, after the code.

from sympy.utilities.iterables import variations, kbins
from sympy import rf, ff, binomial, symbols, simplify, expand

def nkwords(n, k):
    '''Generate words of length k from [n] which have the property'''
    return list(variations([i for i in range(n)], k, True))

def nkperms(n, k): 
    '''Count the number of permutations of length k from [n] which have the property'''
    return list(variations([i for i in range(n)], k))

def nkmult(n,k): 
    '''Generate all multisets of size k from [n].  A list of lists, each nested list represents its index mapping to a count.  The sum of counts is k.'''
    leastone = kbins(list(range(n+k)), n)
    multList = []
    for m in leastone:
        mult = [0 for i in range(n)]
        item = 0
        for x in m:
            mult[n-1-item] = len(x)-1
            item += 1
        multList.append(tuple(mult))
    return multList

def cprod(setsList):
    '''Cartesian product of a list of lists.'''
    nthprod = [[]]
    for set in setsList:
        for i in range(len(nthprod)):
            tup = nthprod.pop(0)
            for x in set:
                extend = tup.copy()
                extend.append(x)
                nthprod.append(extend)
    return nthprod

def generate(domain, map, property):
    '''domain is any list of  elements, map is a function defined on every domain element.  Property is a boolean function taking arguments that are the return values of map.  The return value is the number of elements satisfying the property.'''
    return [map(x) for x in domain if property(map(x))]

Oh boy.

What’s going on here?

Alright, first thing’s first, what even is the point of the function nkwords? It is meant to simulate a language containing n symbols, and forming all possible words of length k from that language. I’ll explain more, as that is probably confusing to anyone not already familiar with what’s going on here.

It’s probably clearest to explain with an example. Suppose that we have the symbols 0, 1, 2, 3. In this example our language will therefore be built up from just an alphabet of these four characters.

(It should be clear that here we mean a “formal language”, which is a sort of mathematical over-simplification of what languages are. The use of words like “language” and “word” are merely metaphors, not to be taken literally.)

We can list out all words of length 2 in the following way.

01, 02, 03, 04, 11, 12, 13, 14, 21, 22, 23, 24, 31, 32, 33, 34, 41, 42, 43, 44

This is in effect what the call nkwords(4,2) returns.


Next is nkperms which attempts to simulate permutations taken from the set [n]. This notation is explained by the equation

$$ [n] = \{0, 1, 2, …, n-1\} $$

That is to say, [n] is notation for a set of n integers. The least in the set is 0 and the maximum is n-1.

Now permutations may be taken “k at a time”. Let’s see this in an example. If we take [1, 2, 3, 4] let’s look at what we get when we list all permutations taken 2 at a time. That means we take all 2-element pairs, such that an element is never repeated in the pair. So the list of all such permutations is

(1,2), (1,3), (1,4), (2,3), (2,4), (3,4)

The function call nkwords(4,2) returns exactly this list.


Lots of people encounter words and permutations in a discrete math class, but it is a slightly rarer discrete math class which ever brings up the idea of a multiset. A multiset can be thought of as a generalization of the idea of a set, in the sense that sets do not “allow for repetition of elements”. That is to say in a traditional set {0, 1, 1} is the same set as just {0, 1}. In a multiset we regard them as distinct. Still just like in sets, we do not respect ordering in multisets. Therefore the multiset {0, 1, 1} is the same as the multiset {1, 0, 1}.

Notice that the idea of a multiset is essentially the same thing as the idea as taking a set, and associating each element with a quantity, meant to represent how many times that element is repeated. Therefore we could say that {0, 1, 1} as a multiset is the same thing as the set {0, 1} together with a function f(0) = 1, f(1) = 2. The function expresses that element 0 is repeated 1 time and that element 1 is repeated 2 times.

We call the size of a multiset the number of things in it. Hence the size of {0, 1, 1} has size 3.

For a given set [n] and fixed size k, the function call nkmult(n,k) returns a list of objects which represent every multiset derived from the set [n] which has size k.


The cprod function is meant to accept a list of sets. Each set is itself represented by a list, but we assume that like a set it has no repetitions and we will not do anything important with the order. It then returns a representation of the Cartesian product of all the input sets. That is to say it returns a set (list) of tuples. In particular it returns the set of every possible tuple which could be formed by making the first tuple coordinate some element of the first set, the second coordinate any element of the second set, and so on.


Finally the intent of the last function, called generate, is to do two functions at once. One of its intended functions is to take an association between the sorts of elements that each of the earlier functions returns, and “translate” them to elements which the user may be interested in. For instance, although the earlier functions may be bound to working with [n], the user may wish that these elements were not numbers but instead cards from a deck of cards (or at least some kind of code representation, having suit and face-value information, or something like that). By means of the map input, we assign each element of [n] to its intended interpretation and possibly some other kind of element.

Next, once the elements have been so interpreted, we then include only the ones satisfying the input property. That is to say, property is a function which takes as arguments the kinds of things that map outputs. And it returns a boolean value, based on whether a particular element does or does not have a desired property. In this way, such a function often goes by the name filter in other computer science contexts.

Finally the generate function returns the list containing only the enumerated elements which satisfy the given property.