School choice in Python

Check out my new Github repository at https://github.com/vanderlindenma/school_choice_python.

It contains code to compute school choice assignments via Deferred and Immediate acceptance, and explanations on how to apply the code to different school choice settings.

The code is based on former code by Jeremy Kun, stable-marriage, (2014), GitHub repository, https://github.com/j2kun/stable-marriages, described in one of Jeremy’s blog posts at http://jeremykun.com/2014/04/02/stable-marriages-and-designing-markets/.

I hope to expand on the code and add more functionalities soon.

Advertisements

Otree : when treatment variables don’t fit into a model field

In oTree, because treatment variables are stored as models, your treatment
variable must fit into one of the field types available in Django (see oTree’s documentation on treatment variables).

In this post I’ll describe a hack to circumvent this limitation. It is kind of ugly but it works for me, and it’s the only way I have found to have lists as treatment variables.

The idea is to first convert your treatment variable into a string
and then convert the string back to its intended format.

To do so, you can use the ast package.
For instance, if your treatment variable consists in choosing between the
two lists [1,2,3] and [3,1,2] at the level of the group, you
would first define a treatment field on the Group model:

class Group(BaseGroup):
# ...
   treatment = models.CharField()

Then inside the before_subsession_starts method, you would turn the
chosen list into a string and assign it to treatment (because
treatment is a CharField, it can only be assigned strings)

def before_session_starts(self):
   for group in self.get_groups():
# Turn the chosen list into a string
      group.treatment = '%s' %(random.choice([1,2,3],[3,1,2])

Assuming [1,2,3] was picked, group.treatment is now a string of the
form '[1,2,3]'. Thus, you will need to define a method that will
convert the string group.treatment back to a list whenever
you need it. To do so, first import the package ast:

# Add the following to the import statements before
# class Constants(BaseConstants):

import ast

Then define the following method inside the Group model:

class Group(BaseGroup):
# ...
   def convert_to_list(self):
      self.treatment = ast.literal_eval(self.treatment)

You can now use your treatment variable as a list by calling
convert_to_list. For instance, if you want to use your list
in the Decide page, you could call convert_to_list inside
the vars_for_template method inside view.py:

class Decide(Page):
# ...
   def vars_for_template(self):
#...
      self.group.convert_to_list()

This will make group.treatment available as a list in the
corresponding template.

Troubleshooting

  1.  Remember to have a look at the doc on treatments from oTree. The example above is for treatments at the group level, and you may need to adapt this according to your needs (e.g. by storing your treatment variable in the participant.var of the first player in the group as described in the doc).
  2. One complication with using Django model.CharField() is that it may change the encoding of your string from utf-8 or ascii to unicode. If that causes trouble, you may need to redefine your convert_to_list method to something like
class Group(BaseGroup):
# ...
   def convert_to_list(self):
      if isinstance(self.treatment,str):
         self.treatment = ast.literal_eval(self.treatment)
      if isinstance(self.treatment,unicode):
         self.treatment = ast.literal_eval(self.treatment.encode("utf-8"))

A glance at python’s memory management

Just learned the hard way about some basics of python’s memory management. For people coming from R or other languages, it might be confusing to realize that if you define

x = [1,2,3]

set

x = y

and then modify x, for instance through

x.append(4)

the change in x will propagate to y. This means that if you query the value of y, you will in fact get

[1,2,3,4]

This also means that the sequence

x = [1,2,3]
x.append(4)

Is very much different from

x = [1,2,3,4]

For more on this and how to “really” define a new list with some life of its own in the memory, but with the same value as x, see http://henry.precheur.org/python/copy_list.

Tackling Python’s project structure

I recently started playing with oTree (“a Django-based framework for implementing multiplayer decision strategy games.”) in order to code a school choice experiment. This lead me to dig deeper into Python.

One of the issue I got stuck on is probably the most basic issue of all : how to structure the folder and subfolders containing my Python code?

I was not looking for anything fancy here. The code for oTree experiment is mostly located in a file called  model.py and all I wanted was a way to

  • Define the Classes I use in model.py in separate files that I could later “import” in model.py, so that model.py does not become crazy long and unreadable
  • Group the files in which I define those classes into subfolders so that the folder containing model.py does not get overcrowded either.
  • E.g : because I am coding a school choice experiment, I wanted to have a subfolder containing all the solvers that I use to compute the final assignments based on preferences and priorities, and later import the corresponding classes in model.py in order to compute participants’ payoffs.

As basic as this problem may seem, the solution is not obvious (at least to me). Googeling  this kind of issues yields a ton of different solutions, and it’s easy to get lost.

The most understandable and functional solution I have found so far is : http://mikegrouchy.com/blog/2012/05/be-pythonic-__init__py.html

If we are to believe the title, this should also be a “Pythonic” solution, which means it should — hopefully — put you at peace with Python aesthetes (from what I understand, “Pythonic” means something like “in the spirit of Python” or “following the coding strandards which are considered good practice by Python’s community”, whatever that may mean).

Two warnings about the solution in the above links:

  • Although the solution claims to be “Pythonic”, I’ve often seen people argue against the use of imports of the form from subpackage import *. These people usually say that these tend to “clutter the namespace”. The truth is I have no idea what that refers to, so I don’t know whether the argument applies to the use of import *  described in the links. Anyways it is good to know that import * can sometimes freak people out, even if — like myself — you don’t quite understand why.
  • Suppose that following the solution described in the link, you’ve specified the __all__ variable in __init__ in the subpackage directory , and you run  from subpackage import * from file2.py (in the package directory).   Now say you want to access That_Class, one of the Classes from submodule2.py in the subpackage directory. Don’t be surprised if calling  That_Class() from file2.py returns the error NameError: name 'That_Class' is not defined. Indeed, you’ve only imported submodule2.py “as a single object”, and not all the Classes it contains individually. Therefore In order to call That_Class from file2.py, you will need to use submodule2.That_Class().

R : a coercive programming language

I recently got a great answer to a question I asked on economics.stackexchange.com which lead me to explore the matchingMarket R-library, by Thilo Kleins. The library contains very handy functions to compute the outcome of some of the most famous matching algorithm in the college admission literature.

I wanted to tweak the functions a little to use them in a research project but I quickly realized that my knowledge of R was to limited to do that. So I painfully started learning a little more about R and I figured I might as well document some of my quest for R-mastery here. Hopefully, I won’t be completely off all the time and what I write may help someone.

Here is the first important thing  that I learned the hard way and which is not always emphasized in introductions to R : R is a coercive programming language. This means that, from time to time, R will “coerce” variables of a certain type  (logical, numeric,…) into another type. Suppose for instance that we ask

1 == TRUE

This seems to make no sense : 1 is a numeric, TRUE is a logical, what does it mean to ask whether they are equal? Intuitively, one may expect equality comparisons to only be applicable between numbers, and one might expect

1 == TRUE 

to yield an error. Well, it might be so in other languages, but not in R. It is not that R can “really” answer questions like

"apple" == 3.14

But if you ask R to do so, it will still try to see if it cannot, in one way or another, transform (i.e. coerce) the type of “apple” and “3.14” to make them comparable. Back to the first query

1 == TRUE 

R knows that TRUE is a logical and that 1 is a numeric. But instead of giving up and returning an error, R will ask “what if I tried to turn ‘TRUE’ into a numeric and performed the equality check anyways”? The R function in charge of trying to turn any argument into a numeric is as.numeric. Thus when sending R the former query, you are in fact asking

1 == as.numeric(TRUE) 

Now, somewhat unsurprisingly, R people have decided that

as.numeric(TRUE)=1

Other variables have a defined numeric equivalent. For example, as you might expect,

as.numeric(FALSE)) = 0

Thus to sum up

1 == TRUE 

is equivalent to

1 == as.numeric(TRUE) 

and therefore in the console

> 1== TRUE
[1] TRUE

The last piece of code might seem like nothing, but for someone who starts to learn R, it can be a huge pain to figure out. Now, there is much more to coercion and object  types than the little example above.  One could for instance ask : “How do we know if R will coerce ‘TRUE’ into a numeric or ‘1’ into a logical”? The short answer can be found in the help of the == operator (” ?`==` ” in the console).

” If the two arguments are atomic vectors of different types, one is coerced to the type of the other, the (decreasing) order of precedence being character, complex, numeric, integer, logical and raw.”

Also, if == was the only operator to coerce variables, it would not be so much of an issue. Things are much more complex however : many other operators and functions do coerce object types.

As a matter of fact, there would be a lot to say (and I still have a lot to learn) about R types and the way R manages coercion. It is not the place — nor am I the right author — to do this. My only goal here was to stress this following simple point :

When dealing with R, one must learn to live with the fact that object types will not always match and that this will not always lead to an error. In an ideal world (?), code would not rely too much on automatic coercion by R and types would be explicitly matched before performing operations. In practice however, code does rely on automatic coercion and if one wants to survive the R world, one must learn about and get used to coercion. At the very top of the list of question one should ask him or herself when one does not understand a piece of code is : “is there any coercion happening which I am not completely comfortable with?”.

To dig deeper, here are a couple of relevant questions on http://stackoverflow.com/: