p-hacking (and other data manipulations) made mainstream?

I am not a big fan of the last part of the talk that is too optimistic about randomized control trials and may make then look like some sort of panacea. RCTs have important limitations, especially when it comes to external validity. If you run enough of them, you are bound to find the kind of patterns Laura Arnold seems to criticizes using the 15 ingredients study (7:20 in the video). What works here now might not work there tomorrow. Eventually, you need structural models and sound theory to understand when a treatment works, and when it does not.

That’s the curse of the TED format. Arnold is somewhat critical of the format, but she has to fit in it. I am sure she and the organizations she advocates for know of the limits of RCTs. But there’s only so much you can say in 18 minutes if you want to conclude on an uplifting note.

Still, the overall message is very much worth spreading. It’s important to popularize notions like p-hacking and file drawer effect that rarely even make it into introductory statistics class. And it’s nice to see TED being a little critical of itself (or rather TEDx being critical of TED).

Advertisements

LanguageTool : adding words

One slightly unpleasant feature of LanguageTool with Texstudio is that new words are a little harder to add to the dictionary than when using Texstudio’s native spellcheck.

The good part of having to add words through LanguageTool is that words you add are, well, actually added to the dictionary, whereas adding words to Texstudio’s native dictionary is — in my experience — unstable (I’ve had to add the same words multiple time in many occasions, in particular after updates).

For explanation on how to add words to LanguageTool’s spell check, once again see the very good documentation on LanguageTool’s website at http://wiki.languagetool.org/hunspell-support#toc0.

I will try to keep an updated list of words I added here. The list might be of some use in particular to those working in a field related to microeconomics theory.

Grammar nightmares: the road to salvation with LanguageTool

I am terrible with grammar, as you will likely observe somewhere in this post or elsewhere on my website. This is often very embarrassing. These days however, I should be able to spare myself the embarrassment given the plethora of language checking softwares.

There are two main reasons these softwares do not do the job for me:

  1.  I do most of my writing in LaTex with Texstudio, and Texstudio only comes with a rudimentary spell checker with little grammar checking abilities (it’s a pain to copy paste in Word, mostly because Word’s checker gets caught into LaTex syntax).
  2. I am so bad that even state-of-the-art language checker do not catch most of my mistakes. For instance, I am very bad with homophones. I often get words like “to” and “too” mixed up when I write, which even Word’s checker misses most of the time.

Regarding 2., what I really need is a language checker in which I can set up my own rules. When I realize I’ve made a mistake, I know I am likely to make that mistake again. Thus it is just a matter of making the effort to write down a rule that will catch that mistake for me in the future.

I’ve wanted to do just that for a while, but never found the right tool. My salvation might come from LanguageTool.

  1. LanguageTool works with Texstudio (see https://www.youtube.com/watch?v=VYIY7bbSv4Q for a simple installation tutorial) and natively improves upon the default language checker in Texstudio.
  2. LanguageTool gives you the ability to add your personal rules using a relatively straightforward syntax (there is a learning curve, but it’s not too bad).

For instance, I can easily tell LanguageTool to look for instances of “It is not to bad” (which Word’s checker does not flag) and suggest to replace it by “It is not too bad”.

Rules in LanguageTool are quite versatile and allows for regular expression via the regex syntax.

LanguageTool’s tutorial explains how to create and add rules very didactically at http://wiki.languagetool.org/development-overview#toc4. Because the previous link has broken in the past, here is a direct quote describing the basics:

“Most rules are contained in rules/xx/grammar.xml, whereas xx is a language code like en or de. In the source code, this folder will be found under languagetool-language-modules/xx/src/main/resources/org/languagetool/; the standalone GUI version contains them under org/languagetool/.

A rule is basically a pattern which shows an error message to the user if the pattern matches. A pattern can address words or part-of-speech tags. Here are some examples of patterns that can be used in that file:

  • <token>think</token>
    matches the word think
  • <token>think</token> <token>about</token>
    Matches the phrase think about – as the text is split into words, you need to list each word separately as a token. This will not work: <token>think about</token>
  • <token regexp="yes">think|say</token>
    matches the regular expression think|say, i.e. the word think or the word say. You can write simple rules without knowing regular expressions, but if you want to learn more about them you can try this tutorial.”

LanguageTool even has a handy rule editor that you can use to create new rules if you don’t want to learn too much about the rules’ syntax (community.languagetool.org/ruleEditor2/index).

You can download a list of the custom rules I find most useful here. I will try to update the list as I write down new rules.

Finding all stable matchings in roommate (and marriage) problems

Patrick Prosser has some great java code at http://www.dcs.gla.ac.uk/~pat/roommates/distribution/ which, among other things, can compute all the stable matchings in roommate problems.

If you are interested in two-sided matchings, rejoice : Patrick’s code allows preferences over roommates to include unacceptable roommates. To implement a two-sided market, just make sure any roommates on one side of the market views any other roommate on the same side of the market as unacceptable, and you’re good to go.

If (like myself) you are not used to java, you might struggle a little to get the code working. Here is a little tutorial for Mac OS, which worked for me as of today.