Quick Post: Misrepresenting Evidence In Behavioral Finance/Genetics
TL;DR: this post looks at a representative example of social scientists playing the telephone game with genetics papers. The intended takeaway is that researchers who wish to claim their hypotheses are founded on prior work in biology cannot freely change the compositions of groups or alter the findings of the papers they cite. Simple post; but I think useful.
A Remarkable Coincidence
Shortly after the previous marathon pointing out fatal flaws in a score of papers in machine learning in finance/trading, the PLoS ONE editorial staff responsible for a couple of those academic catastrophes informed me that they had “opened investigations” into the problems. Then the other day they also informed me that the openings have been the extent of their efforts. So while they’re busy not doing anything, let’s take a quick look at yet another paper that should not have passed whatever they call peer review.
Paper: Genetic Determinants of Financial Risk Taking — Camelia M. Kuhnen, Joan Y. Chiao
As the title implies, this paper attempts to correlate certain genes with risk taking behavior in participants playing an investment game created using E-Prime. From the abstract:
Individuals vary in their willingness to take financial risks. Here we show that variants of two genes that regulate dopamine and serotonin neurotransmission and have been previously linked to emotional behavior, anxiety and addiction (5-HTTLPR and DRD4) are significant determinants of risk taking in investment decisions. We find that the 5-HTTLPR s/s allele carriers take 28% less risk than those carrying the s/l or l/l alleles of the gene…
Summarizing The Necessities:
- Experiment participants played a repeated investment game in which each round they would make investment decisions about how to allocate a given amount of assets between a riskless and risky asset. At the end there was some nominal payoff based on their investment success randomly drawn from one of the rounds.
- Participants were then genotyped.
- One of the gene regions of interest was 5-HTTLPR which “consists of a 44-base pair insertion or deletion, generating either a long (l) or a short (s) allele.”
- Each participant was then classified by those alleles as having either two long (l/l), two short (s/s), or a long and a short (s/l).
- Citing Lesch et al. the authors write, “the short variant of the polymorphism reduces the transcriptional efficiency of the 5-HTT gene promoter and is associated with higher scores on neuroticism and harm avoidance.”
- From this, the authors “hypothesized that individuals carrying two copies of the s allele of the 5-HTTLPR would be significantly more risk averse relative to individuals carrying one or two copies of the l allele.”
- For results, the authors compare the difference between two groups: (1) participants with s/s vs (2) participants with either s/l or l/l. The first group appears to take significantly less risks.
You should always be suspicious anytime you see a grouping like this, especially when the only semblance of justification the authors deploy is the vague and ever hand-wavy “based on previous research.”
It also didn’t smell right. Recall that the authors claimed: “the short variant of the polymorphism reduces the transcriptional efficiency of the 5-HTT gene promoter and is associated with higher scores on neuroticism and harm avoidance.” Naively, this seemed (to me, a non-geneticist) to imply that there should also be some effect witnessed in the s/l group that is different than the l/l group. The authors should have presented at least the summary stats from the three separate allele combos before assigning them to groups. Doing this made it look like they had something to hide.
They had something to hide. The Lesch et al. paper that the authors use to justify their hypothesis uses a completely different grouping. That paper groups s/s and s/l together (calling it the “S” group), and then compares that group to l/l (the “L” group). Here’s the money shot from Lesch:
In all of these studies, the data associated with the s/s and l/s genotype were similar, whereas both differed from the l/l genotype, suggesting that the polymorphism has more of a dominant-recessive than a codominant-additive effect.
So by incorrectly grouping together the subjects with the s/l and l/l alleles, Kuhnen and Chiao misrepresented the paper they claimed as the biological basis for their hypothesis. From the cited literature, there is no biological justification for such a grouping. And yet the authors’ unfounded hypothesis just happened to be confirmed by the experiment — a remarkable coincidence. The paper’s conclusions are therefore wrong until shown otherwise. But I’m not gonna wait.
Thanks to Noah Smith for initially working through this paper with me
I’d also like to thank Trombley for pointing out that “panopticon” is an anagram for “tin can poop”
This post originally appeared on Zachary David’s Market Fails & Computational Gibberish.