The two professors gave HHU students an apparently typical exercise for bioinformaticians: they were told to analyze two datasets, which – separately for men and women – contained information on individual’s body mass index (BMI) and the number of steps they took per day.
The students were separated into two groups: the first was only asked what they could learn from the data. The other group was additionally tasked with testing three hypotheses, for example, if the number of steps differed significantly between women and men.
Unknown to the students, their teachers had a very different intention. They wanted to see, how much a pre-specified analysis path limited the students scientific creativity.
For this, Lercher and Yanai had created a dataset that had nothing to do with the situation described to the students. Instead, it showed a simple picture of a gorilla if one plotted the data pairs against each other. As Prof. Martin Lercher, head of HHU’s Computational Cell Biology group, explained: “Such a simple visualization of data belongs to the basic toolkit of data science, this was covered in detail already in the very first lectures.”
But a large fraction of students that had been given specific hypotheses did not find the gorilla. They missed it five times more often than their freely analyzing fellow students.
The topic can be subsumed under the terms ‚Day Science‘ and ‚Night Science‘; it appeared in a corresponding series of editorials in Genome Biology. The focus on Day Science distinguishes modern science from the natural philosophy that preceded it: scientists submit their hunches to planned experiments – here, they don’t make discoveries but rigorously test hypotheses. Night Science, on the other hand, is the unsystematic, creative part of science, where researches stumble upon new questions, ideas and hypotheses.
Prof. Itai Yanai emphasizes: “With our work, we want to stimulate more discussion about the creative part of science. Our analysis shows: if you’re stuck in ‘Day Science mode’, you easily miss exciting discoveries.”
The authors were very successful in their quest: the analysis of about 3.4 million publications in 2020 from all fields showed that their article “A hypothesis is a liability” is, on position 63, one of the 100 most discussed works. In the area of ‘Information and Computing Sciences’, it is on position 5. To get to these numbers, the bibliometrics platform Altmetrics analyzed all accessible online references to the analyzed articles, almost 88 million in total.
These references include discussions in blogs and social media platforms such as Facebook and Twitter. The paper was particularly successful in Twitter, it was re-tweeted 2566 times. Martin Lercher: “How often our article was shared and discussed took us by surprise. Maybe it exposed a weak spot: that we scientists should give ourselves more space for creative ideas.”
Itai Yanai & Martin Lercher, A hypothesis is a liability, Genome Biology 21, 2021
Analysis by Altimetrics: www.altmetric.com/top100/2020/