New Things Under the Sun is a living literature review; as the state of the academic literature evolves, so do we. This post highlights three recent updates.
How Distortionary is Publish-or-Perish to Science?
As I wrote earlier this month, science appears to be getting harder. One possible cause of this is increasing competition and the incentive to publish. Maybe scientists can only keep up in the publishing race by doing increasingly slap-dash work?
The article Publish-or-perish and the quality of science looked at some evidence on this in two very specific contexts where we have exceptionally good data. A new update adds in some papers that rely on poorer quality data, but which are able to assess a much wider set of contexts:
We can find complementary evidence in two additional papers that have far less precision in their measurement but cover much larger swathes of science. Fanelli, Costas, and Larivière (2015) and Fanelli, Costas, and Ioannidis (2017) each look for statistical correlations between proxies for low quality research and proxies for pressure to publish. When we zoom out like this though, we find only mixed evidence that publication pressures are correlated with lower quality research.
Fanelli, Costas, and Larivière (2015) look at the quality of research by focusing on a rare but unambiguous indicator of serious problems: retraction. If we compare authors who end up having to retract their papers to those who do not, do we see signs that the ones who retracted their papers were facing stronger incentives to publish? To answer this, Fanelli, Costas, and Larivière (2015) identify 611 authors with a retracted paper in 2010-2011, and match each of these retracted papers with two papers that were not retracted (the articles published immediately before and after them in the same journal).
Fanelli, Costas, and Ioannidis (2017) look at a different indicator of “sloppy science.” Recall in Smaldino and McElreath’s simulation of science, one aspect of a research strategy was the choice of protocols you used in research. Some protocols were more prone to false positives than others, and since positive results are easier to publish, labs that adopt these kinds of protocols accumulate better publication records and tend to reproduce their methods. This form of publication bias leads statistical fingerprints that can be measured.undefined Fanelli, Costas, and Ioannidis (2017) tries to measure the extent of publication bias across a large number of disciplines and we can use this as at least a partial measure of “sloppy science.”
Each of these papers then looks at a number of features that, while admittedly crude, are arguably correlated with stronger incentives to publish. Are the authors of retracted papers more likely to face these stronger publication pressures? Are the authors of papers that exhibit stronger signs of publication bias more likely to face them?
One plausible factor is the stage of an author’s career. Early career researchers may face stronger pressure to publish than established researchers who are already secure in their jobs (and possibly already tenured). And indeed, each paper finds evidence of this: early career researchers are more likely to have to retract papers and showed more evidence of publication bias, though the impact on publication bias was quite small.
Another set of variables is the country in which the author’s home institution is based, since countries differ in how academics climb the career ladder. Some countries offer cash incentives for publishing, others disburse public funds to universities based closely on the publication record of universities, and others have tenure-type systems where promotion is more closely tied to publication record. When you sort authors into groups based on the policies of their country, you do find that authors in countries with cash incentives for publication are more likely to retract papers than those working in countries without cash incentives.
But that’s the strongest piece of evidence based on national policy that publication incentives lead to worse science. You don’t observe any statistically significant difference between authors in these cash incentive countries when you look at publication bias. Neither do you see anything when you instead put authors into groups based on whether they work in a country where promotion is more closely tied to individual performance. And if you group authors based on whether they work in a country where publication record plays a large role in how funds are distributed, you actually see the opposite result than expected (authors are less likely to retract and show less signs of publication bias, when publication records matter more for how funds are disbursed).
A final piece of suggestive evidence is also interesting. In Smaldino and McElreath, the underlying rationale for engaging in “sloppy science” is to accrue more publications. But in fact, authors who publish more papers per year were less likely to retract and their papers either exhibited less bias or no statistically different amount (depending on whether the first or last author is assigned to a multi-authored paper). There’s certainly room for a lot of interpretations there, but all else equal that’s not the kind of thing we would predict if we thought sloppy science let you accrue more publications quickly.
Read the whole thing for my view on how all this literature fits together. But the short version is I think publish-or-perish, on average, probably introduces real distortions, but they aren’t enormous.
Measuring the Impact of Strange Combinations of Ideas
A classic school of thought in innovation asserts that the process of innovation is fundamentally a process of combining pre-existing concepts in new and novel ways. One claim from this school of thought is that innovations that make particularly surprising combinations should be particularly important in the history of innovation. The article The Best New Ideas Combine Disparate Old Ideas looked at a bunch of evidence consistent with this claim, at least in the context of patents and papers.
I’ve updated this article with two papers that provide new ways to measure this, in the context of academic papers. The first is by Carayol, Lahatte, and Llopis (2019):
Carayol, Lahatte, and Llopis (2019) investigate this by using the keywords that authors attach to their own manuscripts as proxies for the ideas that are being combined. For a dataset of about 10 million papers published between 1999 and 2013, they look at each pair of keywords used in each paper, comparing how many other papers use the same pair of keywords as compared to what would be expected if keywords were just assigned randomly and independently. Using this metric of novelty, they find the more novel the paper, the more citations it gets and the more likely it is to be among the top 5% most cited. In the figures below, papers are sorted into 100 bins from least novel (left) to most novel (right), and the average citations received within 3 years or the probability of being among the top 5% most cited papers for papers in the same centile is on the vertical axis.
The second paper brings in a new way to measure the impact of unusual combinations, rather than a new way of measuring how ideas are combined or not combined.
[A]s with patents, it would be nice to have an alternative to the number of citations received as a measure of how important are academic papers that combine disparate ideas. Lin, Evans, and Wu (2022) provide one such alternative by comparing how disruptive a paper is and how unusual are the combinations of cited references. Intuitively, disruption is about how much your contribution renders prior work obsolete, and a new line of papers attempt to measure this with an an index based on how much your work is cited on it’s own, and not in conjunction with the stuff your work cited. This is distinct from simply the number of citations a paper receives. You can be highly cited, but not highly disruptive if you get a lot of citations, but most of them also point to one of your references. And you can also be highly disruptive without being highly cited, if most of the citations you do receive cite you and only you.
Lin, Evans, and Wu (2022) measure unusual combinations of ideas in the same way as Uzzi, Mukherjee, Stringer, and Jones and (among other things) compare the extent to which a paper makes unusual combinations to how disruptive it is. They find papers citing conventional combinations of journals are disruptive 36% of the time, whereas papers citing highly atypical combinations of journals are disruptive 61% of the time. In this context, a paper is disruptive if it receives more citations from papers that only cite it than citations from papers that cite both it and one of its references. That suggests unusual combinations are particularly important for forming new platforms upon which subsequent papers build.
A Bias Against Novelty
Lastly, the article Conservatism in science examined a bit of a puzzle: scientists are curious people, so why would they appear to exhibit a bias against novel research? One strand in that argument was a paper by Wang, Veugelers, and Stephan, which presented evidence that papers doing highly novel work eventually get more citations, but are less likely to be highly cited by people in their own discipline, and take longer to receive citations. But that paper was inevitably based on just one sample of data using one particular measure of novelty. Carayol, Lahatte, and Llopis (2019) (discussed previously) provides an alternative dataset and measure of novelty that we can use to assess these claims. In the updated piece, I integrate their results with Wang, Veugelers, and Stephan.
…Suppose we’ve recently published an article on an unusual new idea. How is it received by the scientific community?
Several papers look into this, but let’s focus on two. Wang, Veugelers, and Stephan (2017) and Carayol, Lahatte, and Llopis (2019) each look at academic papers across most scientific fields, with the former looking at papers published in 2001 and the latter at papers published between 1999 and 2013. They each try and devise a way to try and measure how novel these papers are and then look at how novel papers are subsequently cited (or not).
…
Echoing what I said at the beginning, highly novel work seems desirable; it’s much more likely to become one of the top cited papers in its field by either measure. Both figures below sort papers from least to most novel as we move from left-to-right, and give the proportion of papers in a given category who are among the top cited in their field (top 1% cited left, top 5% cited right).
But despite the fact that published highly novel work tends to result in more citations, papers that are really novel face some challenges in the publication game. Both Wang, Veugelers and Stephan (2017) and Carayol, Lahatte, and Llopis (2019) try to look at the prestige of the journals that ultimately publish highly novel work. They both use the impact factor of journals as a proxy for their prestige: this is meant to capture the average number of citations an article published in the journal receives, on average, over some time span.
At first, all seems well: Carayol, Lahatte, and Llopis find that the most novel work is actually more likely to be published in the highest impact journals, and that the average novelty of papers goes up with the impact factor. High impact journals like novel work! But Carayol, Lahatte, and Llopis find evidence that they don’t really like them enough, at least if their goal is to promote the most highly cited work. Across a large swath of journals, novel papers tend to get about 10% more citations than the average article published in the same journals. To oversimplify, that means that if we think citations are a good proxy for the quality of an article, it’s like a novel paper has to be 10% better than it’s more conventional peers in order to get into the same journal as them.
Wang, Veugelers, and Stephan’s evidence on this point is even stronger. They show moderately and highly novel papers are actually just less likely to be published in the best journals (as measured by the journal impact factor). Between the two, that suggests to me that highly novel work is also less likely to be published at all, and so the results we’re seeing might be over-estimating the citations received, since they rely on novel work that was good enough to clear skeptical peer review.
Wang, Veugelers, and Stephan additionally present evidence one reason for this publication penalty might be because peers in your home field are less likely to recognize the merits of highly novel work (at least measured by unusual combinations of cited journals). In fact, this recognition disproportionately comes from other fields. Restricting attention to citations received from within the same field, novelty isn’t rewarded at all! (Carayol and coauthors do not investigate which field cites papers).
Looking across all fields, Wang, Veugelers, and Stephan additionally find these citations take longer to roll in: restricting attention to citations received in just the first three years, again, novelty isn’t really rewarded at all. That said, this finding appears to depend on how you measure novelty. Carayol, Lahatte, and Llopis do not find novel papers only get their citations with a delay. The papers may be picking up different flavors of novelty, with Wang, Veugelers, and Stephan specifically focusing on the novelty of using ideas that come from other fields (whose journals are rarely cited) while Carayol, Lahatte, and Llopis are not.
New Things Under the Sun is produced in partnership with the Institute for Progress, a Washington, DC-based think tank. You can learn more about their work by visiting their website.