Science is getting harder
Evidence that discoveries are getting smaller on average
Like the rest of New Things Under the Sun, this article will be updated as the state of the academic literature evolves; you can read the latest version here.
One of the most famous recent papers in the economics of innovation is “Are Ideas Getting Harder to Find?” by Bloom, Jones, Van Reenen, and Webb. It showed that more and more R&D effort is necessary to sustain the present rates of technological progress, whether we are talking about Moore’s law, agricultural crop yields, healthcare, or other proxies for progress. Other papers that look into this issue have found similar results. While it is ambiguous whether the rate of technological progress is actually slowing down, it certainly seems to be getting harder and harder to keep up the pace.
What about in science?
A basket of indicators all seem to document a trend similar to what we see with technology. Even as the number of scientists and publications rises substantially, we do not appear to be seeing a concomitant rise in new discoveries that supplant older ones. Science is getting harder.
Before diving into these indicators, I want to head off one potential misunderstanding. My claim is that science is getting harder, in some sense, not that science is ending or that we are on the verge of running out of ideas. Instead, the claim is that discoveries of a given “size” are harder to bring about than in the past.
Raw paper output
We’ll actually start with an indicator that shows no evidence of a slowdown though. Since scientists primarily communicate their discoveries via papers, the first place to look for evidence of increasing difficulty of making discoveries is in the number of papers scientists publish annually. The figure below, drawn from Dashun Wang and Albert-László Barabási’s (free!) book on the Science of Science compares publications to authors over the last century.
At left, we can see the number of papers and authors per year has increased basically in lockstep over the twentieth century. Note, the axis is a log-scale, so that a straight-line indicates exponential growth. Meanwhile, at right, the blue dashed line shows that the number of papers per author has hovered around 2 for a century and rather than falling, it is actually on the rise in recent decades. (As an aside, the solid red line at right is strong evidence for the rise of teams in science, discussed more here)
So absolutely no evidence that scientists are struggling to find stuff worth writing up. But that’s not definitive evidence, because scientists are strongly incentivized to publish and what constitutes a publishable discovery is whatever editors and peer reviewers think is publishable. If fewer big discoveries are made, scientists may just publish more papers on small discoveries. So let’s take a more critical look at the papers that get published and see if there are any indicators that they contain smaller discoveries than in the past.
Let’s start by looking at some discoveries whose importance is universally acknowledged. The Nobel prize for discoveries in physics, chemistry, and medicine is one of the most prestigious scientific prizes and has a history long enough for us to see any long-run trends. Using a publicly available database on Nobel laureates by Li et al. (2019), we can identify the papers describing research that is eventually awarded a Nobel prize, and the year these papers were published. Note several papers might be associated with any given award. For each award year, we can then ask, what share of the papers related to the discovery were published in the preceding twenty years. The results of that are presented below, though I smooth the data by taking the ten-year moving average.
Prior to the 1970s, on average 90% of the time, awards went to papers published in the last twenty years. But by 2015, the ten-year moving average was closer to 50%.
So recent discoveries seem to have a harder time getting recognized as Nobel-worthy, relative to a few decades ago. We can also compare the importance of different discoveries that won Nobel prizes. In 2018, Patrick Collison and Michael Nielsen asked physicists, chemists, and life scientists to pick the more important discovery (in their field) from sets of two Nobel prize winning discoveries. For example, they might ask a physicist to say which is more important, the discovery of Giant Magnetoresistance (awarded the Nobel in 2007) or the discovery of the Compton effect (awarded in 1927). For each decade, they look at the probability a randomly selected discovery made in that decade would be picked by their survey respondents over a randomly selected discovery made in another decade. The results are below:1
A few points are notable from this exercise. First, physicists seem to think the quantum revolution of the 1910s-1930s was the best era for physics and it’s been broadly downhill since then. That’s certainly consistent with discoveries today being in a sense smaller than the ones of the past, at least for physics.
In contrast, for chemistry and physiology/medicine, the second half of the twentieth century has outperformed the first half. In the Nobel prize data, within the second half of the century, there is no obvious trend up or down for chemistry and medicine. While that’s better than physics, it remains consistent with the notion that science might be getting harder. As we can see in the first figure here, the number of papers and scientists rose substantially between 1950 and 1980, which naively implies that the number of candidates for Nobel-prize winning discoveries should also have risen substantially. If we are selecting the most important discovery from a bigger pool of candidates, we should expect that discovery to be judged more important than discoveries picked from smaller pools. But that doesn’t seem to be the case.
So Nobel prize data is also consistent with the idea that discoveries today aren’t what they used to be. Whereas it used to be quite common for work published in the preceding twenty years to be recognized for a Nobel, that doesn’t happen nearly so much today. That said, an alternative explanation is that the Nobel committee is just trying to work through an enormous backlog of Nobel-worthy work which they want to recognize before the discoverers die. In this explanation, we’ll eventually see just as many awards for the work of today.
But it’s not clear to me this is how the committee is actually thinking: recent work is awarded half the time still if the committee thinks the discovery is sufficiently important. For example, Jennifer Doudna and Emmanuelle Charpentier were awarded a Nobel for their work on CRISP-R in 2020, less than a decade after the main discoveries. And when you look specifically at the work performed in the 1980s, it doesn’t seem particularly notable, relative to work in the 40s, 50s, 60s, and 70s, despite the fact that many more papers were published in that decade.
Top Cited Papers
Still, perhaps the Nobel prize is simply too idiosyncratic for us to learn much from. Next, let’s look at another indicator of big discoveries, one which shouldn’t be biased by the sort of factors peculiar to the Nobel. This is the most top-cited papers in a given field. For example, if we look at the top 0.1% most highly cited papers of all time in a particular field, we could ask how easy is it for a new paper to join their ranks. If that has fallen over time, then that’s further evidence that today’s papers aren’t making the same contributions as yesterday’s.
On the other hand though, we might think it should get harder and harder to climb to the top 0.1%, even if discoveries are not getting smaller. After all, if discoveries are of constant size, earlier works have more time to get citations; it may not be possible for later papers to catch up, even if they are just as good. But there are also some factors that lean in the opposite direction. First, if work is only cited when relevant, then newer work should have an easier time being relevant to newer papers. Since the number of new papers grows over time, that gives one advantage to the new; they can be tailored to a bigger audience, in some sense. Second, the most esteemed papers of all time may actually stop being cited at high rates, because their contributions become part of common knowledge: it is no longer necessary to cite Newton when talking about gravity, or even Watson and Crick when asserting DNA has a double-helix shape.
So let’s proceed with seeing if there has been any change in how easy or hard it is to become a top cited paper, noting that won’t be the last piece of evidence we look at.
The closest paper I know of that looks into this is Chu and Evans (2021), which looks at the probability of a new paper ever becoming one of the top 0.1% most cited, even for just one year. But this paper does not plot this probability against time, like the previous charts: instead, it plots this probability against the size of a field, measured by the number of papers published per year. In the scatterplot below, each point corresponds to a field in a year. On the horizontal axis is the number of papers published in the fields in that year and on the vertical axis the probability a paper in that field and year is ever among the top 0.1% most cited. The colored lines are trends for each of these ten fields. Note this figure only includes papers published in the year 2000 or earlier. Since the analysis is conducted with data from 2014, every paper has more than a decade to accrue citations and get into the top 0.1%.
The figure shows pretty clearly that as fields get bigger the probability of jumping to the top 0.1% shrinks. It used to be the case that papers had a greater than 0.1% chance of being in the top 0.1% at some point. That, itself, suggests some degree of turnover and dynamism at the top of the field. But for the largest fields, that isn’t the case anymore. Note, while this chart has the size of a field on the horizontal axis, since fields tend to get bigger every year, this also shows us trends over time: up until the year 2000, newer papers had successively lower chances of supplanting their rivals and becoming one of the top 0.1% most cited.
Another variant of this chart tells the same story. In the figure below, Chu and Evans find the top 50 most highly cited papers in each year. In red below, they then track the proportion of those papers that stay in the top 50 in the next year. As this moves up, that means fewer and fewer papers are supplanting the top 50 most cited. Again, they plot this against the size of the field rather than time, but since the two move together, it also shows what is happening over time.
(In blue is a related measure, the year-to-year correlation of the rank of top 50 cited papers. In an appendix, they specifically show that this measure is correlated with time alone, so that there is less turnover in more recent years)
If being a top-cited paper is an indicator of large scientific impact, then the above suggests it’s harder to have a big impact than in the past.
Other interpretations are also possible though, such as the factors mentioned earlier. Alternatively, perhaps as fields grow a canon of select papers emerges, and everyone frames their work in relation to this canon, so that earlier work is perpetually cited, but citations don’t accurately capture the “size” of a new discovery. As with the factors peculiar to the Nobel prize, it might be that these two explanations can coexist. But in case there is something strange about how earlier work is necessarily canonized, let’s now turn to some indicators that cover science more broadly.
Growth in Topics
One issue with all the preceding evidence is that the quality of a research discovery is determined by how other scientists assess it: do they cite it, give it prizes, or tell us how important it is relative to alternatives. That might be the best we can do; a thorny problem in research is that evaluating the quality of research often requires the skills to do research, and most of the people with those skills are researchers. But it also means our assessment of the quality of research is tied up with any biases and blindspots a field might have about itself.
So let’s turn to a metric that isn’t based on the assessment of the scientists themselves. Rather than looking at the size of discoveries, we could instead try to chart the topics covered by a field over time. If a field is steadily spreading into new topics (from electrons to quarks to strings, for example), that suggests a field is learning new things and pushing out its frontier. On the other hand, if a field remains stuck on the same set of things (from strings to strings to strings, for example) that might be indicative that the field is struggling to make progress.
Milojević (2015) tries to get at this question by looking to the titles of published papers. Most people will probably only ever read a paper’s title, and so scientists usually try to broadcast what the paper is about with the title.2 Milojević uses the text of titles to define the topics a field is studying. Milojević identifies a topic as a string of words in a title that lies between a grammatical phrase delimiter (think .,:;) or a common English word with non-technical meaning (think “about”, “since”, “using”). When a topic has more than three words, she uses only the last three to define it. As an example, using this technique the most cited paper of all time, “Protein measurement with the Folin phenol reagent” would be construed to be about the topics “protein measurement” and “Folin phenol reagent.”
Now that Milojević has a way to define topics in a field, she then goes about counting how many distinct topics are mentioned in the titles of papers published in a given year. Analogous to the number of papers published per person, Milojević looks at the number of unique topics for every set of 10,000. In other words, the algorithm reads random paper titles until it reaches 10,000 topics, and then counts how many topics from this set of 10,000 are unique.
In 2022, Milojević updated the data from her 2015 paper as part of an OECD workshop (report forthcoming). Below is the number of unique topics studied in a random sample of 10,000 topics across all of science, by year (which Milojevic calls the cognitive extent).
Through the twentieth century, there was a general rise in the number of unique topics studied in a given sample of scientific titles. Over 1935-1975 this rise was a bumpy one, but it looks like we mostly reverted to trend, so the overall rate of change was steady over a long horizon. But sometime since the 1970s, this upward trend has very gradually slowed to a stop, and even began to reverse slightly. If counting topics is a good way to measure the successful growth rate of a field, then this indicates fields are having a harder time growing today than in the past.
Is counting topics in this manner sensible? Carayol, Lahatte and Llopis (2019) provide some complementary evidence on the topics authors are choosing to study using a much more straight-forward data source: the keywords authors supply to publishers to describe their own papers (some other elements of this paper are also discussed here and here). Unfortunately, this data is only available for 1999-2013. Fortunately, that is precisely the period when Milojević begins to observe the sharpest decline in unique topics. So do we see a similar thing when we look at the keywords authors use to describe their papers? Yes.
In the figure below, Carayol, Lahatte, and Llopis compare the growth rate of three different things, but we are most interested in the green dashed line and the blue line. The green dashed line shows growth in the number of publications over 1999-2013 by comparing annual publication to the number published in 1999. The blue line shows the growth in the total number of unique author-supplied keywords used to describe research, again by comparing the number in a given year to the number in 1999. This figure has a log scale, so that a straight line indicates constant exponential growth.
These two curves have begun to diverge. Whereas the growth in the total number of publications has been steady and exponential, growth in the total number of unique keywords has been slower than exponential. That seems broadly consistent with Milojević’s finding that the number of unique topics for a fixed sample size also stopped growing over this period. In other words, in terms of the number of topics being tackled by science, growth of fields is proceeding more slowly today than in the past.
Citations to Recent Papers
Another approach we could take is to compare the citations received by papers over time. If older papers made bigger discoveries than younger ones, then we might expect them to hold on longer and be more highly cited than new papers. One simple way to assess this is to look at the share of citations made in each year to new papers. A simple measure of this is the Price index (named for Derek De Solla Price, not the cost of a good), which computes the share of citations made to papers published in the last 5 years (or 10 years in some variants).
Below, Larivière, Archambault, and Gingras (2007) compute the Price index for all papers on Thompson Scientific over the period 1900-2004. They compute two versions of the index. The 20-year index divides the number of citations to papers published in the preceding 5 years by the number of citations to papers published in the preceding 20 years. The 100-year index does the same thing, but dividing by the number of citations made to papers published in the preceding century.
I think this figure is best understood as describing two periods. From 1900-1955 there was a general increase in the share of citations made to recent work, interrupted by the two world wars. Each world war imposed big disruptions on the production of new scientific work (see the first figure in this post), which had the secondary effect of reducing the share of citations to recent work. But since 1955, the share of citations made to new work has fallen dramatically, by an amount comparable to the distortions caused by the world wars (though spread out over many decades).
Cui, Wu, and Evans (2022) document that this trend has continued until 2014, and is not driven by any single field. It appears to be an almost universal phenomenon. In the figure below, they calculate the share of (all) citations made to work published in the preceding 10 years. It’s down across the board.
This looks pretty alarming, but there are explanations besides a sharp decline in size of discoveries. Whenever the share of something goes down, there are two possible causes: it could be the numerator goes down and/or the denominator goes up. And in this case, it’s mostly the latter. The raw number of citations to recent work doesn’t actually seem to have fallen by very much, at least, according to Larivière, Archambault, and Gingras (2007). But the total number of citations papers make has gone way up, and most of that increase has been citations to older work.
So the real question is less “why have researchers stopped citing new work” and more “why are researchers citing old work at such a high rate.” One explanation is that older work contains the bigger discoveries, and we’re still living in their shadow. But another explantion, put forward by Cui, Wu, and Evans, is simply that the scientific labor force is aging and older scientists are more familiar with older papers. Older scientists might push up the share of citations to older paper either by citing them in their own work, or by insisting on them being cited in other people’s work when they serve as peer reviewers. This age dynamic could also be a factor in the persistence of top-cited papers of the past. But whether this is about the size of discoveries or an aging scientific labor force’s preferences (and the two explanations are not mutually exclusive either), the more distant past is increasingly influential in contemporary science.
So let’s close by looking at one more data source, which is somewhat immune to some of the strategic citation factors unique to academia.
Patent Citations to Recent Papers
Patents also cite academic papers, where relevant. While patents have their own issues, one virtue is that inventors face a different set of incentives than academics. Whereas academics might cite papers or choose to work on topics that they know will be viewed favorably by their older peers, an inventor doesn’t face these constraints. In principle, they might just rely on whatever science is most useful for the purposes of getting some technology to work. Moreover, we have some good evidence science really is a useful input to technology, and that this is reflected tolerably well in citations to science. So if inventors are not citing recent science as much as in the past, that’s another indicator that recent science may be struggling to make comparable discoveries to the papers of the past.
I have not been able to find any papers that look at the extent to which patents cite recent academic research (if you know of such work, please send it my way and I can update this). Fortunately, Marx and Fuegi have a publicly available database of patent citations to academic work, which I used to put together the following figure. In each year of the following figure, I pull all citations made to academic papers by US patents whose application was filed in that year, and that was eventually granted within the next five years. I then compute the share of these citations that go to papers published in the preceding five years. Essentially, this is the Price index, discussed in the previous section, but applied to the citations of patents. I started the figure in 1975, as citations to academic papers were quite rare before then.
This figure echoes what we see in the academic price index: citations to recent work have become increasingly less common. Unlike the academic work though, this is not entirely a story of rising citations to older work and steady citations to new work. Instead, we are actually seeing a decline in the number of citations to recent work. Among the subset of patents that cite academic work, the average number of citations to papers published in five years prior to the patent’s filing date dropped from about 4 around 2000 to under 3.5 by 2015.
The Big Picture
Stepping back, I’m claiming that science is getting harder, in the sense that it is increasingly challenging to make discoveries that have comparable impact to the ones in the past. Diverse groups - the Nobel nominators, contemporary surveyed scientists, academics, and inventors - all seem to have an increasing preference for the work of the past, relative to the present. And looking at growth in the number of topics covered by scientists also suggests it has become harder to make forward progress. To close, I’ll add two more arguments.
First, we should expect science to get harder because of the “burden of knowledge.” The basic idea is, almost as a tautology, making new discoveries requires new knowledge; otherwise the discovery would likely already have been made. Whenever new knowledge is discovered, it opens the way for new discoveries, and it may also displace or make obsolete some older knowledge. But if new knowledge does not entirely displace old knowledge, then it may be you need steadily more knowledge to make new discoveries. Unfortunately, I think we have quite good evidence this seems to be the case as a general (though probably not absolute) rule, much of it reviewed here. All else equal, if you need more and more knowledge to make a discovery of a given size, then you can probably expect discoveries of a given size to require more time or manpower to bring about. And the evidence of this post suggests that is what we have in fact observed.
Second, if it’s true that innovation in other domains has in fact gotten harder (and I think it has), then to the extent scientific discovery and other forms of innovation are a similar process we shouldn’t be surprised when what applies to the former also applies to the latter. As noted at the outset of this post, if you pick some metric of technological progress, odds are that a constant rate of progress along that metric is accompanied by rising R&D effort. Why should science be so different?
Each of these pieces of evidence has holes in it. But I think they are not the same holes. Stack them all up, and I think you get an argument that can begin to hold water.
A few extra notes:
In an appendix, I briefly list some additional possible explanations for why science might be harder.
A good overview of some of the newer metrics for quantifying science, which helped me in drafting this post, is Wu et al. (2021).
I have turned off comments on substack for this post, but if you would like to comment or discuss it, I invite you to do so at the new Progress Forum, which I think is a better permanent home for comments. This link goes to this post on progress forum.
If you want to chat about this post or innovation in generally, let’s grab a virtual coffee. Send me an email at mattclancy at hey dot com and we’ll put something in the calendar.
New Things Under the Sun is produced in partnership with the Institute for Progress, a Washington, DC-based think tank. You can learn more about their work by visiting their website.
Note decades are when discoveries are made, not when awards are given.
While it’s true that sometimes titles are not informative,* Milojevic gets broadly similar results when she uses abstracts instead.
*economists tend to be bad on this front