Importing Knowledge


Nations tend to develop specialties in different areas of science and technology. When one country has exceptional expertise that another lacks, what happens when knowledge workers migrate from the one to the other?

Three Case Studies

Let's start by with some case studies. Moser, Voena, and Waldinger (2014) study (mostly Jewish) German chemists who were dismissed from academic posts in the 1930s when the Nazi's came to power. About 17% of all German and Austrian chemistry professors were dismissed in this way, and of this group 26 appear to have immigrated to the USA. Moser, Voena, and Waldinger identify fields of chemical innovation where the emigrants were strong by looking at their patents. They find US patenting in these areas increased significantly after 1933, when German immigrants began to arrive in America, when compared to the fields in which non-migrating German chemists were strongest.

Half a century later, Russian scientists were migrating to Germany when the Soviet Union collapsed. Ferrucci (2020) uses a similar methodology as Moser, Voena, and Waldinger to study the impact of this influx on German patenting. Ferruci uses Soviet patenting (abroad - there wasn't a patent system within the USSR) to identify fields in which the USSR had comparative technological strength. Broadly speaking, this was in broad technology classes like "physics" and "electricity", rather than chemistry. After 1991, when Russian scientists begin to arrive in Germany, there is an uptick in German patenting in fields of Soviet strength, as opposed to other fields.

Finally, Choudhury and Kim (2018) exploit a quirk of US immigration law to estimate the impact of Chinese and Indian immigration on innovation using traditional herbal knowledge. Many firms would like to hire more knowledge workers from abroad but are constrained by the cap on highly skilled migrant visas issued each year, which was 65,000 per year prior to 1999. In that year, driven by the dot-com bubble, the cap was raised to 115,000 per year, before dropping back to 65,000 in 2004. During the period 1999-2004, firms had greater scope to hire foreign knowledge workers. At the same time, there is a group of firms and other institutions (mostly but not exclusively universities) that are exempt from this visa cap. So Choudhury and Kim compare the patent output of firms that were affected by the visa cap expansion to those that weren't.

They are interested in the traditional herbal knowledge brought to America by Indian and Chinese migrants, so they build a database of patents that disclose the use of herbs in their title or abstract. They find the number of herbal patents sought by firms that get to hire more migrants increases sharply when the visa cap is lifted and drops sharply when the cap is tightened, whereas herbal patenting by exempt organizations does not change.

A General Result

Across time and space, these three case studies all tell the same basic story: when country X has technological strength in a certain area, then country Y gets stronger in that area after scientists and engineers immigrate. It turns out, this story can be generalized beyond just these three special cases.

To see if this trend is general, Bahar, Choudhury, and Rapoport (2020) look at 95 countries and 651 different technological categories. Among other things, they look at the probability that a country experiences a "take-off" in patenting in a particular technology category after it receives more immigrants with strength in that field. Here, a "take-off" is defined as going from zero patents in that field to having a greater than average share of patents in that field. For example, if the typical country has 5% of it's patents in nanotechnology, a country that goes from having 0 nanotechnology patents to more than 5% of it's patents in nanotechnology is one that experiences a take-off. It signifies going from 0 to an “above average” national focus on that technology.

Take-offs are rare. Over the course of a decade, the probability a given technology field in a given country experiences a take-off is just 2.2%. But Bahar, Choudhury, and Rapoport estimate doubling the number of immigrants who come from countries that are strong in a particular field raises this small probability by nearly a quarter, to 2.7%. And note that "doubling" the immigrant population isn't an unreasonable goal - the typical number of immigrants strong in a particular technology field is just 24.

These results are not driven by one or two countries. They appear to be general. But we might also worry that this is a spurious correlation. For example, if a country is planning to pump a bunch of resources into nanotechnology, then it might pull in immigrant scientists who are also experts in nanotechnology. That might make create a correlation between immigration and subsequent take-off, but it might be that the take-off would have happened regardless of the immigration (since they were pouring money into the field anyway).

So what you want to do is separate out immigration that’s pulled in by these possible technology opportunities from immigration that happens for other reasons. One approach they take is to exploit variation in the number of immigrants that is driven historic immigration trends, since the pull of family (not technology policy) drives a lot of immigration decisions. When they do this, they get results that are consistent with their earlier "naive" estimates.

Bigger Than Expected Impact

So this looks to be a more general point: if you want to develop innovative capacity in a new field, one way to do that is to bring in people from abroad who already have expertise in that area. Not too surprising! What is surprising is the size of the effect: it's frequently bigger than the direct contributions of the immigrants themselves.

In Moser, Voena, and Waldinger’s study of German chemists fleeing Nazi Germany, 157 US chemistry patents can be directly attributed to these migrants. But after we exclude those patents, the number of US patents in fields of where German immigrants were strong nearly doubled after their entry, from about 150 to 290 per year per class. In contrast, the number of patents in technology classes where non-immigrating German chemists were strong only rose from 220 to 250 per year.

In Choudhury and Kim’s study of Indian and Chinese migrants, the number of new herbal patents increased from under 20 per year to more than 50 per year for firms that were able to hire more migrants. And yet, herbal patents from inventors with typically Chinese and Indian names (which they read off the patent documents) only increased by at most 15 per year. And the annual number of herbal patents remained higher when the number of visas dropped back down to 65,000: while it fell, it didn’t return to pre-1999 levels. The immigration appears to have triggered an increase in herbal patenting that extended beyond the activities of the guest workers.

At least in these two cases, the effect of immigration on innovation was greater than the direct contributions of the immigrants themselves (we don’t know who is responsible in the other studies, but we'll see later this is probably a general point too). One reason this is surprising is that we might expect talented immigrant inventors to displace less talented American inventors. That would tend to mean the net effect of immigration would be less than the direct contributions of the immigrants themselves, since it would induce a decline in patenting by US-born inventors.

And in fact, this appears to happen! Moser, Voena, and Waldinger also look at the fate of Americans who are patenting in areas of German chemical strength prior to the arrival of the Germans. These guys were indeed less likely to patent in the same areas after the influx of talent in these areas. Instead, the rise in patenting seems to have come mostly from new inventors in the field.

Similarly, Ferrucci looks at what happens to German inventors who were previously active in fields of Soviet strength. After they collaborate with Russian immigrant scientists Ferrucci finds they are more likely to change the technological fields where they work.

So maybe immigrants bring in new and better knowledge, and that tends to make the knowledge of incumbents obsolete. The thing about knowledge though is that it spreads from mind to mind. What was once the advantage of an immigrant slowly becomes the advantage of all.

Seeds of Knowledge

You can see evidence of this in a few places. Bernstein et al. (2019) cleverly infer the immigration status of hundreds of thousands of US patentees by matching them (via their name and address) to the Infutor database. This database includes the birth year of millions of people, and a fragment of their social security number. Because social security numbers are disbursed according to a known formula, the authors can use the social security fragment to infer the year the number was assigned to a patentee. And since they also know the year you were born, they can infer the age you were when you received your social security number. Most US-born citizens are assigned a social security number when they are born or when they work their first job, typically as teenagers. They show the vast majority of people who are assigned a social security number in their 20s are people who immigrated to the USA.

Armed with evidence on the migration status of tons of inventors, they show the patents of immigrants are a bit more likely to cite foreign patents than the patents of US-born inventors. Specifically, the share of citations to foreign patents is about 3 percentage points higher for immigrant inventors. If we trust patent citations are signals of actual knowledge transfer (and they probably are, at least a bit), then immigrants incorporate knowledge of foreign work into their own inventions at a higher rate than US-born inventors.

And that knowledge probably transfers to US-born inventors. Most patents involve teams of inventors, and more than half of the inventors that immigrants work with are not themselves immigrants.

Another paper looks in more detail at the effect of immigrants on citations, but in the context of academic work, rather than patents. That may actually be better than the data on patent citations though, since there are some potential issues with patent citations that are not present for academic work. Ganguli (2015) also studies the collapse of the USSR, but looks at how citations to Soviet research changes when Russians migrate to the USA. She has two main approaches.

In the first, she looks at how many citations are made to Soviet-era scientific literature by the non-Russian residents of different US cities before and after the collapse of the USSR. On average, for each newly arrived Russian immigrant scientist, a city makes another 8 citations (per year) to Soviet-era scientific literature. So the idea here is that these immigrants moves to a US city and the residents there learn about this Soviet era research that was out there but unknown to them.

Second, she looks at how migration affects citations received by an individual scientific article. She matches migrating scientists to a set of control scientists who are similar along a bunch of dimensions and looks to see what happens to the citations their Soviet era work receives from US (again, non-Russian) residents after they migrate. The effect is small, but it's there: migration buys each of your papers about 0.03 extra cites per year from US researchers, relative to the control group that didn’t migrate.

Access to new knowledge for domestic inventors is one way the innovation impact of immigration may be so big. Another is via the combination of existing and new knowledge. To the extent that innovation is about the combination of old ideas in new ways, teams composed of immigrant scientists and US-born scientist with different knowledge may be able to create innovations unavailable to either group on their own. Choudhury and Kim finds evidence of this: patents that mention both an herbal compound and a synthetic compound (for which the US pharmaceutical industry is presumed to have a comparative advantage) are more commonly invented by teams comprised of both immigrants and US-born workers, as compared to teams comprised entirely of immigrants or US-born workers.

To close, let's look at a grim but illuminating study. As noted above Bernstein et al. (2019) have constructed a unique dataset of US inventors who migrated to the USA no earlier then their 20s. Though this group is only 16% of inventors in their dataset, they account for 22% of all patents over 1976-2012 (and an even larger share, when you attempt to adjust for the quality of patents). So the direct contribution to innovation of this group is already quite large. But they additionally seem to have a large and positive effect on their US collaborators.

To show this, the authors look at what happens to an inventor's collaborators (the other inventors they are listed on patents with) when the inventor suffers an early death, where an early death is one that occurs prior to the age of 60. For every inventor who dies early, they match them to a second inventor who did not die early, but was similar to the deceased in other ways up to that point in their life: immigration status, age in the year of death, patenting history, number of co-inventors, etc. Now they have two inventors who are similar in a lot of ways, except one of them died early. They then identify all the collaborators of these people and look to see what happens to them in the years after one inventor dies and another lives. They find those who previously worked with the deceased take out fewer patents every year, as compared to those who worked with a similar inventor who did not die.

But when they split their results up by immigration status, they find the effect is stronger for immigrants. On average, when an immigrant collaborator dies early, US inventors produce 0.4 fewer patents per year thereafter, compared to those whose coauthor did not die. This effect lasts for at least a decade, and seems to get stronger over time. For comparison, US inventors working with a non-immigrant inventor produce 0.1 fewer patents per year after the inventor passes away.

In other words, US inventors benefit a lot from the ability to collaborate with immigrant inventors. And one way to see that is to see how many fewer patents they take out when they lose the ability to collaborate (or even just talk with) that inventor.

So why is the impact of immigration on innovation so much greater than the activity of the migrants themselves? My take is that part of the story is the knowledge and ideas the migrating scientists learned in their home countries comes along for the ride, and then sets up house in the minds of the native born.

Are Ideas Getting Harder to Find Because of the Burden of Knowledge?


Innovation appears to be getting harder. At least, that’s the conclusion of Bloom, Jones, Van Reenen, and Webb (2020). Across a host of measures, getting one “unit” of innovation seems to take more and more R&D resources.

To take a concrete example, although Moore’s law has held for a remarkable 50 years, maintaining the doubling schedule (twice the transistors every two years) takes twice as many researchers every 14 years. You see similar trends for medical research - over time, more scientists are needed to save the same number of years of life. You see similar trends for agriculture - over time, more scientists are needed to increase crop yields by the same proportion. And you see similar trends for the economy writ large - over time, more researchers are needed to increase total factor productivity by the same proportion. Measured in terms of the number of researchers that can be hired, the resources needed to get the same proportional increase in productivity doubles every 17 years.

There are lots of issues with any one of these numbers. I’ve written about some of them (on the recent total factor productivity slowdown here, and on agricultural crop yields here). But taken together, the effects are so large that it does look like something is happening: it takes more people to innovate over time.


The Burden of Knowledge

A 2009 paper by Benjamin Jones, titled The Burden of Knowledge and the Death of the Renaissance Man, provides a possible answer (explainer here). Assume invention is the application of knowledge to solve problems (whether in science or technology). As more problems are solved, we require additional knowledge to solve the ones that remain, or to improve on our existing solutions.

This wouldn’t be a problem, except for the fact that people die and take their knowledge with them. Meanwhile, babies are (inconveniently) born without any knowledge. So each generation needs to acquire knowledge anew, slowly and arduously, over decades of schooling. But since the knowledge necessary to push the frontier keeps growing, the amount of knowledge each generation must learn gets larger. The lengthening retraining cycle slows down innovation.

Age of Achievement

A variety of suggestive evidence is consistent with this story. One line of evidence is the age when people begin to innovate. If people need to learn more in order to innovate, they have to spend more time getting educated and will be older when they start adding their own discoveries to the stock of knowledge.

Brendel and Schweitzer (2019) and Schweitzer and Brendel (2020) look at the age of academic mathematicians and economists when they publish their first solo-authored article in a top journal: it rose from 30 to 35 over 1950-2013 (for math) and 1970-2014 (for economics). For economists, they also look at first solo-authored publication in any journal: the trend is the same. Jones (2010) (explainer here) looks at the age when Nobel prize winners and great inventors did their notable work. Over the twentieth century, it rose by 5 more years than would be predicted by demographic changes. Notably, the time Nobel laureates spent in education also increased - by 4 years.

Brendel and Schweitzer (2019) and Schweitzer and Brendel (2020) also point to another suggestive fact that the knowledge required to push the frontier has been rising. The number of references in mathematicians and economists’ first solo-authored papers is rising sharply. Economists in 1970 cited about 15 papers in their first solo-authored article, but 40 in 2014. Mathematicians cited just 5 papers in the 1950s in their debuts, but over 25 in 2013.

Outside academia, the evidence is a bit more mixed. In Jones’ paper on the burden of knowledge, he looked at the age when US inventors get their first patents and found it rose by about one year, from 30.5 to 31.5, between 1985 and 1998. But this trend subsequently reversed. Jung and Ejermo (2014), studying the population of Sweden, found the age of first invention dropped from a peak of 44.6 in 1997 to 40.4 in 2007. And a recent conference paper by Kaltenberg, Jaffe, and Lachman (2020) found the age of first patent between 1996 and 2016 dropped in the USA as well.

That said, there is some other suggestive evidence that patents these days draw on more knowledge - or at least, scientific knowledge - than in the past. Marx and Fuegi (forthcoming) use text processing algorithms to match scientific references in US and EU patents to data on scientific journal articles in the Microsoft Academic Graph. The average number of citations to scientific journal articles has grown rapidly from basically 0 to 4 between 1980 and today. And as noted in a previous newsletter, there’s a variety of evidence that this reflects actual “use” of the ideas science generates.

Splitting Knowledge Across Heads

But that’s only part of the story. In Jones’ model, scientists don’t just respond to the rising burden of knowledge by spending more time in school. They also team up, so that the burden of knowledge is split up among several heads.

The evidence for this trend is pretty unambiguous. The rise of teams has been documented across a host of disciplines. Between 1980 and 2018, the number of inventors per US patent doubled. Brendel and Schweitzer also show the number of coauthors on mathematics and economics articles has also risen sharply through 2013/2014. Wuchty, Jones, and Uzzi (2007) has also documented the rise of teams in scientific production through 2000.

We can also take inspiration from Jones (2010) and look at Nobel prizes. The Nobel prize in physics, chemistry, and medicine has been given to 1-3 people for most of the years from 1901-2019. When more than one person gets the award, it may be because multiple people contributed to the discovery, or because the award is for multiple separate (but thematically linked) contributions. For example, the 2009 physics Nobel was one half awarded to Charles Kuen Kao "for groundbreaking achievements concerning the transmission of light in fibers for optical communication", with the other half jointly to Willard S. Boyle and George E. Smith "for the invention of an imaging semiconductor circuit - the CCD sensor."

The figure below gives the average number of laureates per contribution, over the preceding 10 years. For the physics and chemistry awards, there’s been a steady shift: in the first part of the 20th century, each contribution was usually assigned to a single scientist. In the 21st centruy, there are, on average, two scientists awarded per contribution. In medicine, there was a sharp increase from 1 scientist per contribution to a peak of 2.6 in 1976, but has slightly declined since then, though it remains above 2.

According to Jones’ the reason for teams is that teams can bring more knowledge to a problem than an individual. If that’s the case, then innovations that come from teams should tend to perform better than those created by individuals, all else equal. For both patents and papers, that’s precisely what Ahmadpoor and Jones (2019) find. For teams of 2-5 people, the bigger the team the higher the citations the paper/patent receives (though the extent varies by field). Wu, Wang, and Evans (2019) also find the bigger the team, the more cited are patents, papers, and software code.

The Death of the Renaissance Man

By using teams to innovate, scientists and innovators reduce the amount of time they need to spend learning. They do this by specializing in obtaining frontier knowledge on an ever narrower slice of the problem. So Jones’ model also predicts an increase in specialization.

In Jones’ paper, specialization was measured as the probability solo-inventors patented in different technological fields within 3 years on consecutive patents. The idea is the less likely they are to “jump” fields, the more specialized their knowledge must be. For example, if I apply for a patent in battery technology in 1990 and another in software in 1993, that would indicate I’m more of a generalist than someone who is unable to make the jump. Jones used data on 1977 through 1993, but in the figure below I replicate his methodology and bring the data up through 2010. Between 1975 and 2005, the probability a solo-inventor patents in different technology classes, on two consecutive patents with applications within 3 years of each other, drops from 56% to 47%.

(While the probability does head back up after 2005, it remains well below prior levels and it's possible this is an artifact of the data - see the technical notes at the bottom of this newsletter if curious)

Schweitzer and Brendel exploit the JEL classification system in economics. These classifications can be aggregated up to the level of one of 9 fields, and Brendel and Schweitzer look at the probability an economist hops from one field to another between two solo-authored publications that are published within 3 years. Among all articles listed on EconLit, it's fallen in half, from 33% to 14% between 1973 and 2014. Restricting attention to top ten publications, it fell even more sharply, from 28% to 0%(!) in 2014.

Lastly, let’s consider the Nobel prizes again. Since Nobel prizes are awarded for substantially distinct discoveries, winning more than one Nobel prize in physics, chemistry, or medicine, may be another signifier of multiple specialties. There have been just three Nobel laureates to win more than one physics, chemistry, or medicine Nobel prize: Marie Curie (1903, 1906), John Bardeen (1956, 1972), Frederick Sanger (1958, 1980). If it takes as long as 25 years to receive a second Nobel prize, then we can be sure there was no multiple-winner between 1958 and 1994. There were 218 Nobel laureates between 1959 and 1994, compared to 207 between 1901 and 1958. That means there were 3 multiple Nobel laureates in the first 207, and 0 in the second 218.

Why are ideas getting harder to find?

Bloom, Jones, Van Reenen and Webb (2020) document the productivity of research is falling: it takes more inputs to get the same output. Jones (2009) provides an explanation for why that might happen. New problems require new knowledge to solve, but using new knowledge requires understanding (at least some) of the earlier, more basic knowledge. Over time, the total amount of knowledge needed to solve problems keeps rising. Since knowledge can only be used when it’s inside someone’s head, we end up needing more researchers. And that’s precisely the dimension that Bloom et al. (2020) use to measure the declining productivity of research - it does take more researchers to get the same innovation.

A few closing thoughts.

First, while the evidence discussed above is certainly consistent with Jones’ story, stronger evidence would be nice. Most of the above evidence is about how things have changed over time. But we should also be able to see differences across fields. The story predicts fields with “deeper” knowledge requirements should have bigger teams and more specialization. Jones (2009) provides evidence this is indeed the case for patents, but as far as I know, no one else has updated his work or extended this line of evidence into academia and other domains.

Second, Jones’ model isn’t the only possible explanation for the falling productivity of research. Arora, Belenzon, Patacconi, and Suh (2020) suggest the growing division of labor between universities and the private sector in innovation may be at fault. As universities increasingly focus on basic science and the private sector on applied research, there may be greater difficulty in translating science into applications. Bhattacharya and Packalen (2020) suggest the incentives created by citation in academia have increasingly led scientists to focus on incremental science, rather than potential (risky) breakthroughs. Lastly, it may also be that breakthroughs just come along at random, sometimes after long intervals. Maybe we are simply awaiting a new paradigm to accelerate innovation once again.

Third, where do we go from here? Is innovation doomed to get harder and harder? There are a few possible forces that may work in the opposite direction.

If breakthroughs in science and technology wipe the slate clean, rendering old knowledge obsolete, then it’s possible the burden of knowledge could drop. In fact, Jung and Ejermo (2014) suggest this may be a reason why the age of first patent declined in the mid-1990s: digital innovation became relatively easy and did not depend on deep knowledge. It would be interesting to see if the three measures discussed above tend to reverse in fields undergoing paradigm shifts.

On the other hand, the burden of knowledge may, itself, make breakthroughs more difficult! As discussed in more detail in a previous newsletter, there is some evidence that teams are less likely to produce breakthrough innovations. This might be because it’s harder to spot unexpected connections between ideas when they are split across multiple people’s heads. In that case, the burden of knowledge can become self-perpetuating.

Alternatively, if knowledge leads to greater efficiency in teaching, so that students more quickly vault to the knowledge frontier, that could also reduce the burden of knowledge. Lastly, it may be possible for artificial intelligence to shoulder much of the burden of knowledge. Indeed, artificial general intelligence could hypothetically upend this whole model, if it is disrupts the cycle of retraining and teamwork that is required of human innovators. I suppose we’ll know more in 20 years.

Technical Notes

For patent data, I use US patentsview data and their disambiguated inventor data. To calculate the probability of jumping fields, I use the primary US patent classification 3-digit class (as in Jones 2009). This patent classification system was discontinued in mid-2015, and it’s possible this is a contributing factor to the uptick observed after 2005. A patent applied for in 2006 only “counts” as a possible field jump if there was a second patent applied for before 2010 and granted before the classification system was discontinued in 2015. This selection effect might be result in an increasingly unrepresentative sample of patents.

Is there a Price for a Covid-19 Vaccine?


Note: I’m experimenting with Substack’s audio features. This week you can read the newsletter below, or listen to me read it by clicking the link above. Thanks!

“No amount of real resources devoted to medical research would have helped European society in 1348 to solve the riddle of the Black Death.” - Joel Mokyr (1998)

There is no currently existing human vaccine for covid-19. Can we force one into existence by promising to spend a lot on it? Is there some price at which we can “buy” a covid-19 vaccine in the next year?

That’s the premise of a proposal by economists Susan Athey, Michael Kremer, Christopher Snyder, and Alex Tabarrok. They propose the US government commit in advance to paying a substantial price for a specified number of vaccine doses: something like $100 each for the first 300 million. The idea is that the potential of winning $30bn will induce pharma companies to pour resources into vaccine development.

Covid-19 and the Profit Motive

We have lots of reasons to believe that a promise to pay more for a covid-19 vaccine would induce more covid-19 vaccine work. Academic research is moving so fast these days that we already have good evidence that pharma companies are extremely responsive to profit signals around covid-19. Bryan, Lemus, and Marshall (2020) track the number of covid-19 therapies at any stage of development, as well as the number of academic publications related to covid-19, to produce this stunning figure:

The black line that is shooting off to the top of the chart is the total number of therapies or publications related to covid-19, as measured against the number of days since the beginning of the pandemic/epidemic. The various dashed lines correspond to the number of therapies and publications for other diseases and/or pandemics (Ebola, Zika, H1N1, and breast cancer). Two things are immediately apparent.

First, covid-19 research is much higher than research related to other pandemic diseases. Second, the gap between covid-19 and other diseases has widened as the magnitude of the covid-19 pandemic becomes clearer. It seems obvious these differences are entirely driven by the difference in demand for a covid-19 therapy, both relative to other drugs and over time, rather than some scientific breakthrough that made it suddenly easier to do covid-19 research. So the above figure is strong evidence that pharma companies respond to profit opportunities and would probably respond further if the government promised to buy a working vaccine at a higher price than the market would normally support.

But dig into the data a bit deeper and there is something troubling. While a vaccine would be the most useful therapy, an unusually large share of the therapies under development are drugs, rather than vaccines. And while it would be nice if an existing drug turned out to be a useful therapy for covid-19, it seems more likely a new disease will require a new kind of drug. But repurposed drugs, rather than novel therapies account for an unusually large share of trials.

This difference has grown over time, as the scope of the pandemic widened. And the divergence between vaccines vs. drugs, and novel drugs vs. repurposed ones, is significantly larger for covid-19 than for Ebola, Zika, and H1N1.

This suggests the rising profitability of a covid-19 treatment is pushing ever more firms to focus on therapies that are not necessarily the best treatment for the disease, but which are most likely to get to the market soon. Vaccines tend to be harder than drugs, and novel drugs tend to be harder than repurposing existing drugs.

It turns out the above evidence is quite consistent with existing research on how medical research responds to market demand. We have good evidence that government promises to pay more for vaccines would likely induce more vaccine research. But the evidence we have also suggests such a policy is most effective at bringing to market a vaccine that does not require much more R&D (but read the ending of this newsletter for caveats).

Markets for Vaccines

The kind of program Athey, Kremer, Snyder, and Tabarrok are proposing is called an Advance Market Commitment, and it’s been successfully tried before. In 2007, a coalition of governments and the Gates Foundation pledged $1.5bn towards the production of 200 million annual doses of a pneumococcal conjugate vaccine for developing countries. If a manufacturer would supply the vaccine at a price of no more than $3.50 per dose, the advance market commitment would top up the rest with a share of the $1.5bn pledged. The program launched in 2009 and in 2010 GSK and Pfizer each committed to supply 30 million doses annually. This amount was increased over time, and a third supplier entered in 2019. Annual distribution exceeded 160 million doses annually by 2016.

Uptake of the pneumococcus vaccine was much faster than uptake for vaccines for a different virus without an advance market commitment (rotavirus). So the advance market commitment seems to have worked.

But there's an important caveat to all this: very little R&D was required to develop the pneumococcal conjugate vaccine. When it was selected, vaccines for similar diseases in developed countries already existed, and vaccines covering the strains in developing countries were already in late-stage clinical trials. So in this case, the advance market commitment pushed firms to quickly build up manufacturing and distribution capacity, but it didn’t push them to do extensive R&D since none was needed.

This is the only time a large-scale advance market commitment has been tried. But that’s not the only place we can look for evidence.

Finkelstein (2004) identifies three US policy changes that increased the profitability of vaccines for some diseases but not others. She then looks to see if firms respond by creating more new vaccines for the affected diseases, relative to the unaffected diseases. Indeed, they do. Let's dig in a bit more.

The three policies Finkelstein uses are (1) the 1991 CDC recommendation that all infants be vaccinated against Hepatitis B; (2) the 1993 decision for Medicare to fully cover the cost of influenza vaccination for Medicare recipients and; (3) the 1986 creation of the Vaccine Injury Compensation Fund which indemnified vaccine manufacturers from lawsuits relating to adverse effects for some specified vaccines. In each of these three cases, policy choices made vaccines for some diseases more profitable, but had no effect on other diseases.

As a control group, Finkelstein considers various sets of alternative diseases that were not affected by these policies, but which otherwise share some of the same characteristics as the affected diseases. All told, she has data on preclinical trials, clinical trials, and vaccine approvals for 6 affected diseases and control groups consisting of 7-26 other diseases, over 1983-1999.

Diseases where policy increased profitability saw an additional 1.2 clinical trials per year and an additional 0.3 new approved vaccines per year (but only 7 years after the policy took effect), as compared to controls. So the promise of more profit did pull in more vaccine development.

But the effect only travels so far up the research stream. When Finkelstein looks farther up the development pipeline, the effect disappears. Affected diseases had no more preclinical trials than the control group. This suggests firms responded to the increased profit by pulling vaccines already far along off the shelf and putting them into clinical trials. But if it stimulated more basic research, the effect was too small to be detected.

Markets for Drugs

There is also a rich vein of research on the extent to which general pharma R&D (not vaccines) respond to changes in the size of the market for different health products. Dubois, Mouson, Scott-Morton, and Seabright (2015) look at the link between potential profits and innovation in the context of global pharmaceutical innovation. They've got a data on drug sales in 14 major countries, which they use to make estimates of the size of the market for different categories of therapeutic medicine. Their goal is to see how changes in the size of the market for a drug change the propensity to develop new drugs for the market. In this case, they're holding the measure of innovation to a relatively high bar: a newly approved drug, marketed in one of their 14 countries, that is also a new chemical entity (i.e., not a modification of an existing drug).

One challenge is that better drugs can, themselves, change the size of the market. Suppose for example, that new drugs just come along randomly as a result of serendipity. In that case, potential profit doesn't actually induce firms to develop new drugs. But if these new drugs find a market, and we're measuring the size of the market by looking at spending on drugs, then we'll create a misleading correlation between the "size" of the market and the number of new drugs. In this case, the number of drugs is "causing" the size of the market, rather than vice-versa. To avoid this, they use a statistical technique (instrumental variables) to pull out the parts of demand that vary due to demographics and overall GDP growth (neither of which should be affected by drug innovation over the 11-year period they work with).

When they do this, they find that bigger markets do indeed lead to more drugs. On average, when the market for a therapeutic category grows by 10%, there are 2.6% more new chemical entities approved over a given time period.

But how scientifically novel are these new drugs? Suggestive evidence comes from Acemoglu and Linn (2004), who perform a similar exercise as Dubois, Mouson, Scott-Morton and Seabright (2015), but on US rather than global sales data. When the market for different diseases in the US changes due to shifting demographics, how does this change the flow of new drug approvals for those diseases? Acemoglu and Linn find the effect of a bigger market is much, much stronger for generic drugs than for new molecular entities.

More direct evidence comes from Dranove, Garthwaite, and Hermosilla (2020) who also investigates this question in the context of global drug development over 1997-2018. They use the US Medicare Part D extension to see if the promise of higher profits leads led firms to pursue more scientifically novel drugs.

The basic idea is that Medicare Part D extended medicare to pay for enrollee's pharmaceutical drugs beginning in 2006. This created a big new market for drugs used by Medicare enrollees (US residents aged 65 and up). Dranove, Garthwaite, and Hermosilla have data on worldwide pharmaceutical company drug trials, and they want to see if companies run more trials on scientifically novel drugs in response to the new opportunities created by Medicare part D.

To measure the scientific novelty of a drug, Dranove, Garthwaite, and Hermosilla count the number of times the specific "target-based action" of the drug has been explored in previous drug trials (of similar or stronger intensity). A target based action comprises the specific (targeted) biological entity and the mechanism used to modify its function: for example, a p38 MAP kinase inhibitor is a target-based action that targets the p38 mitogen-activated protein kinases and inhibits its function. If this target-based action has never before been used in a clinical trial, then a drug using it is considered maximally novel. The more often it has been previously used, the less novel.

With this measure in hand and data on 76,161 clinical trials on 36,002 molecules, Dranove, Garthwaite, and Hermosilla look to see if therapeutic areas with greater profit potential in the wake of Medicare Part D see more clinical trials for scientifically novel drugs. While they do find that more exposed therapeutic areas do see a small increase in trials for the most novel kinds of drugs, once again the effect is much stronger for the least novel drugs. Over 2012-2018 the number of trials for the least novel group of drugs increased 106%, while the number of trials for the most novel group increased just 14% (with most of the gains coming in the second half of that period).

Can We Buy a Covid-19 Vaccine?

So back to covid-19. Can an advance market commitment “buy” a vaccine that doesn’t yet exist? Or are we in the same position as Joel Mokyr’s medieval kings, whose wealth can’t buy any treatment for the bubonic plague until someone thinks of the germ theory of diseases?

First, the studies above suggest these policies do work, but are most effective if the vaccine does not require too much more R&D. Does a covid-19 vaccine require a lot more research? I don’t know. On the one hand, there hasn’t been a human vaccine for this class of virus before. On the other hand, there have been vaccines for veterinary applications (innovation in human and animal health has a lot of similarities), and there seems to be no shortage of options.

Second, the size of the proposed policy is enormous relative to what’s been tried before. So even if these policies normally only work weakly on vaccines that are far from approval, it may be that we still observe a large effect simply because we’re pouring so much money into it.

Third, one of the goals of the Athey, Kremer, Snyder, Tabarrok proposal is explicitly to build manufacturing capacity for vaccines before they are proven, so that we can mass produce them as soon as we find one that works. To the extent building capacity is not a problem that requires R&D, than advance market commitments should work very well. In general, an advance market commitment is only one (albeit big) part of a set of complementary incentives the authors recommend to push and pull a vaccine to market. Give the whole thing a read!

Optimal Kickstarter

Buterin, Hitzig, and Weyl's proposal to fund public goods

Suppose there was a website called Optimal Kickstarter. Like the actual Kickstarter, it lets people propose projects to be funded, and it lets people crowdfund projects. But there are two differences.

First, Optimal Kickstarter only funds public goods: successful projects are made freely available to all. Second, Optimal Kickstarter has a benevolent patron who supplements the contributions of the crowd in a pre-specified formula. When you fund a project through Optimal Kickstarter, it essentially lets you buy support for a project with someone else’s money.

The way the website works is that every project has a calculator to explain how much patron money you can purchase for the project. For example: maybe there’s an open source software program that you think would be really useful. It is currently funded at $88,209. The site tells you that you can increase the funding to this project by $595 for the price of $1, $1,333 for $5, $1888 for $10, and so on (you need a calculator because the price of patron money is not constant).

Why is it called Optimal Kickstarter? Because the formula determining how much of the patron’s money you can “buy” for a project is from a paper by Buterin, Hitzig, and Weyl (2019) (preprint), which optimally funds public goods.

The trouble with Public Goods

Why so complicated? Public goods are projects that are, by their nature, enjoyable by many users at once and costly to exclude people from using them. Think open source software, art, research, national defense, herd immunity, etc. These goods cannot be efficiently provided by the private market. Since the good can be simultaneously enjoyed by many at once, efficient provision would make it free. But if the good is given away for free, it is impossible to cover the fixed costs necessary to create it. And if you try to charge a price greater than zero (to cover fixed costs), it’s doubly inefficient since it’s costly to prevent people from accessing the good. You have to waste more effort wrapping the project in IP, DRM, etc.

Such goods can be crowd-funded, but under standard economic models, they’ll be drastically underfunded. This is because individuals only consider the private benefits they derive from contributing, not the benefits that accrue to other users. Suppose, for example, for $1,800 programmers can optimize the program mentioned above and make it run 50% faster. Let’s say the value of that is $10 per person. If there are 180 users, the total value of that optimization justifies the cost.

But each user only derives $10 of value themselves so no one has an incentive to provide the full $1800 worth of funding. All the users could band together and provide $10 each, but if contribution is voluntary, this doesn’t solve the problem either. For any individual, if they withhold their $10, the project only raises $1790. That’s nearly $1800, enough to optimize the program by nearly 50% and generate, say, $9.95 worth of value to every consumer. Given this, a selfish individual should just free-ride on the contributions of others, enjoying $9.95 in value for free, instead of spending $10 to get $10 in value. But it doesn’t stop there. The next user can save $10 and enjoy a bit less than $9.95 in value by not donating either. Indeed, this logic continues all along the chain and the project ends up getting a tiny fraction of the funding it should.

Solving the Public Goods Problem

But notice that Optimal Kickstarter doesn’t have this problem. If I chip in $10, the project’s funding rises by $1888, because the patron supplements my contribution. This is enough to fully cover the cost of optimizing the software, meaning I enjoy the full $10 of benefit. If I don’t chip in $10, the software isn’t optimized and I don’t get to free ride on the improvement. So I might as well spend the $10.

The actual formula linking the contributions of the crowd to the funding level of the project is derived from Buterin, Hitzig, and Weyl (2019), who propose a decentralized way of optimally funding public goods. The actual formula isn’t that complicated, but neither is it very intuitive. Let c1 be the contribution of person 1, c2 be the contribution of person 2, and so on. The formula for N contributors is:

Essentially, you take the square root of everyone’s contribution, add all those up, then square that.

In the example I’ve been using, I assumed there were 99 funders, each of whom contributed $9. The square root of $9 is $3, so the total funding of the project is (99 x $3)^2 = $88,209. When you throw in another $10, the square root of $10 is $3.16, so the total funding of the project rises to (99x$3 + $3.16)^2 = $90,097, which is $1,888 more than $88,209. Hence, for $10, you buy $1888 in support for the project.

This formula lets the crowd provide information to the patron about how valuable the public good is, via their contributions. The more contributors, the more funding the patron provides. Notice that if there were just one person who benefited from the public good, and therefore one contributor, the patron doesn’t provide any support:

No more support is needed in this case: the whole problem is when others benefit from your contribution. If it’s just you that benefits, then it’s easy for you to decide how much to fund.

But as the number of contributors grow, there is a stronger and stronger signal that more people benefit from this public good. If there are two contributors each contributing $9, the patron supplies an additional $18. If there are three $9 contributors, the patron supplies an additional $54. If there are 99 contributors ($9 each), the patron supplies an additional $87,318.

Meanwhile, this system ensures people give credible signals about the benefits they receive from the public good since, at the end of the day, they do actually have to spend money to purchase patron support. And the fact that your contributions are supercharged sidelines the free rider problem.

It’s Complicated

In the real world, such a system would have some complications.

For one, each person’s decision about how much to contribute should actually depend on how much others are submitting. But, done properly, this should all be done simultaneously. If I’m the first person to contribute to a project, it is hard to know how much patron support my contribution actually buys, because it will depend on what other people ultimately give. Buterin, Hitzig, and Weyl suggest the program would have to be iterative, with some initial guesses about how what the other contributions would look like, followed by some rounds of fine-tuning in response to better information about how much everyone else actually gives.

Second, the patron’s funds are not actually unlimited. Buterin, Hitzig, and Weyl discuss ways the formula can be modified to take into account the reality that project support may be limited. That shifts you away from the optimal solution, but you still end up with something much better than crowd-funding, and where the patron learns a lot about the value of public goods through contributions.

Third, there are ways to game this system. Splitting contributions among several sock puppet contributors, for example, is a way to buy more support than is warranted. Indeed, a fraudster able to funnel his money through multiple “contributors” could use Optimal Kickstarter as a money pump, if they are able to create projects that simply pay the fraudster from patron money. Fraud detection mechanisms would need to be implemented in a real world implementation.

There are many other issues to be considered. But one of my favorite things about Glen Weyl’s work is that in this, and his many other radical proposals, he always tempers utopian visions with advocacy for an incremental implementation. Start small. See how it works. Correct problems and move forward.

Unfortunately, Optimal Kickstarter doesn’t exist. But I think it’s a cool idea. It would be a great public good. Just the kind of website that I bet would get a lot of funding from Optimal Kickstarter.

Update: I was wrong! Since writing the post, I’ve learned there are, in fact, a few platforms that use the Buterin, Hitzig, and Weyl algorithm to fund public goods.


If there are any curious and mathematically inclined readers out there, here’s a short proof for why Optimal Kickstarter is indeed optimal (warning: calculus).

Denote total funding for a project by F. The value of the public good to person i is Vi(F), and the total value of the project is the sum of Vi(F) across all individuals i. Assume Vi(F) is concave, continuous, etc. The optimal level of funding F for the public good satisfies:

This is just the level where a dollar of funding buys a dollar of value (when we add up the value of the public good across all people).

An individual using optimal kickstarter only considers the value they receive, but also only cares about the cost of their own money. They are maximizing Vi(F) - ci, which at the optimum satisfies:

This is where a dollar of individual i’s money buys a dollar of value to individual i. The key idea is that a dollar of individual i’s money raises total funding F by more than $1.

The partial derivative of F with respect to ci, comes from the optimal funding formula, and is:

Using this, we rearrange the individual’s optimality condition to yield:

Last, we add up these conditions across all individuals i. That gives us:

Which is the precisely the condition for the optimal level of funding F.

The Case for Remote Work

Stronger than you might think

I recently posted a working paper called The Case for Remote Work that some of you might find interesting. It’s kind of like this newsletter, in that it surveys a lot of different academic papers to make an argument. Also like this newsletter, it’s written in a way that I hope is accessible to non-specialists. But unlike this newsletter, it’s 31 pages (plus 6 pages of references).

From the abstract:

The case for remote work goes well beyond its use during the covid-19 global pandemic. Over the last ten years, research from a variety of subdisciplines in economics and other social sciences collectively makes a strong case for the viability of remote work for the long-run. This paper brings this research together to argue remote work (also called telework) is likely to become far more common in the future for four reasons.

  1. The productivity of individual workers who switch to remote work is comparable or higher than their colocated peers, at least in some industries.

  2. Matching firms to geographically distant workers is becoming easier thanks to technological and social developments.

  3. Remote workers tend to be cheaper because workers value geographic flexibility and the ability to work remotely.

  4. The benefits of knowledge spillovers from being physically close to other knowledge workers has been falling and may no longer exist in many domains of knowledge.

While the prevalence of remote work (pre-covid-19) is small, I show it was already rising rapidly with plenty of room to continue growing. Finally, I argue remote work has positive externalities and should be promoted by policy-makers.

Much of the material under item 4 above is drawn from papers that have been discussed in this newsletter. But the rest is probably new material to regular readers of this newsletter.

Back to your regular coverage of new work on the economics of innovation in a few weeks!

Loading more posts…