New Things Under the Sun is a living literature review; as the state of the academic literature evolves, so do we. This post highlights some recent updates.
An Internet of Ink and Paper
The post “The Internet and Access to Distant Ideas” highlighted three studies from the early days of the US internet to illustrate how access to the internet facilitated innovation. Firms who are connected to each other by the internet are more likely to collaborate on patents or cite each other’s work, and counties that would normally be left behind by rising geographic concentration of patenting were better able to buck the trend if they enjoyed greater internet penetration.
This post has now been updated to include discussion of a new paper by Hanlon and coauthors, which documents the same kinds of effects for a very different change in the technology of long-distance communication:
This isn’t the first time we’ve seen something like the dynamics brought about by the internet. Hanlon et al. (2022) travel even further back in time to 1840 in Great Britain to study what happens to science and invention when the price of the mail drops. Prior to 1840, the cost of posting a letter in Great Britain varied substantially based on the distance the letter needed to travel, as can be seen in the figure below. But in 1840, a greatly simplified pricing system was introduced: posting a domestic letter, of any distance, cost 1 penny.
As with the preceding papers, Hanlon and coauthors want to know how this drop in the price of long-distance communication affected collaboration (in science this time) and invention. Though it may seem a bit niche to contemporary readers based outside the UK, as a natural experiment in the effects of communication, this setting has several virtues.
In this era, pretty much the only way to communicate with people at a distance was by personal travel or via the postal system (telegrams at this time were primarily used by the railroads, not the general public). So if long-distance communication is important, this price change should matter.
Because prices prior to reform were based on distance, we actually have a lot of variation to work with. Distant towns experienced a big price cut in the costs of communication and nearby towns experienced only a small price cut. We can look to see if the effects of the reform varied across those contexts.
The price changes were substantial enough, by the standards of the day, to matter. The price of mailing a one-page letter from London to Edinbourgh fell from 10-20% of a professor’s daily salary to 0.5-1%! Also suggesting the price cuts were material, there was a very large increase in mail posted following the reforms.
To track the impacts, Hanlon and coauthors do two analyses.
The first is based on the citations made by articles published in the premier scientific journal of the day, the Philosophical Transactions of the Royal Society of London. For the ten years before and after the postal pricing reform, they locate where the scientists publishing in the Royal Transactions live and where the scientists they cite live. This gives them 1,251 citations between scientists in different parts of Great Britain. Analogously to Forman and Zeebroek (2019), they show the postal price cut increased citations between towns, and that this effect was larger for towns where correspondence was previously more expensive. Specifically, the price cuts reduced the “distance” penalty, wherein towns that are farther apart cite each other less, by 70%.
Hanlon and coauthor’s second analysis tries to assess the impact of the reform on new patents. For this, they have to take a different approach, because even if a patent is drawing on distant knowledge (obtained through mail correspondence), this isn’t really visible in the patent document. Patent citations in this era was not a big thing, nor was collaboration at a distance.
After locating where each inventor resides, Hanlon and coauthors try to estimate, for every town, how much did the postal reform affect that specific town’s access to ideas from the rest of Great Britain. By this measure, a town that is very remote from all others would experience a big increase in its access to distant ideas, since prior to the pricing reform it would have been quite expensive to correspond with most of the people in Great Britain. In contrast, a town that lies within a geographical cluster of several large population centers may have experienced a much smaller increase in its access to distant ideas. There are some other complicating details, but again they find the same flavor of result as earlier papers: patents increased by a larger amount in more remote towns, following the introduction of uniform postal pricing.
So in two quite different settings we observe the same general phenomenon: when communication at a distance becomes easier, access to distant ideas is improved and this has a disproportionate benefit to places that are otherwise far from where the inventive action is.
It didn’t make it into the update, but reading these history papers I am always impressed by the amount of work that has to go into creating the dataset. It’s no small thing to locate where each inventor lives in every year, which post office is closest, and how much it would cost to correspond with other post offices!
You can read the rest of the article (now renamed “The internet, the postal service, and access to distant ideas”) including the pre-existing bits about the early internet, here:
Networking at Academic Conferences
The post “Academic Conferences and Collaboration” surveyed a few papers that document how academic conferences can be useful for forging new collaborations. This post has been updated to include discussion of a new paper that tackles this question in a new way:
Instead of comparing people who attend a conference to those that do not, you can also look within a conference and see if attendees who interact more often during the conference are more likely to collaborate on new projects. Two papers find that is also the case.
Zajdela et al. (2022) examine four recent conferences (around 60 attendees), that mixed large topic discussions of around 10 people with small group discussions of 3-4 people. Zajdela and coauthors estimate how much time people spent interacting at the conferences based on their joint assignment to different sessions (they assume you might have interacted more if the session was longer or if the number of attendees was smaller). At the end of the conference they can see if people spontaneously teamed up to submit a proposal for research funding. Do people who spent more time in the same sessions team up at a greater rate? Yes!
But that doesn’t tell us much unless we know how these groups were formed. Maybe the conference organizers tried to match people up who they thought were most likely to want to work together; and maybe these people would have identified each other no matter what, in a conference with just 60 attendees. In that case, time spent in sessions together doesn’t matter - these people would always have collaborated.
Fortunately, Zajdela and coauthors also know the algorithm which was used to assign people to small and large group sessions. The conferences tried to optimally place people together according to some seemingly desirable, but possibly conflicting, rules.undefined Because this group assignment problem is very complex, the algorithm doesn’t exactly solve for the “best” outcome by these criteria. Instead, it just tries to get as close as it can, and there is a bit of randomness in where it ends up. Zajdela and coauthors re-run this algorithm a bunch of time to come up alternative conference schedules, each of which might well have been the actual schedule but for a bit of algorithmic luck. Then they look to see if collaboration is highly correlated with the actual time spent interacting, rather than the potential time interacting under alternative plausible conference schedules. And it is: among people who did not previously know each other, collaboration was about 9x more likely for pairs that actually attended a small group session together, as compared to pairs who did not attend a small group session together in the real world but would have in alternative possible conference schedules.
The post is also updated with a paragraph discussing some results of Lane et al. (2019), which is a longer run follow-up of one of the other papers discussed in the original post (Lane et al. (2019) has also been covered in more detail here).
Responding to a Good Counterargument to a Recent Post
The recent post “How common is independent discovery?” surveyed a few lines of evidence to think through how much redundancy there is in science and invention: if the discoverer of some idea had gotten sidetracked and never made the discovery, how likely is it someone else would have come along to make the discovery instead?
An email correspondent responding to that post made a really good counterargument to my interpretation of the evidence. I thought a good response to the counterargument was possible, but it would require drawing on a few additional papers. However, since “How common is independent discovery?” was already about as long as I want posts on New Things Under the Sun to be, rather than adding more discussion to that post, I instead decided to split what used to be one long article into two shorter articles.
So now there are two (interrelated) articles related to this topic. The original “How common is independent discovery?” has been reorganized and shortened to focus narrowly on papers about exactly what the title promises: independent discovery. Meanwhile a new post titled “Contingency and Science” is now the home to some of the other material from the original post, as well as my discussion of my email correspondent’s counterargument. Here’s an excerpt from this new post, which starts by describing an important counterargument to my earlier post, which relied on evidence from simultaneous discovery:
…there are two reasons to be cautious about leaning too heavily on evidence from multiple independent discovery.
First, rather than reflecting low redundancy in innovation, low levels of simultaneous discovery might be the outcome of scientists/inventors dividing up the intellectual landscape to avoid incursions into the “territory” of their rivals. In practice, scientists are aware of the different specializations and interests of other labs, and may well eschew work in those areas to avoid being scooped. But that doesn’t mean they are incapable of discovering ideas in those areas; they just choose to avoid them. It may be this avoidance that drives low rates of simultaneous discovery, but low rates of simultaneous discovery don’t actually imply low redundancy in innovation…
…Fortunately, we can get some complementary evidence from alternative literatures that essentially look at “divergent paths in the history of science.” To me, these also suggest contingency is important in science.
Staying Out of Rival Territory?
Suppose scientists proactively choose to avoid topics they believe have been informally claimed by others, even though they are perfectly capable of making the same discoveries in that topic. This has quite different implications for the contingency of science. For example, suppose a scientist has informal ownership of a specific topic - they were among the first to publish in the area, and everyone knows they have very good data and skills for continuing to do excellent work there. During this scientist’s life, others avoid work in the area, leading to a very low rate of simultaneous discovery. But if the scientist dies, we might expect a new scientist to take over the topic and make the same discoveries the deceased would have made.
It turns out, there is work that looks at an something quite like this example.
Azoulay, Graff Zivin, and Wang (2010) and Azoulay, Fons-Rosen, and Graff Zivin (2019) both study a sample of roughly 100-500 eminent life scientists who died in the midst of an active research career. When these scientists died, did others step into the gap and make the discoveries they would have made, had they lived? Of course, we can’t know that (absent access to a multiverse). But we can do the next best thing and match each of these deceased life scientists to another set of eminent life scientists and follow the trajectory of science across these individuals. For example, if the death of a scientist is associated with observable changes in the kind of research that is performed, relative to what we see among those who live, then that suggests those who follow in the footsteps of the deceased are not merely replicating what the deceased would have done.
Azoulay, Graff Zivin, and Wang (2010) focuses on what happens to the collaborators of eminent life scientists when they pass. One theory we might have about redundancy in science is that when an eminent life scientist passes away, collaborators who work on closely related ideas will pick up the baton and continue the work, at least after a period of grief and mourning. But if that’s the case, it doesn’t show up clearly in the data.
Azoulay, Graff Zivin, and Wang show that when you compare the publications of collaborators working with eminent life scientists who live and eminent life scientists who die, those working with the deceased publish steadily less work over time, as much as 15 years after. Moreover, this publication penalty is actually more severe for those working on the most similar topics as the eminent life scientist (as judged by the overlap in topics they work on). That all suggests the collaborators with an eminent life scientists are not easily able to “replace” the discoveries that would have been made if a life scientist had lived.
What about non-collaborators?
Azoulay, Fons-Rosen, and Graff Zivin (2019) uses the “related articles” algorithm in PubMed to define thousands of little microfields, each consisting of dozens of closely related articles. In some of these microfields an eminent life scientist working in it died amidst an active research career and in others an otherwise similar eminent life scientist lived. Azoulay, Fons-Rosen, and Graff Zivin then look to see how these microfields evolve from that point forward.
Consistent with the notion that scientists do respect intellectual property rights, when a scientist dies new people move into the field. The figure below plots the extra publications published by non-collaborators in fields where a life scientist dies, as compared to fields where an eminent life scientist does not die. That’s consistent with the notion that one reason the probability of simultaneous discovery is not higher is because people avoid working in areas where they know prominent scientists are active.
However, there are several indicators that the ideas these people pursue in this field differ from the ideas that would have been pursued by the deceased. The citation profile of publications changes when an eminent scientist dies; there is an increase in very highly cited new publications, relative to fields where the eminent scientist lives.undefined And the new publications also look different in terms of what they cite themselves: there are fewer citations to the work of the eminent life scientist and fewer citations to pre-existing work in the field. Lastly, the topics under study in this field change. The biomedical sciences use a standardized lexicon of keywords for classifying articles, and the keywords attached to articles in fields where a scientist dies tend to be younger and to feature more newer combinations of keywords. It all suggests people are not interchangeable; when one person exits a field, those who come after do not seem to do the same thing.
The rest of the post looks at some evidence of what happens when the scientific ecosystem fractures into relatively isolated communities, and argues we quickly see significant divergence in the topics under study. This material was originally in the post “How common is independent discovery?” but now lives in the “Contingency and Science” post.
Until Next Time
Thanks for reading! As always, if you want to chat about this post or innovation in generally, let’s grab a virtual coffee. Send me an email at mattclancy at hey dot com and we’ll put something in the calendar.
New Things Under the Sun is produced in partnership with the Institute for Progress, a Washington, DC-based think tank. You can learn more about their work by visiting their website.