Since the end of 2019, research on Covid-19 has been carried out at high speed and generates international cooperation in the form of co-authorship of scientific articles. China, the hotbed of the epidemic, is the main producer of research on the new coronavirus, but nearly 60 countries are involved as of 6 April 2020. This note reviews the geography of research on Covid-19 and its evolution between 23 March and 6 April 2020.
Foreword on Scientific Production
The rapid spread of Covid-19 and the need to find solutions to curb the epidemic and deal with its consequences explain why we are witnessing an accelerated mobilization of scientists to publish their results. These publications take the traditional form of articles in peer-reviewed journals and the more direct form (as they do not pass through the filter of peer-review) of preprints on open archives (in particular bioRxiv, medRxiv and ChemRxiv).
From the point of view of the dissemination of knowledge, the academic biomedical publishing sector, dominated by private players with impressive sales margins (see the revenues of Elsevier or Springer), hastened to offer open access to academic literature on the subject of coronavirus (example at Elsevier: https://www.elsevier.com/connect/coronavirus-initiatives). Exceptionally, the knowledge produced by researchers from all over the world is presented as a public good and its open access as indispensable for a global and accelerated circulation of knowledge.
Wishing that these initiatives do not remain exceptional and confined to the case of the crisis we are going through, Vincent Larivière, Fei Shu and Cassidy R. Sugimoto call for an opening “without delay” of academic literature (Larivière, Shu & Sugimoto, February 2020).
Of course, this race for vaccine and publication is not without suspicion of fraud and scientific error, as evidenced by the number of retracted articles and preprints (see the Retraction Watch tracking site) and its share of scientific controversy. The Raoult case in the field of virology has received sufficient media (and political) coverage to bear witness to this (Science at the time of the Coronavirus, Gingras, 2020). Whether relying on the literature published in peer-reviewed journals or on publications available in open archives, one should therefore take a cautious look at the currently available data on Covid-19 related research.
Tracking progress on Covid-19 research
Precautions taken, the web application NETSCITY, set up for researchers, scientific information specialists and scientific journalists to process data from major bibliographic databases, provides an interesting and quick overview of the origin of the first publications on Covid-19.
This application, currently under development, is already available in beta version at: https://www.irit.fr/netscity. It has been jointly developed by three CNRS research units: UMR Géographie-cités (Paris), UMR LISST and UMR IRIT (Toulouse) with the support of the NETSCIENCE group of the SMS LABEX.
In the context of the crisis we are going through, this application can help answer the following questions:
- Where do the scientific articles that have as keywords “COVID-19”, “2019-nCoV” or “SARS-CoV-2” come from, since December 2019?
- Does this geography reflect the geography of the epidemic or are there specificities characteristic of the traditional geography of the field of virology, with a special effort of the areas where the historical laboratories of this field are located?
- What can be said about scientific cooperation in the form of co-authorship of publications? Despite the epidemic and the closing of borders, are we seeing the emergence of connections between researchers located in different cities, countries and continents?
A first analysis, carried out using Web of Science data on two dates, 23 March 2020 and 6 April 2020, has highlighted the pre-eminence of publications from China and the progressive growth of publications from other areas, including the sub-Saharan area.
Here are the details of the data collected, followed by some graphical representations extracted from NETSCITY.
On 23 March 2020, 197 publications were available in the WoS (SCI-EXPANDED, CPCI-S, ESCI), including 70 peer-reviewed articles, 65 editorials, 35 letters, 17 reviews, 9 news, and 1 correction. By way of comparison, for similar requests, Alexei Lutay found the following day: 386 publications in Scopus, 1262 in Semantic Scholar and 1766 in Dimensions. Given the coverage (the number of journals covered by the Wos remains more limited, these differences do not seem surprising) (Lutay, March 2020). From a thematic point of view, the main fields covered by these publications are general medicine, virology, infectious disease, immunology, microbiology, medical imaging and tropical medicine. The medical journal Lancet shows the highest number of publications to date (Table 1).
On April 6, 2020, the same query in the Web of Science returned 442 publications (twice as many as 15 days earlier). These included 146 articles, 137 editorials, 79 letters, 41 news, 34 reviews, and 5 corrections. The fields of pediatrics, biology and intensive care are more prominent. The field of tropical medicine is becoming more marginal. The main contributions remain in general medicine, infectious diseases and virology. At this time, the British Medical Journal (BMJ) surpassed Lancet in number of publications. The first three Web of Science journals publishing on the subject remain the BMJ, Lancet and Journal of Medical Virology (Table 2).
Now let’s move to geography!
The geography of Covid-19 research
March 23, 2020
As of March 23, the publications that can be geographically referenced (177 out of 197) come from 39 different countries (Table 3).
The top 5 countries produced 69% of all publications on the subject. In descending order of production, these countries were China, the United States, the United Kingdom, South Korea and Switzerland. They were closely followed by Italy, Germany and France (Map 1).
Production comes from 159 separate urban areas. The top 55 urban areas contributed nearly 80% of production (Table 4).
Thanks to NETSCITY, the data is normalised so that when a publication comes from several different agglomerations, each receives a fraction of the publication in proportion to the total number of participating agglomerations. To produce these statistics, the urban level considered is that of the agglomeration in the sense that we have grouped together the central city and its suburbs (see the methodology explained here). The main reporting urban areas are Wuhan, Beijing, Hong Kong, Guangzhou and Seoul. The primacy of the city of Wuhan and the fact that the top 5 is Asian is evidence that the geography of research here is directly linked to that of the epidemic (Map 2). These agglomerations are followed by London, which, at this date, is not the European city most affected by the epidemic. It should therefore be seen as having a special place in the scientific fields concerned and as home for many scientific journals (to date, half of London’s publications are editorials).
Of the 177 publications, 96 were signed from at least two different agglomerations and 10 were signed from more than 6 agglomerations. This density of co-publications makes it possible to focus on networks of cooperation between places. At the country level, the main collaborative links are between China and the rest of the world: United States, Canada, Australia, Germany, United Kingdom, Belgium, France. Italian scientists have collaborated more specifically with the United States and Brazil (Graph 1).
At the interurban level, sub-national collaborations are predominant (in China: Wuhan-Beijing and Wuhan-Shanghai links; in France: Paris-Bordeaux link – cities of the first french Coronavirus patients; in Korea: Seoul-Taejon/Daejeon link). Then, there is renewed international cooperation between Philadelphia and Guangzhou; Bangkok and Singapore; Taipei and Wuhan; Rome and Rio; Atlanta and Riyadh; New Haven and Sydney; Copenhagen and Porto; Paris and Wuhan; and between Geneva and Shanghai (Graph 2).
April 6, 2020
As of 6 April, the publications that can be geographically referenced (381 out of 442) come from 57 different countries (Table 5).
The top 5 countries now account for 66% of all publications on the subject, indicating that production is less concentrated than 15 days earlier. The top three countries remain China, the United States, and the United Kingdom. On the other hand, South Korea and Switzerland are overtaken by Italy, the European country most affected by the epidemic, and Germany (Map 3).
Production comes from 262 separate urban areas (that’s one hundred more than 15 days earlier!). The top 54 urban areas contributed nearly 70% of the production, also indicating a deconcentration shift in production between cities (Table 4).
The main reporting urban areas are Wuhan, Beijing, Shanghai, Hong Kong, and Guangzhou. London and Singapore are ahead of Seoul, which was the fifth most publishing city 15 days earlier (Map 4). Tokyo’s normalized number of publications has increased from 1 to 5, propelling the Japanese metropolitan area among the 10 most publishing cities on the subject. A few urban spaces stand out in the southern hemisphere, whose dynamics will be interesting to follow in the coming weeks, especially Singapore, Melbourne and Sydney. Riyadh, Tehran and Beirut in the Middle East are also active, no doubt influenced by the importance of the epidemic in Iran.
Of the 381 publications, 187 were signed from at least two different agglomerations and 20 were signed from more than 6 agglomerations. At the country level, the main collaborative links remain between China and the rest of the world. The United Kingdom is developing cooperation with the United States and Singapore. India (Pune in particular) is connected to China and Thailand. Tanzania is integrated into the global scientific network through one co-publication with South Africa. Similarly, Lebanon is connected to the network through Iran (Graph 3).
At the inter-city level, sub-national collaborations remain important, especially between Chinese cities. In addition to those recorded 15 days earlier, there is privileged cooperation between Atlanta and Seattle in the United States, as well as between Sapporo, Naha and Tokyo in Japan. In addition, there is a very large number of new international cooperations. Links between Toronto and Xian, London and Singapore, Ann Arbor and Shanghai are proving important (Graph 4).
Understanding this geography
It may come as a surprise that we are dealing with such a dense network of cooperation at a time when the question of research is only just emerging and when we are in a situation where opportunities for exchange are weakened by the closure of borders.
To better understand what we are observing, it would be useful to differentiate between the different types of publications considered and to conduct interviews with the researchers involved. For example, we can imagine that cooperation with China has proved essential both for the medical management of the crisis and for knowledge of the virus: chinese scientists having rapidly sequenced the genome, followed by those at the Pasteur Institute in Paris (Lemke, January 2020). The laboratories had to coordinate, share their results, schedule clinical trials and exchange biological specimens. This is the case with the Doherty Institute in Melbourne, which communicated at the end of January 2020 on the fact that it had succeeded in replicating the virus in the laboratory (University of Melbourne, January 2020).
In addition to the accelerated exchanges justified by the urgency of the crisis, we need to combine the pre-established exchanges between laboratories and researchers who are part of pre-existing scientific communities and who had already worked together before. One can think of the community of specialists in coronaviruses, which are a particular type of virus that Professor Bruno Canard (Aix-Marseille University) has been studying since the early 2000s (Sauvons l’Université, March 2020). Thus, within the ICTV (International Committee on Taxonomy of Viruses), there is the Coronaviridae Study Group with a majority of American, German and Dutch members.
The role of the historical laboratories in virology that are the Pasteur Institutes in Paris, Hanoi and Dakar as well as the Robert Koch Institute in Germany in monitoring the spread of the virus and in the search for vaccines is also worth mentioning. To learn more about the history of these two scientists and the institutes that took their names, see the book and documentary of the same name Pasteur and Koch: a duel of giants in the world of microbes. Finally, we note the coordinating role played by the STAG-IH (Strategic and Technical Advisory Group for Infectious Hazards), a committee of experts set up in 2005 at the time of the Ebola epidemic, which provides reports and advice for the World Health Organization.
This justifies that initiatives to make the scientific literature associated with the epidemic available should include literature that predates December 2019. The knowledge base needed to make progress in this field is not limited to publications published since the emergence of the new coronavirus.
For those who would be interested in delving further into these questions, we can distinguish several corpus made available to researchers in recent months:
- The COVID-19 open research database (CORD-19), a free resource of more than 44,000 scientific articles, made available by the Allen Institute for AI and its partners. A sub-part of this corpus has been geographically analysed and is available as an online preprint (Dousset & Mothe, 2020). In addition, the Neural Covidex project (University of Waterloo and NYU) provides automated means to explore this corpus.
- All publications with the keyword “coronavirus” from January 2000 to March 2020 available on the PubMed database (6560 documents). These publications are being searched to extract semantic relationships using Gargantext software (ISCPIF, 2020). For further analyses of this type, the first results of Chaomei Chen can be followed using CiteSpace software (Chen, 2020).
- The publication database specially set up by the World Health Organization on COVID-19, which as of April 12, 2020 includes: 5014 articles including 170 from the BMJ, 120 from the journal Nature and 112 from the Lancet (WHO, 2020).
- The open archives, including a database of 1555 preprints deposited on MedRxiv and BioRxiv relating solely to the new coronavirus (MedRxiv, 2020). For a review of the number of contributions related to the new coronavirus in open archives, see the analyses by Nicholas Fraser and Bianca Kramer (Fraser and Kramer, 2020).
- A review of the open access literature from several databases (Dimensions, Scopus etc.) by a team of scientists from Bandung Institute of Technology in Indonesia (Irawan et al., 2020). A complementary analysis also taking into account the rate of evolution of content on the Web of Science and Scopus is also available (Torres-Salinas, 2020).
- The covid-nma database fed by the Cochrane Institute, INSERM and APHP, which currently includes 275 clinical trials. It has been the subject of an initial analysis including mapping (Vuillemot et al., 2020).
- The list provided by the World Health Organization of current vaccine development programmes (Covid-19 candidate vaccines, 2020). This list has just been the subject of an analysis published in Nature reviews (Thanh Le et al., 2020). This work indicates that the majority of initiatives are currently being driven by private North American industries.
Finally, while this contribution has focused solely on biomedical research by restricting the query to the Web of Science’s Science and Technology databases and excluding the Human and Social Sciences indexes, this does not mean that coronavirus research is limited to the fields of medicine and biology. Indeed, the epidemic affects all parts of our society, both from the point of view of the response of public services and epidemiology and from the economic, social and environmental aspects. The contribution of the Human and Social Sciences is particularly important in this context, as evidenced by the number of specialists who have been summoned to the media in recent weeks to address the lockdown issue. Specific research coordination initiatives are currently initiated to facilitate exchanges between biomedical research and research in the human and social sciences, particularly in the field of epidemiology. In France, one can think of the actions of the CARE committee, as well as research pooling initiatives such as CovidFight.
This note and the results presented were obtained using the NETSCITY application. This web application applies the methodology developed as part of a research program on the geography of science that began in 2010. It allows the rapid processing of large volumes of bibliographic data, the geographic location of publications, the aggregation of data at the level of comparable urban areas, and the building of networks of places between cities and between countries on a global scale.
This web application, still under development (feedback is welcome), is available online at https://www.irit.fr/netscity.
The development team includes Laurent Jégou, geographer and geomatician at the UMR LISST in Toulouse, Guillaume Cabanac, computer scientist and scientometrist at the UMR IRIT in Toulouse and myself, geographer at the UMR Géographie-cités in Paris – Aubervilliers.
Two students from the IUT of computer science in Toulouse also contributed to the web development: Nikita Yakimovich and Nils Bourgon.
A scientific conference paper presented at the International Conference on Science and Technology Indicators in Rome in 2019 allows to situate the application in the context of science data processing applications and to explain how to use the web application. To refer to it:
Maisonobe, Marion, Laurent Jégou, Nikita Yakimovich, and Guillaume Cabanac. 2019. NETSCITY: A Geospatial Application to Analyse and Map World Scale Production and Collaboration Data between Cities’. in ISSI’19: 17th International Conference on Scientometrics and Informetrics. Rome.
Laurent Jégou and Guillaume Cabanac contributed to this note.