The evolving ecosystem of COVID-19 contact tracing applications

Since the outbreak of the novel coronavirus, COVID-19, there has been increased interest in the use of digital contact tracing as a means of stopping chains of viral transmission, provoking alarm from privacy advocates. Concerning the ethics of this technology, recent studies have predominantly focused on (1) the formation of guidelines for ethical contact tracing, (2) the analysis of specific implementations, or (3) the review of a select number of contact tracing applications and their relevant privacy or ethical implications. In this study, we provide a comprehensive survey of the evolving ecosystem of COVID-19 tracing applications, examining 152 contact tracing applications and assessing the extent to which they comply with existing guidelines for ethical contact tracing. The assessed criteria cover areas including data collection and storage, transparency and consent, and whether the implementation is open source. We find that although many apps released early in the pandemic fell short of best practices, apps released more recently, following the publication of the Apple/Google exposure notification protocol, have tended to be more closely aligned with ethical contact tracing principles. This dataset will be publicly available and may be updated as the pandemic continues.


Introduction
The outbreak of SARS-CoV-2, the virus responsible for COVID-19, has resulted in an ongoing global pandemic that has, as of February 1, 2021, infected over 100 million people and killed more than 2 million people (Senn, 2021). As an early response, many countries implemented temporary stay-at-home measures compelling non-essential workers to remain indoors under quarantine, as well as enforcing social distancing measures and the wearing of masks in public places. While these measures proved effective in the short run when met with high public adherence levels, they have not proved a feasible and effective long-term solution for controlling the virus's spread in most countries. Full lockdown measures come with significant economic, social, educational, psychological, and medical ramifications; for instance, unemployment in May in the United States was estimated to be as high as 16% as a result of the pandemic (Kochhar, 2020). A practical and proven method for counteracting the spread of transmissible diseases is through contact tracing: using social networks to determine contacts who have had recent interactions with an infected individual. These contacts can then be quarantined, tested, or instructed to self-isolate for 14 days, the putative upper bound on the incubation period of the virus (Linton et al., 2020).
Traditional contact tracing relies on many human workers to track down and reach out to each potential contact. It can therefore be prohibitively expensive, slow, and challenging to scale (Hart et al., 2020). Many countries and private organizations turned to smartphones as an alternative source for personal contact history. There has been a rapid development of mobile applications designed to aid contact tracing to suppress transmission of COVID-19. These applications typically use GPS or Bluetooth data to detect a device's proximity to nearby devices. Although manual contact tracing is expected to have higher effectiveness at reducing transmission rates than digital contact tracing for a fixed number of individuals (Kucharski et al., 2020), the scalability afforded by digital contact tracing coupled with its relatively low cost makes large-scale implementations more feasible than manual contact tracing.
Naturally, the prospect of large-scale monitoring of the public using contact tracing applications has alarmed privacy advocates (Gillmor, 2020). The hasty implementations of such systems in response to the pandemic's rapid proliferation and severity raised concerns about whether potential ethical and privacy implications were sufficiently considered during the rollout of applications. Further privacy concerns have arisen from contact tracing applications utilizing geolocation data, centralized storage systems, closed source implementations, and government coercion (Shukla et al., 2020). Consumer advocates fear that a disproportionate emphasis on health over privacy could undermine the protection of civil liberties far beyond the stated intentions of these applications, similar to the vast expansion of state surveillance in the United States post-9/11 (Guariglia, 2020;Hart et al., 2020). In turn, such concerns may negatively impact adoption rates and thus reduce the effectiveness of contact tracing. Simulations have shown that the efficacy of contact tracing applications is highly dependent on adoption rates, with a level around 50-60% necessary to limit the spread of COVID-19 (Ferretti et al., 2020). Survey results from Canada and the United States have supported the notion that privacy is a significant reason people may be hesitant to use digital contact tracing apps (Rheault & Musulan, 2020;Zhang et al., 2020), and has remained consistent throughout the pandemic (Simko et al., 2020). Although polls have demonstrated a high willingness to download contact tracing apps (Altmann et al., 2020), short of coercion, maximizing adoption rates can only be achieved if individuals do not perceive the application to encroach unreasonably upon civil liberties. While there may seemingly be exceptions to this rule, such as South Korea's government-backed centralized approach, deviations may be at least partially explained by factors unique to South Korea, such as (1) societal support for government interventions due to prior exposure to outbreaks like Middle East Respiratory Syndrome (MERS), and (2) rapid implementation that had little time to garner negative press and may have instead dampened uptake in other countries (Park et al., 2020).
Digital contact tracing is inherently multidisciplinary and draws upon public health, epidemiology, computer science, and data science. Each of these fields contains multiple sets of ethical guidelines. Several attempts have been made to combine these to form an appropriate set of guidelines for ethical contact tracing (Morley et al., 2020;Parker et al., 2020;WHO, 2020). However, the difficulty in realizing these guidelines in practice lies in the fact that there are already many existing contact tracing applications (at least 124 released, by our estimation) and a multitude of public, private, and academic actors involved in their development. These factors can lead to a lack of coordination due to these various actors' competing interests and discordant practices, resulting in no universally observed standard for ethical contact tracing. Although vaccines are now available and being distributed to the public, there is still a lack of consensus and unanswered questions about how future contact tracing programs should be conducted to balance the utilization of data and preservation of individual privacy. In this paper, we conduct an exhaustive survey of contact tracing applications and assess the adherence of these applications to established ethical contact tracing guidelines. Our main contributions are a database of contact tracing applications, an assessment of how well these applications conform with the aforementioned ethical principles based on their synthesis into a single 8-point scale (which we denote the "Ethical Alignment Score"), and a comparative analysis of applications based on location, release date, and other variables of interest.

Data collection and coding
In their interim guidance report on ethical contact tracing, the World Health Organization (WHO) outlined 17 principles to which contact tracing applications should adhere (WHO, 2020). These guidelines dovetail with other ethical contact tracing guidelines (Morley et al., 2020;Parker et al., 2020) and provide a set of criteria upon which existing contact tracing applications can be assessed. The proposed principles are summarized under the following 17 categories: time limitation, testing and evaluation, proportionality, data minimization, use restriction, voluntariness, transparency and explanability, privacy-preserving data storage, security, limited retention, infection reporting, notification, tracking of COVID-19-positive cases, accuracy, accountability, civil society and public engagement, and independent oversight. For a more detailed explanation of these categories, we refer the reader to the original WHO article (2020). The remainder of this paper uses these guidelines as a basis for a set of questions that can be answered with publicly available data about each contact tracing application. The responses to these questions are combined into a single value, which we denote the Ethical Alignment Score.
We compiled a list of COVID-19 contact tracing applications by conducting independent research on applications found in following sources: (1) Wikipedia page on COVID-19 apps (2) Privacy International (3) Open source community document on COVID-19 contact tracing technologies (Stop-COVID.tech, COVIDWatch, Mitra Ardron) (4) New Zealand Office of the Privacy Commissioner: Overview of COVID-19 Contact Tracing Apps -12 May 2020 (5) Amnesty International (6) Crowdsourced list of projects related to COVID-19 contact tracing (7) MIT Technology Review Covid Tracing Tracker (8) list of countries using Google and Apple's Contact Tracing API (XDA Developers) (9) Publicly-available Exposure Notifications apps (Google API) (10) FIPRA Europe COVID-19 Tracing App Tracker (11) Google searching for terms "covid-19", "contact tracing", "bluetooth contact tracing app", "gps contact tracing app" Using these resources and the materials put out by the makers of the applications themselves, we extracted a set of features for each app (when information was available) based on a subset of the ethical principles set out by the ethical contact tracing guidelines outlined previously (see table  1). We collected application release dates from appannie.com, a service for distributing statistics on mobile applications and websites, and approximate download numbers from the Google Play Store, when available.
To evaluate contact tracing applications based on the criteria provided in the previous section, we propose the concept of an "Ethical Alignment Score". The Ethical Alignment Score of an application is a value in the range 0-8 representing the extent to which the application meets principles of ethical contact tracing, as outlined in table 1. A score of zero indicates that the application fails to even partially fulfill any of the criteria, while a score of eight is the ideal case indicating that the app conforms with all principles of ethical contact tracing that we were able to assess under this framework. Each principle in table 1 corresponds to a single point in the app's score. The full dataset, as well as an interactive map showing all the applications, is available here: https://benjaminlevy.ca/covid-apps. This dataset also further breaks down each principle into 2 or Table 1. Principles and definitions for scoring apps. These principles are based on Morley et al. (Morley et al., 2020) and principles from the WHO (2020).

Features Definition
Temporary Clearly defined lifetime and documented decommissioning process; limited data retention The scope of the application is limited to the current pandemic and no data is stored longer than deemed necessary to perform reliable and accurate contact tracing. analysis, since only those two elements will be updated over time. All analyses were completed using Python v3.8.2.
The state of the contact tracing ecosystem We collected data on a total of 152 applications from 77 different countries (counting the EU as a separate country for apps that are EU-wide). Of these, 124 are released on Apple's App Store or the Google Play Store (or are available as web-based platforms), while 17 have been cancelled and 11 are still in development at the time of writing. The app release dates span from mid-February (Singapore's TraceTogether was released on February 18, 2020) until January (Louisiana's COVID Defense was released on January 18, 2021). Figure 1 shows the progression of app releases in each WHO region (with Canada and the US separated out from Latin America in the North America region). Initially, most new applications came from the Western Pacific region (including Singapore), Southeast Asia, and the EU. There has since been an explosion of apps released across the world, particularly in the EU, Eastern Mediterranean, and North America. Figure 1. Released contact tracing apps, grouped by WHO region. North America (U.S. and Canada) is separated from Latin America, although normally the WHO groups both into a single Americas region. The total number of apps reflects only those apps that could be assigned to a single country or region.
Comparison of contact tracing/exposure notification protocols. The choice of contact tracing protocol (or lackthereof) has varied substantially throughout the course of the pandemic. Early on, most apps did not employ an established protocol. By the beginning of September, this had shifted such that the majority of new applications employed the Apple/Google exposure notification protocol (see figure 2).  Comparison of Government-based Applications. 110 applications (72%) were developed or backed by government entities. The corresponding ethical alignment scores for these applications, averaged by country, are provided in ranked order in Figure 4. Switzerland, Italy, and Belgium were among the countries with the highest average scores (8), while Hong Kong, Vietnam, Qatar, and China received the lowest scores (1). Although most countries have a single government-backed application, two notable outliers are the United States and India, with 24 and 7 government-backed apps released, respectively. Whereas the United States' apps were released late in the pandemic (all on or after August 3 rd , 2020) all of India's apps were developed relatively early, within the period from March 7 th to May 9 th .
Common factors that resulted in lower alignment scores included (1) the use of a centralized database to store user data, (2) GPS tracking of individuals, and (3) lack of informed consent associated with the collection, storage, or use of individual data. While none of these applications received the worst possible score of zero, only 3 applications received a perfect score of 8. Among the applications with near-perfect scores (7.5), the most common reason for losing points was not having a clearly defined lifetime for the application (13/15).
Comparison of Non-Government-Based Applications. In our dataset, 42 applications were not directly associated with a governmental entity. Of these, 20 have been released, 15 have been cancelled, and 7 are either still in development or have an indeterminate status. The countries with the most non-governmental applications of any operational status are the United States (11), Germany (8), Canada (4), the Philippines (2), and Italy (2). Most non-governmental applications were released before June (62%) and all were released before July. When compared to the governmentbacked applications, these non-governmental applications score lower on average in almost every question (see table 2), with the exception that slightly more non-governmental apps are open source (30%) than governmental apps (28%).

Time Evolution of Ethical Alignment.
To distinguish any trends in ethical alignment scores over time, we examined the ethical alignment scores of different applications as a function of application release dates. As shown by Figure 5, there was initially high variation in scores among apps, while later in the pandemic, apps have tended to have much higher ethical alignment scores.
Disaggregated scores can be seen in Figure 6. Two criteria that are consistently less satisfied than the others regardless of the date are whether the app is open source and whether the app has a clearly defined lifetime.
Apps in the United States. Apps in the United States can be neatly divided into two groups shown in Figure 7: (1) private apps released early in the pandemic and (2) government-backed apps released late in the pandemic. The large number of government-backed apps in the US relative to other countries is due to the fact that apps have been developed at the state level, as opposed to the national level in most other countries.
Limitations of Analysis. The assessment of contact tracing applications was limited in scope by the public availability of data which made analysis of some action points from the WHO ethical contact tracing guidelines difficult. In an effort to counter this, emails were sent to application developers for all applications examined in this study, where possible, with a 17-point questionnaire covering the items that were not able to be assessed using public information. The response rate to this questionnaire was approximately 20%, which was deemed too low to make generalizations due to the possibility of self-selected sampling.
Because of the international scope of this work, many apps did not have English-language documentation readily available. Although we attempted to mitigate this with Google Translate, it is likely that certain aspects of these apps were lost in translation.  Finally, it may be the case that apps that implement the Apple/Google exposure notification protocol are overrepresented in the dataset relative to non-Apple/Google apps. This is likely due to the presence of lists of apps that implement the Apple/Google protocol, such as one curated by Google itself. Conversely, apps that do not implement the protocol may be harder to find, which would contribute to their apparent scarcity later in the pandemic.

Discussion and significance
In this paper, we have provided a synthesis of existing databases of contact tracing applications, supplemented with additional apps found in the course of research, creating, to our knowledge, the most comprehensive collection and evaluation of COVID-19 contact tracing applications to date. Figure 1 indicates the existence of several waves of contact tracing applications. The first applications were developed in March 2020, coinciding with the first wave of national lockdowns around the world. The period of April-May 2020 saw a precipitous rise in applications, somewhat evenly split between WHO regions, with the exception of the United States and Canada. The period of June saw the release of several more applications, but this quickly stagnated in July until an influx of new applications, predominantly from North America and Europe, during the period August-November. The cause of the temporary stagnation during July is unclear. However, it may be related to incorporating privacy-preserving contact tracing into new applications following the release of the Apple-Google implementation in late May 2020, as well as the possibility that our data collection procedure may have missed some apps that were released in this period. Figure 4 illustrates the wide variance in country-level differences in ethical alignment scores for government-backed applications. We advance several possible explanations for these disparities. First, there may be intrinsic reasons why a country's apps scored higher, such as higher cultural value placed on privacy vs. public health. Similarly, the large number of high-scoring apps implementing the Apple/Google protocol in European and North American countries may indicate greater cultural appropriateness of that particular protocol to Western nations or greater appeal of the protocol to Western governments. Second, the variance in ethical alignment between countries may also reflect a government's speed of innovation. Figures 5 and 6 show an apparent disparity between the ethical alignment scores of early implementations, released before June 2020, and those released subsequently. The disparity bisects applications into two opposing categories: data-centric and privacy-centric (Fahey & Hino, 2020). Early implementations tended to exhibit lower ethical alignment scores and placed more emphasis on data collection than individual privacy. In contrast, later implementations tended to exhibit higher ethical alignment scores, emphasizing individual privacy as well as greater transparency. This transition roughly coincided with the Apple-Google contact tracing implementation in April 2020 (Apple, 2020), which upon release was touted by some as well-formulated privacy-preserving contact tracing (ACLU, 2020). By September 2020, applications reached an asymptote corresponding to an ethical alignment score of 7, on average. Therefore, paradoxically, the governments that were more innovative and willing to experiment with digital technologies for combating COVID-19 may have been more likely to implement apps with sub-optimal privacy features, whereas governments that waited to roll out their own contact tracing application may have benefited from the development of privacy-preserving protocols like the Apple/Google Exposure Notification protocol.
Both Germany and the United States are notable for having a large number of non-governmental contact tracing applications relative to other countries, with 8 and 11, respectively. In Germany, many of these applications were developed as part of the WirVsVirus hackathon, which took place from March 20 th to 22 nd and whose goal was to bring together coders to rapidly prototype digital tools to help confront the challenges posed by COVID-19, including contact tracing applications (Menger, 2020). In the United States, the significant number of independent applications may be a result of both a huge capacity to produce digital solutions (e.g. Apple and Google, the producers of the eventually dominant Exposure Notification protocol, are located in the United States) and a hesitancy of governmental authorities to develop or sanction their own apps. Figure 7, which contrasts different contact tracing implementations in the United States, shows a clear distinction between government-backed and non-governmental contact tracing implementations. Non-governmental applications tended to exhibit lower ethical alignment scores (as was the case across the world; see table 2) and were developed significantly faster than government-backed applications, with a three-month gap between the release of the last non-governmental application and the first government-backed application. This delayed response may be due to a general mistrust of public health measures or technologies that have the capacity to infringe privacy or track individuals. According to the Best Countries survey conducted by U.S. News, the United States ranks near the bottom in terms of the citizenry's trust in the government with respect to matters of healthcare (McPhillips, 2020). Another possible explanation is the generally decentralized nature of the United States' public health infrastructure, illustrated by the fact that multiple apps have been rolled out independently by different states, as opposed to a single federal app.
Future Implications. The evolution of these applications throughout the COVID-19 pandemic has several important implications on the future of large-scale epidemiological monitoring. If effective, such a large-scale system could prevent or contain large numbers of infections in the event of an outbreak while avoiding the worst impacts of an indiscriminate lockdown. Conversely, the same system could be used under normal circumstances to detect and monitor yearly flu outbreaks. Monitoring flu outbreaks would invariably save lives, but the frequent mutation of flu strains also provides a convenient justification to perpetuate the existence of contact tracing applications without end. In this scenario, the potential exists for governments and companies to abuse contact tracing to perform mass surveillance of civilians using seemingly justified means. The implementation of ethical contact tracing minimizes this possibility by empowering civilians with control over the collection and disclosure of their data and ensuring radical transparency, accountability, and data security. All of these factors are instrumental in building public trust, which, if the situation ever arises, would be minimally necessary for an environment where some degree of digital surveillance is required for public protection against the outbreak of an infectious disease.
Another implication of these analyses is the way in which standards around privacy and ethics are set. In lieu of a central authority to enforce strong standards early in the pandemic, Apple and Google effectively became the arbiter of ethical contact tracing, in some instances indirectly pressuring governments to adopt their protocols (in the case of the UK's NHS app) (Sabbagh & Hern, 2020) or outright banning apps that did not conform to their standards (in the case of Iran's ac19r app) (Sato, 2020). Although Apple and Google's protocol did enforce strong protections for individual rights, in our view, it is crucial that future decisions around public health technologies be made in a way that involves multiple stakeholders, including public health authorities, elected representatives, and the general public.

Conclusion
Now that vaccines have been approved for widespread distribution, we are likely to see a gradual return to the normalcy of public life and an easing of government restrictions. However, the virus will likely remain with us still for an indefinite period until a sufficient level of immunity is reached through vaccination. Using mobile phone data to track, trace, and quickly respond to new infections and flare-ups has drawn much attention as a potential solution. However, high levels of public adoption are critical to the success of contact tracing applications. Since coercing the public into using these applications would constitute an unacceptable violation of several critical ethical principles, the main avenue to promote adoption is building public trust. It is vital that the public feel that they are in control of their data at all times. Thus, we are heartened to see that most applications -especially those released later in the pandemic -use non-coercive, opt-in, decentralized, and anonymous contact tracing schemes.
It is arguably still too early to tell whether the use of the contact tracing applications discussed here has altered the course of the pandemic in a meaningful way, especially since many applications have been released somewhat recently. However, it is clear from the data presented here that there has been significant heterogeneity, both within and between regions, and across the evolution of the pandemic, in terms of the ethicality of these applications. Crucial to investigating whether these apps have proved efficacious or not would be the release of precise download numbers from the app developers, which would allow researchers to determine whether uptake reached the thresholds theorized to ameliorate the spread of infection. We hope that future work can build off the database we have assembled to further probe whether these apps have been helpful and how the use of similar technology in future public health crises can be better managed.