Data visualization in the time of coronavirus

Currently, we observe a proliferation of data visualizations about Covid-19 in the media, which makes it a convenient time to study the topic from the perspective of different disciplines, including information design and mathematics. If, on the one hand, the abundance of such pandemic representations would already be a legitimate reason to address the issue, on the other hand, it is not the central motivation of the present discussion. The uniqueness of the epidemiological phenomenon that we are experiencing highlights new aspects regarding the production and use of data visualizations, one of which is its diversification beyond counting and visual representation of events related to the virus spread. In this sense, the article discusses, through the analysis of examples, three different approaches for this type of schematic representation, namely: visualization of hypothetical data, visualizations based on secondary data, and visualization for social criticism and selfreflection. Ultimately, we can argue that design contributes to the production of data visualizations that can help people to understand the causes and implications involved in the new coronavirus and encourage civic responsibility through self-care and the practice of social distancing.


INTRODUCTION
Today, the expression "unprecedented" is heard quite often. It's as if the language we use to explain our world is falling apart and the superlatives cannot keep up with the new reality brought about by the new coronavirus pandemic. Since March 2020, our lives have been hit by an avalanche of questions that are difficult to answer: "When will this end? How can I protect myself and others? How long will schools remain closed? Am I going to lose my job?".
The irony of the situation is that these doubts emerge at a time of substantial data production, which, however, doesn't seem to be sufficient to clarify the situation. The pandemic is generating growing amounts of data such as case counts, death counts, recovery counts, testing rates, etc. At the same time, data are often uncertain, incomplete, and difficult to analyze as they are continuously reviewed and updated. Faced with the complexity and volatility of the phenomenon, scientists, journalists, and public officials strive to make data meaningful. In this sense, data visualization plays a crucial role in the communication process.
We can observe a proliferation of data visualizations about Covid-19 in the media, which makes the time very suitable for studying the topic from the perspective of different disciplines, including information design. Information design plays a crucial role in the Giannella, J.R. & Velho, L. (2021). Data visualization in the time of coronavirus. Strategic Design Research Journal. Volume 14, number 01, January -April 2021. 275-288. DOI: 10.4013/sdrj.2021.141.23 production of effective data visualizations. Bonsiepe (2000) states that the process of communicating and sharing knowledge is associated with the presentation of knowledge and the presentation of knowledge is a central issue of design: ... without design interventions knowledge presentation and communication would simply not work, because knowledge needs to be mediated by an interface so that it can be perceived and assimilated [...] here is offered a leverage point for information design as an indispensable domain and tool in the process of communicating and at the same time disclosing knowledge. (p. 3).
The domain of design is the domain of the interface. Design acts on the interface of physical and/or semiotic artifacts by applying principles of graphic communication (gestalt, graphic semiology, etc.), usability, and a user-centered approach in project development. Ultimately, effective data visualization helps people to understand, in different degrees and aspects, the causes and implications involved in the new coronavirus and encourage civic responsibility through self-care and the practice of social distancing.
When the pandemic broke out worldwide in mid-March, maps, bar graphs, and line graphs of confirmed cases, among other types of counts, predominated in the media. However, as the effects of the crisis unfolded, the media began to pay attention to alternative types of data that go beyond the mere count of confirmed cases and mortality rates. Concepts such as probability, exponential growth, logarithmic scale, and moving average, once restricted to specialized domains, reached the daily news and needed to be incorporated into data visualization for lay audiences. At this point, we must include mathematic literacy as a second indispensable competence for data visualization production and consumption.
Perhaps, the most emblematic data visualization example that we can cite in this context is the "flatten the curve" chart that, forgive us for the double meaning, went viral, and gained countless adaptations. However, to talk more about this chart, we need to take a step back.
The practice and reflection of data visualization have a long history that goes back to the empirical production of maps and charts by pioneers such as Joseph Priestley, William Playfair, and Charles Joseph Minard. Additionally, we should mention the systematization itself of a theory for the visual-graphic language proposed by scholars like Jacques Bertin (2010), Edward Tufte (1983), Johanna Drucker (2014), among others. In recent years, data visualization has become popular, becoming more present in our daily experiences. Even before the Covid-19 outbreak, data visualization had already acquired a status of popularity in mass communication. Viegas and Wattenberg (2015) highlighted that "... from the perspective of journalism, is that data visualization is an essential part of this communication process. Today, a data-driven story without a chart is like a fashion story without a photo".  Disclosure of concrete and in-process events, that is, a set of data updated in realtime in a controlled environment.
None of the situations, however, describe the Covid-19 phenomenon: the pandemic, at the moment, is an event in process in an uncontrolled environment. The situation driven by Covid-19 is unprecedented both in terms of the speed and unpredictability the virus is spread and in terms of the fast pace at which news and analysis are published and disseminated in the media. The simultaneity between phenomenon and analysis of the phenomenon consolidates a new approach to data visualization. We will call this approach of visualizing hypothetical data that has as its focal example the "flatten the curve" chart. In this approach, data visualization moves away from its primary function, which is to evidence concrete events, to demonstrate concepts and mathematical models from data obtained in simulations.

VISUALIZING HYPOTHETICAL DATA
The Washington Post story signed by Stevens (2020)   The use of an icon in the middle of the sentence (Figure 2)   In the sequence, the article strives to clarify why simulating the spread of epidemics in social distancing scenarios, according to public health professionals, is the best approach to slow down contagion and minimize the impact of the disease. To elucidate this argument, the article introduces elementary notions of dynamic systems in epidemiological situations based on the constitution of a scenario for the propagation of a fictitious disease, called "simulitis". The invention of a disease to illustrate the case rather than based on Covid-19 itself is not justified. However, it can be understood as a narrative strategy to simplify the explanation, eliminating parameters not essential for an initial understanding.
The explanation of the spread of the "simulitis" disease is done at two levels: the first to clarify the categories of people involved in an epidemiological scenario and the second to simulate the spread in a small population.
The first level of explanation, not to mention explaining the SIR model (Kermack & McKendrick, 1927), the text presents the division of a given population into three categories: healthy person (equivalent to category S, of Susceptible), sick person (equivalent to category I, of Infected) and recovered person (equivalent to category R, Recovered). To clarify the contagion dynamics, the article presents two consecutive animations. The first animation reinforces the spread of viral disease through contact between a sick person and a healthy person. When a brown ball touches a blue ball, the latter acquires the color of the first, a visual metaphor for contagion ( Figure 3a). The second animation shows that a sick person (brown ball) probably transforms into a recovered person (purple ball) ( Figure 3b). In the second level of explanation, the article simulates the SIR model in a city that is also fictitious with a population of 200 people, faced with four scenarios: a -circulation free-forall; b -forced quarantine; c -moderate distancing and; d -extensive distancing. Each scenario is communicated to the reader through an animation composed of three visual-graphic elements: 1) counting; 2) chart of change over time; 3) a simulation of contagion and recovery dynamics among individuals in the population. Figure 4a illustrates the three elements for a given moment of contagion in scenario a, that is, without any social distancing action and Figure 4b illustrates a certain moment in scenario b, that is, of forced quarantine regarding the imposed practice by the Chinese government in Hubei province, Covid-19 ground zero. The animation of the latter scenario includes a barrier that separates sick people from healthy people, but as time passes the barrier breaks, demonstrating the impracticality of forced quarantines without a public awareness campaign. After presenting the animations, the article reinforces they are based on hypothetical data that are rendered differently each time the simulation is played again. However, there are standard shapes in the charts that are maintained ( Figure 5). Specifically speaking, these charts are area charts that compare changes over time and show the proportion of the total that each category occupies at a given time. This type of visualization generates insights into general trends and relative values. Without going deeper into the analysis of the charts and the implications of each scenario, the journalist maintains that "moderate communicative distancing will usually outperform the attempted quarantine, and extensive social distancing usually works best of all." (Stevens, 2020).
The Washington Post article simplifies the complex and dynamic phenomenon of an epidemic spread. Under concrete conditions, other factors need to be considered, including the death rate from a disease that, in the case of Covid-19, is real. Except for the first chart  The Economist report points out that the smoother curve results in fewer infected people at the same time, which reduces the chances of an eventual collapse of the health system resulting in fewer deaths. However, this information is only provided in the text. After its first publication in the mainstream media, the "flatten the curve" chart was adapted countless times, inside and outside the journalistic context. Barclay and Scott (2020) and Roberts (2020)

VISUALIZATIONS BASED ON SECONDARY DATA
There is a second wave of data visualizations that reflects the developments involved in the practice of social distance. These are visualizations based on secondary data, that is, indirectly related to the advance of the pandemic. We selected some visualizations to comment and we divided them into three axes

Socio-environmental impact
Popovich (2020)   The company provides anonymous records of GPS locations of users who have chosen to share their data anonymously in the U.S. from February 26, 2020, to March 25, 2020.

Socio-economic impact
The New York Times printed edition published on March 27 (2020a) and May 9 (2020b) two visually impacting covers (Figure 10) about the wave of unemployment that has hit the United States. The two front pages use a similar strategy: they use charts on a linear scale to highlight the disproportionate peak of unemployment in recent weeks. Data, however, are not neutral units of information, and questioning them is a critical attitude. Yglesias and Animashaun (2020) share some considerations on the data used to address the current unemployment problem. On the March 27th front page, we can argue that the increase in unemployment insurance claims in the U.S. has increased dramatically in recent weeks not only because of possible layoffs but also because a law recently enacted in the country has expanded the eligibility of beneficiaries. At the same time, unemployment insurance claims are likely to be underreported because the state systems responsible for receiving requests do not support the high demand for claims. This leads to the belief that the actual number of requests exceeds that registered by official agencies.
On the May 9th front page, we can consider the fact that the monthly change in unemployment is measured in absolute numbers. If it was based on the unemployment rate, would the chart shape have the same visual strength? The NYT itself ponders this aspect by bringing a secondary chart to the front page ( Figure 11). It shows the unemployment rate in the same period. The unemployment rate in April 2020 is the highest recorded in history, but showing numbers summarized in rate is less visually impacting than presenting the variation in absolute numbers.  Leatherby and Gelles (2020)

SOCIAL CRITICISM AND SELF-REFLECTION
There is a third wave of data visualizations leveraged in the context of the pandemic. We are experiencing a health crisis that produces enormous amounts of data that, in turn, are processed and released almost in real-time. However, these data, even when visualized, are not able to address all doubts and alleviate our anxieties.
D'Ignazio (2015) draws attention to the fact that we need to invent new ways of representing uncertainty, outward looks, missing data, and defective methods: While visualizations -particularly popular, public ones -are great at presenting wholly contained worlds, they are not so good at visually representing their limitations. Where are the places that the visualization does not go and cannot go? Can we put those in? How do we represent the data that is missing?
In this perspective, data visualization practices arise guided by other motivations. We can mention, on the one hand, examples that promote social criticism and care for the community and, on the other hand, typical cases that provide self-reflection. visualizations on the subject. The series was shared on her profile on Instagram (Chalabi, 2020). Figure 14 depicts a selection of three data visualizations that show the greater vulnerability of refugees, blacks, and prisoners in the face of the pandemic.

Self-reflection and expression
For data visualization practices aimed at self-reflection and expression, we want to highlight a more paradigmatic change in the flow of data visualization production during the pandemic. So far, we have cited data visualization practices produced by the media (official or alternative) to clearly and reliably inform a wide audience. In addition to this purpose, the Covid-19 context highlights the emergence of data visualizations created for other purposes such as self-reflection and expression. Along these lines, initiatives such as Diario Visual de la cuarentena (Errea, 2020), Data Selfie da quarentena (Giannella, 2020), and Quarantine portrait (Massimetti & Testa, 2020) (Lupi & Posavec, 2015), the workshop aimed to encourage the participant to reflect on the quarantine period from the collection, representation, and communication of personal data that constituted the background of their activities during isolation. Figure 15 is an example of a data selfie designed by a student. The representation portrays a week of the nostalgic feelings felt by its author. The horizontal line represents that specific week, starting on Monday (Segunda) and ending on Sunday (Domingo). The color variation represents the two nostalgic categories: red depicts nostalgia of outdoor activities; blue of indoor ones. The up and down orientation of the line represents whether or not the student spoke about that feeling she was having. Finally, the height of the line indicates the intensity of the feeling (low, medium, high).

CONCLUSIONS
Crises like a pandemic require abundant and consistent communication with reliable sources. Also, due to its complex and data-based nature, the phenomenon requires visual communication. Different agents take on the mediation of the communication process, including designers. Media -from mainstream vehicles to academic publications and independent initiatives -must be judicious and sensible when selecting and structuring data about Covid-19. Data visualizations are powerful resources for communicating information, but when misused they can deceive and misinform.
In this article, we seek to highlight the contribution of design in the process of producing data visualizations that inform, clarify and, express multiple nuances and facets of the representation of the new coronavirus. Through exploratory research, we selected and analyzed examples of data visualizations that characterize its diversification beyond count events. Without exhausting the subject, the research points to possible future works.
Covid-19 exposes challenges for the visual representation of data: how to explain and describe exponential growth in a clear and accessible way? How to correctly compare the spread of the virus in different countries? How to visually explain uncertainty in numbers such as counting cases when the tests performed are not always sufficient to make them reliable indicators of real cases? Rather than reducing the complexity of the phenomenon, data visualizations should embrace visual conventions specific to the context, as well as making use of annotations, narrative strategies, and caveats.