In data science, it is common to say that storytelling skills are needed. For example, Forbes declares in Data Storytelling: The Essential Data Science Skill Everyone Needs that "For thousands of years, storytelling has been an integral part of our humanity. Even in our digital age, stories continue to appeal to us just as much as they did to our ancient ancestors. Stories play a vibrant role in our daily lives—from the entertainment we consume to the experiences we share with others to what we conjure up in our dreams" (Dykes, 2016). The importance of storytelling in the context of data science was already recognized 10 years ago. For example, Andrew Nusca, in a ZDNet post published on December 13, 2012 titled The key to data science? Telling stories, wrote: "Like journalism, there are many stories to tell from the same set of data, and data scientists must choose carefully" (Nusca, 2012).
In other words, in the case of data science, good storytelling means that data-driven solutions are communicated clearly, concisely, and directly to each relevant target audience group. This clear communication benefits both the researchers/writers and, in turn, the audience/readers. It increases the researchers' chances for their work to be published and acknowledged; this acknowledgement occurs when the audience has read the text, and it can be measured in citations by other researchers, adaptation of ideas by policymakers, or clicks by lay readers. However, to achieve this, the data must clearly tell a story.
The question is then, what tool can data scientists use to help effectively convey their story? The familiarity of a data scientist with the organization is not sufficient to convey a story. Data scientists should also understand the industry to which their organization belongs, asking questions such as: What is the said industry impact, role, and contribution to the society, as well as what problems does it create? Such an understanding helps the data scientists choose the stories they tell about the data. In this spirit, in the tableau Data Trends 2022 report, it says that "Data becomes the language for people and organizations to be seen, have their issues understood, and engage with institutions intended to serve them."(Setlur, 2022 p. 21)
Therefore, we propose that one way to construct the story is to implement the Rhetorical Triangle. It not only guides the story development, but it also helps adjust the content, jargon, and aim of the story to the relevant audience. Further, as storytelling may not be a common practice of data scientists, the guidance that the rhetorical triangle provides may appeal to data scientists due to its simple structure and ease of implementation, as we show below.
The rhetorical triangle reflects a way of argument that is expected from scientists and engineers. The rhetorical triangle shows that a communication which properly matches the expectations of the audience in terms of format and language, as well as rhetorical moves, will create a convincing argument. In fact, the more surprising or controversial an argument, the greater the need to apply the triangle's principles to convince the relevant audience.
As an equilateral triangle, the rhetorical triangle is composed of three equally weighted parts for conveying scientific delivery: logos, pathos, and ethos. Each part targets a different aspect of our human attention. Logos is the internal logic of the message, including evidence, which is common way for scientists and engineers to impart their message; pathos appeals to emotion, but also demonstrates shared beliefs and knowledge; and ethos is a resulting trust of the writer/presenter based on their perceived authority and professionalism (Aberšek & Aberšek, 2010).
We can use each part of the triangle to guide (data) scientists in telling the story of their data. To start, data scientists will probably find that logos, the logical content, is the clearest part to write. Nevertheless, here too, logos must be tailored to a specific type of reader, i.e., deciding which and how much information the reader needs to understand the data, as well as providing the appropriate connections between the ideas.
Moving to the pathos is more complex, and can be expressed in the introduction by referring to shared values (Lunsford, Ruszkiewicz, and Walters, 2010). For example, if we examine research from the current COVID-19 pandemic, there could be several different shared values depending on the target audience. In the medical field, this could be healing the sick or preventing sickness; among the general population, this could be allowing children to continue with their everyday routines or enabling people to maintain their economic status.
Ethos, the ethical appeal, also refers to credibility and trust. Credibility can be judged on the choice of reliable sources or the expertise of the writer/presenter. Moreover, solid logos and pathos is the basis that strengthens ethos, which can be seen as a way that 'packages' the information. A suitable package will keep the audience's attention and they will keep reading. Have we used appropriate language? Have we been cautious in making claims? Have we provided reliable sources? Have we expressed our ideas clearly? This whole package will contribute to the writer's authority.
For illustration of the implementation of the rhetorical triangle, we use the following data science project that should be conveyed to four populations: lay—the general public, investors, policy makers, and professionals/field-specific.
A company develops a navigation app and has been in operation for 10 years. Navigation apps are used by millions of users worldwide. They collect information about their users, including the time, location, frequency, stops, speed, and routes of their trips. They also save information about the roads, homes, businesses, infrastructure, and names. The company plans to add a new feature that allows the company to use its voice recognition technology. Such technology would allow the company to provide also security, not only in cases when users must talk and not type, but when, for example, someone attempts to use their phone without permission. These cases require permission to hear what we are saying and record the user's voice. The company needs to convince various audiences to support this initiative—each for its own motivation.
We take the chief data officer (CDO) in the company and how she presents the new feature to different audiences. She joined the company two years ago, after she sold her third startup.
Triangle edge
Audience |
Logos (the content) |
Pathos (shared beliefs) |
Ethos (credibility and trustworthiness of the presenter) |
Lay/general |
Information about what you will need to do to use or disable the feature
|
Voice recognition is important for personal security; we take care to prevent data breaches |
We have tested our system; so far, for the last 10 years, no information has been stolen from our company |
Investors |
Numbers about the potential companies that would use this app because of the new feature and the number of potential users they might have
|
Voice recognition is important - the market for security apps is growing; we take care to check the market and the competition and prices |
The CDO has experience in similar projects in a past company, a startup that she established and sold, as published in the press |
Policymakers |
Demonstrate how the company can ensure this app is used legally, and which countries already use similar apps |
Voice recognition is important for security for preventing crime, monitoring, safety reasons; we take care to prevent data breaches |
The CDO has worked with other similar projects in her own companies and with others around the world
|
Scientific/professional |
Explain how the technology works (technical details)
|
Voice recognition is important for security - can be improved by faster technology with fewer options for corruption |
The CDO has researched the field and developed a similar product that deals with security in the startup that she sold |
When we compare each of the three edges of the triangle, we can see that in several cases there is a general, common element that then changes in its specification for the audience. For instance, for all audiences, pathos begins with the subject of security, and then its motivation for security changes according to each audience.
We propose the readership to pick another data science project and describe it by applying the rhetorical triangle. Here are several possible stories that may be considered: technology for following the spread of Covid-19; a solution for dealing with a type of pollution; or changes in an educational program.
Using the rhetorical triangle is one way to help data scientists effectively present their story to various audiences. The triangle has a clear structure, and focuses on making the data clear and convincing.
In this process of applying the rhetorical triangle to data science, we find several suggestions for our readers. Firstly, we suggest that scientists try to avoid the 'curse of knowledge', which means that they might forget that in the past there is information that they, too, did not know. To do this, scientists can think about when they learned certain pieces of information, i.e., if the information requires a degree in the field, then it may be too difficult for the lay audience and/or may require more explanation. Following this, we suggest that if scientists are then debating about how much and which background information to give to one of the non-professional audiences, they should write out the whole story with all of the background that the researcher thinks is needed. Then, the writer can begin to cut down on irrelevant information, even just by starting with the limitation of length depending on the publication. In general, it is important to avoid sounding like a science textbook and giving too much background information as well.
The awareness needed for such processes is also part of the skills required for data science graduates. This is highlighted in the report of the ACM Data Science Task Force on Computing Competencies for Undergraduate Data Science Curricula, published in January 2021 (Danyluk and Leidig, 2021). For example, the 'Result delivery' skill is mentioned: "[…] on presentation of results, the Data Science graduate needs to explain and interpret the numerical conclusions in the client's terminology, and deliver text and graphics ready to be digested by non-technical personnel" (p. 38).
The relevance today of imparting the rhetorical triangle to data scientists has led us to plan to further expand the ideas presented in this post to a research project. Our plan is to investigate how data science students and experts conceive and apply the rhetorical triangle, as well as the impact of this tool on the audience.
References
Aberšek, B. and Aberšek, M. K. (2010). Development of communication training paradigm for engineers, Journal of Baltic science education 9(2), pp. 99–108.
Danyluk, A. and Leidig, P. (January, 2021). ACM Data Science Task Force: Computing Competencies for Undergraduate Data Science Curricula. Association for Computing Machinery. https://dstf.acm.org/DSTF_Final_Report.pdf
Dykes, B. (Mar 31, 2016). Data Storytelling: The Essential Data Science Skill Everyone Needs, Forbes.
Lunsford, A. A., Ruszkiewicz, J. J. and Walters, K. (2010) Everything's an Argument: With Readings. Bedford/St. Martins.
Nusca, A. (2012). The key to data science? Telling stories. ZDNet https://www.zdnet.com/article/the-key-to-data-science-telling-stories/
Rakedzon, T. and Rabkin, O. (in preparation, 2022). More than just IMRaD – Rhetorical sensitivity and the rhetorical triangle for impactful scientific writing.
Setlur, V. (2022). AI augments and empowers human expertise. Tableau https://www.tableau.com/sites/default/files/2022-02/Data_Trends_2022.pdf
Tzipora Rakedzon is Associate Head of the Department of Humanities and Arts of the Technion. Her research focuses on pedagogies and assessment of scientific and professional communication, especially writing and vocabulary. Orit Hazzan is a professor at the Technion's Faculty of Education in Science and Technology. Her research focuses on computer science, software engineering and data science education. For additional details, see https://orithazzan.net.technion.ac.il/.
No entries found