Why do data scientists tell stories


Data storytelling - what is behind this new discipline? An interview with Frank Pörschmann, Vice President of the Digital Analytics Associations e.V., Germany's largest professional association for data experts. At the same time, a detailed basic article about this new and extremely exciting topic for storytellers.

Data and storytelling, how do they fit together?

Storytelling is a key to successful data analysis today. It is not for nothing that the most popular advanced training courses for professional data analysts are currently storytelling seminars.

A data professional is characterized by the fact that he can ultimately convey complex, multi-layered and abstract findings in a comprehensible and compact manner. He must therefore be able to master and translate two languages: data - business / business - data.

But there is also a problem here. The training of data analysts in Germany focuses purely on methodical craft and less on communication. Yet we are the land of thinkers and poets. The art of data sealing is unfortunately still not very well developed in this country.

What makes storytelling so relevant when it comes to data?

The following still applies in organizations of all types: "Nothing is more expensive than a bad decision."

Research in decision-making science shows that the quality of a decision depends much less on the experience of a manager than previously assumed. The quality of a decision depends primarily on the available knowledge and information base as well as the ability to assess possible consequences. This changes the role of the decision maker and gives a data expert the right to belong to the decision team as an additional source of information.

But since only a few speak the language of the data world, it has to be translated. Not in numbers, facts and probabilities, but ideally in good stories. Many of the tools of storytelling help here - narrative structures, dramaturgies, knowledge of the audience, identification of protagonist and antagonist, etc.

Data storytelling focuses on the customer as a hero

What should you watch out for when storyfecting data?

Data storytelling is still a young discipline. There is no such thing as one method here. Even if structure-loving data scientists wish that. Creativity and empathy are required here, but above all enough time for stories to develop. In my experience, 6 points are decisive:

  1. Know (learn) the language and thought patterns of his clients and build on them
    Pay close attention to the language and structures of the counterpart, e.g. flowery or martial, concrete or in metaphors, result- or process-oriented, even when placing the order and interim coordination. Language is often very revealing, even in top management.
  2. Focus on the question of decision
    It's not about what the data says, just how you feel about the decision. By the way: As a data expert, it is perfectly legitimate to force your client to articulate your question clearly. This can take several days.
  3. Look for moments of tension, incorporate dramaturgies
    Nothing is more tiring than a string of numbers, data and facts. Facts support core statements. The more pictorial and emotional the key message, the better. The simplest form: “You expected X in the preliminary discussion, now let's see what really is.” Professionals develop complete dramaturgies with roles and multi-layered conflict levels.
  4. take time
    Here I see the number one killer. Even the best data professionals destroy the effect of excellent work here. It is not uncommon for the document to be created the day before the presentation date. The classic: time in min / 5 = number of slides. Then it is collected. Awful! A good story takes time and a systematic approach. Experienced experts begin to develop a first hypothesis-based Strawman, a storyboard, so to speak, right from the start. Even if the data later reveals completely different findings than expected, the structure remains. Some top management consultancies have been working on this principle for years.
  5. There is only one author
    Have you ever tried to write a poem in a team? Collaboration & art have their limits. Therefore: someone has to guide and lead the storytelling. The team's role is to make sure the story “works” and to come up with ideas. The goal: The story must be understandable, relevant and plausible, enrich it and provide the listener (i.e. the client) with a solution.
  6. The customer is the hero - whether tragic or funny - the main thing is successful
    The structure of the classic hero's journey also applies to the data world. The customer / client is already the protagonist in terms of office and function. Everything he pushes is done with the aim of success. The question of decision is tantamount to a struggle between good and evil, the outcome is uncertain, it is necessary to exist between interwoven conflicts (in the business world these are conflicts of goals, values ​​or resources). In the end, it's about the plan that dissolves the tension and lets the customer be a hero - but based on facts.

Are there dangers, for example due to oversimplification?

The main dangers lurk in the imperfection of human thought. On the one hand there is the danger of bias, on the other hand the human weakness of being able to differentiate between correlations with causalities.

Bias is the term used to describe subconscious judgment tendencies in one's own thought process. There is an abundance of already researched biases, which can basically be assigned to four categories, those that are caused by:

  1. too much information
  2. too little importance
  3. strong pressure to act / make decisions
  4. Limitation of human memory

A detailed overview of the cognitive biases can be found here.

Everyone, data experts and decision-makers, are subject to their own bias. It is important to filter these in advance as best as possible and not to drag them along subliminally into the analysis or the later story.

The problem of misunderstood correlation is common in the media. If events show a similar development, one speaks of correlated, not knowing whether they are directly, indirectly or purely coincidentally similar. Events are causal if they are directly dependent on one another. For example, a US study showed a connection between people who wear tattoos and their likelihood of committing a crime. There was no causal connection. For the broad press it still seemed “a good story” and sufficiently plausible, it also fitted so wonderfully into a trained cliché (which is also a bias). Experts later found serious methodological flaws in the investigation, but the story was already in the mind.

A number of flimsy correlations can be found e.g. on the website "Spurious-Correlations" (http://www.tylervigen.com/), which I can recommend to everyone for entertainment. There, for example, the divorce rate in the US state of Maine correlates with the average US margarine consumption.

Data storytelling translates the results from high-performance computers into the real world

Is Big Data a Special Challenge? Why?

The word alone is a challenge. On the one hand it is shaped by the visions of the large IT companies, on the other hand it is emotionally charged by the horror visions of the consumer advocates. Technically speaking, big data is when four conditions are met. There are:

  1. massive amounts of data
  2. Real time
  3. in different, unpredictable structures and
  4. with changing, fuzzy information content.

Technically, that's not even the challenge anymore. The dangers lie in the use and interpretation. The more data I have and relate it to each other, the more correlations I can find. This allows all the more stories to be derived, including those that support a bad decision. A good deal of expert knowledge is required here in order to translate the results from high-performance computers into the real world.

Are there examples of good and bad data stories?

Good stories can vary greatly in content and structure. In the end, you have one thing in common: The decision maker can clearly articulate his decision himself and communicate it to others, while at the same time he found the path to decision-making fluid and smooth without knowing why. That's the magic of storytelling. This art remains hidden from the listener.

I've had the case before when a data scientist wanted to come to a board meeting with 165 different frequency distributions. My criticism was incomprehensible to the expert. “Those are the facts,” he replied.

In another case, a customer wanted to know whether he should accelerate or slow down the development of a new digital product and hoped an intelligent AI algorithm could predict the future for him. The first thing to do here was to specify the decision question, which can also be analyzed using data. Neither data experts nor algorithms can assume management responsibility.

Data storytelling supports good decisions

How do data experts actually communicate with one another?

For outsiders, data experts speak a foreign language, with foreign vocabulary and impenetrable grammar. For data experts, it's just their jargon. Unlike perhaps in linguistics, the language of mathematics impresses with its uniqueness and accuracy. And yet here, too, there are regular misunderstandings among professionals.

The challenge lies in the translation. If a data expert has to exchange ideas with decision-makers or other experts, translation or foreign language skills are required. This is exactly what data storytelling is.

Is that possible: data expert and storyteller in personal union?

There is nothing against it. However, reality shows that the combination is still very rare. On the one hand, it is due to the training. Storytelling is not part of the data education repertoire. However, data experts are often naturally lovers of precision and abstraction. Not all of them want to learn a new language, including an emotional one, and try out creatively in it.

The role of the data visualizer and data storyteller has already developed. Good experiences with it come from classic consulting. Experience has shown that data scientists and business consultants with experience in storytelling in a team are very effective constellations.

As in foreign languages, the same applies here: You don't have to be fluent in the new language of storytelling, even a first basic vocabulary is useful in order to better convey your skills.

Are there any preferred techniques?

In the data world, we fundamentally differentiate between the discipline of visualization and actual storytelling.

When it comes to visualization, there are extensive techniques, methods and hundreds of different visualization formats. From simple two-dimensional distributions to multi-dimensional, multi-colored assignments and flowing structures. These master the data visualizers and have already received global awards for this. There, too, the boundaries between graphics and art are blurring.

In contrast, data storytelling is still very young. It's not very systematic and if we're honest we're all still experimenting with it. However, findings, basic structures, guidelines and checklists from classic storytelling certainly help. Methods such as storyboards and role profiles are transferrable and the no-go's are the same. What breaks a novel also destroys a data story. New and different in data storytelling is the focus on economic relationships and conflicts as well as dealing with the decision-making issue.