How to (Not) Lie with Data: Creating Effective Data Visualizations with Python


cover page

What is this book about?

We have entered the age of big data: there are terabytes and terabytes of information available for analysis. With all of this information, it is becoming more necessary to make summaries that adequately convey the story behind the data, portray any patterns, and show any outliers. That need has led to the development of the field of data visualization, which seeks to create effective graphical representations and summaries of data. In the age of big data, everyone wants to make data-driven decisions, and data visualization is essential to understanding your data.

Effective data visualization is important for all career fields. Unfortunately, many people do not have any training in data visualization. Whatever Excel spits out is what gets used. An effective data visualization makes use of natural human tendencies and intuitive understanding to convey a message. It is the difference between staring at rows on a spreadsheet and staring at a graph: one conveys a clear message, and one does not. This book will show you common lies associated with data visualization techniques and how to counter them to create effective graphs, charts, and visualizations.

What you will learn:

This book uses three Python libraries to teach you about data visualization: Matplotlib, Seaborn, and Plotly. The three libraries have different advantages and drawbacks, and they provide different ways to drive an effective visualization home to your reader. This book is for anyone who has at least a beginner’s understanding of Python. It assumes a basic understanding of Python: how to write a function, understanding the default Python datatypes, how to download Python and get your environment set up, etc.

You will learn about many of the standard data visualization techniques:

  • scatter plots
  • line charts
  • histograms
  • box plots and violin plots
  • bar charts
  • pie charts
  • joint plots and pair plots
  • choropleths
  • bump charts, slope charts, and lollipop charts
  • heatmaps
  • hexbinning
  • animation
  • details on demand with graphical user interfaces

Once you have learned the basics, we will use the different visualization techniques to tell stories about the 2018-2019 English Premier League, the Birthday Problem paradox, TSA passenger throughput as a result of COVID-19, and more.

This book will teach you how to make effective data visualizations. To not lie with data. To do so, you need to learn the rules and the suggestions of data visualizations. You need to see how to lie with data so that you don’t lie with data.

Take a test drive of the box plot and violin plot chapter here.