# How to select the best Excel charts for data analysis and reporting

**Index**

## The importance of data visualization

Data visualization is the presentation of data (both qualitative and quantitative data) in graphical format. Through data visualization you can easily:

- Visualise data (make sense of data, especially big data)
- Classify and categorise data
- Find relationship among data
- Understand the composition of data
- Understand the distribution of data
- Understand the overlapping of data
- Determine patterns and trends
- Detect outliers and other anomalies in data
- Predict future trends
- Tell meaningful and engaging stories to decision makers

Data presentation is a very important skill for an optimizer (marketer, analyst). In fact it is so valuable that LinkedIn lists it as one of the top skills that can get you hired in 2017:

Source: LinkedIn Unveils The Top Skills That Can Get You Hired In 2017

Excel charts are commonly used for data visualisation and presentation. But selecting the right excel chart is always a challenge. If you use an incorrect Excel chart for your analysis, you may misinterpret data and make the wrong business and marketing decisions. If you use incorrect an Excel chart for your presentation, then stakeholders may misinterpret your charts and take wrong decisions.

Therefore selecting the right Excel chart is critically important.

## The anatomy of an Excel chart

In order to read an Excel chart, it is important that you understand the various components of the chart.

Consider the following data table in Excel:

This data table has got five **variables**: ‘Month’, ‘Sales’, ‘Cost’, ‘Profit’ and ‘ROI’:

This data table is made up of **categories** and **data series**:

**Categories **– Here the first category is ‘Jan’, second category is ‘Feb’, third category is ‘Mar’ and so on.

**Data series** – A data series is a set of related data points.

**Data point** – Data point represents an individual unit of data. 10, 20, 30, 40 etc are examples of a data points. In the context of charts, a data point represents a mark on a chart:

Consider the following excel chart which is made from the data table mentioned earlier:

This chart is made up of following **chart elements**:

- Primary Horizontal Axis:
- Primary Vertical Axis
- Secondary Vertical Axis
- Primary Horizontal Axis Title
- Primary Vertical Axis Title
- Secondary Vertical Axis Title
- Chart Title
- Data Labels
- Gridlines
- Legend
- Trendline

In Excel, categories are plotted on the horizontal axis and data series are plotted on the vertical axis:

From the chart above, we can conclude the following:

- Months are plotted on the primary horizontal axis.
- Sales, cost and profit are plotted on the primary vertical axis.
- ROI is plotted on the secondary vertical axis.

In order to add, remove or edit a chart element in Excel (2013), follow the steps below:

**Step-1**: Open MS Excel and navigate to the spreadsheet which contains the chart you want to edit.

**Step-2**: Select the chart and then from the ”* Design*‘ tab select ‘

*‘ drop down menu:*

**Add Chart Element**You can also include ‘**Data table’** in the chart above, by following the steps below:

**Step-1**: Click on the chart in Excel (2013)

**Step-2**: Click on ‘* Add Chart Element*‘ > ‘

*‘ > ‘*

**Data Table***‘:*

**With legend Keys**You can now see your chart with data table:

The type of Excel chart you select for your analysis and reporting depends upon the type of data you want to analyse and report and what you want to do with data:

- Visualise data (make sense of data esp. big data)
- Classify and categorise data
- Find relationship among data
- Understand the composition of data
- Understand the distribution of data
- Understand the overlapping of data
- Determine patterns and trends
- Detect outliers and other anomalies in data
- Predict future trends
- Tell meaningful and engaging stories to decision makers

## Most common data types

Following are the most common data types that can be visualised:

**Quantitative data**(also known as interval/ratio data) is the data that can be measured. For example 10 customers, sales, ROI, weight etc.**Qualitative data**is the data that can be classified/categorized but it can not be measured. For example: colours, satisfaction, rankings etc.**Discrete data**– quantitative data with finite number of values / observations. For example: 5 customers, 17 points, 12 steps etc.**Continuous data**– quantitative data with value / observation within a range/interval. For example: sales in the last one year.**Nominal data**– qualitative data that can not be put into a meaningful order (i.e. ranked). For example: {Blue, Yellow, Green, Red, Black}**Ordinal data**– qualitative data that can be put into a meaningful order (i.e. ranked). For example: {Very Satisfied, Satisfied, Unsatisfied, very unsatisfied} or {Strong dislike, dislike, neutral, like, strong like}

## When to use a clustered column chart

#1 Use a **clustered column chart** when you want to compare two to four data series. In other words, avoid using column charts if you have just one data series to plot:

Alternatively, avoid creating a column chart which has got more than four data series. For example following chart contains just five data series and it has already started looking cluttered:

The chart below contains 11 data series and is very difficult to read and understand:

If you want to to create a column chart which contain lot of data series then you can **try switching ‘row’ and ‘column’ of the chart** and see whether it makes any difference:

For example, after switching the row and column of the chart (with 11 data series), it looks like the one below:

Now this chart, though still look cluttered, is much easier to read and understand.

#2 Use a **clustered column chart** when the data series you want to compare have same unit of measurement. So avoid using column charts which compare data series with different unit of measurements. For example in the chart below ‘Sales’ and ‘ROI’ have different unit of measurements. The data series ‘Sales’ is of type number. Whereas the data series ‘ROI’ is of type percentage:

#3 Use a **clustered column chart** when the data series you want to compare are of comparable sizes. So if the values of one data series dwarf the values of the other data series then do not use the column chart. For example in the chart below the values of the data series ‘Website Traffic’ completely dwarf the values of the data series named ‘Transactions’:

#4 Use a clustered column chart when you want to show maximum and minimum values of each data series you want to compare.

#5 Use a clustered column chart when you want to focus on short term trend (i.e. changes over days or weeks) and/or the order of categories is not important.

## Breaking a clustered column chart

The chart below contains 11 data series and is very difficult to read and understand:

One method of making this chart easier to read and understand is by breaking it into several smaller clustered column charts. For example, you can create one column chart which just compare the sales performance of various countries in January. Create another column chart which just compare the sales performance of various countries in Feb and so on:

The rule of thumb is to **avoid presenting too much data in one chart, regardless of the chart type you use.**

## When to use combination charts

A combination chart is simply a combination of two or more charts. For example the combination of a column chart with a line chart. I use combination charts a lot and I think you must know how to create them as they are very useful. Following is a short video on creating a combination chart in excel:

#1 Use a combination chart when you want to compare two or more data series that have different units of measurement:

#2 Use a combination chart when you want to compare two or more data series that are not of comparable sizes:

## When to use a stacked column chart

Use a stacked column chart when you want to compare data series along with their composition and the overall size of each data series is important:

## When to use a 100% stacked column chart

Use a **100% stacked column chart** when you want to compare data series along with their composition but the overall size of each data series is not important:

## When to use a bar chart

#1 Use a bar chart whenever the axis labels are too long to fit in a column chart:

## When to use a line chart

#1 Use **line charts** when you want to show/focus on data trends (uptrend, downtrend, short term trend, sideways trend, long term) especially long term trends (i.e. changes over several months or years) between the values of the data series:

#2 Use **line charts** when you have too many data points to plot and the use of column or bar chart clutters the chart.

#3 Use a line chart instead of clustered column chart if the order of categories is important:

## When to use a number chart?

If you want to visualise just one type of data and it contains a numeric value which does not fall in any range/interval then use the number chart:

## When to use a gauge chart (also known as speedometer chart)

If you want to visualise just one type of data and it contains a numeric value which falls in a range/interval then use the gauge chart (also known as speedometer chart):

## When to use a scatter chart

#1 Consider using a scatter chart when you want to analyse and report the **relationship/correlation** between two variables:

From this chart we can conclude that the relationship between the two variables (‘x’ and ‘y’) is linear. What that means, as the value of the variable ‘x’ increases there is a corresponding increase in the value of the variable ‘y’.

#2 Create a scatter chart only when there are ten or more data points on the horizontal axis. The more data points the better it is for a scatter chart. Conversely just a few data points (like five or six data points) are not good enough for creating a scatter chart.

#3 Use a scatter chart when you want to show ‘why’. For example: why revenue is correlated with average order value or why conversion rate is correlated with number of transactions.

## When to use a pie chart

#1 Use a **pie chart** when you want to show 100% composition of data. In other words the various pie slices you use, must add up to 100%. What that means, do not create a pie chart where the various pie slices do not represent parts of the whole pie. For example, following pie chart is not a good representation of data composition as the two pie slices add up to 82% and not 100%:

#2 Use a pie chart to show composition of data only when you have got one data series and less than five categories to plot. For example following pie chart shows the breakdown of a website traffic sources in the last one month:

Here I have got only four categories (search traffic, referral traffic, direct traffic and campaigns) to plot. So pie chart is ideal to show the breakdown. However if there were more than four categories to plot, like eight or ten categories, then the pie chart would have become cluttered and hard to read. For example following pie chart looks cluttered because it has got too many categories:

#3 Use a pie chart to show data composition only when the pie slices are of comparable sizes. In other words, do not use a pie chart if the size of one pie slice completely dwarf the size of the other pie slice(s):

#4 Order your pie slices in such a way that as you look clockwise from top to bottom, the biggest pie slice comes first followed by second biggest pie slice and so on. This makes the pie chart easy to read:

These pie charts are made from following data:

In order to create a pie chart where the biggest pie slice comes first followed by second biggest pie slice and so on, I have sorted the data in decreasing order (from largest to smallest).

## When to use a stacked area chart

Use a stacked area chart when you want to show the trend of composition and emphasise the magnitude of change over time. For example following stacked area chart shows the breakdown of website traffic:

## When to use a histogram

Use histogram to show frequency distribution for quantitative data:

**Note:** You would need to install the ‘**Analysis ToolPak**’ in order to create histogram in Excel.

## When to use a Venn diagram

Use a Venn diagram to show overlapping of data. The multi-channel conversion visualizer chart used in Google Analytics to visualize multi-channel attribution is actually a Venn diagram:

In the context of web analytics, we can use a Venn diagram to determine whether or not a website has got attribution problem. If there is little to no overlap between two or more marketing channels then the website doesn’t have attribution issues. If there is a good amount of overlap then the website has got attribution issues and you should seriously consider taking multi-channel attribution into account while analyzing and interpreting the performance of marketing campaigns.

To learn more about attribution modelling read this article:

Beginners Guide to Google Analytics Attribution Modelling

Another great use of Venn diagrams is in visualizing the back links overlaps between websites:

The tool that I have used to create this Venn diagram is known as Venny.

**Note**: You can create a Venn diagram in Excel. Check out this tutorial on Microsoft Office website: Create a Venn diagram

## Charts to avoid for reporting purposes

Throughout this article I have talked about the charts that should be used. But there are some charts which should be avoided for reporting purpose unless your target audience is as data savvy as you. Following are those charts:

### Treemap

### Waterfall chart

### Radar chart

### Bubble chart

The reason you should be avoiding reporting data via these charts to your clients is simple. The majority of people have no idea what you are trying to communicate via these charts. Use these charts only when your target audience is as data savvy as you.

## Best practices for designing column and line charts

### #1 Start the ‘Y’axis value at zero

When you do not start the ‘Y’ axis value of a chart at zero, the chart does not accurately reflect the size of the variables (in case of column charts) and trend (in case of line charts). For example, following column chart amplify changes because the ‘y’ axis value start at 440 instead of 0:

Following is the correct column chart:

Another example. Following line chart amplify the growth of Facebook fans because the ‘y’ axis value start at 2500 instead of 0:

Following is the correct line chart:

### #2 Do not use line chart (to create trends) if you have less than eight data points

When you create a line charts with few data points, the trend that you see can be very misleading. For example, following line chart just contain two data points and as a result it makes the growth look phenomenal:

For a line chart, the more data points the better.

### #3 Do not hide the scale on the ‘y’ axis of a column/line chart

When you hide the scale of the ‘y’ axis, your chart won’t accurately reflect the size of the variables (in case of column charts) and trend (in case of line charts). Without any scale on the y axis, there is no way of knowing where the y axis starts from. When you use such charts it create doubt on your analysis.

### #4 Add context to your chart

Different people analyze and interpret same chart differently. It all depends upon the **context** in which they analyze and interpret the chart. No matter what chart you select, some people will always find a way to misinterpret your chart. Therefore it is critical that you provide context with your chart in the form of written commentary and describe exactly the intent of your chart.

First present the context, then the insight and then the chart to support your insight. In this way you are giving clues to your chart reader regarding how to read your chart. For example:

You can learn more data reporting tips from this articles: **How to become Champion in Data Reporting**

## Announcement about my new books

**Maths and Stats for Web Analytics and Conversion Optimization**

This expert guide will teach you how to leverage the knowledge of maths and statistics in order to accurately interpret data and take actions, which can quickly improve the bottom-line of your online business.

**Master the Essentials of Email Marketing Analytics**

This book focuses solely on the ‘analytics’ that power your email marketing optimization program and will help you dramatically reduce your cost per acquisition and increase marketing ROI by tracking the performance of the various KPIs and metrics used for email marketing.

**Attribution Modelling in Google Analytics and Beyond**

Attribution modelling is the process of determining the most effective marketing channels for investment. This book has been written to help you implement attribution modelling. It will teach you how to leverage the knowledge of attribution modelling in order to allocate marketing budget and understand buying behaviour.