How To Analyse Goal Distribution In Excel (BEGINNER EXCEL DATA ANALYSIS!)


If you’re working with football data in Excel, or if you’re looking for a real-world data analysis task to build your Excel data analysis skills from beginner level, you’ll love our approach to analysing goal distributions in Excel, which is designed to help aspiring football traders and Excel data analysts alike.


EXCEL DOWNLOAD FILES HERE

00:00 INTRODUCTION
03:50 WHY AVERAGE IS … AVERAGE
In the video, I show you two datasets and pose the question – to what extent are the two sets of goal distributions similar, or different?  “That’s easy”, you might think, “All I have to do is find the average”.  And yes, the Excel =AVERAGE formula returns similar values, so these datasets are pretty similar right?

Not quite.  I argue the mean average (what the =AVERAGE formula calculates) is, well, pretty average in terms of the insight it provides.  It gets worse, because the mean average can provide ‘insight’ that’s somewhat misleading, because the measure is disproportionately affected by what statisticians call ‘outlier’ values.  Remember that game in the Covid-affected 2020 season when a team won 30-0?  Surely, we’d want to discount this kind of anomaly from the analysis.  This is the problem with =AVERAGE – it simply doesn’t do that – and beginner Excel data analysts need to bear this in mind.

So, what are the alternatives?

04:50 MEDIAN AND MODE
Statisticians talk about ‘measures of central tendency’ – formulae that give us a sense of where the midpoint of the data might be.  Mean average, for better or worse, is the most common.  I suggest you expand your data analysis toolkit by incorporating the median and mode too.  These formulae are easily accessible in Excel and provide different insight.

The median simply lines up all the values in the dataset and gives us the value of the one in the middle!  Simple, right?  It’s interesting because it’s not affected by outlier values, so it’s a great foil to =AVERAGE which, of course, gives us the mean.

Mode refers to the most-commonly occurring value in the dataset.  It gives us a sense of where the data ‘peaks’, and what the overall shape of the distribution might be.  That’s what we’re working towards – a literal picture of the data, more on that later.

As I explain in the video, as we apply the data analysis formulae in Excel, a more nuanced picture of the differences between the goal distributions emerges …

08:00 STANDARD DEVIATION
A beginner Excel data analysis of numerical data should cover two concepts: central tendency and dispersion.  We’ve covered the first one above, so what’s the second all about?  Dispersion refers to how ‘spread out’ the data is.  Is it tightly grouped around a midpoint?  Or, is it more dispersed across a range?

The best way to measure dispersion in Excel is to use standard deviation.  Let’s provide a quick working definition here, so you can understand the concept.  Read carefully!  Standard deviation works by summing up the differences between each data point and the mean average of the dataset.  You could think of one such difference as a ‘deviation’.  The standard deviation, therefore, is the average of the deviations, and a higher standard deviation implies a greater level of dispersion.  There’s some statistical adjustments in the calculations too, but let’s not worry about those for now!  Focus on the basic concept if you can.

I apply the =STDEV formula in the video to reveal more insights about the datasets.  Now, to complete the picture …

10:45 A VISUALISATION
Nothing tells the story of data like a visualization.  In Excel beginner data analysis, that means one thing: charts!  The key to a powerful Excel chart?  Properly-prepared data.  In the video, I demonstrate how to apply =COUNTIF quickly to create a complete summary analysis of the data.  With well-organised summary analysis, chart creation is relatively routine, though the column chart requires a bit of tweaking to knock it into shape, as usual!

I love the visual impact of charts in Excel, and this column chart completes the story.  The two datasets present different features in terms of dispersion, skew, and overall distribution.  But we wouldn’t have known that from the mean average …

15:20 LEARN MORE ABOUT EXCEL (VBA) FOR FOOTBALL TRADERS

If you’ve enjoyed this video and you are working with football data in Excel, or you’re just looking for a set of ‘real-world’ Excel projects to boost your data analysis skills, you’ll love our Excel (VBA) for Football Traders community.  It opens for new members only a couple of times a year, and we’re currently open!  Don’t miss your chance to get involved …