Why is it so difficult to get the data analysis you need in Excel? Believe me, I’ve been there many times, particularly early in my Excel career. Excel beginners can really struggle with data analysis.
I’m not a computer programmer or a statistician but, over ten years of creating and teaching Excel data analysis, I’ve worked out how to do it. In this video, I show you my five-step process that covers most types of data analysis in Excel.
CLICK HERE TO WATCH THE VIDEO It’s the process that allows me to approach data intensive tasks with confidence, safe in the knowledge I’ll be able to get the answers I need, perhaps with a cool Excel chart or two!
1) Objectives Do you have a clear idea in your own head of the analysis you require? Too often, people muddle through Excel data analysis tasks without a clear idea of where they are going. The bad news – Excel is not telepathic and cannot work out what is required for you! (Though it may suggest a chart 😊)
The inevitable result: frustration and confidence diminished still further …
First, think: what measures are required? Do you want to count, average or sum up data? Literally sketch out what your dream analysis looks like – you’ve actually taken a big step towards it … without even touching Excel.
2) Location Where is that data? Please don’t tell me it is fragmented across multiple sheets, or even files? Oh – it is?!
Don’t worry, this actually happens all the time, but you must commit to keeping data in one place from now on. Yes, there is a temptation to break up datasets when they get big. Excel is actually comfortable with datasets of up to 100 000 rows or more. I work on files with up to 500 000 rows of data. It’s what Excel is designed for.
3) Cleanliness How ‘clean’ is your data? First how would we define ‘clean’? Clean data is predictable: it’s one of a limited number of possible entries, or numerical data is in a pre-determined range and a known format. There are no surprises.
What – yours is not like that? Perhaps you have spelling mistakes, blank cells and that Excel classic – an ‘impossible to detect’ space after a text entry. Even Jack Bauer can’t find those.
A number of formulae, techniques and facilities in Excel can help us ‘clean up’ data.
Firstly, check out the new =UNIQUE formula that quickly returns unique entries in a column. Haven’t got =UNIQUE on your system? Time to update your version of Excel so you have this new(ish) formula. Good old ‘remove duplicates’ in the data tab may also help. And don’t forget the =TRIM formula for eradicating those spaces after entries.
More complex cases may require Power Query or even VBA to resolve …
4) Type First question: is the data numerical or text-based? Let’s take numerical data first.
The next question: is this numerical data discrete or continuous? Discrete data is one of a limited number of values. The example in the video shows a ‘salary band’ column which contains one of four possible values – salary band 1, 2, 3, or 4. So, although the data is numerical, it’s actually ‘discrete’ because the number of possible values is limited. Got it? Good!
So, what is ‘continuous’ data? This type of data is not limited to a small number of possible values. The salary figure in the video is instructive since it could take any possible value: 32.456. 23.133, 100.001 – whatever it might be! I hope you get the idea. So, it requires a different technique to analyse it.
What about text data? Text data is best described as discrete since, usually, the range of possible values is limited. Let’s take the ‘location’ column in the example, for which there are four possible values – North, South, East and West. A limited number of values = discrete data.
5) Technique If you’ve worked through the steps, you’re now ready to unleash Excel’s awesome data analysis formulae.
So, which formula for which data type? Continuous numerical data is probably the most common, and is best analysed using the concept of ‘categories’ or ‘bins’. My tool of choice? Excel’s super powerful =FREQUENCY formula.
Discrete data is best analysed using COUNTIF. This formula is a workhorse in my Excel analysis. The idea is simple – point it to a range and a value; it tells you how many times the value appears in the range. Need to count the number of times each salary band occurs? COUNTIF loves that type of job!
For more complex questions, check out the formulae I list at the end of the video – I will put links to dedicated videos in the pinned comment at the bottom of the YouTube video too.
So, next time you’re doing data analysis in Excel, simply remember and work through the five steps. Let me know how you get on in the YouTube comments …