Understand an Excel File in 3 Steps
We have all been there. Our boss hands us a ‘tricky’ spreadsheet file, and tasks us with ‘understanding’ it. Sounds simple enough, until you open it to find it contains no fewer than 34 sheets, has no consistency between sheets, and no sense of a beginning and an end!
How do you even get started?
Here is the three step process I work through to understand a complex Excel spreadsheet.
1. Get an Overview
It can be tempting to get stuck into the detail, but to do so can compound your frustration – and you are probably already frustrated enough. Rather, try to get an overview of the file first. Working from out to in like this allows you to understand the role of each of its parts.
How do we get an ‘overview’? The biggest building block in an Excel file is a sheet, so let’s first try to ascertain how many sheets are in the file. That means a quick glance at the sheets tabs, right? Yes, but there may be more sheets in the file that are ‘hidden’ or, in particularly tricky cases ‘very hidden’!
First, right-click on one of the sheet tabs at the bottom of the screen, and click ‘unhide’. If unhide is shaded out then great, there are no hidden sheets. If you can click unhide, you will next be presented with a list of hidden sheets in the file. You can click and unhide the sheets individually; but, this is onerous if there are more than a few, so what are the alternatives?
Keyboard shortcuts can help. On a Windows PC, you can use the Alt + H + O + U + H shortcut (hit the keys sequentially, not simultaneously) to speed up the process. Or, if you are comfortable with VBA, you can use a macro to unhide all sheets in the file at the click of a button! The code is included in the download file (link above) – copy / paste into the VBA editor for your file and click the play button to run the routine. Cool!
Real spreadsheet fiends may go to great lengths to hide a sheet. It is good to know that the .visible property of an Excel worksheet can be set to ‘very hidden’, which means that you can’t unhide it in Excel. But, the sheet does appear in the project explorer in the VBA editor, which can confuse! (It has confused me more than once). All sheets will be unhidden if you run the above macro, including the ‘very hidden’ ones. Phew! And people out there who are making sheets ‘very hidden’, we would really rather you didn’t 🙂
All this clicking around the spreadsheet and VBA editor can seem time-consuming. But, it gives us what we absolutely need to de-stress the process – an overview of the whole file, and a sense that it cannot get any worse, because everything is now viewable. Onwards!
At this point it is important to check if the file has external links. Click the Data tab and ‘edit links’. If there are links to other files, you will have to repeat this process with those files too. Sorry, I did not say this would be easy!
2. Click around and categorise
Click on a worksheet and start clicking through some of the cells to ‘get a feel’ for what is going on. This can create a lot of cognitive load. Try not to get stressed and simply take a deep breath if it gets confusing, you will work this out in the end!
The following simple idea helps me see through the formula fog:
Almost all cells in any spreadsheet fall into one of three categories:
Input cell
Process cell
Output cell (including pivot tables)
Input cell
Process cell
Output cell (including pivot tables)
Simply click through and begin to categorise the cells. What do the categories mean?
An input cell contains a value, as opposed to a formula. This means a number that has been inputted or copy / pasted to the cell. You can confirm this by double-clicking on the cell (or by navigating to the cell and hitting the F2 key on a windows PC); if no formula appears, this cell contains a value, and it is an input cell. An input cell has no precedents, it does not depend on any other cell. You could call it a ‘starting point’.
A process cell contains a formula. It supports the calculation chain or ‘process’ of the Excel file. You can identify a process cell by establishing if it contains a formula or not. If you can, make a mental note of the cell(s) on which the formula depends. Keyboard shortcuts will help here: on the Windows PC, F2 puts you in formulae editing mode, which allows you to see precedent cells nearby, and ALT + M + D will show cells dependent on this cell. If a dependency arrow points to another sheet, you can find out which sheet it points to by clicking on the base of the arrow.
An output cell, like a process cell, contains a formula. The critical difference with an output cell is that it has no dependent cells – it is the end of the calculation chain, the end of the spreadsheet, the end of the world! Ok perhaps not, but finding one of the these can be exciting because output cells represent an objective: It is the Excel file’s purpose to display them clearly. They should be prominent in a *well-designed* spreadsheet file; located in a dashboard, or at least at the top of a sheet. Bear in mind most spreadsheets are not well-designed, however… Nevertheless, output cells give us a clue about what the creator of the file was trying to achieve.
We can categorise pivot tables as output cells. How to spot a pivot table? The default pivot table formatting is usually the best identifier. You can confirm this by clicking on the cells and noticing if pivot table options appear. Also remember that charts usually point to output cells, since they communicate analysis that is the main focus of the file.
With the concepts of input, process and output you should be able to categorise most, if not all, of the cells in the spreadsheet file. This should give you an overall impression of what is going on. Focus on the general direction of the calculation, rather than the specifics, at this stage. Next!
3. Tabulate or Visualise
How to organise and retain these insights? With multiple sheets, and hundreds or thousands of dependencies, it is a lot to keep in your head. I recommend creating some kind of visualisation. The simplest option is a three column table with headers ‘input’, ‘process’ and ‘ouput’ under which you list the main sheets / cells that fall into each category.
I create structural diagrams of whole files with each sheet represented by a rectangle, with a short summary of the information on that sheet. You can colour code to distinguish input sheets from process and output sheets, assuming the Excel file is sufficiently well-organised! Or, depending on how the file is set up, you might distinguish different blocks within the same sheet. Whichever level you work on, creating a visualisation can really help the process; moreover, your boss, client or teacher is likely to love it! People tend to like talking spreadsheets if it does not involve looking at … a spreadsheet.
Unfortunately, there is a sting in the tail. By understanding the process cells, we have fully understood the calculation chain, right? Not quite. Cell interactions are often communicated through formulae, but, as any subscriber to the Tiger channel knows, they can also be programmed in VBA! If macros are creating interactions between cells, these interactions are not discernable in the spreadsheet. There is no substitute for opening up the VBA editor and interrogating the code. Once again, I told you this would not be easy!
How to understand what the code is doing? This should be familiar territory for those who have worked through Tiger’s videos: the F8 key on a Windows PC allows you to ‘step into’ the code and understand what is going on. Don’t forget you can search the VBA editor (CTRL + F on a Windows PC) to find important references such as output cells.
Bear in mind that code can be organised into modules, but might also be assigned to worksheets in the VBA editor – if a routine is triggered on a worksheet change, for example. Look out for custom VBA functions; unlike routines, that execute without an apparent trigger mechanism. They can be particularly fiendish and difficult to understand. But, at least you now know to look out for them!
So that’s it: my three-step guide to understanding any Excel file.
Is it a straightforward process? 99% of the time, it is not. It takes serious mental engagement to work out a complicated file, which is why a lot of people tend to give up. The above road map should keep you on track and help you navigate the main features, and retain your sanity. It won’t be easy but, once you have mastered the file, you can embark on a redesign to make it easy for the next person (!)
Good luck!
Have you tried this approach? Did it work? Did you generate any new insights? Leave a comment below the video on the YouTube channel.