Friday, July 9, 2010

Toolkit: Pivot Tables in Microsoft Excel

The applications in the Microsoft Office suite (including Word, Excel, and PowerPoint) are extremely powerful tools.  The majority of users touch on a fraction of the capabilities of these tools, either because (1) their work does not require it or (2) because they do not know certain capabilities exist.  The first reason is understandable; no need to use or learn the powerful equation editor if you have no need for it.  The second reason — not being familiar with key features — is unfortunate.  If users would enhance their knowledge of these tools beyond the basics they would be much more productive and produce better results.

I will have a series of postings on a few of the powerful features of Microsoft Office that everyone should learn once they get past the basics of these applications.  They are part of my consultant’s toolkit because they both add value and save the client time and money.

The capabilities I will share are not “rarely applicable” or designed to “show-off your skills”.  They are frequently useful, meaningful, and beneficial.  I list a few here so you get an idea of the features I am referring to:
  • Pivot tables in Excel
  • Style sheets in Word
  • Tables in PowerPoint
If you are not familiar with these capabilities or do not use them, I encourage you to research them, learn more about them, and apply them.

This posting introduces you to pivot tables in Microsoft Excel by:
  • Providing a simple real-life example of a pivot table and its benefits
  • Presenting an overview of the key concepts of pivot tables
  • Sharing several real-life examples to give you additional ideas on where to apply it in your work
  • Sharing some practical tips on using pivot tables
  • Listing additional resources to learn more about pivot tables.
I focus on the concepts and the big picture of pivot tables in this posting.  I do not provide step-by-step instructions.  Microsoft Excel has changed both the location of where to create/manipulate pivot tables and the look and feel in their releases over the last several years.  Please refer to the help within Microsoft Excel to learn how to specifically perform the pivot table operations in your version of Excel.  The concepts and examples apply to all versions.

Note: The pivot tables depicted in this posting are based on real-life uses.  The data values (e.g., numbers, names, amounts) are fictitious, however, to remove any proprietary data.

SIDEBAR - AN EXAMPLE OF WHY TO LEARN NEW FEATURES

A client sent me an Excel file with a long spreadsheet filled with sales data.  He said he spent a full day each month formatting the information.  Specifically, he spent time inserting new rows in the spreadsheet to put in column headings each time a page break occurred.

I asked him if the “Rows to repeat at top” feature in Page Setup would work for him.  He was not familiar with it.  I showed him the feature and the few clicks needed to use it.  If someone had not showed him this built-in capability of Excel, he would still be doing this work at considerable expense (eight hours versus one minute).  This is independent of whether the formatting task itself added value to the company and could have been eliminated altogether.

I asked him what else he did with the sales data to see if I could help him further.  He mentioned he spent a couple of days each month computing sales by salesman, sales by customer, sales by customer grouped by salesman, and a few other views on the data.  I asked him if he was familiar with pivot tables.  He was not.  Using an export from the sales system I showed him with pivot tables how he could create his reports each month in a couple of hours and with less likelihood of error.

Bottomline:  Three days each month was reduced to a less than half a day by learning and leveraging just two of the features of the tool he used.

REAL-LIFE EXAMPLE OF A PIVOT TABLE AND ITS BENEFITS

One typical activity of a project manager is to track expenses on a project.  For large projects, a project management tool or an accounting system may be used to track these expenses.  For smaller projects or when you want more control over what and when you track expenses, a spreadsheet may be a good choice.

Here is a small portion of a spreadsheet I have used track expenses for multiple programs with multiple projects.

(Click image to enlarge)
Each row in the spreadsheet is an expense.  Each expense notes which program and project it is associated with, the month incurred, the company and person or asset billing the expense, the expense type, the funding type and who will fund it, and the expense in dollars.

A project manager with this information can now ask and receive answers to multiple questions:
  • How much have we spent to date? By month? What is our trend?
  • How much have we spent on each project? On each program which has multiple projects?
  • How much have we spent on consultants? On software?
  • How much have we spent on each phase?
  • Hold it…how much have we spent each month by phase and by project?
Answering these questions (and more) is a mouse drag or mouse click away using a pivot table.

How much spent to date on all expenses tracked? One mouse drag gives the answer.
(Click image to enlarge)
What about by month? One additional mouse drag gives the answer.
(Click image to enlarge)
On the each project? Two drags displays the expenses broken down by project within program.
(Click image to enlarge)
Let’s jump to the last question…how much have we spent each month by phase by program?
(Click image to enlarge)
Several benefits of pivot tables can be seen by this example.

First, pivot tables can answer many questions important to people who use data to make decisions.  By viewing the data in multiple ways the user can grasp a large amount of detailed data more easily, spot potential issues or trends, or find out more about an existing issue.  Pivot tables allow this viewing of the same data in multiple ways.

Second, it is easy to change the way data is viewed.  Once the data is set up in a spreadsheet and a pivot table structure has been created (fairly straightforward using Microsoft Excel’s built-in PivotTable Wizard), it literally is a mouse drag or mouse click to change the view of the data.

Third, the computations in pivot tables are done automatically.  When you change the view, the calculations are handled for you. You do not need to write formulas.

Alternatives to pivot tables include writing formulas, using the subtotal feature of Excel, using filters on the data, or writing macros / purchasing add-ons to help redisplay or analyze the data.  There may be cases where these alternatives are more appropriate or provide a view pivot tables do not support (there are some limitations to pivot tables.)  But these alternatives tend to be more error-prone.

KEY CONCEPTS OF PIVOT TABLES

Three key concepts are the data table, the pivot table, and crosstabs.

Data Table

A data table stores the data to be viewed, analyzed, and summarized.  The top row of the data table contains the column headings.  The rows below the top row contain the data with each row representing one record of data.  Each cell on a data row contains a data value for that record.
(Click image to enlarge)
Each row in a data table needs to contain all the data for that record.  This may mean that some data is duplicated on multiple rows.  For example, if you want to be able to view data (in the earlier example) by project and by program you need to fill in the data value "TRW Program" in the program field on each row of the “Turtle Project” even though it the “Turtle Project” is always associated with the “TRW Program”.

The key design aspect of your data table is to determine how you may want to view your data.  By month?  By day?  By region?  By city?  By product?  By product grouping?  By ship date?   By month? A column needs to be set up and the data needs to be stored in the table for each dimension you would like to view.  Each piece of numerical data you want to compute with (sum, average, minimum, maximum, etc.) should also be a column, such as units, sales, and costs.

Pivot Table

The pivot table is where you view the data from the data table.  The pivot table and the data table are two separate tables.  The pivot table points to the data table behind the scenes.

The pivot table is also where you change the view so you can see the data in different ways ("slice and dice the data" or pivot it).  You do this by moving the field names around, filtering which values you want to see (for example, only show the "Turtle Project"), and setting field values for advanced computations (such as sum, count, variance, running totals, percent of).   Changing the view does not impact any of your data in the data table.  For example, removing a field from your pivot table does not change the underlying data table.

Crosstab

A cross-tabulation (or crosstab) is the intersection ("crossing") of two or more dimensions so that the intersection points represent the computations (sum, count, etc.) for each combination of dimensional values.  Below is a subset of a data table.
(Click image to enlarge)

There are two dimensions in the data table above:
  • Gender, with possible values of Male and Female
  • Handedness, with possible values of Left and Right
A cross-tabulation on these two dimensions can compute how many Males are Lefthanded, Females are Lefthanded, Males are Righthanded, and Females are Righthanded. Below is the crosstab with counts from 732 data rows.
(Click image to enlarge)
Excel can also display the results of the crosstab in percentages.
(Click image to enlarge)
The pivot tables in Excel are quite powerful.  It can perform these crosstabs on more than two dimensions at the same time.  If we had another column heading of “Eye Color”, we could compute the number of Blue-eyed Female Lefthanded people compared to the other combinations of dimensional values.

ADDITIONAL REAL-LIFE EXAMPLES

I have created scores of pivot tables over the years.  I find value-added uses for them in many situations.  Here are a few more examples to give you ideas.

I created this pivot table to track the status of over 200 individual projects each month while managing a large program.
(Click image to enlarge)
Microsoft Excel also has the capability to create pivot charts so you can view the data visually.
(Click image to enlarge)
How about this?  Leverage a tool like SurveyMonkey to capture survey information online through a web browser.  I used it to capture over 200 survey responses to a company event.  SurveyMonkey will allow you to export the data in a format perfect for a data table.
(Click image to enlarge)
Then create several pivots on the same page with the results.
(Click image to enlarge)
TIPS ON USING PIVOT TABLES

Start simple and try it out.  Build a simple data table with the information you think you need, put in column headings and a few sample data rows of mock data, build a pivot table using the PivotTable Wizard, and try it out.  Does it answer the questions you have?  Go back and make adjustments to your data table until you feel you have the right model.  Once you get comfortable with the pivot table concepts and its capabilities, designing data tables properly will become natural.

Include computed fields in your data table, if appropriate.  In the first real-life example, the Total Cost field is computed by multiplying the Units field by the Cost per Unit.  Color this column (to remind you it is computed with a formula and not entered directly) and place at the far right of the data table.  As you add rows to your data table, you can then copy the formulas to the new rows.

You can even use computed fields based on look-up values.  For example, if the “Turtle Project” will always be associated with the “TRW Program”, you could make the Program field a computed value with a VLOOKUP in a Project-Program cross-reference table.  Just be careful.  Make sure that the association will always be a true statement.  For example, a project manager may be leading the “Turtle Project” today but could be replaced by a new project manager later in the project.  If you want to view the data by project manager, you should not compute this field but manually fill it in each time for each record.

A pivot table is based on a range of rows in your data table.  If you insert rows outside of this range, your pivot table will not know to use this data.  To prevent this from happening, clearly mark the last row in the range of your data table.  When you insert rows above and inside the range, the pivot table will automatically adjust and include the new rows.  Here is an example:
(Click image to enlarge)
You can create multiple pivot tables in the same spreadsheet against the same data table.  Put one pivot table on a tab labeled “Playground”.  This can be used to slice and dice anyway you want -- it is your playground to play around with the views.  Put on separate worksheets pivot tables with the specific view and format that you always want to see whenever the data table is updated.  Multiple pivot tables allow you to have your "standard" pivot reports always available and a version of the pivot table you can always change.

While pivot tables are less prone to errors than other approaches, there are still some gotchas you should always check.
  • First, refresh the values in the pivot table by pressing the exclamation point refresh icon.  This makes sure the table includes any changes to the data table since the last time the pivot table was refreshed.
  • Second, make sure the pivot table is referring to all of the data rows in the data table.  Check that the range for the pivot table includes all of the rows you want in your results.
  • Third, make sure there are no filters within the pivot table omitting some values from the results, unless that is what you wanted.  Pivot table functionality allows you hide and filter certain dimensional values.
  • Fourth, make sure the calculation on the crosstab is the calculation you want.  A common mistake is for the computation to be a Sum when you want a Count or vice versa.
  • Finally, as with the use of any spreadsheet, spot-check the answers to make sure they make sense and are not out of line of what you expect.

ADDITIONAL RESOURCES TO FURTHER YOUR KNOWLEDGE

Here are some places to find more information about pivot tables:
  • The Help feature in Microsoft Excel. While it is not as extensive as it should be, it should be your first stop.  More recent versions of Excel include "courses" to step you through simple examples and how-to's within your particular version of Excel.
  • The Microsoft website with support for Excel.
  • Any of the free Pivot Table tutorials online.  Do a search on "Excel pivot table tutorial".  I found this example on a website called ProductivityPortfolio. This tutorial provides examples in Excel 2007.

Happy pivoting!

No comments:

Post a Comment

Toolkit: Pivot Tables in Microsoft Excel ~ DANIEL SKLAR