Breaking rank—differentiating between values that are equal

Challenge: On a dashboard report, show the top five values from a table of data showing the number of respondents. And against these values, provide other data from the same records.

At first glance, the above challenge seems rather simple. The LARGE function brings back the nth largest value in a range, and VLOOKUP can then be used to bring back the associated data from that record.

But because the number of respondents is a discrete number (i.e. it can only take integer values), then there may be a tie. And VLOOKUP doesn't like ties.

The way around this is to add a small, static random number to each of the numbers of respondents in the source table. And to use this as the basis for looking up.

Let's assume the number of respondents start in cell A2. Create two new columns to its right. In cell B2, enter the following formula:

=RAND()*0.0001

The RAND function creates a random number between 0 and 1. The multiplication by 0.0001 makes that number tiny, but—and importantly—positive. Also, given the level of accuracy to which random numbers are stored in Excel, they're most likely unique.

Copy over the formulae in column B with values. This will hard-code the random numbers and avoid them recalculating and jumping around.

Now in cell C2, type the following formula:

=A2+B2

By adding the small random number to the original number of respondents, you've created a number negligibly higher than the number of respondents. And by formatting that new column as an integer with no decimal places, it will appear identical to column A. But importantly, all of the numbers in column C are unique.

Now in the dashboard, you can use the LARGE function against column C to bring back the five largest values. And when you use VLOOKUP against that same column, it will be able to differentiate between the rows and bring back unique values.

Hiding a cell’s contents

If you want to hide the contents of a cell, instead of matching its font colour with its background, give it a custom number format of ";;;" (without the quotes). It's more elegant and won't suddenly appear if the background colour changes.

Charting shortcut

If you select a range of data and hit F11, Excel will create a default chart on a new sheet based on that data. It's often useful as a throw-away action just to visualise how your data trends. Or you can use it as the basis for developing the chart further.

Formatting shortcut

CTRL+1 is a very useful shortcut to bring up the Format Cells dialog box.

Creating conditional dropdowns

Challenge: Create two dropdowns, the second one only showing values relevant to what has been selected in the first one.

This question has come up a lot in my Excel career. A good example is where you have two-tiered data. For example, you may want a user to select a continent and then select a country within that continent. Instead of presenting the user with the full list of countries across the globe, wouldn't it be handy to only show the countries in the selected continent? Here's how.

First, create an area in which the countries within each continent are listed. The title of each column should be the continent. Unfortunately, these column titles cannot contain spaces, so you'll have to use underscores if you need to separate words.

And immediately beneath each continent name, type the countries that make up that continent, using as many cells as necessary.

Now you'll need to create a few named ranges. Create one named range for each of the sets of countries (excluding the title). And lastly, create a named range called "Continents" made up of just the column headings.

In Excel 2007, this can be done throughout the Create From Selection option in the Defined Names area of the Formulas ribbon. But if your sub-options vary in number (as would be the case with countries within continents), it's best to create the ranges manually—otherwise you’ll have empty cells in some of the named ranges and these will carry over into the dropdowns, which isn't pretty.

So by now you should have lists of countries for Asia, Europe, North America etc., and named ranges for each set of countries (called Asia, Europe, North_America etc.). And you should have another named range called Continents containing a list of the values Asia, Europe, North America etc.

That’s the prep. work done. Now you need to create the data entry area. For this, you simply need two cells: one in which people will select the continent, and one in which they will select the country. Let’s assume these are A1 and B1 respectively.

In A1, hit Data Validation, select List and type "=Continents" (without the quotes) in the Source field. This will allow the user to select a continent from the list.

Now for the clever bit. Ready?

In cell B1, use Data Validation again, choose List and put "=INDIRECT(A1)" as the Source. The indirect function merely takes the value of A1 and uses it as text that can be in turn interpreted by Excel. Given that A1 can only contain the name of a continent and each continent has its own named range, the values allowed in B1 are defined by the named range associated with the continent selected in A1. And the two cells can be copied down the columns to make as many contextual dropdowns as you like.

Formatting percentages

Time and again, I receive spreadsheets from people multiplying numbers by 100 to show them as percentages. Here's a brief overview of how percentages work, both generally and specifically in Excel.

A percentage is merely an expression of a ratio. Thirty percent of the class was wearing a tie. This means that of the 30 people in the class, nine were wearing a tie. Nine out of 30 = 0.3

Year-on-year profits increased by 120%. This means that if the profit was £2.0m in 2010, it increased to £4.4m in 2011. 4.4/2.0 = 1.2.

But percentages are always displayed multipled by 100. So the 0.3 above is written as 30%. And the 1.2 is written as 120%.

In Excel, once you've calculated your ratios, you should never multiply them by 100 to display them correctly. Instead, there is a handy percentage format built in. Simply click the % button in the Home ribbon, and your 0.3 will show as 30%. And the number of decimal places showing can be increased or decreased using the appropriate buttons, or in the Formats dialog box.

Multiplying your percentages by 100 creates inefficient formulae, and can cause all sorts of issues when you start using them downstream in further formulae.

Usually, if you see a 100 in a formula, it doesn't need to be there.

Correct syntax for formulae

I think way back when, some highly-read source must have mistakenly advised people that to create a basic arithmetic formula, you needed the SUM, irrespective of the calculation being performed.

So I often see formulae such as:

=SUM(C7-C6)

If you're writing formulae to be efficient and easily interpreted by other people, the better way to write this formula is:

=C7-C6

Best to steer clear of the SUM function unless you're summing more than two things.

1900 was not a leap year, despite what Excel might have you think

If you enter 29 February 1900 into Excel, it will recognise it as a date and format it such. But if you can find a calendar dating back that far, you'll discover that 1900 was not a leap year.

This bug was purposely introduced by Excel to keep it in step with other systems that had been created before Excel's inception. In all but very exceptional circumstances, no one will notice.

Killer Sudoku Pro: spreadsheet helper

Challenge: There is a variant of Sudoku called Killer Sudoku in which areas of the grid, either square, rectangular or some other irregular shape, are caged off. A small number in the corner indicates the value to which the numbers therein must sum (or multiply, or subtract, or divide). Produce a spreadsheet that identifies all possible combinations for three-box cages.

Here's the problem in a bit more depth. There is an indicator in the corner to confirm which of the four mathematical operators to use on the cage. The challenge was to take the leg-work out of figuring out the possible combinations.

The division operator is trivial, so didn't need to be included in the challenge. It only operates on two-box cells, and all combinations can easily be accommodated mentally. So we were left with multiplication, addition and subtraction.

The logic is on the Workings sheet of the attached spreadsheet..

Columns M, N and O contain all 165 three-way combinations of the numbers 1 through 9. Columns A through D deal with the subtraction option; E through H with multiplication; and I through L with addition.

For each row, working right to left, the result is simply the result of the calculation. It's easy for multiplication (PRODUCT) and addition (SUM).

For subtraction it's a bit more tricky. The subtraction rule dictates that you subtract the two smaller numbers from the biggest number. But beautifully, this is the same as =2*MAX(range)-SUM(range). By taking twice the maximum number and subtracting the sum of the three, it's the same as taking the maximum number and subtracting the two smaller numbers.

On the "Answer" sheet, you simply enter the number from the Sudoku puzzle (the one that you're trying to evaluate against) and select which type of cage it is: multiply, sum or minus.

Back on the "Workings" sheet, the "Match?" column simply compares the answer to the value you're evaluating against. It returns a 1 for a match, and a 0 for a non-match.

The sequence column then adds the previous column to give the sequence number of the entry. And the "Unique sequence" column merely strips out the duplicates from the previous column, tagging just the matching rows with the relevant sequence number.

Cell L2 on the "Answers" sheet contains a formula detailing how many exact matches there are. By using a format of ";;;", the cell contains the value but will never display it.

Column E contains the unique reference of each successful match, using L2 as the basis for determining when to stop. Columns F, G and H simply use a VLOOKUP to bring back the relevant values from the "Workings" sheet.

Excel is like life

Life is complex. Problems are complex. And it’s rare that an intricate problem can be solved by a simple solution. Or indeed that a single solution is the only one. Excel is a good analogy here.

It has rows and columns and it was invented to organise data. Simple, huh?

Yet in its 2007 version, it comes with eight standard ribbons. (Further ribbons present themselves in specific situations.) The Home ribbon alone has 42 separate items within it. Twenty of these have dropdowns from which further options can be selected. A very conservative estimate would be that an average of five sub-options are available for each of these. If the other seven ribbons are similar, then that’s 976 options, and that doesn’t account for the plethora of formulae that can be written in each of the cells, and the canvas of colours available. (As a slight aside, that would allow 16 trillion possible actions across the cells of a single worksheet.)

Admittedly, Excel was not designed to solve a single problem. But the arsenal of tools available merely highlights the huge variety of ways in which problems can be addressed. If you give ten people the same problem and ask them to solve it in Excel, each one will address it in a different way, sometimes subtly different, sometimes wildly different. If you don’t tell them which tool to use, then your range of solutions will widen further.