Wizard of Excel

The ROUND function works both sides of the decimal place

Posted on May 25, 2011 by Dan

Most people who have used the ROUND function will have used it to the right of the decimal place.

=ROUND(A1,2)

This will round cell A1 to two decimal places. So 3.1415 will become 3.14. And 2.718 will become 2.72.

But if you can also supply a negative value as the second argument of the function.

=ROUND(A1,-3)

This will round cell A1 to the nearest thousand (i.e. three positions left of the decimal place). So 1,234 will be rounded to 1,000. And 599,999 will be rounded to 600,000.

The same is true for both the ROUNDDOWN and ROUNDUP functions.

It can be quite useful in certain circumstances where superfluous accuracy is not required.

Text to columns: saving your file layout

Posted on May 23, 2011 by Dan

The Text to columns function in Excel is useful. If you have a bunch of data in a single column, it can be used to split it into separate columns. Found on the Data ribbon, you can either specify delimiters (for example, you might specify a space as a delimiter to split a UK postcode into its two halves), or you can click on the positions at which the data should be split based on each field being a fixed width.

The trouble with the fixed-width method is that Excel has no way of saving the file layout for later use. So if you have a need to repeatedly split some data in exactly the same positions, you have to click each column splitting point each time you do it. Very frustrating.

I have a need to do this quite regularly. I receive text files on a regular basis, all sharing the same fixed-width layout. To split them into their constituent fields, using the Text to columns function over and over would drive me crazy and waste an inordinate amount of time.

Instead, I've created a shell spreadsheet containing two sheets:

Raw: into which I can paste the raw data, all in column A, starting in row 4
Split: the split data, starting in column A

At the top of the Split sheet, I have the column titles (row 3), field lengths (row 2) and field start positions (row 1). The field lengths came straight from the file layout; the start positions are calculated, using the following trivial formula in B1:

=A1+A2

Each cell in the top data row (row 4) of the Split sheet contains a formula to bring back that column's data for the corresponding row. For example cell A4 contains the following formula:

=MID(Raw!A4,A$1,A$2)

The dollars before the 1 and the 2 ensure that when you copy the formula down, it still refers to the start position and field length. Leaving the column free to roam means that it always looks up against the relevant column.

Now, instead of having to define the file layout each time I have to analyse an incoming file, I copy the text file into the Raw sheet, flick across to the Split sheet and copy that top row of formulae down to match the corresponding number of rows in the Raw sheet.

It's memory-intensive. But Excel copes just fine with over 20,000 rows of data and 100 columns. Which is all I really need it for.

Conditional minimums

Posted on May 18, 2011 by Dan

A friend asked me the other day how to return the lowest positive value from a row of data in Excel. Given that there’s no MINIF function, at first I struggled.

Then I re-phrased the question: how do you return the Nth lowest value in a range? Ah, we can do that with the little-known SMALL function. And can we find N? Sure we can, using the COUNTIF function.

So, the following formula gets you what you want:

=SMALL(A1:Z1,COUNTIF(A1:Z1,"<=0")+1)

In English, this reads: From the range A1:Z1, count the number of non-positive values, add 1 (the result being N). Now return the Nth lowest value.

A cute Excel conditional formatting quirk

Posted on May 18, 2011 by Dan

Let's say I conditionally format cell A10 based on the contents of cell A1 using the formula bit of the conditional formatting feature and leaving A1 undollared. I then copy A10 in its entirety to A9. Because Excel can’t base its formatting on A0 (as such a cell doesn’t exist), it instead bases it on A65536.

Quite cute really.

SUMPRODUCTIF

Posted on May 9, 2011 by Dan

I recently found out how to do the equivalent of a SUMPRODUCTIF, a formula that doesn't exist in Excel. Here’s the detail on how.

First of all, some context.

SUM does exactly what it says on the tin. It sums a range of cells.

SUMPRODUCT is a lesser-known function, and sums a set of products. (a1*b1*c1) +(a2*b2*c2) + [...] + (an*bn*cn)

SUMIF sums a range of cells if a certain condition is true. For example, sum people’s salaries in a column if they are based in London.

I wanted a way of putting a condition on the SUMPRODUCT function. That is, I wanted to do a sum of the products only where a condition for those rows held true.

To make the formulae more manageable, let’s assume I’ve named a few ranges:

conditions is the column of data that I want to validate the condition against
range1 and range2 are the two ranges that I want to do the conditional SUMPRODUCT on.

And let’s assume I only want to do this if the values in the conditions range are "London".

The formula would read:

=SUMPRODUCT(--(conditions="London"),range1,range2)

Basically, the first term acts as a range in its own right, taking the value of 1 (for London) and 0 (for anything else). This means that it’s not actually doing a conditional SUMPRODUCT, but instead it’s multiplying the entries for which the condition fails by zero, which has the same effect. The double minus at the beginning is to ensure that the first argument is read as a formula.

Variable vlookup offsets

Posted on May 9, 2011 by Dan

Challenge: I want to future-proof the column number returned in my VLOOKUP formula.

VLOOKUP is a hugely useful formula in Excel. Apart from the basic mathematical operators and the IF statement, there’s nothing that touches it in terms of how often I delve for it in trying to solve a problem.

But it’s a very unstable formula. Here’s its syntax.

VLOOKUP(needle,haystack,column number to return,exact match?)

You look for a number in a haystack of data, and return the nth column across from that haystack.

The issue comes when the range that you’re looking up in is liable to change in structure and size.

The needle is unlikely to ever change, so that’s not a problem. Meanwhile, the haystack is likely to be a fixed range, and if it’s not, it can be dealt with by using a names range. The column number to return is the weak link. If you innocently insert or delete a column from the haystack, it can make your VLOOKUPs come tumbling down.

In the main, I’ve always hard-coded the column reference, not thinking to do things any differently. But recently, I found a way of making it dynamic.

At the head of the column in which you’re looking up, likely in a hidden row, store a value that represents the column number that the column represents in the range. So if your data runs from row 3 to row 100, with column titles in row 2, and the column in which you’re looking up is column A, put the following formula in, say, D1:

=COLUMN(D3)-COLUMN($A$3)+1

The COLUMN function simply returns the column number of the cell being referenced.

In this example, the formula will return the value 4.

Copy this formula across the relevant cells in row 1, and the row can then be hidden.

And in the lookup itself, if you want to bring back column D, use the following formula:

=VLOOKUP(needle,$A$3:$Z$100,D$1,0)

By doing this, if a column is inserted somewhere between columns A and D, the data that was in column D will move to column E, but its header number will increase from 4 to 5. And importantly, your VLOOKUP won’t break.

Dynamic named ranges

Posted on May 9, 2011 by Dan

Huge power can result from combining the OFFSET function with Named Ranges in Excel. Here's how.

OFFSET itself is phenomenally powerful. It allows you to refer to a range of a chosen height and width that is a chosen number of rows and columns away from another range. Its syntax (or one of two possible syntaxes) is:

=OFFSET(range, rows, columns, height, width)

Range is the original range from which you're offsetting, which only actually needs to be a single-celled range. Rows and columns are the respective offsets (measured in cells), while height and width define the height and width of the new range.

So OFFSET(A1,2,3,4,5) will give you a range 4 rows high, 5 rows across that starts 2 rows below and 3 columns to the right of A1 (i.e. D3:H6).

Named ranges are also great. You can reference them in formulae, and if they change in dimensions, all you need to do is change the reference behind the named range, and all of your formulae update accordingly. Fantastic!

But what if you know in advance that your referenced range is going to change? And what if the height and width of the range can be determined based on the contents of a column and row?

So let’s combine the two. Let’s assume you have a range of data that starts in A2, with column titles in row 1. And let’s assume that your columns and rows are contiguous (i.e. no blank rows or columns), and a column title signifies that it should be part of the range, while an entry in column A qualifies that row as being part of the range.

Create a named range (Formulas | Define Name) called my_data. And name the range as follows:

=OFFSET(A2,0,0,COUNTA(A:A)-1,COUNTA(1:1))

(Note, the zeros mean that we’re not actually using the offset bit of the function, but that’s by the by.)

The range referenced will start in A2, will (if the earlier assumptions hold) end at the bottom-most row which has a value in column A and go as far right as there are columns with a title. The "-1" adjustor is to account for the title in column A, which you don’t want to count. The COUNTA function simply counts the non-empty cells in a range.

By having a dynamic range, you can add rows and columns to a range that is being looked up in without fear of rows or columns being missed out of the look-up. So a database of contacts can be added to and the new names will be included in any referenced range; and new columns of data can be added with similar confidence.

Useful custom formats

Posted on May 8, 2011 by Dan

Some neat Excel custom formats

A few nice custom Excel formats for you to use.

<;;;> Stores a value, but doesn’t display it
<00000> Always stores five digits. So 453 becomes 00453. Useful for US zip codes
<[Black] General> To hide error messages (e.g. #DIV/0!). If you make your standard font colour the same as the background (usually white), then this will kick in to give non-error cells a black font
<0;-0;;@> Not sure why this works, but it hides zero values
[#.???] This will line up your decimal places in a column, and display three decimal places, but only if they’re significant. "3.2" will display as "3.2", not "3.200". The only slight issue is that "3" will display as "3."
< @> Put a bunch of spaces before the @, and these will appear at the beginning of the cell, resulting in a padding
<#,###"km"> Type in 1000, and it will appear as 1,000km, but you can still use the value to do calculations
<@*.> This will pad your cell out with dots to the width of the cell. So "Excel" will appear as "Excel…………". Might be useful for tables of contents
<#,###,, "M"> Again, not sure how this works, but it divides the value by a million and displays a trailing "M"
<[<=2]"Low"* 0;[>=4]"High"* 0;"Average"* 0> This will display the word at the left of the cell, and the value at the right

To use them, go to Format | Cells and click Custom. Then type the bit between the < and the > above to get the desired result.

The beauty about the above is that only change the display format. They don’t affect the value stored.

Paste Special between Excel instances

Posted on May 8, 2011 by Dan

If you have two instances of Excel open, you can’t Copy | Paste Special between the two. Which can be a bit of a pain.

Instead, create a new spreadsheet in the destination instance, paste into that, and then Copy | Paste Special from that one to the destination. Not elegant, but it does the job in most cases.

Hiding errors

Posted on May 8, 2011 by Dan

Challenge: I want to hide error values.

Error values are pretty ugly. #N/A, #VALUE, #REF etc. When the eye sees them, it triggers an ever so slight wince. They occur when something's awry: when lookups don't yield any results, when cells being referenced no longer exist etc.

In the past, I used to cater for such anomalies in formulae. In the most common example, I used to use an IF statement to get around VLOOKUPS that yielded nothing.

=IF(ISNA(VLOOKUP(A2,Range,2,0)),"",VLOOKUP(A1,Range,2,0))

In human speak, if the lookup brings back "not applicable", leave the cell empty, otherwise bring back the lookup value.

This meant for woefully inefficient formulae and tiresome repetition on the part of their author (me).

Thinking about the problem, I figured it probably makes more sense, both from a presentation and a memory perspective, to cater for this circumstance using a formula-based conditional format, using the formula:

=ISERROR(A1)

As for the format, change the number format to ";;;" (without the quotes). This hides the contents of cells that meet the criterion, meaning that those ugly error messages won't appear.

Conditional formatting of number formats was only introduced in Excel 2007, so this tip is only applicable in more recent versions, I’m afraid.

The ROUND function works both sides of the decimal place

Text to columns: saving your file layout

Conditional minimums

A cute Excel conditional formatting quirk

SUMPRODUCTIF

Variable vlookup offsets

Dynamic named ranges

Useful custom formats

Paste Special between Excel instances

Hiding errors

Wizard’s tips by email

Search this site

Topics

Wizard’s tips by email

Search this site

Topics

Tags