Distances between places as the crow flies

Challenge: I have two the latitude and longitude of two places. I want to know the distance between them, as the crow flies.

This took a lot of trial and error, made more tricky by having to also accommodate two points that straddled the equator. But the formula has been tested and works.

Put the latitude of point A (in degrees) in A1; its longitude in A2. And put the latitude of point B in B1; and its longitude in B2.

Below is the formula for the distance between the points.

=ROUNDDOWN(6371.0072*ACOS(COS(RADIANS(90-A1))*COS(RADIANS(90-B1))+SIN(RADIANS(90-A1))*SIN(RADIANS(90-B1))*COS(RADIANS(A2-B2))),1)

The 6371.0072 is the radius of the Earth, the first time I've ever used it in an Excel formula. The ROUNDDOWN function merely uses the "1" at the end of the formula to round it down to 0.1km in accuracy.

Frequency count on a range of data

Challenge: I have thousands of rows of data, each containing multiple columns of piece of information. I want a frequency count of each piece of information that appears.

I received some data the other day. In total, there were 1,000 rows of raw data. Each record contained a set of comma-separated tags. The most tag-heavy record had 200 tags; other records had but one or two. My mission was to create a single, deduplicated list of tags, together with their frequencies.

First, load the data into the first column of Excel. The next step is to get each tag into a separate cell.

Data | Text to Columns

This gives you the option to declare the comma as the separator, and what was a single column becokmes 200 columns of data. The next step is to stack them. To do this requires two functions: ADDRESS and INDIRECT.

The ADDRESS function brings back the actual text of a cell reference. So =ADDRESS(2,3) will bring back the cell reference of row 2, column 3, i.e. "C2". But it brings this back as text.

This is where the INDIRECT function comes into play. INDIRECT takes some text and interprets it as a formula. So =INDIRECT(C2) will give you the contents of C2.

Combine the two, and you have:

=INDIRECT(ADDRESS(2,3))

This will bring back the contents of cell C2. But if you make the two arguments of the ADDRESS formula into cell references, you have poetry.

The idea is to create two lists containing the vertical and horizontal reference numbers of each cell in the 1,000-by-200 matrix, and use that to inform the above formula.

So in a new sheet, create two columns—one containing the row references of the tags and one containing their column references. In the first, put 200 1s followed by 200 2s, 200 3s etc. until we hit 1,000—200,000 rows in total. The second column contained the numbers 1 through 200, repeated 1,000 times.

In the third column, apply the above formula to each of those row–column combinations. This will bring back every single tag in a list, with zeros where there was no tag. (I.e. a record with five tags will have those listed followed by 195 zeros.)

Copy and paste values, and you can then sort the list alphabetically by tag, removing the zeros.

A simple PivotTable then gives you the frequency count for each tag.