Last week I was working on getting a document together that involved typing many, many ZIP codes from across the United States. This particular document involved looking up addresses for approximately 350 locations and after a while I realized that I was getting pretty darn good at accurately predicting what the first digit of the ZIP code was going to be and vise versa (i.e. if I looked at the first digit of the ZIP code I could guess the location within a few states).
As I was collecting this data into my spreadsheet, I was developing a hypothesis . . . the first digit of the ZIP code is directly related to the year a state joined the union.
Remember, directly related means as the year the state joined the union increases the first digit of the ZIP code also increases. In other words, the first digit of the ZIP code depends on the year the state joined the union. To test my hypothesis I used a map of the U.S. and wrote in the first digits of the ZIP codes I knew.
And then, I created a table of values with the same information (X means I didn’t have the ZIP for any location in that particular state, not that a quick Google search couldn’t have helped me find it, but I just didn’t have it in the document I was working from–also, if my hypothesis proved correct I likely wouldn’t need it!):
|State||ZIP||Year of Statehood|
And I made a scatterplot:
So, I’m going to go ahead and say that my hypothesis was not overwhelmingly correct. It looks like the year the state joined the union may be related to the first digit of the ZIP code, but clearly my theory has some flaws. For example, look at the first few entries in the table. States joining the union after the first few states have ZIP codes of 0, 1, 2!
Ugh. Then, you know what I wondered about. Would there have been a need for ZIP codes (i.e. a post office) when the first 13 colonies became states? In fact, when did the post office start using ZIP codes anyway? Well, I found my answer . . . 1963. Yes, really. 1963.
Goodness Gracious. All 50 states had joined The Union by the time the use of ZIP codes was implemented.
This experience made me think about two things:
1. Just because two variables are correlated, doesn’t mean that one causes the other.
2. I wonder what a better predictor of ZIP codes would be?