Calculating Moran’s I for Spatial Data in Python!
Today, we are going to continue exploring the world of spatial statistics. If you come from a quantitative background, you’ve probably been exposed to many meaningful letters like p and Z and t. In this article, I am going to add another to the statistics alphabet soup: I!
Scenario
Suppose you are analyzing the distribution of crime rates across counties in your state. You are not merely interested in whether crime is high or low in a specific area, but if crime rates in nearby neighborhoods relate to each other. Are neighborhoods with high crime clustered together? Are safe areas surrounded by similarly safe neighborhoods, or are there unexpected contrasts?
I get bored easily and hate seeing data analyses of only crime rates. I want to add a twist to this problem.
Instead of looking at the crime rates themselves, I want to look at cold crime case rates — those which still have not been solved — per population. This not only considers crimes committed, but the ability of law enforcement to hold culprits accountable. We will look at Virginia for this.
Method
To add another twist, I am going to also perform the analysis with areal data, rather than coordinate or point data. Here, we will aggregate our data by some region. This article will only consider the areal analysis (look out for a future article calculating Moran’s I with coordinate data!).