Member-only story

Calculating Moran’s I for Spatial Data in Python!

Venn Datagram
7 min readNov 2, 2024

--

Today, we are going to continue exploring the world of spatial statistics. If you come from a quantitative background, you’ve probably been exposed to many meaningful letters like p and Z and t. In this article, I am going to add another to the statistics alphabet soup: I!

Scenario

Suppose you are analyzing the distribution of crime rates across counties in your state. You are not merely interested in whether crime is high or low in a specific area, but if crime rates in nearby neighborhoods relate to each other. Are neighborhoods with high crime clustered together? Are safe areas surrounded by similarly safe neighborhoods, or are there unexpected contrasts?

I get bored easily and hate seeing data analyses of only crime rates. I want to add a twist to this problem.

Instead of looking at the crime rates themselves, I want to look at cold crime case rates — those which still have not been solved — per population. This not only considers crimes committed, but the ability of law enforcement to hold culprits accountable. We will look at Virginia for this.

Method

To add another twist, I am going to also perform the analysis with areal data, rather than coordinate or point data. Here, we will aggregate our data by some region. This article will only consider the areal analysis (look out for a future article calculating Moran’s I with coordinate data!).

Formula For Moran’s I

Let’s first focus on the global trend and evaluate whether areas with similar crime tend to cluster together. We will use Moran’s I to reveal that spatial autocorrelation.

Here is the formula (Fig. 1):

Figure 1: Formula for Moran’s Global I Statistic.

This looks like a jumble. Let’s break it down for a moment.

We have yᵢ as the observed value in region i (i = 1,2,…,m), ybar as the mean for yᵢ, and wᵢⱼ as the spatial proximity between regions i and j. The structure of this equation is similar to the Pearson correlation coefficient (Fig. 2):

Spatial Proximity

What is this? Spatial proximity is how two data are related. The term proximity implies a varying number associated with how far the…

--

--

Venn Datagram
Venn Datagram

Written by Venn Datagram

Intersect data with all. Make sense of data in a variety of fields with our Venn Datagram!

No responses yet

Write a response