20_R

hrafnulf13
Dec 2, 2020
2 min read

In statistics, a rank correlation is any of several statistics that measure an ordinal association - the relationship between rankings of different ordinal variables or different rankings of the same variable, where a "ranking" is the assignment of the ordering labels "first", "second", "third", etc. to different observations of a particular variable [2].

A rank correlation coefficient measures the degree of similarity between two rankings, and can be used to assess the significance of the relation between them.

If, for example, one variable is the identity of a college basketball program and another variable is the identity of a college football program, one could test for a relationship between the poll rankings of the two types of program: do colleges with a higher-ranked basketball program tend to have a higher-ranked football program? A rank correlation coefficient can measure that relationship, and the measure of significance of the rank correlation coefficient can show whether the measured relationship is small enough to likely be a coincidence.

If there is only one variable, the identity of a college football program, but it is subject to two different poll rankings (say, one by coaches and one by sportswriters), then the similarity of the two different polls' rankings can be measured with a rank correlation coefficient.

As another example, in a contingency table with low income, medium income, and high income in the row variable and educational level—no high school, high school, university—in the column variable), a rank correlation measures the relationship between income and educational level.

Some of the more popular rank correlation statistics include:

Spearman's ρ
Kendall's τ
Goodman and Kruskal's γ
Somers' D

An increasing rank correlation coefficient implies increasing agreement between rankings. The coefficient is inside the interval [−1, 1] and assumes the value:

1 if the agreement between the two rankings is perfect; the two rankings are the same.
0 if the rankings are completely independent.
−1 if the disagreement between the two rankings is perfect; one ranking is the reverse of the other.

A ranking can be seen as a permutation of a set of objects. Thus we can look at observed rankings as data obtained when the sample space is (identified with) a symmetric group. We can then introduce a metric, making the symmetric group into a metric space. Different metrics will correspond to different rank correlations.

The Spearman correlation coefficient, ρ, can take values from +1 to -1. A ρ of +1 indicates a perfect association of ranks, a ρ of zero indicates no association between ranks and a ρ of -1 indicates a perfect negative association of ranks. The closer ρ is to zero, the weaker the association between the ranks [1, 3].

Let's take example from [1], to calculate a Spearman rank-order correlation on data without any ties we will use the following data:

Where d = difference between ranks and d2 = difference squared.

We then calculate the following:

We then substitute this into the main equation with the other information as follows:

as n = 10. Hence, we have a ρ of 0.67. This indicates a strong positive relationship between the ranks individuals obtained in the Maths and English exam. That is, the higher you ranked in Maths, the higher you ranked in English also, and vice versa.

References

https://statistics.laerd.com/statistical-guides/spearmans-rank-order-correlation-statistical-guide-2.php
https://en.wikipedia.org/wiki/Rank_correlation
https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient

Statistics 2020-2021

MSc Cybersecurity, Sapienza University

20_R

References

Recent Posts

Comentarios