Which metrics can be used to measure correlation of categorical data?

Chi square test can be used for doing so. It gives the measure of correlation between categorical predictors.

 

When measuring the correlation of categorical data, you typically use metrics that are suitable for categorical variables. One commonly used metric for this purpose is Cramér’s V.

Cramér’s V is a measure of association between two nominal variables and is an extension of the chi-square test. It ranges from 0 to 1, with 0 indicating no association and 1 indicating a perfect association. The formula for Cramér’s V is:

V=n×min(k1,r1)χ2​​​

Where:

  • χ2 is the chi-square statistic from the contingency table,
  • n is the total number of observations,
  • k is the number of categories in one variable, and
  • r is the number of categories in the other variable.

Cramér’s V is commonly used when dealing with categorical variables, providing a measure of the strength and direction of association between them.