# Chi-Square Test in R

## GOAL

To do chi-square test and residual analysis in R. If you want to know chi-square test and implement it in python, refer “Chi-Square Test in Python”.

## Source Code

```> test_data <- data.frame(
groups = c("A","A", "B", "B", "C", "C"),
result = c("success", "failure", "success", "failure", "success",
"failure"),
number = c(23, 100, 65, 44, 158, 119)
)

> test_data
groups  result number
1      A success     23
2      A failure    100
3      B success     65
4      B failure     44
5      C success    158
6      C failure    119

> cross_data <- xtabs(number ~ ., test_data)
> cross_data
result
groups failure success
A     100      23
B      44      65
C     119     158

> result <- chisq.test(cross_data, correct=F)
> result
Pearson's Chi-squared test

data:  cross_data
X-squared = 57.236, df = 2, p-value = 3.727e-13

> reesult\$residuals
result
groups   failure   success
A  4.571703 -4.727030
B -1.641673  1.697450
C -2.016609  2.085125

> result\$stdres
result
groups   failure   success
A  7.551524 -7.551524
B -2.663833  2.663833
C -4.296630  4.296630

> pnorm(abs(result\$stdres), lower.tail = FALSE) * 2
result
groups      failure      success
A 4.301958e-14 4.301958e-14
B 7.725587e-03 7.725587e-03
C 1.734143e-05 1.734143e-05```

## functions

### xtabs

xtabs() function is a function to create a contingency table from cross-classifying factors that contained in a data frame. “~” is the formula to specify variables that serve as aggregation criteria are described. And “~ .” means that this function use all variables (groups+result).

### chisq.test

chisq.test() function is a function to return the test statistic, degree of freedom and p-value. The argument “correct” is continuity correction and set “correct” into F to suppress the continuity correction.

### reesult\$residuals

\$residuals return standardized residuals.