Correlation

Case I, X, Y are both normally distributed:

,

then

¡@

Case II, A is normal, B is binary (B=0 or 1), the correlation(A B) :

Just treat B as a continuous variable and calculate the Pearson corr of A and B.

Test the significance of this correlation:

Note , or

where t is the 2-sample t-statistics for testing means under equal variance hypothesis.

¡@

Case III, A is normal, B is a categorical variable with more than 2 categories.

Then , where is the F-statistics for testing means in ANOVA.

¡@

Case IV, if both A, B are binary,

B\ A

1

0

1

N11

N12

0

N21

N22

Let

then define corr(A,B) as

       , (analogous to Case I)

Note , where Q is the Pearson for testing the association of A, B.

¡@

Case V, if both A, B are categorical with more I and J categories respectively,

Then

¡@

Case VI, both X, Y are non-normal variables.

Let R(xi) and R(yi) be the rank variables w.r.t. X and Y, and di= R(xi)-R(yi), then define Spearman rank correlation as

It can be shown that

.

Eg. ¨â­ÓŲ½à®aµûŲ 10¥ó§@«~

When n>10, a mimic t-test as in case II can be applied.

¡@

Case VII Altenative of Spearman rank correlation for small n Kendal's Ċ :

Step1: rank x and y,

Step2, sort x

Step3, count the disarray of any of the ordered pair of y, say, d pairs,

Step4,

e.g.

¡@

a

b

c

d

e

¡@

¡@

a

b

c

d

e

x

83

72

65

79

86

„³

x

2

4

5

3

1

y

82

70

74

87

92

¡@

y

3

5

4

2

1

¡@

¡@

¡@

e

a

d

b

c

„³

x

1

2

3

4

5

¡@

y

1

3

2

5

4

¡@

D=2, so

¡@

Case VIII are non-normal variables: Kendall Coefficient of Concordance

E.g. k­ÓŲ½à®aµûŲ 10¥ó§@«~ (k dependent samples with size n)

Let be the rank of ith subject within sample j, be the sum of the rank of subject i across k sample, and be the mean of . Then the coef. is

,

where,

¡@

¡@