Stats Works
  • About This Website

Hotelling's T-Test Example

Here's an example of the Hotelling's MV T-test with responses and hypothesized means. Hotelling's MV T-test allows us to use a multivariate analogue of the t-test, comparing the observed values with some hypothesized mean. In this, similar to the univariate case, we assume that the population is Multivariate Normally Distributed, that is:

$$ \vec{x} \sim MVN(\vec{\mu}, \Sigma) $$

where $\Sigma$ is a square, symmetric matrix.

This test is implemented below.

In [4]:
## Hotelling's t-test for multivariate normal.

prob_null <- function(data, hypothesized_mean_vector) {
  n = nrow(data)
  p = ncol(data)
  tsqobs <- hotellings_tsq_statistic(data, hypothesized_mean_vector)
  f_dist <- hotellings_f_statistic_trans(tsqobs, n, p)

  return(1 - pf(f_dist['f'], f_dist['df1'], f_dist['df2']))
}

hotellings_f_statistic_trans <- function(tsqobs, n, p) {
  f <- ((n - p)/(p*(n - 1))) * tsqobs
  df1 <- p
  df2 <- n - p
  return(c(f = f,df1 = df1,df2 = df2))
}

hotellings_tsq_statistic <- function(data, hypothesized_mean_vector) {
  sample_mean <- apply(data,2,mean)
  sample_covar <- cov(data)

  S_inv <- solve(sample_covar)
  n <- nrow(data)
  mu <- hypothesized_mean_vector

  tsqobs <- n*t(sample_mean - mu) %*% S_inv %*% (sample_mean - mu)
  return(tsqobs)
}

find_discriminant <- function(data, hypothesized_mean_vector) {
  sample_mean <- apply(data,2,mean)
  sample_covar <- cov(data)

  discriminant <- solve(sample_covar) %*% (sample_mean - hypothesized_mean_vector)
  return(discriminant)
}
In [5]:
response_data <- matrix(c(51,27,37,42,27,
                          43,41,38,36,26,
                          29,36,20,22,36,
                          18,32,22,21,23,
                          31,20,50,26,41,
                          32,33,43,36,31,
                          27,31,25,35,17,
                          37,34,14,35,25,
                          20,25,32,26,42,
                          27,30,27,29,40,
                          38,16,28,36,25),ncol = 5)
hypothesized_mean <- c(30,25,40,25,30)
response_data

prob_null(response_data, hypothesized_mean)
find_discriminant(response_data, hypothesized_mean)
5136503542
2720261727
3722413730
4236323427
2718331429
4332433540
4122362538
3821312016
3623272528
2631313236
2920252625
f: 0.00669952528414886
0.5298893
-0.2659554
-0.6946549
0.1483265
0.3206404

From this, because the p value is less than .05, we reject the hypothesis that this data comes from a normally distributed population with mean vector: $$ \vec{\mu} = \begin{bmatrix} 30 \\ 25 \\ 40 \\ 25 \\ 30 \end{bmatrix} $$

The discriminant indicates that the third variable contributes most to the difference between the hypothesized and sample mean.

In [6]:
require(MASS)
ex <- mvrnorm(11, mu = hypothesized_mean, Sigma = cov(response_data))
prob_null(ex, hypothesized_mean)
find_discriminant(ex, hypothesized_mean)
f: 0.684532555387565
-0.03199313
-0.07721548
0.09417471
0.09898568
-0.13109563

Now we see that a multivariate normal distribution sampled with mean equal to the hypothesized mean and Sigma equal to the covariance of the observed data provides unsignificant results -- as expected.

Now, let's see the t-test for the equality of means, assuming an equal covariance matrix and sample sizes.

In [7]:
differences <- response_data - ex
prob_null(differences, rep(0,5))
find_discriminant(differences, rep(0,5))
f: 0.00818771246890448
0.35457592
-0.09934875
-0.70226005
0.01886289
0.33948696

We see the expected result: that the two samples produce significant results -- meaning that there is evidence to reject the null hypothesis that they come from a distribution with the same mean vector.

Sean Ammirati - creator of Stats Works. He can be reached on Github, LinkedIn and email.
Comments
comments powered by Disqus

Articles in Multivariate Analysis

  • « MANOVA

All Articles

  • « Generalized Linear Models An Exercise
  • HMMs: Viterbi Algorithm »

Published

Nov 18, 2018

Category

Multivariate Analysis

Tags

  • hotellings 2
  • multivariate 2
  • t-test 2

Contact

  • Stats Works - Making Statistics Work for You