Statistics For Data Science Course
2
969

All About Student’s t test

Where σ is the population standard deviation, which is not known in most of the cases.if the sample size is large enough, we can approximate the population standard deviation σ to be almost equal to sample standard deviation s. But, when the sample size is less, we cannot make such approximation, and that is where a t-test comes into the picture.

What is a t-test?

A t-test is a way to test your hypothesis when the sampling distribution is known to follow a t-distribution. When the sample size is less, and the population standard deviation is unknown, then the sampling distribution follows a t- distribution. A t-score is given as

Where, \bar{x}= sample mean, 𝜇= population mean, 𝑠= sample standard deviation, 𝑛= sample size

t-distribution

A t- distribution closely follows a normal distribution with the exception that it has a flatter tail. A t-distribution becomes closer to a normal distribution as the sample size increases. The degree of freedom is given as n-1, where n is the sample size. In the following picture, as the degree of freedom increases, t-distribution becomes almost identical to a normal distribution.

Hypothesis Testing using a t-test

Let’s recall the process of hypothesis testing, as explained in the following flow diagram.

We can test a Null Hypothesis utilizing a t-test. The decision can be made as per the below conditions.

  • Reject a Null Hypothesis,  if t (test) > t(critical) or p-value (test) < alpha
  • Fail to reject a Null Hypothesis, if t (test) < t(critical) or p-value (test) > alpha

A t-critical is the t-statistic value in a t-table corresponding to the degree of freedom and level of significance, and alpha is the level of significance.

Types of t-test

There are three types of t-test, as listed below.
  1. One sample t-test
  2. Independent two-sample t-test
  3. Dependent paired – sample t-test

One Sample t-test

In this type of t-test, we test the null hypothesis that the population mean is equal to a specified value 𝜇 based on a sample mean \bar{x}. For a one-sample t-test, the following formula for the t-score is used.

One Sample t-test Example

A retail store wants to improve sales. Historical sales data indicate that the average sale was Rs. 100 per transaction. After training the sales force, recent sales data (taken from a sample of 25 salespeople) indicates an average sale of Rs. 130, with a standard deviation of Rs. 15. Did the training work? Assume that the level of significance is 5%.
Step 1. Frame Null and Alternate Hypothesis.
  • Ho- Average sales remain Rs. 100 (𝜇=100)
  • Ha- Average sales is higher than 100(𝜇>100)
Step 2.  Choose test statistic- In this problem, the sample size is 25, and the population standard deviation is not known. Hence we will be using a t statistic.

Step 3. Level of significance is given as 0.05

Step 4. Calculate t score(test) and  p-value (test) – explained in excel and python in the next section

Step 5. Calculate t score (critical) and p-value(critical) – explained in excel and python in the next section

Step 6. Decision Making about the null hypothesis
  • Reject a Null Hypothesis,  if t (test) > t(critical) or p-value (test) < alpha
  • Fail to reject a Null Hypothesis, if t (test) < t(critical) or p-value (test) > alpha

Hypothesis Testing using One sample t-test in Excel

A t-test is determined using the t score formula. Critical p-value is same level of significance. The following formula can determine t- critical score.
  • T.INV(probability,deg_freedom) where probability is the level of significance.
The following formula can determine p-value at the test score
  • T.DIST(x,deg_freedom), where x is the t score.

The following diagram depicts how we can do a One-sample t-test in Excel.

Hypothesis Testing using one-sample t-test in Python

The following notebook explains the way we can do a Dependent paired – sample t-test in Python.

Independent two-sample t-test

It tests the null hypothesis that two sample means \bar{x1} and \bar{x2}   are equal. In this type of t-test, the following formula is used for calculating a t-score. 

The degree of freedom, in this case, is given as n1+n2-1, where n1 and n2 are the sample size of the two sets of samples.

Independent two-sample t-test example

Let’s consider due to slowdown in sales; an automobile company wants to shut down one of the two plants in the city. Let’s call those plants are A and B. we have data of some cars manufactured in each plant for ten days? Which plant is better? From this data, it looks like, on average, Plant A produces 36 more cars than Plant B.Is 36 cars enough to say that the plants are different?

Step 1. Frame Null and Alternate Hypothesis.

  • Ho- The population means of both the plants are equal(𝜇1= 𝜇2)
  • Ha- The population means of Plant A is higher than Plant B(𝜇1>𝜇2)

Step 2.  Choose test statistic- In this problem, the sample size is 10, and the population standard deviation is not known. Hence we will be using a t statistic.

Step 3. Level of significance is given as 0.05

Step 4. Calculate t score(test) and  p-value (test) – explained in excel and python in the next section

Step 5. Calculate t score (critical) and p-value(critical) – explained in excel and python in the next section

Step 6. Decision Making about the null hypothesis

Reject a Null Hypothesis,  if t (test) > t(critical) or p-value (test) < alpha

Fail to reject a Null Hypothesis, if t (test) < t(critical) or p-value (test) > alpha


Hypothesis Testing using two-sample independent t-test in Excel

The P-value that the two samples are different is calculated using this formula.

  • T.TEST(Sample1, Sample2, tails,test-type)

Where tails may be one-sided or two-sided, and type can be paired, two samples equal variance or two-sample unequal variance.The following diagram depicts how we can do a two-sample independent t-test in Excel.

Hypothesis Testing using two-sample independent t-test in Python

The following notebook explains the way we can do a two-sample independent t-test in Python.

Dependent paired – sample t-test

A paired t-test can be used to compare two populations where samples are dependent, or samples are repeated. Some examples are: 
  • Measurement of test scores of the same set of students before and after introducing peer learning.
  • A comparison of two different measures on the same samples.

Dependent paired – sample t-test Example

A group of 10 people were given ice-creams of two brands, A and B, and were told to rate them out of 10. The data is recorded as in the table. Do the two ice-creams differ?

Step 1. Frame Null and Alternate Hypothesis.

  • Ho- The two ice creams are same (𝜇1= 𝜇2)
  • Ha- The two ice creams are different (𝜇1≠𝜇2)

Step 2.  Choose test statistic- In this problem, the sample size is 10, and the population standard deviation is not known. Hence we will be using a t statistic.

Step 3. Level of significance is given as 0.05

Step 4. Calculate t score(test) and  p-value (test) – explained in excel and python in the next section

Step 5. Calculate t score (critical) and p-value(critical) – explained in excel and python in the next section

Step 6. Decision Making about the null hypothesis

  • Reject a Null Hypothesis,  if t (test) > t(critical) or p-value (test) < alpha
  • Fail to reject a Null Hypothesis, if t (test) < t(critical) or p-value (test) > alpha

Hypothesis Testing using Dependent paired – sample t-test in Excel

The P-value that the two samples are different is calculated using this formula- 
  • T.TEST(Sample1, Sample2, tails,test-type)
Where tails may be one-sided or two-sided, and type can be paired, two samples equal variance or two-sample unequal variance. For paired test test-type is 1.
The following diagram depicts how we can do a Dependent paired – sample t-test in Excel.

Hypothesis Testing using Dependent paired – sample t-test in Python

The following notebook explains the way we can do a Dependent paired – sample t-test in python.

The following video tutorial explains t-test in detail.

Show Comments

No Responses Yet

Leave a Reply