Basic Concepts of Hypothesis Testing

Our previous discussion focused on the basics of hypothesis formulation. As you can recall, a hypothesis refers to a statement that makes a claim about a population parameter, and our aim is to test that claim using sample data. We touched upon the concepts of null hypothesis (H_o), which is the starting point or baseline of the hypothesis. Its opposite is the alternative hypothesis ( H₁or H_a), which is what needs to be established using data. Let’s begin the current session by spending more time familiarizing ourselves with these two fundamental ideas.

Rejecting and Failing to Reject the Null Hypothesis

It only makes sense to begin a hypothesis on established information backed by evidence. The default or status quo assumption is the null hypothesis. This concept is assumed to be true unless there is reasonably strong evidence to the contrary. The null hypothesis is important because it serves as the baseline for statistical testing.

Let’s say that you’re testing a new drug. In this situation, the null hypothesis might be that the drug has no effect compared to a placebo. By assuming that there is no effect or difference, researchers can objectively determine whether the observed data provides enough evidence to suggest otherwise.

When performing hypothesis testing, the goal is to evaluate whether the data collected provides strong enough evidence against the null hypothesis. The decision can be:

Evidence towards the contrary is strong: This means that the data provides strong enough evidence to conclude that there is an effect or difference, implying that the alternative hypothesis is more plausible. In such a situation, the decision is to reject the null hypothesis.
Evidence towards the contrary is not strong: This means that the null hypothesis cannot be rejected in favor of the alternative hypothesis. There is no significant evidence to say that the alternative is acceptable to the null. The decision here is to fail to reject the null hypothesis.

Let’s take a look at the outcomes, which are to reject the null hypothesis and to fail to reject the null hypothesis. This binary decision exists because hypothesis testing is designed to make objective judgments based on probabilities. Since you’re working with sample data, you can never be certain, but you can make a decision based on how likely it is that your data contradicts the null hypothesis.

The need for this parallelism when describing the outcome of hypothesis testing comes from the fact that you’re never “proving” anything in hypothesis testing. Instead, you’re evaluating evidence. The phrase “fail to reject” indicates that the data did not provide enough proof to contradict the null, but it doesn’t prove the null hypothesis is true either.

The Importance of Test Statistics

To make a decision in hypothesis testing—that is whether to reject or fail to reject the null hypothesis—researchers rely on test statistics. A test statistic is a numerical value calculated from the sample data. It is used to determine how far the sample data deviates from what is expected under the null hypothesis. Common test statistics include the Z-score, t-statistic, and chi-square statistic. The test statistics serve as a basis for decision rule, which is a set of criteria used to determine whether to reject or fail to reject the null hypothesis.

The decision rule is determined by comparing the test statistic to a critical value, which is determined by a predefined significance level (𝛼), such as 0.05 or 0.01.

If the test statistic falls within a critical region, you reject the null hypothesis.
If the test statistic falls outside the critical region, you fail to reject the null hypothesis.

Hypothesis testing relies on sampling distributions, which represent how a test statistic is distributed across multiple random samples taken from the population. Because you’re working with samples and not the entire population, the results of the test are probabilistic in nature. Probabilistic outcomes refer to the fact that conclusions drawn from hypothesis testing are based on the likelihood or probability of the observed data given the null hypothesis. This means that even if you reject or fail to reject the null hypothesis, there’s a chance that your decision could be incorrect, leading to errors like Type I or Type II errors.

Type I and Type II Errors

Recall that there are two decisions to be made during hypothesis testing. One is to reject the null hypothesis, while the other is to fail to reject the null hypothesis. If the null hypothesis is false, it should be rejected. If the null hypothesis is true, it should fail to be rejected.

A Type I error occurs when the null hypothesis is actually true and you reject it. The probability of a Type I error taking place is denoted as (𝛼), which is the same as the predetermined significance level. This error is also called a false positive, as you conclude that there is an effect or difference when, in reality, there is none.

Example: A company tests a new marketing strategy and rejects the null hypothesis, believing the strategy increased sales. However, in reality, the strategy had no impact on sales, and the observed difference was due to random chance.

The correct decision should be that if the null hypothesis is false, you should reject it. The probability of this is event taking place is denoted as (1 – β). This is called the Power of the Test.

A Type II error happens when the null hypothesis is false and you fail to reject it. The probability of a Type II error taking place is denoted as (β). This is also called a false negative, as you conclude that there is no effect or difference when there actually is one.

Example: A company tests a new product design and fails to reject the null hypothesis, concluding that the new design is no better than the old one. However, in reality, the new design does improve customer satisfaction, but the study did not have enough evidence to detect this improvement.

The correct decision is that if the null hypothesis is true, you should fail to reject it. The probability of this event taking place is denoted as (1 – 𝛼).

In summary:

A Type I error or false positive takes place when incorrectly concluding that there is an effect or difference. As a consequence, you make a change or take action when it’s not necessary.
A Type II error or false negative happens when failing to detect a real effect or difference. As a consequence, you miss out on an opportunity for improvement or innovation.

Both errors have important implications in business and decision-making. Minimizing Type I errors reduces the risk of acting on false positives, while minimizing Type II errors ensures that true differences are detected. Next session, we’ll talk about the template for hypothesis testing.

About Glen Dimaandal

Glen Dimaandal is a data scientist from the Philippines. He has a post-graduate degree in Data Science and Business Analytics from the prestigious McCombs School of Business in the University of Texas, Austin. He has nearly 20 years of experience in the field as he worked with major brands from the US, UK, Australia and the Asia-Pacific. Glen is also the CEO of SearchWorks.PH, the Philippines' most respected SEO agency.

ARTICLE & NEWS

Check our latest news

Ready to get started?

Reveal the untapped potential of your data. Start your journey towards data-driven decision making with Griffith Data Innovations today.