Introduction to Hypothesis Testing

Let’s take a moment to review some of the ideas that we covered in the previous session. In the last post, we focused on estimation and how it’s used to make inferences about an unknown population parameter. Estimation has two aspects: point estimation and interval estimation. The first, point estimation, uses one value to represent the unknown population parameter, such as how the sample mean  (\underline{x}) is used as a point estimate of the population mean  (\mu) . The other, interval estimation, presents a range of values within which, with some level of confidence, lies the population parameter, like how the sample variance can be an estimate of the population variance.

We also touched upon the confidence interval, which is the range of values that is expected to cover the true unknown population parameter. The upper and lower limits of the confidence interval are determined by using the distribution of the sample mean  (\underline{x}) and a multiplier that specifies confidence. Let’s say you calculated that your customers’ average age is between 30 and 35 with 95% confidence. What does this mean? This says that even if the process is repeated a large number of times, 95% of the time, the population parameter-which is the average customer age in this particular case-can be found between the range of values specified in your calculation.

What Is Hypothesis Testing?

Hypothesis testing is a set of techniques that builds upon the basic ideas of descriptive statistics, probability, and inferential ideas regarding distributions. It is used in statistics to determine whether there is enough evidence in a sample of data to infer that a certain condition holds for the entire population. In short, hypothesis testing provides a systematic way to test claims or ideas about a population based on sample data. This will be the focus of the second portion of the course.

Applying Hypothesis Testing

In business settings, hypothesis testing can be used to determine the validity of claims and assumptions, compare strategies, and measure the effectiveness of actions and business decisions. Let’s take a look at how it can be applied in a realistic scenario and delve into the details later on.

Our scenario can focus on a company that wants to know if the new manufacturing process it has adopted has improved the reliability of its products. Historically, 70% of the products made using the old process passed the reliability test. Now that the company has switched over to the new process, we have a sample of 100 products, 73 of which are reliable. Does this mean that the adoption of a new manufacturing process has improved the reliability of the company’s products?

The historical mean in this situation is 70, and 73 is clearly more than 70. However, what are the chances of getting 73 or more reliable items out of 100 products? If the true mean is 70, then there’s a 30% probability that you can see 73 reliable products from a sample of 100 by pure luck. This means that even if there is no improvement in the manufacturing process, there’s a significant chance that you’ll see the same numbers. In other words, there’s no strong evidence that the new process improves reliability. Yes, 73 is more than 70, but there’s a 30% chance that you’ll get 73 reliable items from a process where the mean remains 70.

Let’s say now that the new process yielded 81 reliable products from a sample of 100 items. In this situation, the chance of seeing 81 or more reliable products from a sample of 100 items is 1%. This means there’s only a 1% chance that, out of sheer luck, you’ll get 81 reliable products from 100 items by using the old manufacturing process. This strongly suggests that the manufacturing process has improved.

What’s the Difference Between Estimation and Hypothesis Testing?

It can be tempting to lump estimation and hypothesis testing together. Both techniques are crucial in inferential statistics and offer complementary approaches to understanding and making decisions based on data. However, there’s a significant difference between the two.

Estimation is typically utilized in simpler problems that have no previous knowledge of the concerned population parameter. In such a situation, a random sample is taken, a sample statistic is computed, and an appropriate point and interval estimate is suggested. The point of estimating is getting a numerical value. This is not the case when hypothesis testing. Rather than arriving at a point estimate of a population parameter, hypothesis testing aims to determine the plausibility of a hypothesis about the population parameter by using sample data. It’s a binary statement: Is it true that x? Is it not true that x?

To summarize:

  • Estimation focuses on determining the approximate value of a population parameter by providing either a single value (point estimate) or a range of values (confidence interval).
  • Hypothesis testing focuses on making decisions about a population parameter by testing specific hypotheses and determining statistical significance.

Sample Business Situations Where Hypothesis Testing Can Be Used

Here are sample business scenarios where hypothesis testing can prove to be quite useful.

  • Marketing Campaign Effectiveness – It can be used to determine if a new marketing campaign has significantly increased sales or customer engagement. For instance, a retail company launches a new online advertising campaign and wants to know if it has led to an increase in website traffic.
  • Product Development and Quality Control – As used in the example above, hypothesis testing can ensure that new products or production processes meet the company’s desired standards. It can be used by a manufacturing firm to test if a new production process reduces the number of defective products.
  • Customer Satisfaction and Feedback – Hypothesis testing can also be applied to assess whether changes in service or product features have improved customer satisfaction. Companies that are implementing a new customer service training program can use hypothesis testing to know if they have improved customer satisfaction scores.
  • Price Optimization – The impact of pricing changes on sales volume and profitability can also be assessed using hypothesis testing. An e-commerce company can use this strategy to test the effect of a price reduction on sales volume.

In summary, the objective of hypothesis testing is to set a value for the parameters and perform a statistical test to see whether that value is tenable in the light of the evidence gathered from the sample. In the next sessions, we’ll take a closer look at how hypothesis testing is conducted.

About Glen Dimaandal

Picture of Glen Dimaandal
Glen Dimaandal is a data scientist from the Philippines. He has a post-graduate degree in Data Science and Business Analytics from the prestigious McCombs School of Business in the University of Texas, Austin. He has nearly 20 years of experience in the field as he worked with major brands from the US, UK, Australia and the Asia-Pacific. Glen is also the CEO of SearchWorks.PH, the Philippines' most respected SEO agency.
Picture of Glen Dimaandal
Glen Dimaandal is a data scientist from the Philippines. He has a post-graduate degree in Data Science and Business Analytics from the prestigious McCombs School of Business in the University of Texas, Austin. He has nearly 20 years of experience in the field as he worked with major brands from the US, UK, Australia and the Asia-Pacific. Glen is also the CEO of SearchWorks.PH, the Philippines' most respected SEO agency.
ARTICLE & NEWS

Check our latest news

In data science, saving progress is essential. Just like saving your progress in a video game…

In our last lesson, we introduced the concept of Python packages and NumPy in particular. Short…

Now that we have a solid handle on basic Python programming, we can move on to…

Ready to get started?

Reveal the untapped potential of your data. Start your journey towards data-driven decision making with Griffith Data Innovations today.