Explore how different t-tests can be performed by using the SciPy library for hypothesis testing in this 10-video course, which continues your explorations of data science. This beginner-level course assumes prior experience with Python programming, along with an understanding of such terms as skewness and kurtosis and concepts from inferential statistics, such as t-tests and regression. Begin by learning how to perform three different t-tests—the one-sample t-test, the independent or two-sample t-test, and the paired t-test—on various samples of data using the SciPy library. Next, learners explore how to interpret results to accept or reject a hypothesis. The course covers, as an example, how to fit a regression model on the returns on an individual stock, and on the S&P 500 Index, by using the scikit-learn library. Finally, watch demonstrations of measuring skewness and kurtosis in a data set. The closing exercise asks you to list three different types of t-tests, identify values which are returned by t-tests, and write code to calculate the percentage returns from time series data using Pandas.
Data Science Statistics: Applied Inferential Statistics
Course Overview
test a hypothesis about a sample by comparing it to the general population using the one-sample t-test available in the SciPy library
compare a sample with another independent sample using the independent t-test and with a related sample using a paired t-test using the SciPy library
apply independent t-tests on a real dataset to test a hypothesis that managers at a firm have higher salaries than non-managerial employees
work with Pandas and Matplotlib to analyze the stock price of Volkswagen in 2008, which were affected by some extreme events
compute the skewness and kurtosis of the returns on Volkswagen stock in 2008 and recognize how it was a few days of extreme behavior which increased those numbers
perform pre-processing operations on a dataset containing close prices for stocks and indices to analyze it using linear regression
use the scikit-learn library to fit a linear regression model on the returns on a stock and the returns on the S&P 500 index
use two explanatory variables - the returns on the S&P 500 index and on an index tracking the strength of the US Dollar - to perform a regression on the returns on individual stocks
recall different types of T-tests and identify the values they return, calculate percentage returns from time series data using Pandas, and measure the skew and kurtosis values for a series