# Inferential Statistics with Python
## A Talk Proposal for SciPy India 2017 (Mumbai)

This repository is a collection of Jupyter notebooks that contain code relevant to the proposed talk on Inferential Statistics with Python at the SciPy India Conference in November 2017.

### Notebooks

The notebooks are as follows:

1. **descriptive_primer.ipynb:** The Descriptive Statistics Notebook. Explains measures of central tendencies, measures of spread, the Binomial Distribution, the Normal Distribution, the Normalcy test, Z-Scores and P-Values.
2. **sampling.ipynb:** The Sampling Notebook. Explains the Central Limit Theorem, Estimation of Proportion from a sample, Estimation of mean from a sample.
3. **hypothesis.ipynb:** The Hypothesis Testing Notebook. Explains one sample and two sample significance tests, test for mean(s), test for proportion(s) and the Chi Square Significance Test.
4. **correlation.ipynb:** The Correlation, Scatter Plot and Linear Regression notebook. Explains the aforementioned, heatmaps, pairplots and the scikit-learn implemention of a Linear Regressor.

### Datasets

The following datasets have been used:

1. **2016 Olympics in Rio de Janeiro:** Athletes, medals, and events from summer games. Uploaded by Rio 2016 on Kaggle. Available at https://www.kaggle.com/rio2016/olympic-games#_=_
2. **Credit Card Fraud Detection:** Anonymized credit card transactions labeled as fraudulent or genuine. Uploaded by Andrea on Kaggle. Available at https://www.kaggle.com/dalpozz/creditcardfraud
3. **Suicides in India:** Sucides in each state is classified according to various parameters from 2001-12. Uploaded by Rajanand Illangovan on Kaggle. Available at https://www.kaggle.com/rajanand/suicides-in-india
4. **NBA Players Stats - 2014-2015:** Points, Assists, Height, Weight and other personal details and stats. Uploaded by DrGuillermo on Kaggle. Available at https://www.kaggle.com/drgilermo/nba-players-stats-20142015
5. **Top 500 Indian Cities:** What story do the top 500 cities of India tell to the world? Uploaded by Arijit Mukherjee on Kaggle. Available at https://www.kaggle.com/zed9941/top-500-indian-cities.
6. **Airbnb New User Bookings:** Where will a new guest book their first travel experience? Uploaded by Airbnb on Kaggle. Available at https://www.kaggle.com/c/airbnb-recruiting-new-user-bookings/data.
7. **Racial Discrimination in the Job Market:** Are Emily and Greg More Employable Than Lakisha and Jamal? Uploaded by the American Economic Association. Available at https://www.aeaweb.org/articles?id=10.1257/0002828042002561.
7. **The Iris Dataset:** Classify iris plants into three species in this classic dataset. Available in the scikit-learn library.