Bayes Primer

Posted on Sat 17 October 2015 in ml • Tagged with tutorial, bayesian

What is Bayes Theorem?

Bayes theorem is what allows us to go from a sampling (or likelihood) distribution and a prior distribution to a posterior distribution.

What is a Sampling Distribution?

A sampling distribution is the probability of seeing our data (X) given our parameters ($\theta$). This is written as $p(X|\theta)$.

For example, we might have data on 1,000 coin flips. Where 1 indicates a head. This can be represented in python as


Continue reading

Logistic Regression and Optimization

Posted on Wed 29 April 2015 in ml • Tagged with tutorial, logistic-regression

Logistic Regression and Gradient Descent

Logistic regression is an excellent tool to know for classification problems. Classification problems are problems where you are trying to classify observations into groups. To make our examples more concrete, we will consider the Iris dataset. The iris dataset contains 4 attributes for 3 types of iris plants. The purpose is to classify which plant you have just based on the attributes. To simplify things, we will only consider 2 attributes and 2 classes. Here are the data visually:


Continue reading

Bayes With Continuous Prior

Posted on Fri 03 April 2015 in ml • Tagged with bayesian, tutorial

Continuous Prior

In my introduction to Bayes post, I went over a simple application of Bayes theorem to Bernoulli distributed data. In this post, I want to extend our example to use a continous prior.

In my last post, I ended with this code:

Python For Data Mining

Posted on Sat 17 January 2015 in ml, data-exploration • Tagged with tutorial

INTRODUCTION TO PYTHON FOR DATA MINING

Python is a great language for data mining. It has a lot of great libraries for exploring, modeling, and visualizing data. To get started I would recommend downloading the Anaconda Package. It comes with most of the libraries you will need and provides and IDE and package manager.


Continue reading