Logistic Regression, Mathematically Explained

AI, But Simple Issue #33

Hello from the AI, but simple team! If you enjoy our content, consider supporting us so we can keep doing what we do.

Our newsletter is no longer sustainable to run at no cost, so we’re relying on different measures to cover operational expenses. Thanks again for reading!

Logistic Regression, Mathematically Explained

AI, But Simple Issue #33

Imagine you want to predict whether a student passes (or fails) an exam based on how many hours they studied. Your outcome is categorical—either 1 (pass) or 0 (fail).

We need a function that always stays between 0 and 1 to represent a valid probability. This is where the logistic function (or the sigmoid function) is used.

The logistic/sigmoid function takes any real number and “squeezes” it into the range [0,1]. It is defined as:

Its graph looks like this:

Its end behaviors are the following:

In machine learning applications, this function is the heart of a very popular binary classifier—logistic regression.

In logistic regression, we model the probability of the positive or true class (y=1) with the logistic or sigmoid function.

  • For instance, an input of 0 (z=0) will result in a 0.5 probability from σ(z).

We can then interpret this probability into a classification using a threshold value, which we’ll discuss more later.

It is recommended to have some mathematical background of multivariable calculus, probability and math notation. Without it, it may be more difficult to follow along.

We want this model to be able to learn from data that we give it, allowing it to predict probabilities that will accurately classify samples. In other words, the model needs parameters that it can learn (e.g., β0, β1).

To do this, we wrap a linear combination of inputs 0 + β1x1 + … + βnxn) inside a logistic function.

Subscribe to keep reading

This content is free, but you must be subscribed to AI, But Simple to continue reading.

Already a subscriber?Sign In.Not now