Logistic regression

I continued watching

Machine Learning Algorithms in 7 Days By Shovon Sengupta
Video link

And this time the subject was Logistic Regression; having caught cold did not help studying this week, sob. Here are my notes

This is the sigmoid function 1/(1+e^-x), as you can see it goes from 0 to 1, and at x=0 it’s value is 1/2. This function will be used to classify the inputs to two categories 0 or 1, no or yes. Once trained the model, it will be evaluated with the inputs, and the sigmoid will be computed. If the result is closer to 0 then the category 0 will be chosen, otherwise the category will be 1. So logistic regression can by tried whenever you have a binary classification problem, such as will this customer subscribe a new contract?

With a trick it is be possible to extend it to multiple categories: suppose you want to classify the inputs into 3 categories A,B, and C. You can formulate 3 binary questions: is it A or something else? is it B or something else… You can then train 3 logistic model, and each of them will predict an y value, you will chose the category that has the highest y value as the correct category (softmax).

The actual formulation does not use directly the sigmoid, but a logarithm of probabilities. Let’s be p the category 1 probability, the logit function is defined as ln[p / (1-p) ]. The model that will be trained will obey to this equation:

So you can have as many explanatory variables X as you want. The training process will identify the theta coefficients that minimize the classification error.

As for the linear regression, the training continues presenting a plethora of statistic indexes to judge the trained model: concordance, deviance, Somer’s D, C statistic, divergence, likelihood ratio, Wald chi-square… Luckily the sample code can be used to understand practically what to do in a Jupyter Notebook: https://github.com/PacktPublishing/Machine-Learning-Algorithms-in-7-Days/blob/master/Section%201/Code/MLA_7D_LGR_V2.ipynb

The samples contains examples on:

Identify categorical variables, and how to generate dummy variables to be used in the training
How to plot frequence histograms, to explore the data
How to produce boxplots, to understand data distribution
Using the describe function to get avg, stddev and quantile distributions
Using Recursive Feature Elimination, to select only the most relevant explanatory variables
Using MinMax scalers to prepare the data and how to train the logistic regression model
Check for multi-collinearity and further reduce the used explanatory variables
Plot the model’s ROC (receiver operating characteristic)
Plot a Shap diagram to understand which are the most relevant libraries and how they influence the classifier.

Written by Giovanni

December 18, 2022 at 3:19 pm

Posted in Varie

Giovanni Bricconi