Logistic regression
I continued watching
Machine Learning Algorithms in 7 Days By Shovon Sengupta
Video link
And this time the subject was Logistic Regression; having caught cold did not help studying this week, sob. Here are my notes

This is the sigmoid function 1/(1+e^-x), as you can see it goes from 0 to 1, and at x=0 it’s value is 1/2. This function will be used to classify the inputs to two categories 0 or 1, no or yes. Once trained the model, it will be evaluated with the inputs, and the sigmoid will be computed. If the result is closer to 0 then the category 0 will be chosen, otherwise the category will be 1. So logistic regression can by tried whenever you have a binary classification problem, such as will this customer subscribe a new contract?
With a trick it is be possible to extend it to multiple categories: suppose you want to classify the inputs into 3 categories A,B, and C. You can formulate 3 binary questions: is it A or something else? is it B or something else… You can then train 3 logistic model, and each of them will predict an y value, you will chose the category that has the highest y value as the correct category (softmax).
The actual formulation does not use directly the sigmoid, but a logarithm of probabilities. Let’s be p the category 1 probability, the logit function is defined as ln[p / (1-p) ]. The model that will be trained will obey to this equation:

So you can have as many explanatory variables X as you want. The training process will identify the theta coefficients that minimize the classification error.
As for the linear regression, the training continues presenting a plethora of statistic indexes to judge the trained model: concordance, deviance, Somer’s D, C statistic, divergence, likelihood ratio, Wald chi-square… Luckily the sample code can be used to understand practically what to do in a Jupyter Notebook: https://github.com/PacktPublishing/Machine-Learning-Algorithms-in-7-Days/blob/master/Section%201/Code/MLA_7D_LGR_V2.ipynb
The samples contains examples on:
- Identify categorical variables, and how to generate dummy variables to be used in the training
- How to plot frequence histograms, to explore the data
- How to produce boxplots, to understand data distribution
- Using the describe function to get avg, stddev and quantile distributions
- Using Recursive Feature Elimination, to select only the most relevant explanatory variables
- Using MinMax scalers to prepare the data and how to train the logistic regression model
- Check for multi-collinearity and further reduce the used explanatory variables
- Plot the model’s ROC (receiver operating characteristic)
- Plot a Shap diagram to understand which are the most relevant libraries and how they influence the classifier.
Leave a comment