Logistic Regression — Explained and Implemented

Mohan Dogra
3 min readDec 3, 2020

--

Source

Through this article, we’re going to learn about Logistic Regression(LR). This article is divided into two parts:

  1. Brief Explanation of Logistic Regression
  2. Implementation of Logistic Regression

For those who have no idea regarding Logistic Regression, I’d suggest going through a more detailed article: Logistic Regression Explained

Let’s get started.

1. Brief Explanation of Logistic Regression

Despite having the word ‘regression in its name, Logistic Regression is a kind of binary classification algorithm. It is named ‘Logistic Regression’ because it’s similar to Linear Regression. The term “Logistic” is taken from the Logit function that is used in this method of classification.

It is a Binary-Classification technique, therefore the output variable is dichotomous in nature. i.e, always in contrast with each other. Either the output would be (0 or 1) or (yes or no).

This technique has applications such as spam filtering, checking the presence of a disease, etc

Logistic regression performs pretty well only with the dependent variables. i.e, say
[x1, x2, x3 ……xn] are the features of our data, and
[y] is the target label of the data, then
our features [ x1, x2, x3 …… xn ] depending more on the target label [y] will perform much better than the features[ x1, x2, x3 …… xn ] not depending on the target label [y].

In layman terms: To predict the presence of diabetes using the depending features such as body temperature, sugar level, age of the patient would give more accurate results, compared to predicting the presence of diabetes using the patient's age, location, and gender.

Traditionally, this logistic regression is being used for binary classification. But, it can also be used for multi-class classification.

Multinomial classification: It is used for multiclass/categorical classification which is a non-traditional/special linear regression case. This method uses log of odds as the dependent variable.

LR uses the logit function to predict the probability of occurrence of a binary event.

Uses sigmoid function(0,1)

Sigmoid function

where (y) has a value of:

Value of y

Linear vs. Logistic

Linear Regression Vs Logistic Regression

Linear regression is a regression technique and hence gives continuous output i.e, the intermediate target values are not binary, therefore would a better choice for applications such as House pricing, stock price prediction, etc.

Logistic regression being a classification technique will give discreet output such as 0 or 1, to classify the input to one of the target labels. therefore LR is a better option for applications such as spam mail detection or cancer detection.

Advantages
- Low computation, widely used

Disadvantages
- Not apt for multi-class, can’t solve non-linear problems.
- It will not perform well with independent variables.

2. Implementation

Implementing Logistic regression for diabetes detection in a person using features such as age, body mass, skinfold thickness, blood pressure, etc., and predicting if the person has diabetes(1) or not(0).

Conclusion

Logistic Regression is one of the most used binary classification techniques. Through this article, we briefly learned about logistic regression, its applications, advantages. Finally, we implementing LR using the ‘scikit-learn’ for detecting if a person has diabetes or not.

--

--

Mohan Dogra
Mohan Dogra

Written by Mohan Dogra

AI Enthusiast | Independent researcher

No responses yet