SEPSIS PREDICTION AND API CREATION WITH FASTAPI

6 min readDec 9, 2023

INTRODUCTION

After having trained, evaluated and tested your machine model you have decided to expose it through an API in order to be able to use it on other applications whether Web or Mobile
For this we will use FastAPI to create APIs intended for our machine machine.FastAPI is a modern web framework for creating RESTful APIs in Python.
Sepsis is a medical term for any “generalized inflammatory response associated with a serious infection.” This deadly transmissible response occurs when the host’s response to infection, a systemic and severe inflammation of the body, causes damage to its own tissues and organs. It is accompanied by a cytokine shock.

GOAL

Build a Machine Learning to predict Sepsis
2. Build API using Fast API to embed the ML built

HYPOTHESIS

Null Hypothesis (H0): There is no significant relationship between sepsis and PRG (Plasma/glucose).

Alternative Hypothesis (Ha): There is a significant relationship between sepsis and PRG (Plasma /glucose).

DATA LOADING

Import Packages

Load the Patiens_Files_Train

Let’s check if we are missing values

Our dataset is clean we are not missing value and no duplicated
Hypothesis Testing

For this let’s use the Point-Biserial Correlation coeficient .

The point-biserial correlation coefficient is a measure of the strength and direction of association between a binary variable (in this case, the presence or absence of sepsis, coded as 0 or 1) and a continuous variable (Plasma glucose, PRG). The coefficient ranges from -1 to 1, where -1 indicates a perfect negative correlation, 1 indicates a perfect positive correlation, and 0 indicates no correlation.

Univariate Analysis

More negative than positive cases of sepsis in our data the classes in the target variable are imbalanced. These need to be balanced before model training.

Bi-variate Analysis

There are more cases of sepsis among people with higher Plasma glucose (PRG).

PRG of less than 4.0 has more negative cases of Sepsis

MultiVariate Analysis

Create a correlation matrix or heatmap to identify relationships between different blood work attributes. Additionally, use box plots or violin plots to compare the distribution of each blood work result for patients with and without sepsis.

Pl (Attribute 2 Blood Work Result-1) = 0.4497 (moderate positive correlation)

Higher attribute 2 levels may be associated with a higher likelihood of sepsis.

her attribute 2 levels may be associated with a higher likelihood of sepsis.
SK (Attribute 4 Blood Work Result-2) = 0.0756 (very weak positive correlation)

The correlation is close to zero, indicating a minimal linear relationship.

TS (Attribute 5 Blood Work Result-3) = 0.1459 (weak positive correlation)

The correlation suggests a slight tendency for higher Blood Work Result-3 to be associated with a higher likelihood of sepsis.

BD2 (Attribute 7 Blood Work Result-4) 0.1816 (weak positive correlation)

The correlation suggests a slight tendency for higher Blood Work Result-4 to be associated with a higher likelihood of sepsis.

Data Preparation

We want to rename Sepssis to Sepsis

Outlier detection

Outliers may not be representative of the underlying patterns in the data. Outliers can disproportionately influence the parameters of a model during the training process. ML models often aim to minimize errors or maximize likelihood, and outliers may introduce large errors that the model will try to minimize, potentially skewing the model’s parameters.

Now we use the remove_outliers function to our numerical columns

X and Y split

Create Pipeline

After the data now create a pipeline using ColumnTransformer

Balance Target Classes

Our target is not balanced so we need to balanced using SMOTE

After Balanced

Train-Test Split

We want to split our X and y using train_split_test of sklearn

Label Encoder

Encoded y_train and y_test seeing that they are categorical data

Model

Now it’s time to create a model for our dataset

For that we use four model and after training and evaluated XGBoost has the highest accurracy

With accuracy 0.85 compare with other model

Hyperparameter tuning

We need to optimize our model

After optimize

Our model accuracy has slightly increased to 86.7%

Export key components

Now we want to export our key for api creation

After exporting and creating the main.py we will import all the necessary libraries before that do not forget in your requirements.txt to add fastapi to install the necessary packages

Import package

Now create a instant app

And load our pipeline

So let’s create a Class to define our input call it SepsiFeature

Let’s create our first endpoint when launching the application we obtain this page

Now let’s create an endpoint to manage the prediction

Once the function is created, open your terminal and type uvicorn main:app — reload

All code is available in my github

https://github.com/bambadij/Sepsi_Predict_FastAPI

I’m just a learner trying to share what I’ve learned
if you have any suggestions or incorrectly worded things, you can contact me bambadij@gmail.com