SEPSIS PREDICTION AND API CREATION WITH FASTAPI
INTRODUCTION
After having trained, evaluated and tested your machine model you have decided to expose it through an API in order to be able to use it on other applications whether Web or Mobile
For this we will use FastAPI to create APIs intended for our machine machine.FastAPI is a modern web framework for creating RESTful APIs in Python.
Sepsis is a medical term for any “generalized inflammatory response associated with a serious infection.” This deadly transmissible response occurs when the host’s response to infection, a systemic and severe inflammation of the body, causes damage to its own tissues and organs. It is accompanied by a cytokine shock.
GOAL
- Build a Machine Learning to predict Sepsis
- 2. Build API using Fast API to embed the ML built
HYPOTHESIS
Null Hypothesis (H0): There is no significant relationship between sepsis and PRG (Plasma/glucose).
Alternative Hypothesis (Ha): There is a significant relationship between sepsis and PRG (Plasma /glucose).
DATA LOADING
- Import Packages
Load the Patiens_Files_Train
Let’s check if we are missing values
Our dataset is clean we are not missing value and no duplicated
Hypothesis Testing
For this let’s use the Point-Biserial Correlation coeficient .
The point-biserial correlation coefficient is a measure of the strength and direction of association between a binary variable (in this case, the presence or absence of sepsis, coded as 0 or 1) and a continuous variable (Plasma glucose, PRG). The coefficient ranges from -1 to 1, where -1 indicates a perfect negative correlation, 1 indicates a perfect positive correlation, and 0 indicates no correlation.
Univariate Analysis
- More negative than positive cases of sepsis in our data the classes in the target variable are imbalanced. These need to be balanced before model training.
Bi-variate Analysis
There are more cases of sepsis among people with higher Plasma glucose (PRG).
PRG of less than 4.0 has more negative cases of Sepsis
MultiVariate Analysis
Create a correlation matrix or heatmap to identify relationships between different blood work attributes. Additionally, use box plots or violin plots to compare the distribution of each blood work result for patients with and without sepsis.
Pl (Attribute 2 Blood Work Result-1) = 0.4497 (moderate positive correlation)
Higher attribute 2 levels may be associated with a higher likelihood of sepsis.
- her attribute 2 levels may be associated with a higher likelihood of sepsis.
- SK (Attribute 4 Blood Work Result-2) = 0.0756 (very weak positive correlation)
The correlation is close to zero, indicating a minimal linear relationship.
- TS (Attribute 5 Blood Work Result-3) = 0.1459 (weak positive correlation)
The correlation suggests a slight tendency for higher Blood Work Result-3 to be associated with a higher likelihood of sepsis.
- BD2 (Attribute 7 Blood Work Result-4) 0.1816 (weak positive correlation)
The correlation suggests a slight tendency for higher Blood Work Result-4 to be associated with a higher likelihood of sepsis.
Data Preparation
We want to rename Sepssis to Sepsis
Outlier detection
- Outliers may not be representative of the underlying patterns in the data. Outliers can disproportionately influence the parameters of a model during the training process. ML models often aim to minimize errors or maximize likelihood, and outliers may introduce large errors that the model will try to minimize, potentially skewing the model’s parameters.
Now we use the remove_outliers function to our numerical columns
X and Y split
Create Pipeline
After the data now create a pipeline using ColumnTransformer
Balance Target Classes
Our target is not balanced so we need to balanced using SMOTE
After Balanced
Train-Test Split
We want to split our X and y using train_split_test of sklearn
Label Encoder
Encoded y_train and y_test seeing that they are categorical data
Model
Now it’s time to create a model for our dataset
For that we use four model and after training and evaluated XGBoost has the highest accurracy
With accuracy 0.85 compare with other model
Hyperparameter tuning
We need to optimize our model
After optimize
- Our model accuracy has slightly increased to 86.7%
Export key components
Now we want to export our key for api creation
After exporting and creating the main.py we will import all the necessary libraries before that do not forget in your requirements.txt to add fastapi to install the necessary packages
Import package
Now create a instant app
And load our pipeline
So let’s create a Class to define our input call it SepsiFeature
Let’s create our first endpoint when launching the application we obtain this page
Now let’s create an endpoint to manage the prediction
Once the function is created, open your terminal and type uvicorn main:app — reload
All code is available in my github
https://github.com/bambadij/Sepsi_Predict_FastAPI
I’m just a learner trying to share what I’ve learned
if you have any suggestions or incorrectly worded things, you can contact me bambadij@gmail.com