This repository contains the implementation of a complete Machine Learning workflow, including data preprocessing, model training with hyperparameter tuning, experiment tracking using MLflow, and deployment via FastAPI inside Docker. The project is structured for modularity, scalability, and production readiness.
The goal of this project is to build, train, evaluate, and deploy a sentiment analysis model. The system includes:
- Model Factory pattern to dynamically load ML algorithms.
- Optuna for hyperparameter optimization.
- MLflow for experiment tracking and model versioning.
- FastAPI as a serving layer for inference APIs.
- Docker containerization for consistent and portable deployment.
This project demonstrates real-world ML engineering workflow and MLOps practices.
final_project/
│── artifact/ # location dataset and model.pkl
│── config/
│ ├── config.yaml # Global configuration file
│── logs/ # File save log
│── research/ # Reserch dataset jupyter notebook
│── src/
│ ├── api/ # FastAPI application
│ ├── data/ # Dataset loaders and preprocessing
│ ├── models/ # Model Factory + training logic
│ ├── utils/ # Logging, configuration, helpers
│ ├── services/ # Function of chat llm and predict category
│ ├── pipeline/ # Training and inference pipelines
│── .dockerignore # Ignore file not used it
│── docker-compose.yml # Docker configuration
│── Dockerfile.fastapi # Docker container FastApi
│── Home.py # Streamlit Chat Bot
│── README.md # Project documentation
│── requirements.txt # Python dependencies
- Multiple model options available through Model Factory
- Automatic hyperparameter optimization using Optuna
- Evaluation metrics: accuracy, precision, recall, F1-score
- MLflow Tracking & Model Registry integrated
- Automatic model logging and artifact storage
- REST API built with FastAPI
- Docker image for production
- Endpoint for real-time prediction
git clone https://github.com/zippo538/final_project.git
cd final_project
pip install -r requirements.txtRun the training pipeline:
python src/run_pipeline.pyThis will:
- Load dataset
- Run Optuna optimization
- Train the best model
- Log everything to MLflow
Start FastAPI server:
uvicorn src.api.main:app --reloadOpen docs:
http://127.0.0.1:8000/docs
Example prediction request:
{
"text": "This government policy is terrible"
}Build Docker image:
docker build -t myfastapi:latest -f Dockerfile.fastapi .Run container:
docker run -d -p 8000:8000 -NAME myfastapi myfastpi:latestStart MLflow UI:
mlflow ui --host 0.0.0.0 --port 5000View experiments at:
http://localhost:5000
- Ensure the dataset path inside
config.yamlis correct. - MLflow tracking server can be switched to remote storage if needed.
Contributions are welcome! Please open an issue or submit a pull request.
This project is licensed under the MIT License.
If you need improvements, restructuring, or a more formal academic format, feel free to ask!