- Fork or clone this repo to your local machine.
- Navigate to the project directory and activate the virtualenv
source dev/bin/activate - Install requirements
pip install -rrequirements.txtfrom the root directory. - Navigate to the packages -> regression_model then run
toxto make sure that everything is passing.
The following kaggle notebooks includes the complete basic analyses and EDA
Scaffold template - starter template
Notes to consider:
- If you intended to use this project as a reference to build your own, you need to define your environmental variables in the circleci.
- Tests are disabled for this project, would appreciate if anyone raise a pull request to include them.
I structured the scaffold based on the OOP which seperate concerns of code.
processingfolder conatains any scripts for data wrangeling, cleaning, or feature engineering.trained_modelfolder contains any scripts dedicated to build model, tuning or any related scripts.pipeline.pyfile contains all the procedures that should be done using thesklearn.pipelinepredict.pyfile dedicated for getting out the predictionstrain_pipeline.pyfile dedicated to train the model, starting from downloading the dataset, split, apply pipeline...etc.
- The resulted model is saved as a
.pklfile versioned with the same version of the package. - Only one API end point is already deployed to Heroku, pull requests are welcome
- Build extra API end points
- Build frontend interface using streamlit
- Complete unit tests.
- Create a configuration file for travis
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.