Commit 333a9b29 authored by Gavin Lee's avatar Gavin Lee
Browse files

Update README.md

parent 2623c0f5
Pipeline #310433 passed with stage
in 33 seconds
# renku-mls-plugin
# Renku MLS (machine learning schema) plug-in
## Introduction
## Helping you benchmark
This is a Renku project - basically a git repository with some
bells and whistles. You'll find we have already created some
useful things like `data` and `notebooks` directories and
a `Dockerfile`.
If you have just got your feet wet with machine learning, you will soon realise
that there are a myriad of techniques, models and metrics out there. Nowadays,
many popular machine learning packages out there, for example `scikit-learn`
or `keras` have user-friendly interfaces in Python (or other languages).
## Working with the project
When you need to test out several models and metrics at the same time, code
repetition and unnecessary verbosity sometimes creeps in to your projects.
The simplest way to start your project is right from the Renku
platform - just click on the `Environments` tab and start a new session.
This will start an interactive environment right in your browser.
The Renku MLS plug-in is designed to help you benchmark your machine learning
models easier, all whilst interfacing with the classic Renku command-line.
To work with the project anywhere outside the Renku platform,
click the `Settings` tab where you will find the
git repo URLs - use `git` to clone the project on whichever machine you want.
At present, the plug-in is compatible with `keras`, `scikit-learn` and `XGBoost`
with classification tasks and the following metrics: `accuracy_score`,
`roc_auc_score` and `f1_score`.
### Changing interactive environment dependencies
## About the project
Initially we install a very minimal set of packages to keep the images small.
However, you can add python and conda packages in `requirements.txt` and
`environment.yml` to your heart's content. If you need more fine-grained
control over your environment, please see [the documentation](https://renku.readthedocs.io/en/latest/user/advanced_interfaces.html#dockerfile-modifications).
The dataset used in this project contains numeric data with the target variable
the `label` column. The `src/train.py` file is a training script for the data
and it uses command-line arguments to avoid code repetition. The `export()`
function exposes the model to the plug-in.
## Project configuration
Project options can be found in `.renku/renku.ini`. In this
project there is currently only one option, which specifies
the default type of environment to open, in this case `/lab` for
JupyterLab. You may also choose `/tree` to get to the "classic" Jupyter
interface.
## Moving forward
Once you feel at home with your project, we recommend that you replace
this README file with your own project documentation! Happy data wrangling!
\ No newline at end of file
The `notebooks/renku-runs.ipynb` shows the example tasks run in this sample
project.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment