README.md 9.35 KB
Newer Older
Aaron Spring's avatar
Aaron Spring committed
1
# S2S AI Challenge Template
2

Aaron Spring's avatar
Aaron Spring committed
3
This is a template repository with running examples how to join and contribute to
Aaron Spring's avatar
Aaron Spring committed
4
the `s2s-ai-challenge`.
Aaron Spring's avatar
Aaron Spring committed
5
6

You were likely referred here from the [public website](https://s2s-ai-challenge.github.io/).
Tasko Olevski's avatar
Tasko Olevski committed
7

Aaron Spring's avatar
Aaron Spring committed
8
9
10
11
12
The submission period for the `s2s-ai-challenge` ended. Please find the steps to join the competition for documentation purposes below.
However, the `s2saichallengescorer` still remains active. So you can still verify your 2020 predictions in the future and they will show up in the [RPSS leaderboard](https://renkulab.io/gitlab/aaron.spring/s2s-ai-challenge-leaderboard), but do not qualify for prizes.

---

Aaron Spring's avatar
Aaron Spring committed
13
If you have already forked this project, please fork again or pull recent changes.
Aaron Spring's avatar
Aaron Spring committed
14
Major changes will be also announced on the [challenge website](https://s2s-ai-challenge.github.io/#announcements).
Aaron Spring's avatar
Aaron Spring committed
15
This template repository will have release tags to track changes.
Tasko Olevski's avatar
Tasko Olevski committed
16

Aaron Spring's avatar
Aaron Spring committed
17
Find an overview of [repositories and websites](https://renkulab.io/gitlab/aaron.spring/s2s-ai-challenge/-/wikis/Flow-of-information:-Where-do-I-find-what%3F).
18
19
20

## Introduction

Aaron Spring's avatar
Aaron Spring committed
21
22
23
24
25
26
This is a Renku project. Renku is a platform for reproducible and collaborative data analysis.
At its simplest a Renku project is a gitlab repository with added functionality.
So you can use this project just as a gitlab repository if you wish. However, you may be surprised
by what Renku has to offer and if you are curious the best place to start is the
[Renku documentation](https://renku.readthedocs.io/en/latest/).
You'll find we have already created some useful things like `data` and `notebooks` directories and
27
28
a `Dockerfile`.

Aaron Spring's avatar
Aaron Spring committed
29
## Join the challenge
30

Aaron Spring's avatar
Aaron Spring committed
31
### 1. The simplest way to join the S2S AI Challenge is forking this renku project.
Aaron Spring's avatar
Aaron Spring committed
32
Ensure you fork the renku project and the underlying gitlab repository through the renkulab.io page.
Aaron Spring's avatar
Aaron Spring committed
33
34
35

Fork this template renku project from https://renkulab.io/projects/aaron.spring/s2s-ai-challenge-template/settings.

Aaron Spring's avatar
Aaron Spring committed
36
<img src="docs/screenshots/fork_renku.png" width="300">
Aaron Spring's avatar
Aaron Spring committed
37

Aaron Spring's avatar
Aaron Spring committed
38
39
Name your fork `s2s-ai-challenge-$TEAMNAME`.

Aaron Spring's avatar
Aaron Spring committed
40
41
42

When cloning this repository and you do not want to immediately download the `git lfs`-backed [renku datasets](https://renku.readthedocs.io/projects/renku-python/en/v0.4.0/cli.html#module-renku.cli.dataset), please use:
```bash
Aaron Spring's avatar
Aaron Spring committed
43
GIT_LFS_SKIP_SMUDGE=1 renku/git clone https://renkulab.io/projects/$YOURNAME/s2s-ai-challenge-$TEAMNAME.git
Aaron Spring's avatar
Aaron Spring committed
44
45
46
47
48
49
```

To be able to pull future changes from the [template](https://renkulab.io/gitlab/aaron.spring/s2s-ai-challenge-template) into your repository, add an `upstream`:

```bash
# in your fork locally
Aaron Spring's avatar
Aaron Spring committed
50
git remote add upstream https://renkulab.io/gitlab/aaron.spring/s2s-ai-challenge-template.git
Aaron Spring's avatar
Aaron Spring committed
51
52
53
git pull upstream master
```

Aaron Spring's avatar
Aaron Spring committed
54
55
### 2. Fill our [registration form](https://docs.google.com/forms/d/1KEnATjaLOtV-o4N8PLinPXYnpba7egKsCCH_efriCb4).

Aaron Spring's avatar
Aaron Spring committed
56
57
Registrations are not required before October 31st 2021, but highly [appreciated for the flow of information](https://renkulab.io/gitlab/aaron.spring/s2s-ai-challenge/-/issues/4).

Aaron Spring's avatar
Aaron Spring committed
58
### 3. Make the project private
Aaron Spring's avatar
Aaron Spring committed
59

Aaron Spring's avatar
Aaron Spring committed
60
Now navigate to the gitlab page by clicking on "View in gitlab" in the upper right corner.
Aaron Spring's avatar
Aaron Spring committed
61
62
63
64
65
66
Under "Settings" - "General" - "Visibility" you can set your project private.

<img src="docs/screenshots/gitlab_visibility.png" width="300">

Now other people cannot steal your idea/code.

Aaron Spring's avatar
Aaron Spring committed
67
Please modify the `README` in your fork with your team's details and a
Aaron Spring's avatar
Aaron Spring committed
68
69
description of your method.

Aaron Spring's avatar
Aaron Spring committed
70

Aaron Spring's avatar
Aaron Spring committed
71

Aaron Spring's avatar
Aaron Spring committed
72
### 4. Add the `s2saichallengescorer` user to your repo with Reporter permissions
Aaron Spring's avatar
Aaron Spring committed
73
The scorer follows the code shown in the [verification notebook](https://renkulab.io/gitlab/aaron.spring/s2s-ai-challenge-template/-/blob/master/notebooks/verification_RPSS.ipynb). The scorer's username on gitlab is `s2saichallengescorer`. You should add it to your project with `Reporter` permissions. Under "Members" - "Invite Members" - "GitLab member or Email address", add `s2saichallengescorer`. The scorer will only ever clone your repository and evaluate your submission. It will never make any changes to your code.
Aaron Spring's avatar
Aaron Spring committed
74

Tasko Olevski's avatar
Tasko Olevski committed
75
76
77
78
79
80
81
82
83
### 5. Add the `s2s-ai-challenge` topic to your repository
To add the project topic navigate to `Settings` -> `General` and then fill in the word `s2s-ai-challenge` in the 
`Topics` field near the top of the page. If you have multiple topics you can separate them by commas.

This allows your repository to be recognized as a participant of the competition. Without
this project topic or if you have not added the scorer as a member of your project 
the automated scoring bot will not evaluate any of your submissions and none of your code
or results will be considered for the competition.

Aaron Spring's avatar
Aaron Spring committed
84
## Make Predictions
Aaron Spring's avatar
Aaron Spring committed
85

Tasko Olevski's avatar
Tasko Olevski committed
86
### 6. Start jupyter on renku or locally
Aaron Spring's avatar
Aaron Spring committed
87
88
The simplest way to contribute is right from the Renku platform - 
just click on the `Environments` tab in your renku project and start a new session.
89
90
This will start an interactive environment right in your browser.

Aaron Spring's avatar
Aaron Spring committed
91
92
93
94
<img src="docs/screenshots/renku_start_env.png" width="300">

If the docker image fails initially, please re-build docker or touch the `enviroment.yml` file.

95
96
To work with the project anywhere outside the Renku platform,
click the `Settings` tab where you will find the
Aaron Spring's avatar
Aaron Spring committed
97
98
99
renku project URLs - use `renku clone` to clone the project on whichever machine you want.
Install [renku first with `pipx`](https://renku-python.readthedocs.io/en/latest/installation.html),
and then `renku clone https://renkulab.io/gitlab/$YOURNAME/s2s-ai-challenge-$GROUPNAME.git`
100

Tasko Olevski's avatar
Tasko Olevski committed
101
### 7. Train your Machine Learning model
Aaron Spring's avatar
Aaron Spring committed
102
103

Get training data via 
Aaron Spring's avatar
Aaron Spring committed
104
105
- [climetlab](https://github.com/ecmwf-lab/climetlab-s2s-ai-challenge)
- [renku datasets](https://renku.readthedocs.io/en/stable/user/data.html)
Aaron Spring's avatar
Aaron Spring committed
106
107
108

Get corresponding observations/ground truth:
- [climetlab](https://github.com/ecmwf-lab/climetlab-s2s-ai-challenge)
Aaron Spring's avatar
Aaron Spring committed
109
- IRIDL: [temperature](http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCEP/.CPC/.temperature/.daily/) and accumulated [precipitation](http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCEP/.CPC/.UNIFIED_PRCP/.GAUGE_BASED/.GLOBAL/.v1p0/.extREALTIME/.rain)
Aaron Spring's avatar
Aaron Spring committed
110

Tasko Olevski's avatar
Tasko Olevski committed
111
### 8. Let the Machine Learning model perform subseasonal 2020 predictions
Aaron Spring's avatar
Aaron Spring committed
112
and save them as `netcdf` files.
Aaron Spring's avatar
Aaron Spring committed
113
114
115
116
117
The submissions have to placed in the `submissions` folder with filename `ML_prediction_2020.nc`,
see [example](https://renkulab.io/gitlab/aaron.spring/s2s-ai-competition-bootstrap/-/blob/master/submissions/ML_prediction_2020.nc).

### 9. `git commit` training pipeline and netcdf submission
For later verification by the organizers, reproducibility and scoring of submissions,
Tasko Olevski's avatar
Tasko Olevski committed
118
119
120
121
122
123
124
125
commit all code, input and output data. For the data files please use `git lfs`. 
If you are unfamiliar with `git lfs` a short introduction can be found 
[here](https://www.atlassian.com/git/tutorials/git-lfs).
This is very important because the organizers need to review and reliably reproduce 
your results. If your results cannot be reliably reproduced then you cannot win
the competition - even if your submitted results had the highest score. 

After committing, tag your submission and push your commit and tag. The automated scorer will
Aaron Spring's avatar
Aaron Spring committed
126
127
128
129
130
131
132
evaulate any tag (regardless of which branch it is on) that starts with the word `submission`
followed by any other combination of characters. In other words, any tags that satisfy the
regex `^submission.*` will be evaluated by the scorer. In addition, the scorer will only look for the
results in a file named `ML_prediction_2020.nc` located in the `submissions` folder
at the root of each competitor's repository.

Here is an example of a set of commands that would commit the results and add the scorer tag.
Aaron Spring's avatar
Aaron Spring committed
133
```bash
Aaron Spring's avatar
Aaron Spring committed
134
# run your training and create file ../submissions/ML_prediction_2020.nc
Aaron Spring's avatar
Aaron Spring committed
135
136
137
git lfs track "*.nc"  # this will ensure that all *nc files are using lfs and needs to be done only once, already done in https://renkulab.io/gitlab/aaron.spring/s2s-ai-challenge-template
git add submissions/ML_prediction_2020.nc  # submission file to be fetched by s2saichallengescorer
git add notebooks/current_notebook.ipynb  # training and prediction notebook
Aaron Spring's avatar
Aaron Spring committed
138
git commit -m "commit submission for my_method_name" # whatever message you want
Aaron Spring's avatar
Aaron Spring committed
139
git tag "submission-my_method_name-0.0.1"  # add this tag if this is to be evaluated by the s2saichallengescorer
Aaron Spring's avatar
Aaron Spring committed
140
141
git push --tags
```
Aaron Spring's avatar
Aaron Spring committed
142

Tasko Olevski's avatar
Tasko Olevski committed
143
144
145
146
Please note that only submitted/tagged commits will be considered for the competition.
If you have code that produces better results after the competition ends and it has
not been tagged or is tagged after the competition closed then this will not be considered.

Aaron Spring's avatar
Aaron Spring committed
147
148
### 10. RPSS scoring by `s2saichallengescorer` bot
The `s2saichallengescorer` will fetch your tagged submissions, score them with RPSS against recalibrated ECMWF real-time forecasts.
Aaron Spring's avatar
Aaron Spring committed
149
Your score will be added to the private leaderboard, which will be made public in early November 2021.
Aaron Spring's avatar
Aaron Spring committed
150

Aaron Spring's avatar
Aaron Spring committed
151
The `s2saichallengescorer` is not active for the competition yet.
Aaron Spring's avatar
Aaron Spring committed
152

Aaron Spring's avatar
Aaron Spring committed
153
## More information
Aaron Spring's avatar
Aaron Spring committed
154
155
156

- in the [`s2s-ai-challenge` wiki](https://renkulab.io/gitlab/aaron.spring/s2s-ai-challenge/-/wikis/Home)
- all different resources for this [competition](https://renkulab.io/gitlab/aaron.spring/s2s-ai-challenge/-/wikis/Flow-of-information:-Where-do-I-find-what%3F)
Aaron Spring's avatar
Aaron Spring committed
157

Aaron Spring's avatar
Aaron Spring committed
158
## Changing interactive environment dependencies
159
160
161
162
163
164

Initially we install a very minimal set of packages to keep the images small.
However, you can add python and conda packages in `requirements.txt` and
`environment.yml` to your heart's content. If you need more fine-grained
control over your environment, please see [the documentation](https://renku.readthedocs.io/en/latest/user/advanced_interfaces.html#dockerfile-modifications).