Commit bea59b3e authored by CI-bot's avatar CI-bot Committed by renku 0.10.4
Browse files

renku rerun data/covidtracking/states-metadata.json data/covidtracking/states-daily.json

parent 11039d14
Pipeline #46087 passed with stage
in 43 seconds
class: Workflow
cwlVersion: v1.0
hints: []
inputs:
input_1:
default: out_folder
streamable: false
type: string
input_2:
default: data/covidtracking
streamable: false
type: string
input_3:
default:
class: File
path: ../../notebooks/process/download-covidtracking-data.ipynb
streamable: false
type: File
input_4:
default: runs/download-covidtracking-data.runs.ipynb
streamable: false
type: string
input_5:
default: states-metadata.json
streamable: false
type: string
input_6:
default: states-daily.json
streamable: false
type: string
outputs:
output_0:
outputSource: step_1/output_1
streamable: false
type: Directory
output_1:
outputSource: step_1/output_0
streamable: false
type: File
requirements: []
steps:
step_1:
in:
input_1: input_1
input_2: input_2
input_3: input_3
input_4: input_4
out:
- output_0
- output_1
run: a17d560c41a54f5aa307ce5f3c5effe5_papermill.cwl
step_2:
in:
filename: input_5
input_directory: step_1/output_1
out:
- output_file
run:
arguments: []
baseCommand:
- 'true'
class: CommandLineTool
cwlVersion: v1.0
hints: []
inputs:
filename:
default: states-metadata.json
streamable: false
type: string
input_directory:
streamable: false
type: Directory
outputs:
output_file:
outputBinding:
glob: $(inputs.filename)
streamable: false
type: File
permanentFailCodes: []
requirements:
- &id001
class: InlineJavascriptRequirement
- &id002
class: InitialWorkDirRequirement
listing: $(inputs.input_directory.listing)
successCodes: []
temporaryFailCodes: []
step_3:
in:
filename: input_6
input_directory: step_1/output_1
out:
- output_file
run:
arguments: []
baseCommand:
- 'true'
class: CommandLineTool
cwlVersion: v1.0
hints: []
inputs:
filename:
default: states-daily.json
streamable: false
type: string
input_directory:
streamable: false
type: Directory
outputs:
output_file:
outputBinding:
glob: $(inputs.filename)
streamable: false
type: File
permanentFailCodes: []
requirements:
- *id001
- *id002
successCodes: []
temporaryFailCodes: []
This source diff could not be displayed because it is stored in LFS. You can view the blob instead.
This source diff could not be displayed because it is stored in LFS. You can view the blob instead.
%% Cell type:code id: tags:
``` python
import requests
import os
import pandas as pd
```
%% Cell type:code id: tags:parameters
``` python
out_folder = "../data/covidtracking/"
PAPERMILL_OUTPUT_PATH = None
```
%% Cell type:code id: tags:injected-parameters
``` python
# Parameters
PAPERMILL_INPUT_PATH = "/tmp/cgc1623o/notebooks/process/download-covidtracking-data.ipynb"
PAPERMILL_INPUT_PATH = "/tmp/7998tnm6/notebooks/process/download-covidtracking-data.ipynb"
PAPERMILL_OUTPUT_PATH = "runs/download-covidtracking-data.runs.ipynb"
out_folder = "data/covidtracking"
```
%% Cell type:markdown id: tags:
# Download state metadata
Download a dataset of URLs for data for each US state and several territories. See [Google Doc](https://docs.google.com/spreadsheets/d/18oVRrHj3c183mHmq3m89_163yuYltLNlOmPerQ18E8w/htmlview?sle=true).
%% Cell type:code id: tags:
``` python
url = 'http://covidtracking.com/api/states/info'
r = requests.get(url, allow_redirects=True)
states_metadata_json = r.content
```
%% Cell type:code id: tags:
``` python
# save the result
if PAPERMILL_OUTPUT_PATH:
out_path = os.path.join(out_folder, 'states-metadata.json')
with open(out_path, 'wb') as f:
f.write(states_metadata_json)
```
%% Cell type:code id: tags:
``` python
metadata_df = pd.read_json(states_metadata_json)
print(len(metadata_df), "states and territories have metadata")
metadata_df.head(2)
```
%%%% Output: stream
56 states and territories have metadata
%%%% Output: execute_result
state notes \
0 AK Negatives = (Totals – Positives)\nPositives oc...
1 AL Negatives = (Totals - Positives) \nPositives o...
covid19Site \
0 http://dhss.alaska.gov/dph/Epi/id/Pages/COVID-...
1 https://alpublichealth.maps.arcgis.com/apps/op...
covid19SiteSecondary \
0 http://dhss.alaska.gov/dph/Epi/id/Pages/COVID-...
1 https://dph1.adph.state.al.us/covid-19/
covid19SiteTertiary twitter \
0 https://alaska-dhss.maps.arcgis.com/apps/opsda... @Alaska_DHSS
1 @alpublichealth
covid19SiteOld name fips pui pum
0 http://dhss.alaska.gov/dph/Epi/id/Pages/COVID-... Alaska 2 False
1 http://www.alabamapublichealth.gov/infectiousd... Alabama 1 False
%% Cell type:markdown id: tags:
# Download daily state data
%% Cell type:code id: tags:
``` python
url = 'https://covidtracking.com/api/states/daily'
r = requests.get(url, allow_redirects=True)
states_daily_json = r.content
```
%% Cell type:code id: tags:
``` python
# save the result
if PAPERMILL_OUTPUT_PATH:
out_path = os.path.join(out_folder, 'states-daily.json')
with open(out_path, 'wb') as f:
f.write(states_daily_json)
```
%% Cell type:code id: tags:
``` python
data_df = pd.read_json(states_daily_json)
print(len(data_df), "data points")
data_df.head(2)
```
%%%% Output: stream
6961 data points
7017 data points
%%%% Output: execute_result
date state positive negative pending hospitalizedCurrently \
0 20200707 AK 1184.0 130236.0 NaN 25.0
1 20200707 AL 45785.0 415579.0 NaN 1073.0
0 20200708 AK 1226.0 132175.0 NaN 30.0
1 20200708 AL 46962.0 421330.0 NaN 1110.0
hospitalizedCumulative inIcuCurrently inIcuCumulative \
0 NaN NaN NaN
1 2961.0 NaN 858.0
1 3006.0 NaN 871.0
onVentilatorCurrently ... posNeg deathIncrease hospitalizedIncrease \
0 1.0 ... 131420 1 0
1 NaN ... 461364 26 47
0 0.0 ... 133401 0 0
1 NaN ... 468292 25 45
hash commercialScore \
0 7c7ab728f94cde83c2cd769a80f5de03648a7feb 0
1 8d8164f1d3922f1d9d55fff0e476fd676e62b009 0
0 44f40820165a4480d4a1b8b031bb7e6bf3a22a40 0
1 9fc63c874a98fa97cfee6a65d2dff374513b499f 0
negativeRegularScore negativeScore positiveScore score grade
0 0 0 0 0
1 0 0 0 0
[2 rows x 39 columns]
[2 rows x 41 columns]
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment