Commit 8e75a5c0 authored by CR (covid cron)'s avatar CR (covid cron) Committed by renku 0.10.4.dev13
Browse files

renku rerun data/covidtracking/states-metadata.json data/covidtracking/states-daily.json

parent 98a2aca3
Pipeline #30280 canceled with stage
in 4 minutes and 55 seconds
class: Workflow
cwlVersion: v1.0
hints: []
inputs:
input_1:
default: states-metadata.json
streamable: false
type: string
input_2:
default: states-daily.json
streamable: false
type: string
input_3:
default: out_folder
streamable: false
type: string
input_4:
default: data/covidtracking
streamable: false
type: string
input_5:
default:
class: File
path: ../../notebooks/process/download-covidtracking-data.ipynb
streamable: false
type: File
input_6:
default: runs/download-covidtracking-data.runs.ipynb
streamable: false
type: string
outputs:
output_0:
outputSource: step_3/output_0
streamable: false
type: File
output_2:
outputSource: step_3/output_1
streamable: false
type: Directory
requirements: []
steps:
step_1:
in:
filename: input_1
input_directory: step_3/output_1
out:
- output_file
run:
arguments: []
baseCommand:
- 'true'
class: CommandLineTool
cwlVersion: v1.0
hints: []
inputs:
filename:
default: states-metadata.json
streamable: false
type: string
input_directory:
streamable: false
type: Directory
outputs:
output_file:
outputBinding:
glob: $(inputs.filename)
streamable: false
type: File
permanentFailCodes: []
requirements:
- &id001
class: InlineJavascriptRequirement
- &id002
class: InitialWorkDirRequirement
listing: $(inputs.input_directory.listing)
successCodes: []
temporaryFailCodes: []
step_2:
in:
filename: input_2
input_directory: step_3/output_1
out:
- output_file
run:
arguments: []
baseCommand:
- 'true'
class: CommandLineTool
cwlVersion: v1.0
hints: []
inputs:
filename:
default: states-daily.json
streamable: false
type: string
input_directory:
streamable: false
type: Directory
outputs:
output_file:
outputBinding:
glob: $(inputs.filename)
streamable: false
type: File
permanentFailCodes: []
requirements:
- *id001
- *id002
successCodes: []
temporaryFailCodes: []
step_3:
in:
input_1: input_3
input_2: input_4
input_3: input_5
input_4: input_6
out:
- output_0
- output_1
run: a17d560c41a54f5aa307ce5f3c5effe5_papermill.cwl
This source diff could not be displayed because it is stored in LFS. You can view the blob instead.
This source diff could not be displayed because it is stored in LFS. You can view the blob instead.
%% Cell type:code id: tags:
``` python
import requests
import os
import pandas as pd
```
%% Cell type:code id: tags:parameters
``` python
out_folder = "../data/covidtracking/"
PAPERMILL_OUTPUT_PATH = None
```
%% Cell type:code id: tags:injected-parameters
``` python
# Parameters
PAPERMILL_INPUT_PATH = "/tmp/1thbz0bs/notebooks/process/download-covidtracking-data.ipynb"
PAPERMILL_INPUT_PATH = "/tmp/3ek7e62r/notebooks/process/download-covidtracking-data.ipynb"
PAPERMILL_OUTPUT_PATH = "runs/download-covidtracking-data.runs.ipynb"
out_folder = "data/covidtracking"
```
%% Cell type:markdown id: tags:
# Download state metadata
Download a dataset of URLs for data for each US state and several territories. See [Google Doc](https://docs.google.com/spreadsheets/d/18oVRrHj3c183mHmq3m89_163yuYltLNlOmPerQ18E8w/htmlview?sle=true).
%% Cell type:code id: tags:
``` python
url = 'http://covidtracking.com/api/states/info'
r = requests.get(url, allow_redirects=True)
states_metadata_json = r.content
```
%% Cell type:code id: tags:
``` python
# save the result
if PAPERMILL_OUTPUT_PATH:
out_path = os.path.join(out_folder, 'states-metadata.json')
with open(out_path, 'wb') as f:
f.write(states_metadata_json)
```
%% Cell type:code id: tags:
``` python
metadata_df = pd.read_json(states_metadata_json)
print(len(metadata_df), "states and territories have metadata")
metadata_df.head(2)
```
%%%% Output: stream
56 states and territories have metadata
%%%% Output: execute_result
state covid19SiteOld \
0 AK http://dhss.alaska.gov/dph/Epi/id/Pages/COVID-...
1 AL http://www.alabamapublichealth.gov/infectiousd...
covid19Site \
0 http://dhss.alaska.gov/dph/Epi/id/Pages/COVID-...
1 https://alpublichealth.maps.arcgis.com/apps/op...
covid19SiteSecondary twitter \
0 http://dhss.alaska.gov/dph/Epi/id/Pages/COVID-... @Alaska_DHSS
1 https://dph1.adph.state.al.us/covid-19/ @alpublichealth
pui pum notes fips \
0 All data False Total tests are taken from the annotations on ... 2
1 No data False Negatives = (Totals - Positives) \nPositives o... 1
name
0 Alaska
1 Alabama
%% Cell type:markdown id: tags:
# Download daily state data
%% Cell type:code id: tags:
``` python
url = 'https://covidtracking.com/api/states/daily'
r = requests.get(url, allow_redirects=True)
states_daily_json = r.content
```
%% Cell type:code id: tags:
``` python
# save the result
if PAPERMILL_OUTPUT_PATH:
out_path = os.path.join(out_folder, 'states-daily.json')
with open(out_path, 'wb') as f:
f.write(states_daily_json)
```
%% Cell type:code id: tags:
``` python
data_df = pd.read_json(states_daily_json)
print(len(data_df), "data points")
data_df.head(2)
```
%%%% Output: stream
3377 data points
3433 data points
%%%% Output: execute_result
date state positive negative pending hospitalizedCurrently \
0 20200504 AK 370.0 21353.0 NaN 12.0
1 20200504 AL 8025.0 95092.0 NaN NaN
0 20200505 AK 371.0 22321.0 NaN 13.0
1 20200505 AL 8285.0 98481.0 NaN NaN
hospitalizedCumulative inIcuCurrently inIcuCumulative \
0 NaN NaN NaN
1 1064.0 NaN 411.0
1 1107.0 NaN 428.0
onVentilatorCurrently ... hospitalized total totalTestResults \
0 NaN ... NaN 21723.0 21723.0
1 NaN ... 1064.0 103117.0 103117.0
posNeg fips deathIncrease hospitalizedIncrease negativeIncrease \
0 21723.0 2 0.0 0.0 143.0
1 103117.0 1 6.0 29.0 10317.0
onVentilatorCurrently ... hospitalized total totalTestResults posNeg \
0 NaN ... NaN 22692 22692 22692
1 NaN ... 1107.0 106766 106766 106766
fips deathIncrease hospitalizedIncrease negativeIncrease \
0 2 0.0 0.0 968.0
1 1 17.0 43.0 3389.0
positiveIncrease totalTestResultsIncrease
0 2.0 145.0
1 300.0 10617.0
0 1.0 969.0
1 260.0 3649.0
[2 rows x 25 columns]
[2 rows x 27 columns]
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment