Commit 3229456c authored by Pauline Maury Laribière's avatar Pauline Maury Laribière
Browse files

Merge branch 'r_apis' into 'master'

R apis

See merge request pauline.maury-laribiere/metadata-auto-r-library!1
parents 3d30d334 59ecec16
Pipeline #271446 passed with stage
in 29 seconds
^.*\.Rproj$
^\.Rproj\.user$
......@@ -2,36 +2,102 @@
## Introduction
This is a Renku project - basically a git repository with some
bells and whistles. You'll find we have already created some
useful things like `data` and `notebooks` directories and
a `Dockerfile`.
This repository aims to simplify the access to the [Swiss Federal Statistical Office](https://www.bfs.admin.ch/bfs/en/home.html) metadata.
Following the implementation in the [interoperability platform](https://www.i14y.admin.ch) and the [SIS portal](https://sharepoint.admin.ch/edi/bfs/fr-ch/News/Pages/go-life-neues-sis-portals.aspx), the APIs are made available here in R.
This public library is made available for the internal FSO staff, the federal administration and for external actors.
## Working with the project
## Installation
The simplest way to start your project is right from the Renku
platform - just click on the `Environments` tab and start a new session.
This will start an interactive environment right in your browser.
You can install the library with
```
install.packages("fso.metadata")
```
To work with the project anywhere outside the Renku platform,
click the `Settings` tab where you will find the
git repo URLs - use `git` to clone the project on whichever machine you want.
then at the beginning of your R script, you will need to
```
library("fso.metadata")
```
### Changing interactive environment dependencies
Initially we install a very minimal set of packages to keep the images small.
However, you can add python and conda packages in `requirements.txt` and
`environment.yml`, and R packages to `install.R` (listed as, for example,
`install.packages("ggplot2")`), to your heart's content. If you need more fine-grained
control over your environment, please see [the documentation](https://renku.readthedocs.io/en/latest/user/advanced_interfaces.html#dockerfile-modifications).
## Functionnalities
Based on the metadata that you want, you will call certain functions and parameters.
## Project configuration
### Codelists
1. Export a codelist based on an identifier
```
codelist <- get_codelist(identifier, export_format, version_format, annotations)
```
Project options can be found in `.renku/renku.ini`. In this
project there is currently only one option, which specifies
the default type of environment to open, in this case `/rstudio`.
Parameters:
- identifier ("character"): the codelist's identifier
- export_format ("character", default="SDMX-ML"): the export's format.
Available are CSV, XLSX, SDMX-ML or SDMX-JSON.
- version_format ("numeric", default=2.1): the export format's version
(2.0 or 2.1 when format is SDMX-ML).
- annotations (bool, default=FALSE): flag to include annotations
Returns:
- codelist (data.frame) based on the export format
- a data.frame if export_format was CSV or XLSX
- a json if export_format was SDMX-ML or SDMX-JSON.
## Moving forward
Once you feel at home with your project, we recommend that you replace
this README file with your own project documentation! Happy data wrangling!
\ No newline at end of file
### Data Structures
1. Get the data structure
```
data_structure <- get_data_structure(identifier, language)
```
Parameters:
- identifier ("character"): the nomenclature's identifier
- language ("character", default='fr'): the language of the response data.
Available are 'fr', 'de', 'it', 'en'.
Returns:
- data_structure: data structure
### Nomenclatures
1. Export one level of a nomenclature
```
one_level_df <- get_nomenclature_one_level(identifier, level_number, filters, language, annotations)
```
Parameters:
- identifier ("character"): nomenclature's identifier
- level_number ("numeric"): level to export
- filter (hash::hash): additionnal filters (hash)
- language ("character", default='fr'): response data's language
Available are 'fr', 'de', 'it', 'en'.
- annotations (bool, default=FALSE): flag to include annotations
Returns:
- response (data.frame): dataframe with 3 columns
(Code, Parent and Name in the selected language)
2. Export multiple levels of a nomenclature (from `level_from` to `level_to`)
```
multiple_levels_df = get_nomenclature_multiple_levels(identifier, level_from, level_to, filters, language, annotations)
```
Parameters:
- identifier ("character"): nomenclature's identifier
- level_from ("numeric"): the 1st level to include
- level_to ("numeric"): the last level to include
- filter (hash::hash): additionnal filters
- language ("character", default='fr'): response data's language
Available are 'fr', 'de', 'it', 'en'.
- annotations (bool, default=FALSE): flag to include annotations
Returns:
- multiple_levels_df (data.frame): dataframe columns from `level_from` to `level_to` codes
As the APIs continue to be implemented, further functionnalities will be added.
## Background
All the APIs made available in this library are also documented in Swagger UI should you want to do more experiments through a UI.
- [Here](https://www.i14y.admin.ch/api/index.html) for APIs of the interoperability platform (public).
- [Here](https://dcat.app.cfap02.atlantica.admin.ch/api/index.html) for dcat APIs (internal to configuration).
## Example
Examples for each API are provided in the [R Markdown](https://renkulab.io/gitlab/pauline.maury-laribiere/metadata-auto-r-library/-/blob/r_apis/example.Rmd).
name: "base"
channels:
- defaults
# dependencies:
# - add packages here
# - one per line
prefix: "/opt/conda"
\ No newline at end of file
---
title: "Examples R Markdown"
author: "Pauline Maury Laribière"
date: "21/10/2021"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
# Installation
You can install the library with
```{r install}
install.packages("fso.metadata")
```
then at the beginning of your R script, you will need to
```{r library}
library("fso.metadata")
```
# Available everywhere with the interoperability plateform (i14y)
## Code list
```{r codelist, echo=FALSE}
get_codelist(identifier='CL_NOGA_SECTION')
```
## Data Structure
```{r , echo=FALSE}
get_data_structure(identifier='HCL_NOGA', language='it')
```
## Nomenclature
```{r , echo=FALSE}
my_filters <- hash::hash(
'AF_ACTIVE'= list("0"),
'AFC_ISCO_REDUCED_LIST' = list("1")
)
```
```{r , echo=FALSE}
get_nomenclature_one_level(identifier='HCL_CH_ISCO_19_PROF', filters=my_filters, level_number=2)
```
```{r , echo=FALSE}
get_nomenclature_multiple_levels(identifier='HCL_CH_ISCO_19_PROF', filters=my_filters, level_from=2, level_to=5)
```
This diff is collapsed.
install.packages(c("devtools", "roxygen2", "document", "styler"))
install.packages(c("httr", "jsonlite", "glue", "hash"))
devtools::install_github("opensdmx/rsdmx")
library(hash)
library(methods)
library(glue)
library(httr)
library(jsonlite)
library(rsdmx)
\ No newline at end of file
......@@ -10,4 +10,8 @@ NumSpacesForTab: 2
Encoding: UTF-8
RnwWeave: Sweave
LaTeX: pdfLaTeX
\ No newline at end of file
LaTeX: pdfLaTeX
BuildType: Package
PackageUseDevtools: Yes
PackageInstallArgs: --no-multiarch --with-keep.source
Package: fso.metadata
Type: Package
Title: Metadata Auto R
Version: 0.0.1
Date: 2021-22-10
Authors@R: person("Pauline", "Maury Laribiere", email = "pauline.maury-laribiere@bfs.admin.ch", role = c("aut", "cre"))
Author: Pauline Maury Laribiere [aut, cre],
Maintainer: Pauline Maury Laribiere <pauline.maury-laribiere@bfs.admin.ch>
Description: This package aims to simplify the access to the Swiss Federal Statistical Office metadata. Following the implementation in the interoperability platform and the SIS portal, the APIs are made available here in R. This public library is made available for the internal FSO staff, the federal administration and for external actors.
Depends: R (>= 3.1.0)
License: GPL (>= 2)
Encoding: UTF-8
URL: https://renkulab.io/gitlab/pauline.maury-laribiere/metadata-auto-r-library
LazyData: true
Imports:
data.table,
dplyr,
glue,
hash,
httr,
jsonlite,
methods,
rsdmx
RoxygenNote: 7.1.1
# Generated by roxygen2: do not edit by hand
export(get_codelist)
export(get_data_structure)
export(get_nomenclature_multiple_levels)
export(get_nomenclature_one_level)
importFrom(methods,new)
importFrom(utils,read.csv)
#' Get a codelist based on an identifier
#'
#' @param identifier the codelist's identifier
#' @param export_format the export's format
#' Available are CSV, XLSX, SDMX-ML or JSON.
#' @param version_format the export format's version
#' (2.0 or 2.1 when format is SDMX-ML).
#' @param annotations flag to include annotations
#'
#' @return response based on the export format
#' @export
get_codelist <- function(identifier,
export_format = "SDMX-ML",
version_format = 2.1,
annotations = FALSE) {
api <- api_class(
api_type = "codelist",
export_format = export_format,
parameters = glue::glue("annotations={tolower(annotations)}"),
id = identifier
)
api$get_response()
}
#' Get the data structure
#'
#' @param identifier the dataset's identifier
#' @param language the language of the response data
#' Available are 'fr', 'de', 'it', 'en'.
#'
#' @return data structure
#' @export
get_data_structure <- function(identifier, language = "en") {
api <- api_class(
api_type = "dcat_data_structure",
id = identifier,
language = language
)
api$get_response()
}
#' Get one level of a nomenclature
#'
#' @param identifier nomenclature's identifier
#' @param filters additionnal filters
#' @param level_number level to export
#' @param language the language of the response data
#' Available are 'fr', 'de', 'it', 'en'.
#' @param annotations flag to include annotations
#'
#' @return dataframe with 3 columns
#' (Code, Parent and Name in the selected language)
#' @export
get_nomenclature_one_level <- function(identifier,
filters,
level_number = 1,
language = "en",
annotations = FALSE) {
parameters <- glue::glue("language={language}&level={level_number}&annotations={tolower(annotations)}&{hash_to_string(filters)}")
print(parameters)
api <- api_class(
api_type = "nomenclature_one_level",
id = identifier,
parameters = parameters,
export_format = "CSV"
)
api$get_response()
}
#' Get multiple levels of a nomenclature (from `level_from` to `level_to`)
#'
#' @param identifier nomenclature's identifier
#' @param filters additionnal filters
#' @param level_from the 1st level to include
#' @param level_to the last level to include
#' @param language the language of the response data
#' Available are 'fr', 'de', 'it', 'en'.
#' @param annotations flag to include annotations
#'
#' @return dataframe columns
#' from `level_from` to `level_to` codes
#' @export
get_nomenclature_multiple_levels <- function(identifier,
filters,
level_from = 1,
level_to = 2,
language = "en",
annotations = FALSE) {
parameters <- glue::glue("language={language}&levelFrom={level_from}&levelTo={level_to}&annotations={tolower(annotations)}&{hash_to_string(filters)}")
api <- api_class(
api_type = "nomenclature_multiple_levels",
id = identifier,
parameters = parameters,
export_format = "CSV"
)
res <- api$get_response()
}
#' Api class to make appropriate request based on parameters
#'
#' @field api_type character. The name of the api to call (see url_mapping)
#' @field export_format character (default = "JSON"). The export's format
#' Available are CSV, XLSX, SDMX-ML and JSON
#' @field parameters character. Additional request parameters
#' @field id character. The identifier or id of the request's object
#' @field language character (default = "en"). The language of the response data.
#' Available are 'fr', 'de', 'it', 'en'
#' @field version_format numeric (default = 2.1). The export format's version
#' (2.0 or 2.1 when format is SDMX-ML)
#' (for 'codelist')
#' @field api_url character. The url to make the request to.
#'
#' @importFrom methods new
api_class <- setRefClass(
"Api",
fields = list(
api_type = "character",
export_format = "character",
parameters = "character",
id = "character",
language = "character",
version_format = "numeric",
api_url = "character"
),
methods = list(
initialize = function(...,
export_format = "JSON",
parameters = "",
id = "",
language = "en",
version_format = 2.1) {
callSuper(
...,
export_format = export_format,
parameters = parameters,
id = id,
language = language,
version_format = version_format,
)
get_url(id, export_format, version_format, language)
},
get_response = function() {
request_function <- REQUEST_FUNCTION_MAPPING[[export_format]]
if (parameters == "") {
url <- glue::glue("{BASE_URL}/api/{api_url}")
} else {
url <- glue::glue("{BASE_URL}/api/{api_url}?{parameters}")
}
request_function(url)
},
get_url = function(id, export_format, version_format, language) {
url_mapping <- hash::hash(
"codelist" =
glue::glue("CodeLists/{id}/exports/{export_format}/{version_format}"),
"dcat_data_structure" =
glue::glue("DataStructures/{id}/{language}"),
"nomenclature_one_level" =
glue::glue("Nomenclatures/{id}/levelexport/CSV"),
"nomenclature_multiple_levels" =
glue::glue("Nomenclatures/{id}/multiplelevels/CSV"),
)
api_url <<- url_mapping[[api_type]]
}
)
)
# Root URL constants
BASE_URL <- "https://www.i14y.admin.ch"
#' API query for SDMX output
#'
#' @param url url to query
#'
#' @return dataframe response
sdmx_request <- function(url) {
as.data.frame(rsdmx::readSDMX(url))
}
#' API query for JSON output
#'
#' @param url url to query
#'
#' @return response: dataframe or list of dataframes
json_request <- function(url) {
jsonlite::fromJSON(rawToChar(httr::GET(url)$content))
}
#' API query for CSV output
#'
#' @param url url to query
#' @importFrom utils read.csv
#'
#' @return dataframe response
csv_request <- function(url) {
read.csv(url)
}
# Request function based on expected response
REQUEST_FUNCTION_MAPPING <- hash::hash(
"SDMX-ML" = sdmx_request,
"JSON" = json_request,
"CSV" = csv_request
)
#' Transform a hash object into a string of parameters
#' hash::hash("a" = list("1"), "b" = list("2", "3"))
#' becomes "a=1&b=2&b=3"
#'
#' @param filters hash object
#'
#' @return formatted string of parameters
hash_to_string <- function(filters) {
string <- ""
for (prop in ls(filters)) {
for (value in filters[[prop]]) {
if (string == "") {
string <- glue::glue("{prop}={value}")
} else {
string <- paste(string, glue::glue("{prop}={value}"), sep = "&")
}
}
}
string
}
\ No newline at end of file
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/api_class.R
\docType{class}
\name{Api-class}
\alias{Api-class}
\alias{api_class}
\title{Api class to make appropriate request based on parameters}
\description{
Api class to make appropriate request based on parameters
}
\section{Fields}{
\describe{
\item{\code{api_type}}{character. The name of the api to call (see url_mapping)}
\item{\code{export_format}}{character (default = "JSON"). The export's format
Available are CSV, XLSX, SDMX-ML and JSON}
\item{\code{parameters}}{character. Additional request parameters}
\item{\code{id}}{character. The identifier or id of the request's object}
\item{\code{language}}{character (default = "en"). The language of the response data.
Available are 'fr', 'de', 'it', 'en'}
\item{\code{version_format}}{numeric (default = 2.1). The export format's version
(2.0 or 2.1 when format is SDMX-ML)
(for 'codelist')}
\item{\code{api_url}}{character. The url to make the request to.}
}}
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/format_request.R
\name{csv_request}
\alias{csv_request}
\title{API query for CSV output}
\usage{
csv_request(url)
}
\arguments{
\item{url}{url to query}
}
\value{
dataframe response
}
\description{
API query for CSV output
}
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/api_call.R
\name{get_codelist}
\alias{get_codelist}
\title{Get a codelist based on an identifier}
\usage{
get_codelist(
identifier,
export_format = "SDMX-ML",
version_format = 2.1,
annotations = FALSE
)
}
\arguments{
\item{identifier}{the codelist's identifier}
\item{export_format}{the export's format
Available are CSV, XLSX, SDMX-ML or JSON.}
\item{version_format}{the export format's version
(2.0 or 2.1 when format is SDMX-ML).}
\item{annotations}{flag to include annotations}
}
\value{
response based on the export format
}
\description{
Get a codelist based on an identifier
}
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/api_call.R
\name{get_data_structure}
\alias{get_data_structure}
\title{Get the data structure}
\usage{
get_data_structure(identifier, language = "en")
}
\arguments{
\item{identifier}{the dataset's identifier}
\item{language}{the language of the response data
Available are 'fr', 'de', 'it', 'en'.}
}
\value{
data structure
}
\description{
Get the data structure
}
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/api_call.R
\name{get_nomenclature_multiple_levels}
\alias{get_nomenclature_multiple_levels}
\title{Get multiple levels of a nomenclature (from `level_from` to `level_to`)}
\usage{
get_nomenclature_multiple_levels(
identifier,
filters,
level_from = 1,
level_to = 2,
language = "en",
annotations = FALSE
)
}
\arguments{
\item{identifier}{nomenclature's identifier}
\item{filters}{additionnal filters}
\item{level_from}{the 1st level to include}
\item{level_to}{the last level to include}
\item{language}{the language of the response data
Available are 'fr', 'de', 'it', 'en'.}