where $\mathbf{Z}$ is a "valid adjustment set". Usually, there are several valid adjustment sets. One set $\mathbf{Z}$ that always works, is the set of parents of $X$, i.e., $\mathbf{Z} = \mathbf{PA}_X$.

%% Cell type:markdown id: tags:

We will apply formula (1) to a (famous) kidney stone data set. Let us assume that the causal structure looks as follows

#comment out the above line if you cannot install igraph

```

%% Cell type:markdown id: tags:

Let us say, we are interested in the causal effect from $X$ to $Y$, i.e., in the intervention distribution $p^{do(X:=x)} (y)$.

Computing equation (1) is often difficult, even if we are given full knowledge of the observational distribution, that is, if we are given the density $p$ (especially in the case of continuous variables and high-dimensional $\mathbf{Z}$). In this example, we are not even given $p$ but only a sample from $p$ and estimating conditional distributions is a hard statistical problem. Fortunately, the joint distribution $p$ is Gaussian and things become easier. Equation (1) then implies

\begin{equation} \tag{2}

\mathbf{E}^{do(X:=x)} [Y] = \alpha x,

\end{equation}

where

$\alpha$ is sometimes called the causal effect from $X$ to $Y$ and is determined by

obtain $\alpha$ as the regression coefficient, when linearly regressing $Y$ on $X$ and $\mathbf{Z}$.

%% Cell type:markdown id: tags:

### Exercise 2:

1. Estimate

$

\mathbf{E}^{do(X:=3)} [Y]

$

from data.loaded.

2.

We have regressed

$Y$ on $X$ *and* $\mathbf{Z}$ --- thus the name "adjusting for $\mathbf{Z}$".

Do you see what goes wrong when we try to estimate the causal effect $\alpha$

by regressing $Y$ on $X$ without adjusting for $\mathbf{Z}$?

%% Cell type:markdown id: tags:

### Solution 2:

%% Cell type:code id: tags:

``` R

```

%% Cell type:markdown id: tags:

### End of Solution 2

%% Cell type:markdown id: tags:

Whether or not a set is a valid adjustment set can be answered using the notion of d-separation (e.g., the set should block the path $X \leftarrow A \rightarrow K \rightarrow Y$). Since d-separation is an important concept, we would like to revise it here.

In brief, given a DAG , the (disjoint) sets $Q$ and $R$ are d-separated by a set $S$ if all paths between $Q$ and $R$ are blocked by $S$. A path $i_1, i_2, \ldots, i_m$ is blocked by $S$ if there is a node $i_k$ on the path ($1 < k < m$) such that one of the following conditions hold:

* neither ${i_k}$ nor any of its descendants is in ${S}$ and

$$

{i_{k-1}} \rightarrow {i_k} \leftarrow {i_{k+1}}.

$$

The data that we have loaded comes from an SEM. One can show that the distribution is Markov with respect to the corresponding graph. This means that d-separation implies conditional independence. We will try to verify this with a few examples.

%% Cell type:markdown id: tags:

### Exercise 3:

Write down two d-separation statements that hold and two that do not hold. Test for conditional and unconditional independence using the functions ```CondIndTest``` from package ```CondIndTests``` and ```dhsic.test``` from package ```dHSIC```, respectively. Use the data in data.loaded; you can access variables using ```data.loaded[,"X"]```, for example. Do you find the correct conditional independences? Note that conditional independence testing is a difficult statistical problem, especially if the conditioning set is large.

%% Cell type:markdown id: tags:

### Solution 3:

%% Cell type:code id: tags:

``` R

```

%% Cell type:markdown id: tags:

### End of Solution 3

%% Cell type:markdown id: tags:

If we are given the full SEM (this will often not be the case), there is a different strategy for computing causal effects. We will use this to validate our estimate from Exercise 2. In fact, the data that we have provided above, was generated by the following SEM:

\begin{align*}

C &:= N_C\\

A &:= 0.8 N_A\\

K &:= A + 0.1 N_K\\

X &:= C - 2 A + 0.2 N_X\\

F &:= 3 X + 0.8 N_F\\

D &:= -2 X + 0.5 N_D\\

G &:= D + 0.5 N_G\\

Y &:= 2 K - D + 0.2 N_Y\\

H &:= 0.5 Y + 0.1 N_H

\end{align*}

with all $N$'s being jointly independent and having a standard normal distribution.

%% Cell type:code id: tags:

``` R

set.seed(1); n <- 200

C <- rnorm(n)

A <- 0.8*rnorm(n)

K <- A + 0.1*rnorm(n)

X <- C - 2*A + 0.2*rnorm(n)

F <- 3*X + 0.8*rnorm(n)

D <- -2*X + 0.5*rnorm(n)

G <- D + 0.5*rnorm(n)

Y <- 2*K - D + 0.2*rnorm(n)

H <- 0.5*Y + 0.1*rnorm(n)

data.check <- cbind(C, A, K, X, F, D, G, Y, H)

```

%% Cell type:markdown id: tags:

Indeed, this yields the same data:

%% Cell type:code id: tags:

``` R

head(data.loaded)

head(data.check)

```

%% Cell type:markdown id: tags:

### Exercise 4:

1. Generate data from the intervened SEM $do(X:=3)$ and compare your findings with the estimate computed above.

2. Look at the graph shown above and at the coefficients of the SEM. Compute $\mathbf{E}^{do(X:=3)}[Y]$ using the path method.