# Exercise 1 - Classifying newswires: A multiclass classification example
In this exercise, you will build a model to classify Reuters newswires into 46 mutually exclusive topics. Because we have many classes, this problem is an instance of multiclass classification, and because each data point should be classified into only one category, the problem is more specifically an instance of single-label multiclass classification.
If each data point could belong to multiple categories (in this case, topics), we’d be facing a multilabel multiclass classification problem.
%% Cell type:markdown id: tags:
### The Reuters dataset
You’ll work with the _Reuters_ dataset, a set of short newswires and their topics, published by Reuters in 1986. It’s a simple, widely used toy dataset for text classification. There are 46 different topics; some topics are more represented than others, but each topic
has at least 10 examples in the training set.
Like IMDB and MNIST, the Reuters dataset comes packaged as part of Keras. Let’s take a look.
To vectorize the labels, there are two possibilities:
1. you can cast the label list as an integer tensor, or
2. you can use one-hot encoding.
One-hot encoding is a widely used format for categorical data, also called categorical encoding. In this case, one-hot encoding of
the labels consists of embedding each label as an all-zero vector with a 1 in the place of the label index.
%% Cell type:code id: tags:
``` python
fromtensorflow.keras.utilsimportto_categorical
y_train=to_categorical(train_labels)
y_test=to_categorical(test_labels)
y_train=<<YOURCODEHERE>>
y_test=<<YOURCODEHERE>>
```
%% Cell type:markdown id: tags:
### TODO: Building your Fully Connected Neural Network Model
In this topic-classification we are trying to classify short snippets of text. The number of output classes is 46.
In a stack of Dense layers like those we’ve been using, each layer can only access information present in the output of the previous layer.
If one layer drops some information relevant to the classification problem, this information can never be recovered by later layers: each layer can potentially become an information bottleneck. To learn to separate 46 different classes:
too small layers may act as information bottlenecks, permanently dropping relevant information. For this reason we’ll use larger layers. Let’s go with 2 hidden layers each consisting of 64 units.
# Exercise 2 - Recurrent Neural Networks for Prediction of Temperature Time Series
%% Cell type:markdown id: tags:
Throughout this exercise, all of our code examples will target a single problem: predicting the temperature 24 hours in the future, given a timeseries of hourly measurements of quantities such as atmospheric pressure and humidity, recorded over the recent past by a set of sensors on the roof of a building. As you will see, it’s a fairly challenging
problem!
We’ll use this temperature-forecasting task to highlight what makes timeseries data fundamentally different from the kinds of datasets you’ve encountered so far. You’ll see that densely connected networks and convolutional networks aren’t well-equipped to deal with this kind of dataset, while recurrent neural networks (RNNs) really shine on this type of problem.
We’ll work with a weather timeseries dataset recorded at the weather station at the Max Planck Institute for Biogeochemistry in Jena, Germany. In this dataset, 14 different quantities (such as temperature, pressure, humidity, wind direction, and so on) were recorded every 10 minutes over several years. The original data goes back to 2003, but the subset of the data we’ll download is limited to 2009–2016.
Let’s start by downloading and uncompressing the data: