| ... | ... | @@ -58,3 +58,13 @@ y_t = g(X_1,X_2,X_3, ..., X_n) |
|
|
|
|
|
|
|
and this is where the recursive nature comes from. Note that the objective of the training is to find the best W matrices fitting our data.
|
|
|
|
|
|
|
|
### Creating a model architecture
|
|
|
|
|
|
|
|
So far we talked about the a mathematical model to learn about the time dependencies in the observations by introducing $`h_t`$. One practical question at this stage is how we want to construct our custom model as a user, for a given time series data X.
|
|
|
|
|
|
|
|
Let's take our dataset and notebook as an example. We have time vs. load data, coupled with temperature. We decided to use a sliding window of 4 to create median and std data for load, and we pass the temperature as an additional feature. Here T can be both the current temperature, mean of the last a few temperatures, or even better, the difference between the current and past temperatures (information on whether it is increasing or decreasing). At the end, we have 4 features. Next, we need to decide how much past is relevant to make predictions about the future. To make it easier to plot, I will take it as 3 here. Our data representation becomes (1, 3, 4) for one training instance.
|
|
|
|
|
|
|
|
What are the model architecture options and how is it related to our practical goal? Let's first discuss some popular ideas (you can also get creative here in any way you want).
|
|
|
|
|
|
|
|
...
|
|
|
|
|