Training and validation sets
Training and validation sets
This is the Mathematica companion notebook for our Training and Validation Sets exercise. You may need to Make Your Own Copy before starting. See the menu above.
This is the Mathematica companion notebook for our Training and Validation Sets exercise. You may need to Make Your Own Copy before starting. See the menu above.
Create data
Create data
As before, let us create some synthetic data. The points will be generated by polynomial with some noise added. You can change the polynomial if you so desire; it is the input to the makeData function. The data will be split into a training and a validation set. When prompted to automatically evaluate the initialization cell, answer “YES”.
As before, let us create some synthetic data. The points will be generated by polynomial with some noise added. You can change the polynomial if you so desire; it is the input to the makeData function. The data will be split into a training and a validation set. When prompted to automatically evaluate the initialization cell, answer “YES”.
In[]:=
data=makeData[x^3-2x^2+x-3];range=40;dataPlot=plotData[data,range]
Fit polynomials of various degrees
Fit polynomials of various degrees
Using only the training data, we can have Mathematica compute the polynomial which fits best. We do have to make a choice of which degree to use. Evaluating the following cell will compute the best fitting polynomial and tell you the MSE for points in the training data as well as the validation data. Which do you expect to be larger?
Using only the training data, we can have Mathematica compute the polynomial which fits best. We do have to make a choice of which degree to use. Evaluating the following cell will compute the best fitting polynomial and tell you the MSE for points in the training data as well as the validation data. Which do you expect to be larger?
In[]:=
degree=3;fitPolynomial[data,degree]
Now do the same thing, but this time graphing the results. Choose a variety of degrees for the polynomial, let’s say as high as thirty, and describe what happens as the degree increases. What happens in the interval where we have missing data?
Now do the same thing, but this time graphing the results. Choose a variety of degrees for the polynomial, let’s say as high as thirty, and describe what happens as the degree increases. What happens in the interval where we have missing data?
In[]:=
degree=3;range=30;fitPolynomialPlot[data,degree,range]
We can plot the training and validation errors versus n, the degree of the polynomial we are using to fit our data. Which degree should we use to form our model?
We can plot the training and validation errors versus n, the degree of the polynomial we are using to fit our data. Which degree should we use to form our model?
In[]:=
range=10;plotErrors[data,range]
More complex data
More complex data
Now let’s do the same with a higher degree polynomial. First create synthetic data using a higher degree polynomial than before:
Now let’s do the same with a higher degree polynomial. First create synthetic data using a higher degree polynomial than before:
In[]:=
data=makeData[(x-2)(x-2.6)(x+1)(x+2.5)(x-0.5)(x+1.2)];range=30;dataPlot=plotData[data,range]
And then fit polynomials of various degrees:
And then fit polynomials of various degrees:
In[]:=
degree=15;range=30;fitPolynomialPlot[data,degree,range]
range=50;plotErrors[data,range]
Even more complex data
Even more complex data
And finally, what is the function underlying the original data is not even a polynomial?
And finally, what is the function underlying the original data is not even a polynomial?
In[]:=
data=makeData[20*Exp[-Sqrt[Abs[x]]]*Sin[x^2]];range=10;dataPlot=plotData[data,range]
In[]:=
degree=15;range=15;fitPolynomialPlot[data,degree,range]
In[]:=
range=50;plotErrors[data,range]