Underfitting/Overfitting Demo

Here we'll consider a simple example of linear regression with polynomial features, because it's the easiest to visualize. We'll use the following generative model. xUnif[1,1]ϕk(x)=xk1yxN(0.8tanh(2x),σ2) for ϕ:[1,1]Rd being our feature map, wgenRd being a parameter vector, and N(μ,σ2) denoting the Gaussian distribution with mean μ and variance σ2. We'll start by setting σ=0.3, d=12, and make the training set size 10 and the validation set size 30.

This diagram visualizes what is going on. On the left, we plot the dataset (with the training set in blue and the validation set in red) along with the predictor y^=h(x) in green for a particular ridge regression parameter λ. On the right, we plot the training loss, validation loss, and expected loss over the source distribution, on a log-log plot against the ridge regression parameter λ on the x-axis. Every time you refresh the page, you get a new dataset. You can also click-and-drag the points to change the dataset.

Train loss: 0.00591Validation loss: 0.00591Expected population loss: 0.0379

0.1