Skip to main content

Homework 3: Polynomial Regression

Due: 2024-10-01 23:59:00EDT

Overview

The goals of this week’s assignment are:

In this exercise, you will work through linear and polynomial regression. Our data consists of inputs \(x_{i} \in \mathbb{R}\) and outputs \(y_{i} \in \mathbb{R}\), which are related through a target function \(y = f(x) + \epsilon\). Your goal is to learn a linear predictor \(h_{w}(x)\) that best approximates \(f(x)\).

This assignment is to be done individually. Discussing high-level ideas with others in the class is encouraged, but you should not be sharing code in any way.

Provided Code

Numpy Background

We used numpy before in Homework01, but we’ll use more of its functionality in this homework. It is a good skill to pick up since you will inevitably use \(\texttt{numpy}\) if you plan to do math in Python, e.g. for machine learning. You may find it useful to work through a numpy tutorial first.1

Here are some things to keep in mind as you complete this problem:

Useful numpy functions:

1. Visualization

Visualizations can help us understand the data. For this data set, you can use a scatter plot since the data has only two properties to plot (\(x\) and \(y\)).

Hint: As you implement the remaining exercises, use this plotting function as a debugging tool.

2. Linear Regression

Recall that the objective of linear regression is to minimize the cost function:

\[J(w) = \dfrac{1}{2} \sum_{i=1}^{n} (h_{w}(x_{i}) - y)^{2}\]

In this problem, we will use the matrix-vector form where

\[y = \begin{pmatrix} y_{1} \\ y_{2} \\ \ldots \\ y_{n} \end{pmatrix}, \hspace{25pt} X = \begin{pmatrix} --x_{1}-- \\ --x_{2}--\\ \ldots \\ --x_{n}-- \end{pmatrix}, \hspace{25pt} w = \begin{pmatrix} w_{1} \\ w_{2} \\ \ldots \\ w_{p} \end{pmatrix}\]

where each example \(x_{i} = [1, x_{i1}, \ldots , x_{ip}]\). Our linear regression model is: \(h_{w}(x) = w^{T}x\).

Beginning with the simple linear regression case (\(p = 1\)), we have: \(h_{w}(x) = w_{0} + w_{1}x\)

\[w_{j} \leftarrow w_{j} - \alpha (h_{w}(x_{i}) - y_{i}) x_{ij} \text{ (simultaneously update wj for all j).}\]

With each step of gradient descent, our parameters \(w_{j}\) come closer to the optimal values that will achieve the lowest cost \(J(w)\).

2. Polynomial Regression

Now let us consider the more complicated case of polynomial regression, where, for a polynomial of degree \(d\), our hypothesis is

\[h_{w}(x) = w^{T}x = w_{0} + w_{1}x + w_{2}x^{2} + \ldots +w_{d}x^d.\]

Recall that polynomial regression can be considered as an extension of linear regression in which we replace our input matrix \(X\) with

\[\Phi = \begin{pmatrix} --\phi(x_{1})-- \\ --\phi(x_{2})--\\ \ldots \\ --\phi(x_{n})-- \end{pmatrix},\]

where \(\phi(x)\) is a function such that \(\phi_{k}(x) = x^k\) for \(k = 0, \ldots , d\).

Given \(n\) training examples, it is always possible to obtain a “perfect fit” (a fit in which all the data points are exactly predicted) by setting the degree of the regression to \(n - 1\) (for example, in Handout 2 we had \(n = 2\) points and a polynomial of deg \(d = 1\) fit them perfectly). Of course, we would expect such a fit to generalize poorly. In the remainder of this problem, you will investigate the problem of overfitting as a function of the degree of the polynomial, \(d\). To measure overfitting, we will use the Root-Mean-Square (RMS) error, defined as

\[E_{RMS} = \sqrt{\dfrac{2\mathbb{E}[w]}{n}},\]

where

\[\mathbb{E}[w] = \dfrac{1}{2}\sum_{i=1}^{n}(\sum_{k=0}^{d} w_{k}\phi_{k}(x_{i}) - y)^2 = \dfrac{1}{2}\sum_{i=1}^{n}(h_{w}(x_{i})-y_{i})^2\]

and \(n\) is the number of examples.

3. Regularization

Finally, we will explore the role of regularization. For this problem, we will use \(L_{2}\)-regularization so that our regularized objective function is

\[J(w) = \dfrac{1}{2}\sum_{i=1}^{n}(h_{w}(x_{i})- y_{i})^{2} + \dfrac{\lambda}{2}\sum_{k=1}^{d}w_{k}^2,\]

again optimizing for parameters \(w\).

FAQ

How do I know if my code is working?
Run the provided unit tests provided in tests.py. Initially, many of them will fail. If your program is working correctly, all of them will pass. However, the converse is not true: passing all the tests is not sufficient to demonstrate that your code is working. Using tests.py is strongly encouraged, this is similar to how your code will be graded on gradescope.

README.md

In a text file called README.md answer the following questions:

  1. How much time did you spend on the homework
  2. What did you learn from this homework
  3. optional: What did you struggle with during this homework
  4. optional: any other feedback you would like to share

Submitting

Submit the following files to the assignment called HW02 on Gradescope:

  1. PolynomialRegression.py
  2. run_regression.py
  3. Your readme

Make sure to name these files exactly what we specify here. Otherwise, our autograders might not work and we might have to take points off. Add any other files that you created that are needed for your code to run.

Acknowledgements

Modified from lab by: Sara Mathieson, Ameet Soni.

  1. Try out Numpy’s tutorial (https://numpy.org/doc/stable/user/quickstart.html). Those familiar with Matlab may find the “Numpy for Matlab Users” documentation (https://numpy.org/doc/stable/user/numpy-for-matlab-users.html) more helpful.