$\newcommand{\xv}{\mathbf{x}} \newcommand{\Xv}{\mathbf{X}} \newcommand{\yv}{\mathbf{y}} \newcommand{\zv}{\mathbf{z}} \newcommand{\av}{\mathbf{a}} \newcommand{\Wv}{\mathbf{W}} \newcommand{\wv}{\mathbf{w}} \newcommand{\tv}{\mathbf{t}} \newcommand{\Tv}{\mathbf{T}} \newcommand{\muv}{\boldsymbol{\mu}} \newcommand{\sigmav}{\boldsymbol{\sigma}} \newcommand{\phiv}{\boldsymbol{\phi}} \newcommand{\Phiv}{\boldsymbol{\Phi}} \newcommand{\Sigmav}{\boldsymbol{\Sigma}} \newcommand{\Lambdav}{\boldsymbol{\Lambda}} \newcommand{\half}{\frac{1}{2}} \newcommand{\argmax}[1]{\underset{#1}{\operatorname{argmax}}} \newcommand{\argmin}[1]{\underset{#1}{\operatorname{argmin}}}$

Exercises

These exercises are meant to lead you in self-learning about concepts to be used in coming assignments.

They will not be checked-in, and they will not be graded.

The ReLU activiation function is

$$ f(x, w) = \begin{cases} 0, & x^T w \le 0\\ x^T w, & x^T w > 0 \end{cases} $$

Define a python function named relu that accepts two arguments, $\Xv$, which is a matrix of multiple input samples, one per row, and $\Wv$, a matrix of weights. If $\Xv$ is $N \times D+1$ then $\Wv$ must be $D+1 \times M$, where $M$ is the number of units.

In [ ]:
def relu(X, W):
    ...

Write the gradient of $f(x,w)$ with respect to $w$. Use LaTeX math expressions.

Implement this gradient.

In [ ]:
def grad_relu(X, W):
    ...

Define a new python class, named LinearModel that has mostly empty functions named train and use. These functions should just print In train and In use and return nothing. Also define the constructor, and the other functions shown in the following partial implementation.

In [1]:
class LinearModel():
    
    def __init__(self):
        ...
        
    def __repr__(self):
        ...
        
    def __str__(self):
        ...
        
    ...

Test your class definition by making an instance of the class and running all of the methods.

In [ ]: