# %load /Users/facai/Study/book_notes/preconfig.py
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(color_codes=True)
#sns.set(font='SimHei')
plt.rcParams['axes.grid'] = False
#from IPython.display import SVG
def show_image(filename, figsize=None, res_dir=True):
if figsize:
plt.figure(figsize=figsize)
if res_dir:
filename = './res/{}'.format(filename)
plt.imshow(plt.imread(filename))
the best fitting model is a large model that has been regularized appropriately.
where $\Omega(\theta)$ is a paramter norm penalty.
typically, penalizes only the weights of the affine transformation at each layer and leaves the biases unregularized.
The sparsity property induced by $L^1$ regularization => feature selection
constrain $\Omega(\theta)$ to be less than some constant $k$:
\begin{equation} \mathcal{L}(\theta, \alpha; X, y) = J(\theta; X, y) + \alpha(\Omega(\theta) - k) \end{equation}In practice, column norm limitation is always implemented as an explicit constraint with reprojection.
regularized matrix is guarantedd to be invertible.
create fake data:
Goal: learn a representation so that example from the same class have similar representations.
show_image("fig7_2.png")
run it until the ValidationSetError has not imporved for some amount of time.
Use the parameters of the lowest ValidationSetError during the whole train.
show_image("fig7_3.png", figsize=[10, 8])
place a penalty on the activations of the units in a neural network, encouraging their activations to be sparse.
increase the size of the model when using dropout.
small samples, dropout is less effective.
show_image("fig7_8.png", figsize=[10, 8])