This notebook introduces the algorithms within Dask-GLM for Generalized Linear Models.
Starting the Dask Client is optional. It will provide a dashboard which is useful to gain insight on the computation.
The link to the dashboard will become visible when you create the client below. We recommend having it open on one side of your screen while using your notebook on the other side. This can take some effort to arrange your windows, but seeing them both at the same is very useful when learning.
from dask.distributed import Client, progress
client = Client(processes=False, threads_per_worker=4,
n_workers=1, memory_limit='2GB')
client
Client-5b571f45-0de1-11ed-a361-000d3a8f7959
Connection method: Cluster object | Cluster type: distributed.LocalCluster |
Dashboard: http://10.1.1.64:8787/status |
94f79b63
Dashboard: http://10.1.1.64:8787/status | Workers: 1 |
Total threads: 4 | Total memory: 1.86 GiB |
Status: running | Using processes: False |
Scheduler-079365e3-ef5b-4539-84b1-973599540812
Comm: inproc://10.1.1.64/9057/1 | Workers: 1 |
Dashboard: http://10.1.1.64:8787/status | Total threads: 4 |
Started: Just now | Total memory: 1.86 GiB |
Comm: inproc://10.1.1.64/9057/4 | Total threads: 4 |
Dashboard: http://10.1.1.64:39345/status | Memory: 1.86 GiB |
Nanny: None | |
Local directory: /home/runner/work/dask-examples/dask-examples/machine-learning/dask-worker-space/worker-ptl8ho2_ |
from dask_glm.datasets import make_regression
X, y = make_regression(n_samples=200000, n_features=100, n_informative=5, chunksize=10000)
X
|
import dask
X, y = dask.persist(X, y)
We also recommend looking at the "Graph" dashboard during execution if available
import dask_glm.algorithms
b = dask_glm.algorithms.admm(X, y, max_iter=5)
b = dask_glm.algorithms.proximal_grad(X, y, max_iter=5)
/usr/share/miniconda3/envs/dask-examples/lib/python3.9/site-packages/dask/core.py:119: RuntimeWarning: overflow encountered in exp return func(*(_execute_task(a, cache) for a in args))
The Dask-GLM project is nicely modular, allowing for different GLM families and regularizers, including a relatively straightforward interface for implementing custom ones.
import dask_glm.families
import dask_glm.regularizers
family = dask_glm.families.Poisson()
regularizer = dask_glm.regularizers.ElasticNet()
b = dask_glm.algorithms.proximal_grad(
X, y,
max_iter=5,
family=family,
regularizer=regularizer,
)
/usr/share/miniconda3/envs/dask-examples/lib/python3.9/site-packages/dask/core.py:119: RuntimeWarning: overflow encountered in exp return func(*(_execute_task(a, cache) for a in args)) /usr/share/miniconda3/envs/dask-examples/lib/python3.9/site-packages/dask/core.py:119: RuntimeWarning: overflow encountered in exp return func(*(_execute_task(a, cache) for a in args))
dask_glm.families.Poisson??
dask_glm.regularizers.ElasticNet??