Rules:
- It must be a tutorial (reproducible Jupyter notebook, ipynb file), i.e you should teach some skills, do not just express your ideas on some theoretical concept
- It is not just a translation of somebody else's material. You can borrow some stuff, but with fair links/citations
- The prerequisite for reading your tutorial should be basic ML as taught in mlcourse.ai. Do not write in-depth articles about neural nets, probabilistic programming, Bayesian approach, reinforcement learning etc. - topics like these need a thorough approach. That is, sure, you can write articles like that, but not in the format of mlcourse.ai tutorials. On the other hand, it's hardly worth making a tutorial too simple (e.g., to take some library and demonstrate only a couple of methods from it)
- A typical tutorial shall be 30-60 minutes to read and digest (however, here exceptions are possible)
- Check out a list of published tutorials from previous runs of this course (in Russian). Yes, it's in Russian but Google Translate can kind of give you an insight, and topic that are already covered. For those who already passed the course: definitely, translating somebody else's (or your own) tutorial into English is not going to work
Here are a dozen of sample topics (but this is a just as an example, you can/should come up with your own):
- data collection, crawling with, working with XML, JSON and so on.
- overview of dask library
- overview of Bokeh or another viz library
- decision trees with statistical tests in the nodes
- review of H2O library
- poisson/quantile or another type of regression
- automatization of machine learning, AutoML, Teapot
- interpretability of ML - LIME, reducing of Forests to Trees
- review of some clustering method (with motivation, why it is needed)
- dimentionality reduction for visualization (there are some methods besides t-SNE, e.g. from manifold learning family)
- fastText
- methods of imputing missing data
- description of some Kaggle tricks
- counters in supervised learning tasks: WOE, smoothed likelihood and other methods of feature engineering from the target
- there is much more in Scikit-learn that we didn't cover (e.g. different methods of feature selection, feature hashing; you cover NestedCV or elastic net, for instance)
- methods of tree visualization in Python (other than standard graphviz)
- something about statistics, but such that a general audience will understand
- something that better covers the course material, e.g. a broader review of Vowpal Wabbit
- data analysis with bash
- etc.