Toggle navigation
JUPYTER
FAQ
View as Slides
View as Code
Python 2 Kernel
View on GitHub
Execute on Binder
Download Notebook
das2018-tutorial
99-TODO.ipynb
Notebook
Scanned Text Recognition
¶
functionality (mostly done)
resolution detection, deep language modeling
font identification, reading order detection
upsampling, downsampling, better noise removal
character and word bounding boxes
training / data
larger, more diverse training sets
large scale self-supervised training (research)
Scanned Text Recognition (more)
¶
other work
purely convolutional OCR (better suited to current accelerators)
replace line normalization, layout extraction with deep models
replace data augmentation with deep models (like GAN)
better semantic segmentation (text, image, table, graph, figure, ...)
non-CTC models and/or automatic decoding
benchmarking of attention-based models
Camera Captured Recognition
¶
functionality
page boundary detection models
DL dewarping
DL depth estimation (from RGB, from stereo)
training / data
large collection of photographically captured images
automatic generation of photographically distorted images (ray tracing)
automatic DL-based data augmentation
Scene Text Recognition
¶
functionality
DL text detection / extraction (reimplement standard convolutional models)
training / data
good datasets exist; train on them
benchmark CTC vs attention-based models
In [ ]: