Mostly in all courses regarding deep learning and tensorflow, i commonly see the following examples,
1. Image Classification (CNN)
2. Image Segementations (CNN)
3. Text Classification (RNN)
4. Text Translation (RNN)
and mostly something to do with text, images and speech.
why do they form the foundations for anyone learning deep learning?
+ Image classification/segmentation used as the basis for teaching deep learning because, these tasks directly ingest images with very little preprocessing. So when explaining these models, you can directly start with images (which everyone can easily understand) and then move on to explain models. Therefore image classification is used as a stepping stone in books (before NLP tasks).
+ Text related tasks used mostly because of the popularity of NLP. As you can imagine, there's more textual data in the world than images. Therefore, NLP is gaining quite a bit of popularity.
It is also the case that most of the deep learning tasks have to do with images, text and speech. So if a book is teaching deep learning, they must touch upon these topics.
PhD | Senior Data Scientist | AI/ML Educator
posted 1 month ago
Thanks Thushan. Handwriting recognition and image to text conversion are few use cases i could quote which are similar to image classification/segmentation and combination of text and image.
My imagination on CNN is that I just recognize an object by zooming in and zooming out and try to classify or segment based on the recognized features.
For RNN, it makes a great sense to take LSTM model and assume how we read text and analyze the context of the sentence or a paragraph. we also can go forward and backward on sentences like in bi-directonal Networks. Definitely we have seqtoseq, manytoone and onetomany or many to many models.
Beyond these general example which puts up itself into the context of our human eye, reading skills and listening skills. is it possible to apply them in Strategic Games like what Open AI did? how complex such system will be and what will be general compute resource required in production systems.