Deep Learning for Natural Language Processing
- Description
- Curriculum
- FAQ
- Reviews
In this course, we will dive into the world of Natural Language Processing. We will demonstrate how Deep Learning has re-shaped this area of Artificial Intelligence using concepts like word vectors and embeddings, strucutured deep learning, collaborative filtering, recurrent neural networks, sequence-to-sequence models and transformer networks. In our journey, we will be mostly concerned with how to represent the language tokens, being at the word or character level, and and how to represent their aggregation, like sentences or documents, in a semantically sound way. We start the journey by going through the traditional pipeline of text pre-processing and the different text features like binary and TF-IDF features with the Bag-of-Words model. Then we will dive into the concepts of word vectors and embeddings as a general deep learning concept, with detailed discussion of famous word embedding techniques like word2vec, GloVe, Fasttext and ELMo. This will enable us to divert into recommender systems, using collaborative filtering and twin-tower model as an example of the generic usage of embeddings beyond word representations. In the second part of the course, we will be concerned with sentence and sequence representations. We will tackle the core NLP of Langauge Modeling, at statistical and neural levels, using recurrent models, like LSTM and GRU. In the following part, we tackle sequence-to-sequence models, with the flagship NLP task of Machine Translation, which paves the way to talk about many other tasks under the same design seq2seq pattern, like Question-Answering and Chatbots. We present the core idea idea of Attention mechanisms with recurrent seq2seq, before we generalize it as a generic deep learning concept. This generalization leads to the to the state-of-the art Transformer Network, which revolutionized the world of NLP, using full attention mechanisms. In the final part of the course, we present the ImageNet moment of NLP, where Transfer Learning comes into play together with pre-trained Transfomer architectures like BERT, GPT 1-2-3, RoBERTa, ALBERT, XLTransformer and XLNet.
-
1Module intro and roadmapVideo lesson
-
2Why NLP is hard?Video lesson
-
3NLP tasks and appsVideo lesson
-
4CV vs NLP analogyVideo lesson
-
5DL in NLP and Bag-of-Words modelVideo lesson
-
6Text preprocessing pipelineVideo lesson
-
7Text preparation stepsVideo lesson
-
8Text features: Binary-Count- Freq- TF-IDFVideo lesson
-
10Module intro and roadmapVideo lesson
-
11Why word embeddings?Video lesson
-
12Traditional word vectorsVideo lesson
-
13Learnable Embedding matrixVideo lesson
-
14BoW Vectors modelVideo lesson
-
15Structured Deep LearningVideo lesson
-
16Pre-trained word embeddingsVideo lesson
-
17Word2VecVideo lesson
-
18GloVeVideo lesson
-
19FastText and ELMoVideo lesson
-
20Evaluation of Word Embedding vectorsVideo lesson
-
24Module intro and roadmapVideo lesson
-
25Statistical Langauge Models (SLM)Video lesson
-
26Neural Language Models (NLM)Video lesson
-
27Recurrent Neural NetworksVideo lesson
-
28RNN as Sentence Embedding EncoderVideo lesson
-
29Example RNN char level NLMVideo lesson
-
30Example RNN word level NLMVideo lesson
-
31Language Models evaluation methodsVideo lesson
-
32Long-Short Term Memory (LSTM) and Gated Recurrent Units (GRU)Video lesson
-
33Example: LSTM/GRU for Text Classification appsVideo lesson
-
34Conv1D and CNN-LSTM modelsVideo lesson
-
35Module intro and roadmapVideo lesson
-
36Seq2seq models overviewVideo lesson
-
37Unaligned/Matched sequences case (CTC loss)Video lesson
-
38Statistical Machine Translation (SMT)Video lesson
-
39Neural Machine Translation (NMT) and Vanilla seq2seq modelVideo lesson
-
40NMT decoding and Beam-SearchVideo lesson
-
41Attention mechanisms with seq2seq modelsVideo lesson
-
42Evaluation of seq2seq models (WER and BLEU)Video lesson
External Links May Contain Affiliate Links read more