Attention

Some well-explained blog articles on Attention Mechanism.

Attention and Augmented Recurrent Neural Networks

Reference: https://distill.pub/2016/augmented-rnns/

Our guess is that these “augmented RNNs” will have an important role to play in extending deep learning’s capabilities over the coming years.

Attention is All you need

Reference: https://arxiv.org/abs/1706.03762

Video Explanation: https://www.youtube.com/watch?v=iDulhoQ2pro

Attention? Attention!

Reference: https://lilianweng.github.io/lil-log/2018/06/24/attention-attention.html

Attention has been a fairly popular concept and a useful tool in the deep learning community in recent years. In this post, we are gonna look into how attention was invented, and various attention mechanisms and models, such as transformer and SNAIL.

TRANSFORMERS FROM SCRATCH

Reference: http://www.peterbloem.nl/blog/transformers

Transformers are a very exciting family of machine learning architectures. Many good tutorials exist (e.g. [1, 2]) but in the last few years, transformers have mostly become simpler, so that it is now much more straightforward to explain how modern architectures work. This post is an attempt to explain directly how modern transformers work, and why, without some of the historical baggage.

Last updated on Oct 5, 2019