Transformers
What is it?⌗
A deep learning architecture which is based on multi-head attention.
Transformers were introduced to the world through a 2017 paper by eight scientists at Google: “Attention Is All You Need”. A paper which is seen as the turning point of modern artificial intelligence.
Why was it created?⌗
Previously, ML architectures such as recurrent architectures, long short-term memory took much longer to train.
Transformers enabled more efficient training, and by proxy made possible the wave of LLMs (Large Language Models) we have access to today.