Self-attention, sometimes called intra-attention, is a mechanism
that aims to mimic cognitive attention, relating different positions
of a single sequence to compute a representation of the sequence.
Structured data are tabular data (for example, organized in tables,
databases, or spreadsheets) that can be used to train some
machine learning models effectively.
Transformers are a relatively new neural network architecture that
relies on self-attention mechanisms to transform a sequence of
inputs into a sequence of outputs while focusing its attention on
important parts of the context around the inputs. Transformers do
not rely on convolutions or recurrent neural networks.