That is a groovy title for a
research paper that revolutionized natural language processing and created deep learning architecture
called Transformer that became the foundation for modern AI -the generative AI.
Previously neural network (Recurrent Neural Network -RNN) process data
sequentially -one word at a time in the order of appearance. Transformer was
able to capitalize on pattern that exist between words without being located in
proximity or are in sequence, this long distance dependencies in words is ‘attention
mechanism’. So, not all words are important only some are, and then you work
the pattern and imprint it simultaneously to get the meaning. This is pretty amazing.
Amazing because that is how I also read, infact all slow readers attempt such
tricks.
There are two steps in trying to comprehend written text. One is to understand basic structure of language, that is, grammar as also meaning of words. Since I really couldn’t understand grammar, I worked it by reading a lot and getting the structure, iterative learning -more you read better you become. So that when you write something wrong you get a feeling of something not quite right, so you pause and evaluate it further to get it right. Meaning of the words are always contextual hence iterative learning works. The more you read better you become while quality reading enhances your intellect. This is precisely how AI also learned language by getting pattern approximation through large data and powerful computation. I have discussed these few years back in context to GPT (https://depalan.blogspot.com/2021/12/human-intelligence-artificial.html). Now that you understand language how do you get the meaning is the second step. There was this big gap between wanting to read and capacity to read. I had racks of books to finish, and I was equally excited to know more as also understand high caliber patterns, but the capacity to read was woefully slow. I really couldn’t read more than few pages in a day, and sometimes I was stuck in word mesh and sentence loop, it is crazy but words tend to jump and play around, it frustrating to hold the words in their places while you try to get the meaning. I figured that there are two types of reading; one is deliberately slow, like for instance if you are reading nuanced writing like Kafka’s story or Dickinson’s poem, the unhinged words too add to depth of meaning, and the other one is fast reading -that is when you are reading nonfiction. The trick for fast reading is to identify signifier words (that encapsulates much meaning- 'holds attention'/tokens) and then to link a pattern to get the meaning. Though less number of words but the one that carry high approximation for meaning, and interestingly jumping words adds to unintended complexity to meaning. This is quite crazy but meaning gets into zone of high probability of not intended and eventual mess up. Probability of success vary with how much comfort you have in the subject; approximation is generally high if you are adept in identifying important signifier words. Indeed, I used to teach fast reading techniques for few years –a kind of job you finish fast with decent money then enough time to focus on real interests.