Wednesday, December 21, 2022

Unfolding the protein universe

 

As the year comes to an end the most significant event, if you really want to capsulate into timescale of a year and year end shenanigans, has to be DeepMind releasing structures of 200million proteins -all catalogued protein known to science. To put it into perspective before DeepMind only 17% of protein 3D structures were known that too after long and costly process. The decision to opensource AlphaFold is a momentous occasion. It will be Cambrian explosion of insights and innovations in multiple fields in the coming years and decades. Protein plays an important role in nearly every important activity that happens inside every living organism on earth. So, the possibility through this new awareness of three-dimensional structures of protein folds, hence its functions, is mindboggling and is set to profoundly impact understanding in fields ranging from fighting diseases to origin of life. AlphaFold is as significant as CRISPRcas9 in its broad span of impact.    

Meanwhile after ChatGPT -that has initiated some amusing online reactions, GPT-4 -OpenAI’s powerful language generative model, is to be released very soon. I read that it is going to make astonishing leaps in performance of memory -retain and refer back, and summarizing -distill essentialities in a text. The issue is Large Language Models (LLMs) like GPTs are running out of training data. There are lots of data generated online but very less is within acceptable quality threshold. They are moving towards transcription of spoken content of formal meetings. I find GPT progress exceedingly interesting. These started in 2017 with iterative machine learning transformers through neural network architecture. I find them very real in the way they acquire language. I got my frame on writing, and ofcourse keep updating, by iterative addition through constant reading. Better writers and incisive writings mean better frame to consolidate. Unlike GPT human brain takes years to work, particularly when there was no frame to begin with. Though I began choosing randomly very soon conscious part of the brain kicked in to evaluate and found the writings wanting in quality, nuance and span of idea. Indian writing -with rare exceptions, lacked the verve and vigor, mostly cosmetic and planted with word play. As also English classics like Austen were too boring and out of my reality or any reality I could fathom. Russian and Latin American writings were where things were happening. I was looking for contemporary writings -and that is how I bumped into Nobel prize writers which seems a definitive reference in a chaos filled world of writing. Ofcourse now I have moved out of fiction -it has become a luxury, into nonfiction -there is so much to catch up in the fast changing world. Without awareness of grammar (not that I didn’t try, I really couldn’t understand at all, as also it starts to intrude into writing and thinking hence I let it go) iterative consolidation is the only way out. I identify myself with GPT a lot. Like GPT I sometimes write fascinating stuff but also make silly embarrassing mistakes since I get into my own loop and brain gets tripped. But unlike GPT human mind has limitations but fascinating possibilities. You are not working on words from models of contexts and references alone but also through experiences and thinking. GPT at this stage is emulating third rate Indian writers who use superficial play of words to impress the system to make easy money and influence. Intermittently impressive but mostly mediocre. GPT can easily be better than most Indian writers (who can be safely taken as benchmark of mediocrity!).  

There is also research going on to augment data through visual language or multimodal access afterall much of human speech is complemented by visual cues and representations. When I see something, I am acquiring data that is converting the experience through emotions which are then imprinted into memory. Many a times I find it difficult to convert into words, and even if I do, it doesn’t really carry the whole feeling of experience. It seems to be lacking and futile. Even in best of writers it is very much hinged to partial emotions it excites in readers. Ofcourse consensus is high on great writers (is there primal quantum entangling in human brains that aligns emotions? Crazy thought!), but it is never complete, and many a times it could even be readers limitations. It does create diversity of experience but doesn’t address the limitation of language. So, is the language lacking since it is in a linear time frame whereas experience encompass linearity and has input from past and future -oscillating in time, to add to the richness of memory. AlphaZero has exhibited this ability to look deep into time and backforth for possible outcomes. It has accessed reality that human emotions are able to sense but brain is incapable to articulate. GPT is open ended, subjective frame, while AlphaZero is close ended system, clear objective, but the way AlphaZero experience time in limited frame holds lots of potential.