Given both the competitive landscape and the safety implications of large-scale models like GPT-4, this report contains no further details about the architecture (including model size), hardware, training compute, dataset construction, training method, or similar.gpt-4.pdf
“GPT-4 is not easy to develop. It took pretty much all of OpenAI working together for a very long time to produce this thing. And there are many many companies who want to do the same thing, so from a competitive side, you can see this as a maturation of the field.”OpenAI co-founder on company’s past approach to openly sharing research: ‘We were wrong’ – The Verge
It dawned on me that they officially ended the era of AI as a research, and switched gears towards AI as a product. For many startups, AI has been a product, not a research, and OpenAI is one such startup, but they have pretended to be a research lab of a big tech company or an elite university. Although there have been signs of this shift, like not releasing the GPT-3 model to the public, they still kept publishing research papers and people loved it. Until now.
Another iconic line from the GPT-4 “technical report” is the Authorship section (p.15) – It looks a lot like a staff roll of a movie, and it asks: Please cite this work as “OpenAI (2023)”. This would never happen if it were a research paper. This signifies that GPT-4 is a product and the whole company has worked towards building it.
What does this mean? When their researchers have done the research-y stuff, it’ll be folded into the next version of GPT and won’t become a standalone research. No published paper that is.
And I think it is already happening. Look at the paper about the Vision-Language model CLIP. This is their early iteration of visual understanding and you can consider it a building block of multimodal models like GPT-4.
This paper is two years old. If they kept doing the “research”, they should’ve been another paper for its successor. That didn’t happen. There is no CLIP-2; they have built the progress into GPT-4 instead. This is how product development works: The innovations occur inside the product. (Compare this to MSR which has up to three generations of VLP models!)
After all, there are no papers about, say, Google’s ranking algorithms (forget Page Rank), Facebook’s feed scoring, Amazon’s logistics, or Apple’s CPUs. These are all the marvels of computer science achieved by smart Ph.D. people, but all have occurred in the products. Few papers (if not none) were written. GPT has morphed into one of these monumental products.
Am I disappointed? Definitely. Do I complain? Not really. Academic research has its limitation. Sometimes innovation needs a product. It needs a user. It requires real problems. It demands money to earn. That’s why tech companies hire Ph.D. and let them write production code instead of academic papers.
The downside is that it mostly happens behind doors. This darkness however is worth it, at least for the innovating engineers and researchers themselves.
However, for us begging paper readers – It feels like the end of a great season. It’s time to get back to work.
I still hold a hope: Someday, once this war is over and the victory makes the GPT gangs rich, one of them goes back to school, write a textbook about AI, in which there is a chapter called “History”, and they share their retrospective as an insider: What they did right. What their peers overlooked. What the vision was. To what extent it is achieved, and don’t forget some technical details. It’d be a fascinating read.
Until that book arrives, I’ll keep reading. Well, I should get back to work. But after business hours, I’ll keep reading.