"amount of compute" isn't equivalent to "infinite data," though. The observable spece will always be infinitely smaller than the possible space. ChatGPT has the benefit of the entire accumulated data of mankind, and has glaring repetitiveness, gaps, etc. OpenAI as a result is investing in synthetically generated text to improve this - which is problematic in and of itself.
Tesla is also investing in
synthetic data - which to me is a bit of a flashing red light because, wait, 2.5 million Tesla cars aren't enough to provide this training data?
But we're getting into a difference-of-approaches here. My general thought is that when you are having to approach infinite data to get the results you want, then there is something semantic which is missing.
As another analogy, consider music and consider training on song recognition. If you trained a model on the raw sound and let ML "take care of the rest," then you'd need an infinite number of sound samples to be able to take this model to recognize the same song being played on pianos, horns, guitars, etc and be able to apply this recognition to any arbitrary instrument.
But if you take a step of abstraction back, and break down the music to its notation form - old fashioned music sheets - then the amount of training data required is exponentially less. But the "gotcha" part here is that it takes subsstantial study to "nail" the semantics, e.g., coming up with the right
notation system.
This is in fact what my company is based on - it is solving a particular problem space for which almost no diverse training data exists.