Will Performance Hit Diminishing Returns Eventually?

Neonatal RRT · Jul 31, 2024

jsilver256 said:
Simplistic example -> throw a pair of green shoes over a power line and it is entirely possible for computer vision to interpret that as a green light.

I do worry about terrorists doing something like flashing a red umbrella (which is easy for computer vision to mistake for a stop sign) and causing a gigantic pileup.

High resolution cameras pretty much eliminate this idea, as well as, millions of images to discriminate one object from another. This is why something like what Tesla has, a series of multiple, football-sized computer data centers dedicated to machine learning, collecting millions of data points every day as automobiles drive throughout the day is essential. There will be a day, not so far in the future, where it will be able to discriminate one person from another (facial recognition), a black cat from a striped one, and so on. My cars already read traffic signs and as far as the traffic lights, are very high resolution, will "ding" when the light turns green, etc. So, I am pretty confident we've moved well past your scenario here.

Neonatal RRT · Jul 31, 2024

Joshua the Writer said:
To be honest, I still feel like we need to reevalute the process on new licenses. That is probably a better way to do road safety. Because eventually, people will start using self driving as a crutch.

Perhaps, there will be, but at the same time it also opens up a lot more safety and mobility for the elderly, those that cannot drive due to neurological conditions such as epilepsy, not to mention those who are under the influence of alcohol and drugs that put themselves, as well as, you and I in danger when they get behind the wheel. The vast majority of accidents, the hundreds of thousands each year worldwide are due to human error, distractions, slow reaction times, or simply not being able to see 360* at all times. A vehicle with 360* of situational awareness, orders of magnitude faster reaction times, and a machine learning experience based upon trillions of miles/kilometers of data, I am thinking this is going to be much safer than the average human driver. It already is. This data is coming in and despite your news feed, full of FUD and anti-Tesla, per miles driven, it's already an 10X or more safer than humans, and this was with older software editions. The new software is orders of magnitude more safe than that.

Tesla is not the only company doing machine learning, but for example, every Tesla vehicle out on the road (if the owner agrees) is having all its camera data while driving sent to the Dojo data center for machine learning. If you are a Tesla owner and driver, you might be an "excellent" driver, and this is what the machine learning system will use as an example, but keep in mind all the other idiots you deal with out on the road on a daily basis, some of them other Tesla drivers. Their driving examples would be weeded out as "unacceptable" and not used for machine learning.

jsilver256 · Jul 31, 2024

Neonatal RRT said:
High resolution cameras pretty much eliminate this idea, as well as, millions of images to discriminate one object from another. This is why something like what Tesla has, a series of multiple, football-sized computer data centers dedicated to machine learning, collecting millions of data points every day as automobiles drive throughout the day is essential. There will be a day, not so far in the future, where it will be able to discriminate one person from another (facial recognition), a black cat from a striped one, and so on. My cars already read traffic signs and as far as the traffic lights, are very high resolution, will "ding" when the light turns green, etc. So, I am pretty confident we've moved well past your scenario here.

It has nothing to do with being high resolution. It is the attempt of ML to "resolve" what they have never seen in the context of training data. This is called the domain gap. When we don't know what is happening under the hood, we don't know what the ML truly categorizes.

Let's put it in another context - let's say Meta trains a ML model to recognize pictures of household cats. They take users' everyday photos, which will have people, cat, dogs, rabbits, etc., and they will create a model with great accuracy.

But what happens when you feed the model the picture of a fox? Foxes are vaguely cat-like. The chances are quite high the model will classify the fox as a cat.

There is only so far you can do to make ML robust against such unexpected inputs. The ML is always searching for a way to categorize something it's never seen before according to what it's seen before. It is not all that different from a toddler who plays with a toy car and then sees a bike and calls it a car.

As an example, I played with the Cadillac self-driving system and determined how it detects "attention." It uses depth cameras and is trained on facial orientation. Putting a fist up there - which is vaguely "face-llike" from a depth camera-perspective (because of contours) will fool it.

Neonatal RRT · Jul 31, 2024

jsilver256 said:
It has nothing to do with being high resolution. It is the attempt of ML to "resolve" what they have never seen in the context of training data. This is called the domain gap. When we don't know what is happening under the hood, we don't know what the ML truly categorizes.

Let's put it in another context - let's say Meta trains a ML model to recognize pictures of household cats. They take users' everyday photos, which will have people, cat, dogs, rabbits, etc., and they will create a model with great accuracy.

But what happens when you feed the model the picture of a fox? Foxes are vaguely cat-like. The chances are quite high the model will classify the fox as a cat.

There is only so far you can do to make ML robust against such unexpected inputs. The ML is always searching for a way to categorize something it's never seen before according to what it's seen before. It is not all that different from a toddler who plays with a toy car and then sees a bike and calls it a car. The ML sees something, the ML is gonna label what it sees.

As an example, I played with the Cadillac self-driving system and determined how it detects "attention." It uses depth cameras and is trained on facial orientation. Putting a fist up there - which is vaguely "face-llike" from a depth camera-perspective (because of contours) will fool it.

Your example of the fox that appears vaguely cat-like, or the child who misidentifies an object is apt here, but just like a child who learns to discriminate, it requires other data points to compare. Eventually, the child learns the difference between a car and a truck, then a Chevy Silverado versus a Ford F150, then a Ford F150 Raptor versus a Ford F150 Off Road edition. The more data points, the more likely one can discriminate and properly identify. So, I give the example of Tesla's Dojo machine learning centers. Trillions and trillions of data points, every day. The amount of compute being dedicated to machine learning at this point is beyond most people's imagination.

https://www.datacenterdynamics.com/...usk-posits-using-cars-as-distributed-compute/

jsilver256 · Jul 31, 2024

Neonatal RRT said:
The amount of compute being dedicated to machine learning at this point is beyond most people's imagination.

"amount of compute" isn't equivalent to "infinite data," though. The observable spece will always be infinitely smaller than the possible space. ChatGPT has the benefit of the entire accumulated data of mankind, and has glaring repetitiveness, gaps, etc. OpenAI as a result is investing in synthetically generated text to improve this - which is problematic in and of itself.

Tesla is also investing in synthetic data - which to me is a bit of a flashing red light because, wait, 2.5 million Tesla cars aren't enough to provide this training data?

But we're getting into a difference-of-approaches here. My general thought is that when you are having to approach infinite data to get the results you want, then there is something semantic which is missing.

As another analogy, consider music and consider training on song recognition. If you trained a model on the raw sound and let ML "take care of the rest," then you'd need an infinite number of sound samples to be able to take this model to recognize the same song being played on pianos, horns, guitars, etc and be able to apply this recognition to any arbitrary instrument.

But if you take a step of abstraction back, and break down the music to its notation form - old fashioned music sheets - then the amount of training data required is exponentially less. But the "gotcha" part here is that it takes subsstantial study to "nail" the semantics, e.g., coming up with the right notation system.

This is in fact what my company is based on - it is solving a particular problem space for which almost no diverse training data exists.

Neonatal RRT · Jul 31, 2024

jsilver256 said:
"amount of compute" isn't equivalent to "infinite data," though. The observable spece will always be infinitely smaller than the possible space. ChatGPT has the benefit of the entire accumulated data of mankind, and has glaring repetitiveness, gaps, etc. OpenAI as a result is investing in synthetically generated text to improve this - which is problematic in and of itself.

Tesla is also investing in synthetic data - which to me is a bit of a flashing red light because, wait, 2.5 million Tesla cars aren't enough to provide this training data?

But we're getting into a difference-of-approaches here. My general thought is that when you are having to approach infinite data to get the results you want, then there is something semantic which is missing.

As another analogy, consider music and consider training on song recognition. If you trained a model on the raw sound and let ML "take care of the rest," then you'd need an infinite number of sound samples to be able to take this model to recognize the same song being played on pianos, horns, guitars, etc and be able to apply this recognition to any arbitrary instrument.

But if you take a step of abstraction back, and break down the music to its notation form - old fashioned music sheets - then the amount of training data required is exponentially less. But the "gotcha" part here is that it takes subsstantial study to "nail" the semantics, e.g., coming up with the right notation system.

This is in fact what my company is based on - it is solving a particular problem space for which almost no diverse training data exists.

As best I understand it, in order to create artificial generalized intelligence in a machine that interacts with its surroundings, it not only must recognize everything in its environment, but interact with it in a predictable manner, and make appropriate decisions. Furthermore, there must be a series of prime directives, as well as, in a sense, operate with a moral compass.

2.5 million cars are enough to provide the training data, but it wasn't until the past 6 months, or so, that Tesla had the amount of compute to analyze and use it. They were significantly compute limited. Prior to 2024, there were several reports of Tesla investing heavily into synthetic data, but in December of 2023 or so, they discovered that the ways they were approaching the problem were wrong. The Full Self Driving software was lacking in several areas, the cars were behaving like machines, and not like good human drivers. That all changed when Tesla completely changed their AI machine learning strategies. FSD 12.3 to 12.4 to 12.5 have shown dramatic improvements over the past 6 months.

It's one thing to interact with an AI chat bot, quite another for it to be a robot on wheels (the car) operating amongst the chaos of the streets and highways, and the next step being humanoid robots that are actively interacting with humans within the home and workplace. Future generations of the Optimus robots will be personal and professional assistants, care givers, and so on. We aren't talking about moving boxes in a warehouse. A whole different level. The Optimus robots are already performing tasks within the Tesla battery factory, with plans on scaling production in 2025-26. Things are moving forward quickly.

Generalized AI appears to be an entirely different "animal". Tesla discovered that humans writing code presented the kinds of limitations that makes true AI essentially impossible. We are way too slow. Humans should guide it along, analyze, screen good information from bad, create and release the software updates for the machines. Humans do play an important role, but as Tesla is figuring out, you pretty much have to let it learn on its own by giving it the proper data it needs. When you are "chasing the 9's" in terms of safety, capability, and reliability, with exponential growth in learning, there must be as many data points as possible. You need high levels of compute to perform this in a timely manner.

https://www.freethink.com/robots-ai...he legacy,the previous rule-based algorithms.

Will Performance Hit Diminishing Returns Eventually?

Neonatal RRT

Well-Known Member

Neonatal RRT

Well-Known Member

jsilver256

Well-Known Member

Neonatal RRT

Well-Known Member

Attachments

jsilver256

Well-Known Member

Neonatal RRT

Well-Known Member

New Threads