A common misconception is that self-driving car companies (outside of a few smaller startups) are using RL to drive the car. They are not. They use deep learning for perception systems which produce tangible outputs that can be processed by what amounts to expert systems.
I work in this space and even if you could assume the RL would never make a mistake it's not auditable in the way you would need it to be for things like insurance. In general, RL isn't ready to be used in complex situations where people can die when things go bad. This ignores the sample efficiency challenges and handling unseen data.
I work in this space and even if you could assume the RL would never make a mistake it's not auditable in the way you would need it to be for things like insurance. In general, RL isn't ready to be used in complex situations where people can die when things go bad. This ignores the sample efficiency challenges and handling unseen data.