Text this: Multi-scenario benchmark for autonomous driving systems: Exposing diverse behavioral anomalies