Large (Vision) Language Models for Autonomous Vehicles: Current Trends and Future Directions

Abstract

As autonomous vehicles (AVs) advance, the integration of Large (Vision) Language Models (L(V)LMs) has emerged as a promising approach to enhance AV capabilities in perception, planning, decision-making, and data generation. However, the practical challenges of incorporating L(V)LMs into AV systems, including computational efficiency, real-time processing, and ethical considerations, remain under-explored. This survey aims to provide a comprehensive review of the current research on L(V)LM applications in AVs, focusing on four key areas: modular integration, end-to-end integration, data generation, and evaluation platforms. Our findings highlight the potential of L(V)LMs to improve AV system performance but emphasise the need for further research in real-world integration, regulatory challenges, and V2X communication. This survey offers valuable insights and guidance for researchers and practitioners aiming to optimise L(V)LMs in autonomous vehicles.

Type
Publication
Submitted to IEEE Transactions on Intelligent Transportation Systems (T-ITS)