Edoardo Palladin* Samuel Brucker*
Filippo Ghilotti Praveen Narayanan Mario Bijelic Felix Heide
Outside of urban hubs, autonomous cars and trucks have to master driving on intercity highways. Safe, long-distance highway travel at speeds exceeding 100 km/h demands perception distances of at least 250 m, which is about five times the 50–100m typically addressed in city driving, to allow sufficient planning and braking margins. Increasing the perception ranges also allows to extend autonomy from light two-ton passenger vehicles to large-scale forty-ton trucks, which need a longer planning horizon due to their high inertia. However, most existing perception approaches focus on shorter ranges and rely on Bird’s Eye View (BEV) representations, which incur quadratic increases in memory and compute costs as distance grows. To overcome this limitation, we built on top of a sparse representation and introduced an efficient 3D encoding of multi-modal and temporal features, along with a novel self-supervised pretraining scheme that enables large-scale learning from unlabeled camera-LiDAR data by predicting and forecasting the scene geometry. Our approach efficiently fuses camera and lidar data, enables to extend perception distances to 250 meters and achieves an 26.6% improvement in mAP in object detection and a decrease of 30.5% in Chamfer Distance in LiDAR forecasting compared to existing methods, reaching distances up to 250 meters.
Coming Soon ...
If you find this work useful, please cite our paper:
@inproceedings{PalladinAndBruckerLRS4Fusion,
title={Self-Supervised Sparse Sensor Fusion for Long Range Perception},
author={Palladin, Edoardo and Brucker, Samuel and Ghilotti, Filippo and Narayanan, Praveen and Bijelic, Mario and Heide, Felix},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
year={2025}
}
For questions or feedback, please contact the authors via the project page.