File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: ST-P3: End-to-End Vision-Based Autonomous Driving via Spatial-Temporal Feature Learning

TitleST-P3: End-to-End Vision-Based Autonomous Driving via Spatial-Temporal Feature Learning
Authors
Issue Date2022
Citation
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2022, v. 13698 LNCS, p. 533-549 How to Cite?
AbstractMany existing autonomous driving paradigms involve a multi-stage discrete pipeline of tasks. To better predict the control signals and enhance user safety, an end-to-end approach that benefits from joint spatial-temporal feature learning is desirable. While there are some pioneering works on LiDAR-based input or implicit design, in this paper we formulate the problem in an interpretable vision-based setting. In particular, we propose a spatial-temporal feature learning scheme towards a set of more representative features for perception, prediction and planning tasks simultaneously, which is called ST-P3. Specifically, an egocentric-aligned accumulation technique is proposed to preserve geometry information in 3D space before the bird’s eye view transformation for perception; a dual pathway modeling is devised to take past motion variations into account for future prediction; a temporal-based refinement unit is introduced to compensate for recognizing vision-based elements for planning. To the best of our knowledge, we are the first to systematically investigate each part of an interpretable end-to-end vision-based autonomous driving system. We benchmark our approach against previous state-of-the-arts on both open-loop nuScenes dataset as well as closed-loop CARLA simulation. The results show the effectiveness of our method. Source code, model and protocol details are made publicly available at https://github.com/OpenPerceptionX/ST-P3.
Persistent Identifierhttp://hdl.handle.net/10722/351457
ISSN
2023 SCImago Journal Rankings: 0.606

 

DC FieldValueLanguage
dc.contributor.authorHu, Shengchao-
dc.contributor.authorChen, Li-
dc.contributor.authorWu, Penghao-
dc.contributor.authorLi, Hongyang-
dc.contributor.authorYan, Junchi-
dc.contributor.authorTao, Dacheng-
dc.date.accessioned2024-11-20T03:56:24Z-
dc.date.available2024-11-20T03:56:24Z-
dc.date.issued2022-
dc.identifier.citationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2022, v. 13698 LNCS, p. 533-549-
dc.identifier.issn0302-9743-
dc.identifier.urihttp://hdl.handle.net/10722/351457-
dc.description.abstractMany existing autonomous driving paradigms involve a multi-stage discrete pipeline of tasks. To better predict the control signals and enhance user safety, an end-to-end approach that benefits from joint spatial-temporal feature learning is desirable. While there are some pioneering works on LiDAR-based input or implicit design, in this paper we formulate the problem in an interpretable vision-based setting. In particular, we propose a spatial-temporal feature learning scheme towards a set of more representative features for perception, prediction and planning tasks simultaneously, which is called ST-P3. Specifically, an egocentric-aligned accumulation technique is proposed to preserve geometry information in 3D space before the bird’s eye view transformation for perception; a dual pathway modeling is devised to take past motion variations into account for future prediction; a temporal-based refinement unit is introduced to compensate for recognizing vision-based elements for planning. To the best of our knowledge, we are the first to systematically investigate each part of an interpretable end-to-end vision-based autonomous driving system. We benchmark our approach against previous state-of-the-arts on both open-loop nuScenes dataset as well as closed-loop CARLA simulation. The results show the effectiveness of our method. Source code, model and protocol details are made publicly available at https://github.com/OpenPerceptionX/ST-P3.-
dc.languageeng-
dc.relation.ispartofLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)-
dc.titleST-P3: End-to-End Vision-Based Autonomous Driving via Spatial-Temporal Feature Learning-
dc.typeConference_Paper-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1007/978-3-031-19839-7_31-
dc.identifier.scopuseid_2-s2.0-85142756667-
dc.identifier.volume13698 LNCS-
dc.identifier.spage533-
dc.identifier.epage549-
dc.identifier.eissn1611-3349-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats