File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1007/978-3-031-19839-7_31
- Scopus: eid_2-s2.0-85142756667
- Find via
Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Conference Paper: ST-P3: End-to-End Vision-Based Autonomous Driving via Spatial-Temporal Feature Learning
Title | ST-P3: End-to-End Vision-Based Autonomous Driving via Spatial-Temporal Feature Learning |
---|---|
Authors | |
Issue Date | 2022 |
Citation | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2022, v. 13698 LNCS, p. 533-549 How to Cite? |
Abstract | Many existing autonomous driving paradigms involve a multi-stage discrete pipeline of tasks. To better predict the control signals and enhance user safety, an end-to-end approach that benefits from joint spatial-temporal feature learning is desirable. While there are some pioneering works on LiDAR-based input or implicit design, in this paper we formulate the problem in an interpretable vision-based setting. In particular, we propose a spatial-temporal feature learning scheme towards a set of more representative features for perception, prediction and planning tasks simultaneously, which is called ST-P3. Specifically, an egocentric-aligned accumulation technique is proposed to preserve geometry information in 3D space before the bird’s eye view transformation for perception; a dual pathway modeling is devised to take past motion variations into account for future prediction; a temporal-based refinement unit is introduced to compensate for recognizing vision-based elements for planning. To the best of our knowledge, we are the first to systematically investigate each part of an interpretable end-to-end vision-based autonomous driving system. We benchmark our approach against previous state-of-the-arts on both open-loop nuScenes dataset as well as closed-loop CARLA simulation. The results show the effectiveness of our method. Source code, model and protocol details are made publicly available at https://github.com/OpenPerceptionX/ST-P3. |
Persistent Identifier | http://hdl.handle.net/10722/351457 |
ISSN | 2023 SCImago Journal Rankings: 0.606 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Hu, Shengchao | - |
dc.contributor.author | Chen, Li | - |
dc.contributor.author | Wu, Penghao | - |
dc.contributor.author | Li, Hongyang | - |
dc.contributor.author | Yan, Junchi | - |
dc.contributor.author | Tao, Dacheng | - |
dc.date.accessioned | 2024-11-20T03:56:24Z | - |
dc.date.available | 2024-11-20T03:56:24Z | - |
dc.date.issued | 2022 | - |
dc.identifier.citation | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2022, v. 13698 LNCS, p. 533-549 | - |
dc.identifier.issn | 0302-9743 | - |
dc.identifier.uri | http://hdl.handle.net/10722/351457 | - |
dc.description.abstract | Many existing autonomous driving paradigms involve a multi-stage discrete pipeline of tasks. To better predict the control signals and enhance user safety, an end-to-end approach that benefits from joint spatial-temporal feature learning is desirable. While there are some pioneering works on LiDAR-based input or implicit design, in this paper we formulate the problem in an interpretable vision-based setting. In particular, we propose a spatial-temporal feature learning scheme towards a set of more representative features for perception, prediction and planning tasks simultaneously, which is called ST-P3. Specifically, an egocentric-aligned accumulation technique is proposed to preserve geometry information in 3D space before the bird’s eye view transformation for perception; a dual pathway modeling is devised to take past motion variations into account for future prediction; a temporal-based refinement unit is introduced to compensate for recognizing vision-based elements for planning. To the best of our knowledge, we are the first to systematically investigate each part of an interpretable end-to-end vision-based autonomous driving system. We benchmark our approach against previous state-of-the-arts on both open-loop nuScenes dataset as well as closed-loop CARLA simulation. The results show the effectiveness of our method. Source code, model and protocol details are made publicly available at https://github.com/OpenPerceptionX/ST-P3. | - |
dc.language | eng | - |
dc.relation.ispartof | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | - |
dc.title | ST-P3: End-to-End Vision-Based Autonomous Driving via Spatial-Temporal Feature Learning | - |
dc.type | Conference_Paper | - |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1007/978-3-031-19839-7_31 | - |
dc.identifier.scopus | eid_2-s2.0-85142756667 | - |
dc.identifier.volume | 13698 LNCS | - |
dc.identifier.spage | 533 | - |
dc.identifier.epage | 549 | - |
dc.identifier.eissn | 1611-3349 | - |