Update notes for MPDM series

patrick-llgc · patrick-llgc · commit 2bc600895b0a · 2024-06-20T21:43:56.000+08:00
diff --git a/README.md b/README.md
@@ -34,7 +34,7 @@ I regularly update [my blog in Toward Data Science](https://medium.com/@patrickl
 - [Multimodal Regression](https://towardsdatascience.com/anchors-and-multi-bin-loss-for-multi-modal-target-regression-647ea1974617)
 - [Paper Reading in 2019](https://towardsdatascience.com/the-200-deep-learning-papers-i-read-in-2019-7fb7034f05f7?source=friends_link&sk=7628c5be39f876b2c05e43c13d0b48a3)
 
-## 2024-06 (7)
+## 2024-06 (8)
 - [LINGO-1: Exploring Natural Language for Autonomous Driving](https://wayve.ai/thinking/lingo-natural-language-autonomous-driving/) [[Notes](paper_notes/lingo1.md)] [Wayve, open-loop world model]
 - [LINGO-2: Driving with Natural Language](https://wayve.ai/thinking/lingo-2-driving-with-language/) [[Notes](paper_notes/lingo2.md)] [Wayve, closed-loop world model]
 - [OpenVLA: An Open-Source Vision-Language-Action Model](https://arxiv.org/abs/2406.09246) [open source RT-2]
@@ -49,6 +49,7 @@ I regularly update [my blog in Toward Data Science](https://medium.com/@patrickl
 - [trajdata: A Unified Interface to Multiple Human Trajectory Datasets](https://arxiv.org/abs/2307.13924) <kbd>NeurIPS 2023</kbd> [Marco Pavone, Nvidia]
 - [Optimal Vehicle Trajectory Planning for Static Obstacle Avoidance using Nonlinear Optimization](https://arxiv.org/abs/2307.09466) [Xpeng]
 - [Jointly Learnable Behavior and Trajectory Planning for Self-Driving Vehicles](https://arxiv.org/abs/1910.04586) [[Notes](paper_notes/joint_learned_bptp.md)] <kbd>IROS 2019 Oral</kbd> [Uber ATG, behavioral planning, motion planning]
+- [HiVT: Hierarchical Vector Transformer for Multi-Agent Motion Prediction]
 - [Enhancing End-to-End Autonomous Driving with Latent World Model](https://arxiv.org/abs/2406.08481)
 - [OccNeRF: Advancing 3D Occupancy Prediction in LiDAR-Free Environments](https://arxiv.org/abs/2312.09243) [Jiwen Lu]
 - [RenderOcc: Vision-Centric 3D Occupancy Prediction with 2D Rendering Supervision](https://arxiv.org/abs/2309.09502) <kbd>ICRA 2024</kbd>
@@ -246,6 +247,8 @@ I regularly update [my blog in Toward Data Science](https://medium.com/@patrickl
 - [BEVDepth: Acquisition of Reliable Depth for Multi-view 3D Object Detection](https://arxiv.org/abs/2206.10092) [[Notes](paper_notes/bevdepth.md)] [BEVNet, NuScenes SOTA, Megvii]
 - [CVT: Cross-view Transformers for real-time Map-view Semantic Segmentation](https://arxiv.org/abs/2205.02833) [[Notes](paper_notes/cvt.md)] <kbd>CVPR 2022 oral</kbd> [UTAustin, Philipp]
 - [Wayformer: Motion Forecasting via Simple & Efficient Attention Networks](https://arxiv.org/abs/2207.05844) [[Notes](paper_notes/wayformer.md)] [Behavior prediction, Waymo]
+- [Hivt: Hierarchical vector transformer for multi-agent motion prediction](https://openaccess.thecvf.com/content/CVPR2022/papers/Zhou_HiVT_Hierarchical_Vector_Transformer_for_Multi-Agent_Motion_Prediction_CVPR_2022_paper.pdf) <kbd>CVPR 2022</kbd> [Zikang Zhou, agent-centric, motion prediction]
+- [QCNet: Query-Centric Trajectory Prediction](https://openaccess.thecvf.com/content/CVPR2023/papers/Zhou_Query-Centric_Trajectory_Prediction_CVPR_2023_paper.pdf) <kbd>CVPR 2023</kbd> [Zikang Zhou, scene-centric, motion prediction]
 
 ## 2022-06 (3)
 - [BEVDet4D: Exploit Temporal Cues in Multi-camera 3D Object Detection](https://arxiv.org/abs/2203.17054) [[Notes](paper_notes/bevdet4d.md)] [BEVNet]
@@ -1519,7 +1522,6 @@ Feature Extraction](https://arxiv.org/abs/2010.02893) [monodepth, semantics, Nav
 - [SAM: Segment Anything](https://arxiv.org/abs/2304.02643) [FAIR]
 - [GeoMIM: Towards Better 3D Knowledge Transfer via Masked Image Modeling for Multi-view 3D Understanding](https://arxiv.org/abs/2303.11325)
 - [Motion Prediction using Trajectory Sets and Self-Driving Domain Knowledge](https://arxiv.org/abs/2006.04767) [Encode Road requirement to prediction]
-- [Hivt: Hierarchical vector transformer for multi-agent motion prediction](https://openaccess.thecvf.com/content/CVPR2022/papers/Zhou_HiVT_Hierarchical_Vector_Transformer_for_Multi-Agent_Motion_Prediction_CVPR_2022_paper.pdf) <kbd>CVPR 2022</kbd>
 - [Transformer Feed-Forward Layers Are Key-Value Memories](https://arxiv.org/abs/2012.14913) <kbd>EMNLP 2021</kbd>
 - [BEV-LaneDet: a Simple and Effective 3D Lane Detection Baseline](https://arxiv.org/abs/2210.06006) <kbd>CVPR 2023</kbd> [BEVNet]
 - [Exploring Recurrent Long-term Temporal Fusion for Multi-view 3D Perception](https://arxiv.org/abs/2303.05970) [BEVNet, megvii]
diff --git a/learning_pnc/pnc_notes.md b/learning_pnc/pnc_notes.md
@@ -2,7 +2,7 @@
 
 ## Introduction
 - The notes were taken for the [Prediction, Decision and Planning for Autonomous driving](https://www.shenlanxueyuan.com/course/671) from Shenlan Xueyuan mooc course.
-- The lecturer is [Wenchao Ding](website: https://wenchaoding.github.io/personal/index.html), former engineer at Huawei and not AP at Fudan University.
+- The lecturer is [Wenchao Ding](website: https://wenchaoding.github.io/personal/index.html), former engineer at Huawei and now AP at Fudan University.
 
 # Model-based Prediction
 ## Overview
@@ -463,11 +463,13 @@
 
 - Continuous state with belief MDP
     - Put complexity into state transition, and solve with ML.
-- Normal solution: MPC, with limited lookup ahead (forward simulation).
-    - MCTS
+- Normal solution: MPC-like , with limited lookup ahead (forward simulation).
+    - MCTS with forward simulation
+    - The complexity lies in multi-agent interaction roll out, and branching out
+    - MPC: receding horizon planning
 - Plan in MDP
-    - Assuming the most likely belief is the real state. In the ULT case, assuming the most likely behavior of the other car to be reality, and act accordingly.
-    - Cannot actively collect information. This is actually the charm of POMDP’s intelligence. POMDP will lead to some action that actively collects information.
+    - Approximate POMDP as MDP. Assuming the most likely belief argmax(b) is the real state. In the ULT case, assuming the most likely behavior of the other car to be reality, and act accordingly.
+    - MDP cannot actively collect information. This is actually the charm of POMDP’s intelligence. POMDP will lead to some action that actively collects information.
 
 ## EPSILON
 
diff --git a/paper_notes/eudm.md b/paper_notes/eudm.md
@@ -11,6 +11,9 @@ In order to make POMDP more tractable it is essential to incorporate domain know
 
 In EUDM, ego behavior is allowed to change, allowing more flexible decision making than MPDM. This allows EUDM can make a lane-change decision even before passing the blocking vehicle (accelerate, then lane change).
 
+![](https://pic3.zhimg.com/80/v2-a7778368cbf39f083ef5ad5a2f931a4e_1440w.webp)
+
+
 EUDM does guided branching in both action (of ego) and intention (of others).
 
 EUDM couples prediction and planning module. 
@@ -20,7 +23,7 @@ It is further improved by [MARC](marc.md) where it considers risk-aware continge
 #### Key ideas
 - DCP-Tree (domain specific closed-loop policy tree), ego-centric
 	- Guided branching in action space
-	- Each trace only contains ONE change of action (more flexible than MPDM but still manageable).
+	- Each trace only contains ONE change of action (more flexible than MPDM but still manageable). This is a tree with pruning mechanism built-in. [MCDM](mcdm.md) essentially has a much more aggressive pruning as only one type of action is allowed (KKK, RRR, LLL, etc)
 	- Each semantic action is 2s, 4 levels deep, so planning horizon of 8s.
 - CFB (conditional focused branching), for other agents
 	- conditioned on ego intention
diff --git a/paper_notes/marc.md b/paper_notes/marc.md
@@ -7,18 +7,25 @@ tl;dr: Generating safe and non-conservative behaviors in dense dynamic environme
 #### Overall impression
 This is a continuation of work in [MPDM](mpdm.md) and [EUDM](eudm.md). It introduces dynamic branching based on scene-level divergence, and risk-aware contingency planning based on user-defined risk tolerance.
 
-POMDP provides a theoretically sounds framework to handle dynamic interaction, but it suffers from curse of dimensionality and making it infeasible to solve in realtime.
+POMDP provides a theoretically sounds framework to handle dynamic interaction, but it suffers from curse of dimensionality and making it infeasible to solve in realtime. 
 
-* [MPDM](mpdm.md) prunes belief trees heavily and decomposes POMDP into a limited number of closed-loop policy evaluations. MPDM has only one ego policy over planning horizon (8s). Mainly BP. 
-* EUDM improves by having multiple (2) policy in planning horizon, and performs DCP-Tree and CFB (conditoned focused branching) to use domain specific knowledge to guide branching in both action and intention space. Mainly BP.
-* MARC performs risk-aware contigency planning based on multiple scenarios. And it combines BP and MP.
-	* All previous MPDM-like methods consider the optimal policy and single trajectory generation over all scenarios, resulting in lack of gurantee of policy consistency and loss of multimodality info.
+[MPDM](mpdm.md) and [EUDM](eudm.md) are mainly BP models, but [MARC](marc.md) combines BP and MP.
+
+ belief trees heavily and decomposes POMDP into a limited number of closed-loop policy evaluations.
+
+For the policy tree (or policy-conditioned scenario tree) building, we can see how the tree got built with more and more careful pruning process with improvements from different works. 
+
+* [MPDM](mpdm.md) is the pioneering work prunes belief trees heavily and decomposes POMDP into a limited number of closed-loop policy evaluations. MPDM has only one ego policy over planning horizon (8s).
+* [MPDM](mpdm.md) iterates over all ego policies, and uses the most likely one policy given road structure and pose of vehicle.
+* [MPDM2](mpdm2.md) iterates over all ego policies, and iterate over (a set of) possible policies of other agents predicted by a motion prediction model.
+* [EUDM](eudm.md) itrates all ego policies, and then iterate over all possible policies of other agents to identify **critical scenarios** (CFB, conditioned filtered branching). Guide branching in both action and intention space. [EPSILON](epsilon.md) used the same method.
+* [MARC](marc.md) iterates all ego policies, iterates over a set of predicted policies of other agents, identifies **key agents** (and ignores other agents even in critical scenarios). 
+
+
+All previous MPDM-like methods consider the optimal policy and single trajectory generation over all scenarios, resulting in lack of gurantee of policy consistency and loss of multimodality info.
 
 #### Key ideas
-- Planning is hard from uncertainty and interaction (inherently multimodal intentions). 
-	- For interactive decision making, MDP or POMDP are mathematically rigorous formulations for decision processes in stochastic environments. 
-	- For static (non-interactive) decision making, the normal trioka of planninig (sampling, searching, optimization) would suffice.
-- *Contigency planning* generates deterministic behavior for mulutiple future scenarios. In other words, it plans a short-term trajectory that ensures safety for all potential scenarios.
+- *Contigency planning* generates deterministic behavior for mulutiple future scenarios. In other words, it plans a short-term trajectory that ensures safety for all potential scenarios. --> This is very similar to the idea of *backup plan* in [EPSILON](epsilon.md).
 - Scenario tree construction
 	- generating policy-conditioned critical scenario sets via closed-loop forward simulation (similar to CFB in EUDM?).
 	- building scenario tree with scene-level divergence assessment. Determine the latest timestamp at which the scenario diverge. Delaying branching time as much as possble.
@@ -35,7 +42,9 @@ POMDP provides a theoretically sounds framework to handle dynamic interaction, b
 	- with better effiency (avg speed) and riding comfort (max decel/acc).
 
 #### Technical details
-- Summary of technical details, such as important training details, or bugs of previous benchmarks.
+- Planning is hard from uncertainty and interaction (inherently multimodal intentions). 
+	- For interactive decision making, MDP or POMDP are mathematically rigorous formulations for decision processes in stochastic environments. 
+	- For static (non-interactive) decision making, the normal trioka of planninig (sampling, searching, optimization) would suffice.
 
 #### Notes
 - Questions and notes on how to improve/revise the current work
diff --git a/paper_notes/mpdm.md b/paper_notes/mpdm.md
@@ -33,5 +33,4 @@ Despite simple design, MPDM is a pioneering work in decision making, and improve
 - Summary of technical details, such as important training details, or bugs of previous benchmarks.
 
 #### Notes
-- Questions and notes on how to improve/revise the current work
-
+- The white paper from [May Mobility](https://maymobility.com/resources/autonomy-at-scale-white-paper/) explains the idea with more plain language and examples. 
diff --git a/paper_notes/mpdm2.md b/paper_notes/mpdm2.md
@@ -5,18 +5,9 @@ _June 2024_
 tl;dr: Improvement of MPDM in predicting the intention of other vehicles.
 
 #### Overall impression
-The majority is the same as the previous work [MPDM](mpdm.md). 
-
-For the policy tree (or policy-conditioned scenario tree) building, we can see how the tree got built with more and more careful pruning process with improvements from different works.
-
-* [MPDM](mpdm.md) iterates over all ego policies, and uses the most likely one policy given road structure and pose of vehicle.
-* [MPDM2](mpdm2.md) iterates over all ego policies, and iterate over (a set of) possible policies of other agents predicted by a motion prediction model.
-* [EUDM](eudm.md) itrates all ego policies, and then iterate over all possible policies of other agents to identify **critical scenarios** (CFB, conditioned filtered branching). [EPSILON](epsilon.md) used the same method.
-* [MARC](marc.md) iterates all ego policies, iterates over a set of predicted policies of other agents, identifies **key agents** (and ignores other agents even in critical scenarios). 
-
-
-![](https://pic3.zhimg.com/80/v2-a7778368cbf39f083ef5ad5a2f931a4e_1440w.webp)
+The majority is the same as the previous work [MPDM](mpdm.md). There is a follow up article on this as well [MPDM3](https://link.springer.com/article/10.1007/s10514-017-9619-z) which expands [MPDM2](mpdm2.md) with more experiments, but with the same methodology.
 
+So the main idea of MPDM is already covered in the original short paper [MPDM](mpdm.md).
 
 #### Key ideas
 - Motion prediction of other agents with a classical ML methods (Maximum likelihood estimation).