Add accurate uncertainties for dl using calibrated regression

patrick-llgc · patrick-llgc · commit 696e8ac13151 · 2019-11-21T00:02:40.000-08:00
diff --git a/README.md b/README.md
@@ -19,7 +19,8 @@ This repository contains my paper reading notes on deep learning and machine lea
 
 The sections below records paper reading activity in chronological order. See notes organized according to subfields [here](organized.md) (up to 06-2019). 
 
-## 2019-11 (9)
+## 2019-11 (15)
+- [Vehicle Detection With Automotive Radar Using Deep Learning on Range-Azimuth-Doppler Tensors](http://openaccess.thecvf.com/content_ICCVW_2019/papers/CVRSUAD/Major_Vehicle_Detection_With_Automotive_Radar_Using_Deep_Learning_on_Range-Azimuth-Doppler_ICCVW_2019_paper.pdf) [[Notes](paper_notes/radar_iccv.md)] <kbd>ICCV 2019</kbd>
 - [GPP: Ground Plane Polling for 6DoF Pose Estimation of Objects on the Road](https://arxiv.org/abs/1811.06666) \[[Notes](paper_notes/gpp.md)] (UCSD, mono 3DOD)
 - [MVRA: Multi-View Reprojection Architecture for Orientation Estimation](http://openaccess.thecvf.com/content_ICCVW_2019/papers/ADW/Choi_Multi-View_Reprojection_Architecture_for_Orientation_Estimation_ICCVW_2019_paper.pdf) [[Notes](paper_notes/mvra.md)] <kbd>ICCV 2019</kbd>
 - [YOLOv3: An Incremental Improvement](https://pjreddie.com/media/files/papers/YOLOv3.pdf)
@@ -30,14 +31,14 @@ The sections below records paper reading activity in chronological order. See no
 - [Can We Trust You? On Calibration of a Probabilistic Object Detector for Autonomous Driving](https://arxiv.org/abs/1909.12358) [[Notes](paper_notes/towards_safe_ad_calib.md)] <kbd>IROS 2019</kbd> (DriveU)
 - [LaserNet: An Efficient Probabilistic 3D Object Detector for Autonomous Driving](https://arxiv.org/abs/1903.08701) [[Notes](paper_notes/lasernet.md)] <kbd>CVPR 2019</kbd> (uncertainty)
 - [LaserNet KL: Learning an Uncertainty-Aware Object Detector for Autonomous Driving](https://arxiv.org/abs/1910.11375) \[[Notes](paper_notes/lasernet_kl.md)] (LaserNet with KL divergence)
-- [Sampling-free Epistemic Uncertainty Estimation Using Approximated Variance Propagation](https://arxiv.org/abs/1908.00598) <kbd>ICCV 2019</kbd> (Uncertainty)
 - [IoUNet: Acquisition of 	Localization Confidence for Accurate Object Detection](https://arxiv.org/abs/1807.11590) [[Notes](paper_notes/iou_net.md)] <kbd>ECCV 2018</kbd>
 - [gIoU: Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression](https://arxiv.org/abs/1902.09630) [[Notes](paper_notes/giou.md)] <kbd>CVPR 2019</kbd>
 - [KL Loss: Bounding Box Regression with Uncertainty for Accurate Object Detection](https://arxiv.org/abs/1809.08545) [[Notes](paper_notes/kl_loss.md)] <kbd>CVPR 2019</kbd>
 - [CAM-Convs: Camera-Aware Multi-Scale Convolutions for Single-View Depth](https://arxiv.org/abs/1904.02028) [[Notes](paper_notes/cam_conv.md)] <kbd>CVPR 2019</kbd>
 - [BayesOD: A Bayesian Approach for Uncertainty Estimation in Deep Object Detectors](https://arxiv.org/abs/1903.03838) [[Notes](paper_notes/bayes_od.md)]
-- [Multi-Task Learning of Depth from Tele and Wide Stereo Image Pairs](https://ieeexplore.ieee.org/abstract/document/8803566) <kbd>ICIP 2019</kbd>
 - [TW-SMNet: Deep Multitask Learning of Tele-Wide Stereo Matching](https://arxiv.org/abs/1906.04463) [[Notes](paper_notes/twsm_net.md)] <kbd>ICIP 2019</kbd>
+- [Accurate Uncertainties for Deep Learning Using Calibrated Regression](https://arxiv.org/abs/1807.00263) [[Notes](paper_notes/dl_regression_calib.md)] <kbd>ICML 2018</kbd>
+- [Calibrating Uncertainties in Object Localization Task](https://arxiv.org/abs/1811.11210) [[Notes](paper_notes/2dod_calib.md)] <kbd>NIPS 2018</kbd>
 - [Classification of Objects in Polarimetric Radar Images Using CNNs at 77 GHz](http://sci-hub.tw/10.1109/APMC.2017.8251453) (Radar, polar) <-- todo
 - [Gated2Depth: Real-time Dense Lidar from Gated Images](https://arxiv.org/abs/1902.04997) <kbd>ICCV 2019 oral</kbd>
 - [PifPaf: Composite Fields for Human Pose Estimation](https://arxiv.org/abs/1903.06593) <kbd>CVPR 2019</kbd>
@@ -48,21 +49,18 @@ The sections below records paper reading activity in chronological order. See no
 - [Eliminating the Blind Spot: Adapting 3D Object Detection and Monocular Depth Estimation to 360° Panoramic Imagery](https://arxiv.org/abs/1808.06253) <kbd>ECCV 2018</kbd> (Monocular 3D object detection and depth estimation)
 - [On Calibration of Modern Neural Networks](https://arxiv.org/abs/1706.04599) <kbd>ICML 2017</kbd> (Weinberger)
 - [Measuring Calibration in Deep Learning](https://arxiv.org/abs/1904.01685) <kbd>CVPR 2019</kbd>
-- [Calibrating uncertainties in object localization task](https://arxiv.org/abs/1811.11210)
 - [Probabilistic Object Detection: Definition and Evaluation](https://arxiv.org/abs/1811.10800)
 - [Sampling-free Epistemic Uncertainty Estimation Using Approximated Variance Propagation](https://arxiv.org/abs/1908.00598) <kbd>ICCV 2019</kbd> (epistemic uncertainty)
-- [Vehicle Detection With Automotive Radar Using Deep Learning on Range-Azimuth-Doppler Tensors](http://openaccess.thecvf.com/content_ICCVW_2019/papers/CVRSUAD/Major_Vehicle_Detection_With_Automotive_Radar_Using_Deep_Learning_on_Range-Azimuth-Doppler_ICCVW_2019_paper.pdf) <kbd>ICCV 2019</kbd>
 - [Deep Learning Based 3D Object Detection for Automotive Radar and Camera](https://www.astyx.com/fileadmin/redakteur/dokumente/Deep_Learning_Based_3D_Object_Detection_for_Automotive_Radar_and_Camera.PDF) (Astyx)
 - [Automotive Radar Dataset for Deep Learning Based 3D Object Detection](https://www.astyx.com/fileadmin/redakteur/dokumente/Automotive_Radar_Dataset_for_Deep_learning_Based_3D_Object_Detection.PDF) (Astyx)
 - [End-to-end Lane Detection through Differentiable Least-Squares Fitting](https://arxiv.org/abs/1902.00293) <kbd>ICCV 2019</kbd>
-- [Accurate Uncertainties for Deep Learning Using Calibrated Regression](https://arxiv.org/abs/1807.00263) <kbd>ICML 2018</kbd>
-- [Calibrating Uncertainties in Object Localization Task](https://arxiv.org/abs/1811.11210)
 - [Momentum Contrast for Unsupervised Visual Representation Learning](https://arxiv.org/abs/1911.05722) (Kaiming He)
 - [Frustum ConvNet: Sliding Frustums to Aggregate Local Point-Wise Features for Amodal 3D Object Detection](https://arxiv.org/abs/1903.01864) <kbd>IROS 2019</kbd>
 - [Dropout Sampling for Robust Object Detection in Open-Set Conditions](https://arxiv.org/abs/1710.06677) <kbd>ICRA 2018</kbd> (Niko Sünderhauf)
 - [Evaluating Merging Strategies for Sampling-based Uncertainty Techniques in Object Detection](https://arxiv.org/abs/1809.06006) <kbd>ICRA 2019</kbd> (Niko Sünderhauf)
 - [Sparse-to-Dense: Depth Prediction from Sparse Depth Samples and a Single Image](https://arxiv.org/abs/1709.07492) <kbd>ICRA 2018</kbd> (depth completion)
 - [Self-supervised Sparse-to-Dense: Self-supervised Depth Completion from LiDAR and Monocular Camera](https://arxiv.org/abs/1807.00275) <kbd>ICRA 2019</kbd>  (depth completion)
+- [Enhancing self-supervised monocular depth estimation with traditional visual odometry](https://arxiv.org/abs/1908.03127) <kbd>3DV 2019</kbd> (sparse to dense)
 
 ## 2019-10 (18)
 - [Review of monocular object detection](paper_notes/review_mono_3dod.md)
diff --git a/paper_notes/dl_regression_calib.md b/paper_notes/dl_regression_calib.md
@@ -0,0 +1,39 @@
+# [Accurate Uncertainties for Deep Learning Using Calibrated Regression](https://arxiv.org/abs/1807.00263)
+
+_November 2019_
+
+tl;dr: Extends NN calibration from classification to regression.
+
+#### Overall impression
+The paper has a great introduction to the background of model calibration, and also summarizes the classification calibration really well.
+
+The method can give calibrated credible intervals given sufficient amount of iid data.
+
+For application of this in object detection, see [calibrating uncertainties in object detection](2dod_calib.md) and [can we trust you](towards_safe_ad_calib.md).
+
+#### Key ideas
+- For regression, the regressor H outputs at each step t a CDF $F_t$ targeting $y_t$. 
+- A calibrated regressor H satisfies
+$$\frac{1}{T}\sum_{t=1}^T\mathbb{I}\{y_t \le F_t^{-1}(p)\} = p$$ for all $p \in (0, 1)$. This notion of calibration also extends to general confidence intervals. 
+- The calibration is usually measured with a calibration plot (aka reliability plot)
+	- For classification, divide pred $p_t$ into intervals $I_t$, then it plots the predicted average x = $mean(p_t)$ vs empirical average y = $mean(y_t)$, for $p_t \in I_t$.
+	- For regression, construct dataset 
+$$\mathcal{D} =\{F_t(y_t), \frac{1}{T}\sum_{\tau=1}^T\mathbb{I}\{F_\tau(y_\tau) \le F_t(y_t) \} \}_{t=1}^T$$ 
+As approximation, divide to bins $I_t$, for $p_t \in I_t$, plots the predicted average x = $mean(p_t)$, vs the empirical average y = $ \frac{1}{T}\sum_{\tau=1}^T\mathbb{I}\{F_\tau(y_\tau) \le p_t \}$. Then fit a model (e.g., isotonic regression) on this dataset.
+	- For example, for p - 0.95, if only 80/100 observed $y_t$ fall below the 95% quantile of $F_t$, then adjust the 95% to 80%.
+
+#### Technical details
+- Evaluation: calibration error
+$$CalErr = \sum_j w_j (p_j - \hat{p_j})^2$$
+	- cf ECE (expected calibration error) from [can we trust you](towards_safe_ad_calib.md)
+
+#### Notes
+- [model calibration in the sense of cls](https://pyvideo.org/pycon-israel-2018/model-calibration-is-your-model-ready-for-the-real-world.html)
+- Platt scaling just uses a logistic regression on the output of the model. See [this video](https://pyvideo.org/pycon-israel-2018/model-calibration-is-your-model-ready-for-the-real-world.html) for details. It recalibrates the predictions of a pre-trained classifier in a post-processing step. Thus it is classifier agnostic.
+- [isotonic regression (保序回归)](https://scikit-learn.org/stable/auto_examples/plot_isotonic_regression.html) is a piece-wise constant function that finds a non-decreasing approximation of any function.
+
+```python
+ir = IsotonicRegression() # or LogisticRegression()
+ir.fit(p_holdout, y_holdout)
+p_calibrated = ir.transform(p_holdout)
+```
diff --git a/paper_notes/towards_safe_ad_calib.md b/paper_notes/towards_safe_ad_calib.md
@@ -7,7 +7,7 @@ tl;dr: Calibration of the network for a probabilistic object detector
 #### Overall impression
 The paper extends previous works in the [probabilistic lidar detector](towards_safe_ad.md) and its [successor](towards_safe_ad2.md). It is based on the work of Pixor. 
 
-Calibration: a probabilistic object detector should predict uncertainties that match the natural frequency of correct predictions. 90% of the predictions with 0.9 score from a calibrated detector should be correct. Humans have intuitive notion of probability in a frequentist sense. 
+Calibration: a probabilistic object detector should predict uncertainties that match the natural frequency of correct predictions. 90% of the predictions with 0.9 score from a calibrated detector should be correct. Humans have intuitive notion of probability in a frequentist sense. --> cf [accurate uncertainty via calibrated regression](dl_regression_calib.md).
 
 A calibrated regression is a bit harder to interpret. P(gt < F^{-1}(p)) = p. F^{-1} = F_q is the inverse function of CDF, the quantile function.
 
@@ -26,7 +26,7 @@ The paper also has a very good way to visualize uncertainty in 2D object detecto
 
 $$ECE = \sum_i^M \frac{N_m}{N}|p^m - \hat{p^m}|$$
 
-- Isotonic regression
+- Isotonic regression (保序回归)
 	- During test time, the object detector produced an uncalibrated uncertainty, then corrected by the recalib model g(). In practice, we build a recalib dataset from validation data.
 	- Post-processing, does not guarantee recalibration of individual prediction (only by bins). 
 	- It changes probability distribution, Gaussian --> Non-Gaussian
diff --git a/paper_notes/twsm_net.md b/paper_notes/twsm_net.md
@@ -12,7 +12,9 @@ The paper is among the first to fuse stereo pairs with different focal length. F
 - Single image depth estimation on the wide FoV has better performance on the periphery, but not so much in the overlapped FoV.
 - TW-SMNet merges the depth of the two. The single image depth estimation branch forces the network to learn semantics. Actually only the stereo matching prediction is used during inference. **The single image depth estimation is used as auxiliary training branch**.
 - Proper fusion of the two predictions can also improve performance (the paper has a long discussion on how to fuse them)
-	- the authors fused the input from the initial results (absolute metric value) from the stereo matching in tele FoV to the wide FoV raw image. This idea is similar to the [sparse to dense](sparse_to_dense.md).
+	- input fusion: the authors fused the input from the initial results (absolute metric value) from the stereo matching in tele FoV to the wide FoV raw image. This idea is similar to the [sparse to dense](sparse_to_dense.md).
+	- output fusion: pixel-wise decision selection. --> This leads to abrupt change in depths. Need to use global smoother such as FGS (Fast global smoother).
+	- deep fusion of depth uses robust regression as second stage refinement.
 
 #### Technical details
 - **classification-based robust regression** loss, by classifying regression target range into bins, then predict. Note that no cross entropy loss is added. The loss is on the soft prediction (weighted average of bin centers by the scores past softmax) --> this is very similar to the multi-bin loss proposed by [deep3dbox](deep3dbox.md).
@@ -22,4 +24,6 @@ The paper is among the first to fuse stereo pairs with different focal length. F
 #### Notes
 - Kitti's stereo pairs has a baseline of 54 cm. Human has baseline of 6 cm. Most trifocal lens system on the market has a couple of cm, smaller than human eye, and thus not much disparity.
 - Note on the results: merging the two actually finds the middle ground between the TW-SMNet models T and W. With stereo info the intersected FoV has much better depth estimation than single image based model.
+- Publication at ICIP 2019 [Multi-Task Learning of Depth from Tele and Wide Stereo Image Pairs](https://ieeexplore.ieee.org/abstract/document/8803566) <kbd>ICIP 2019</kbd>
+