Skip to content

Commit 1508911

Browse files
committed
Rename depth in the wild to LearnK
1 parent 7b951c0 commit 1508911

File tree

8 files changed

+14
-8
lines changed

8 files changed

+14
-8
lines changed

README.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -456,11 +456,17 @@ Crosswalk Behavior](http://openaccess.thecvf.com/content_ICCV_2017_workshops/pap
456456
- [An intriguing failing of convolutional neural networks and the CoordConv solution](https://arxiv.org/abs/1807.03247) [[Notes](paper_notes/coord_conv.md)] <kbd>NIPS 2018</kbd>
457457

458458

459+
460+
## 2019-08 (0)
461+
- [Detection in Crowded Scenes: One Proposal, Multiple Predictionn](https://arxiv.org/abs/2003.09163) <kbd>CVPR 2020 oral</kbd> [Megvii]
462+
- [BorderDet: Border Feature for Dense Object Detection](https://arxiv.org/abs/2007.11056) <kbd>ECCV 2020 oral</kbd> [Megvii]
463+
464+
459465
## 2019-07 (19)
460466
- [Deep Parametric Continuous Convolutional Neural Networks](http://openaccess.thecvf.com/content_cvpr_2018/papers/Wang_Deep_Parametric_Continuous_CVPR_2018_paper.pdf) [[Notes](paper_notes/parametric_cont_conv.md)] <kbd>CVPR 2018</kbd> (@Uber, sensor fusion)
461467
- [ContFuse: Deep Continuous Fusion for Multi-Sensor 3D Object Detection](http://openaccess.thecvf.com/content_ECCV_2018/papers/Ming_Liang_Deep_Continuous_Fusion_ECCV_2018_paper.pdf) [[Notes](paper_notes/contfuse.md)] <kbd>ECCV 2018</kbd> [Uber ATG, sensor fusion, BEV]
462468
- [Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net](http://openaccess.thecvf.com/content_cvpr_2018/papers/Luo_Fast_and_Furious_CVPR_2018_paper.pdf) [[Notes](paper_notes/faf.md)] <kbd>CVPR 2018 oral</kbd> [lidar only, perception and prediction]
463-
- [Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unknown Cameras](https://arxiv.org/pdf/1904.04998.pdf) [[Notes](paper_notes/mono_depth_video_in_the_wild.md)] <kbd>ICCV 2019</kbd> [monocular depth estimation, intrinsic estimation, SOTA]
469+
- [LearnK: Unsupervised Monocular Depth Learning from Unknown Cameras](https://arxiv.org/pdf/1904.04998.pdf) [[Notes](paper_notes/learnk.md)] <kbd>ICCV 2019</kbd> [monocular depth estimation, intrinsic estimation, SOTA]
464470
- [monodepth: Unsupervised Monocular Depth Estimation with Left-Right Consistency](https://arxiv.org/abs/1609.03677) [[Notes](paper_notes/monodepth.md)] <kbd>CVPR 2017 oral</kbd> (monocular depth estimation, stereo for training)
465471
- [Struct2depth: Depth Prediction Without the Sensors: Leveraging Structure for Unsupervised Learning from Monocular Videos](https://arxiv.org/abs/1811.06152) [[Notes](paper_notes/struct2depth.md)] <kbd>AAAI 2019</kbd> [monocular depth estimation, estimating movement of dynamic object, infinite depth problem, online finetune]
466472
- [Unsupervised Learning of Geometry with Edge-aware Depth-Normal Consistency](https://arxiv.org/pdf/1711.03665.pdf) [[Notes](paper_notes/edge_aware_depth_normal.md)] <kbd>AAAI 2018</kbd> (monocular depth estimation, static assumption, surface normal)

paper_notes/deep3dbox.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ A simpler version for 3d proposal generation based on 2d bbox and viewpoint clas
1919
- This is also used in depth/disparity estimation, such as in [TW-SMNet](twsm_net.md).
2020
- **Representation matters**.
2121
- Regress dimension and orientation first.
22-
- The authors tried regressing dimension and distance at the same time but found it to be highly sensitive to input errors. --> This is understandable as dim and distance are highly correlated in determining the dimension of the bbox. (c.f. [depth in the wild](mono_depth_video_in_the_wild.md) to understand the coupling of estimation parameters. Sometimes an overall supervision signal is given to two tightly coupled parameters and it is not enough to get accurate estimate for both parameters)
22+
- The authors tried regressing dimension and distance at the same time but found it to be highly sensitive to input errors. --> This is understandable as dim and distance are highly correlated in determining the dimension of the bbox. (c.f. [depth in the wild](learnk.md) to understand the coupling of estimation parameters. Sometimes an overall supervision signal is given to two tightly coupled parameters and it is not enough to get accurate estimate for both parameters)
2323
- Orientation of a car can be estimated fairly accurately, given ground truth (from lidar annotation). Angle errors are: 3 degrees for easy case, 6 for moderate and 8 for hard cases.
2424
- The translational vector (center of the 3dbbox) is calculated deterministically from solving linear equations. However the center of the 3dbbox can be calculated fairly easily with reprojecting the 2d bbox height and the center to the 3d world. See [monoPSR](monopsr.md) for a rough estimate of the 3D position.
2525

paper_notes/glnet.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ tl;dr: Combine monodepth with optical flow with geometric and photometric losses
77
#### Overall impression
88
The paper proposes two online refinement strategies, one finetuning the model and one finetuning the image. --> cf [Struct2Depth](struc2depth.md) and [Consistent video depth](consistent_video_depth.md).
99

10-
It also predicts intrinsics for videos in the wild. --> cf [Depth from Videos in the Wild](mono_depth_video_in_the_wild.md).
10+
It also predicts intrinsics for videos in the wild. --> cf [Depth from Videos in the Wild](learnk.md).
1111

1212
The paper has several interesting ideas, but there are some conflicts as well. The main issue is that it uses FlowNet to handle dynamic regions but it still enforces epipolar constraints on the optical flow. Also it does not handle depth of the dynamic regions well.
1313

paper_notes/kp3d.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ _March 2020_
55
tl;dr: Predict keypoints and depth from videos simultaneously and in a unsupervised fashion.
66

77
#### Overall impression
8-
This paper is based on two streams of unsupervised research based on video. The first is depth estimation starting from [sfm Learner](sfm_learner.md), [depth in the wild](mono_depth_video_in_the_wild.md) and [scale-consistent sfm Learner](sc_sfm_learner.md), and the second is the self-supervised keypoint learning starting from [superpoint](superpoint.md), [unsuperpoint](unsuperpoint.md) and [unsuperpoint with outlier rejection](kp2d.md).
8+
This paper is based on two streams of unsupervised research based on video. The first is depth estimation starting from [sfm Learner](sfm_learner.md), [depth in the wild](learnk.md) and [scale-consistent sfm Learner](sc_sfm_learner.md), and the second is the self-supervised keypoint learning starting from [superpoint](superpoint.md), [unsuperpoint](unsuperpoint.md) and [unsuperpoint with outlier rejection](kp2d.md).
99

1010
The two major enablers of this research is [scale-consistent sfm Learner](sc_sfm_learner.md) and [unsuperpoint](unsuperpoint.md).
1111

paper_notes/mono_depth_video_in_the_wild.md renamed to paper_notes/learnk.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ tl;dr: Estimate the intrinsics in addition to the extrinsics of the camera from
77
#### Overall impression
88
Same authors for [Struct2Depth](struct2depth.md). This work eliminates the assumption of the availability of intrinsics. This opens up a whole lot possibilities to learn from a wide range of videos.
99

10-
This network regresses depth, ego-motion, object motion and camera intrinsics from mono videos. --> The idea of regressing intrinsics is similar to [GLNet](glnet.md).
10+
This network regresses depth, ego-motion, object motion and camera intrinsics from mono videos. Thus it is named learn-K (intrinsics) --> The idea of regressing intrinsics is similar to [GLNet](glnet.md).
1111

1212
#### Key ideas
1313
- Estimate each of the intrinsics

paper_notes/monodepth.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ This paper is one pioneering work on monocular depth estimation with self-superv
99

1010
When people are talking about monocular depth estimation, they mean "monocular at inference". The system can still rely on other supervision at training, either explicit supervision by dense depth map GT or with self-supervision via consistency.
1111

12-
I feel that for self-supervised method there are tons of tricks and know-hows about tuning the model, cf. [google AI's depth in the wild paper](mono_depth_video_in_the_wild.md).
12+
I feel that for self-supervised method there are tons of tricks and know-hows about tuning the model, cf. [google AI's depth in the wild paper](learnk.md).
1313

1414
Monodepth requires synchronized and rectified image pairs. It also does not handle occlusion in training. It is superseded by [monodepth2](monodepth2.md), which focuses on depth estimation from monocular video.
1515

paper_notes/oft.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ The network does not require explicit info about intrinsics, but rather learns t
2222

2323
#### Technical details
2424
- Replace batchnorm with groupnorm.
25-
- Data augmentation and adjusting intrinsic parameters accordingly (including cx, cy and fx and fy, c.f., [depth in the wild](mono_depth_video_in_the_wild.md) paper).
25+
- Data augmentation and adjusting intrinsic parameters accordingly (including cx, cy and fx and fy, c.f., [depth in the wild](learnk.md) paper).
2626
- Sum loss instead of averaging to avoid biasing toward examples with few object instances.
2727

2828
#### Notes

paper_notes/struct2depth.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ The improvement on prediction of depth in dynamic object is amazing. It also pre
1111

1212
The paper's annotation is quite sloppy. I would perhaps need to read the code to understand better.
1313

14-
It directly inspired [depth in the wild](mono_depth_video_in_the_wild.md).
14+
It directly inspired [depth in the wild](learnk.md).
1515

1616
#### Key ideas
1717
- Segment each dynamic object with Mask RCNN

0 commit comments

Comments
 (0)