반응형

AI/Paper Review 29

[논문리뷰] Domain Generalization-Aware Uncertainty Introspective Learning for 3D Point Clouds Segmentation

Domain Generalization-Aware Uncertainty Introspective Learning for 3D Point Clouds Segmentation | Proceedings of the 32nd ACM International Conference on Multimedia Domain Generalization-Aware Uncertainty Introspective Learning for 3D Point Clouds Segmentation | Proceedings of the 32nd ACM InPublication History Published: 28 October 2024dl.acm.orgProblemWeather condition 등의 domain shift에 대해서 unc..

AI/Paper Review 2025.01.15

[논문리뷰] Single Domain Generalization for LiDAR Semantic Segmentation

Single Domain Generalization for LiDAR Semantic SegmentationProblem타겟으로 하는 2개의 domain에 대해서는 잘 되는 모습을 보이지만, 정작 unseen domain에 대해서는 좋지 못한 성능을 보인다.sparsity invariant feature를 만들지 못하고 있다.dataset 간의 semantic correlation을 잘 활용하지 못하고 있다.SolutionSpherical projection 기반으로 beam 단위의 drop을 수행 → multi-sparsity augmentation SIFC loss: Sparsity Invariant Feature Consistency. Voxel별 align. 비어있는 voxel은 knn해서 wei..

AI/Paper Review 2025.01.15

[논문리뷰] AllWeatherNet:Unified Image Enhancement for Autonomous Driving under Adverse Weather and Lowlight-conditions

[2409.02045] AllWeatherNet:Unified Image Enhancement for Autonomous Driving under Adverse Weather and Lowlight-conditions AllWeatherNet:Unified Image Enhancement for Autonomous Driving under Adverse Weather and Lowlight-conditionsAdverse conditions like snow, rain, nighttime, and fog, pose challenges for autonomous driving perception systems. Existing methods have limited effectiveness in improv..

AI/Paper Review 2025.01.01

[논문리뷰] Image Hazing and Dehazing: From the Viewpoint of Two-Way Image Translation With a Weakly Supervised Framework

Image Hazing and Dehazing: From the Viewpoint of Two-Way Image Translation With a Weakly Supervised Framework | IEEE Transactions on Multimedia Image Hazing and Dehazing: From the Viewpoint of Two-Way Image Translation With a Weakly Supervised Framework | IEEE TransactionImage dehazing is an important task since it is the prerequisite for many downstream high-level computer vision tasks. Previou..

AI/Paper Review 2025.01.01

[논문 리뷰] Multiview Equivariance Improves 3D Correspondence Understanding with Minimal Feature Finetuning

[2411.19458] Multiview Equivariance Improves 3D Correspondence Understanding with Minimal Feature Finetuning  Multiview Equivariance Improves 3D Correspondence Understanding with Minimal Feature FinetuningVision foundation models, particularly the ViT family, have revolutionized image understanding by providing rich semantic features. However, despite their success in 2D comprehension, their abi..

AI/Paper Review 2024.12.25

[논문 리뷰] Probing the 3D Awareness of Visual Foundation Models

https://arxiv.org/abs/2404.08636 Probing the 3D Awareness of Visual Foundation ModelsRecent advances in large-scale pretraining have yielded visual foundation models with strong capabilities. Not only can recent models generalize to arbitrary images for their training task, their intermediate representations are useful for other visual tasarxiv.orgOverviewVLM을 비롯해서, 여러 Vision Foundation model (D..

AI/Paper Review 2024.12.24

[논문 리뷰] Rethinking LiDAR Domain Generalization: Single Source as Multiple Density Domains

[ECCV 2024] Rethinking LiDAR Domain Generalization: Single Source as Multiple Density Domains paper link: https://arxiv.org/abs/2312.12098 Rethinking LiDAR Domain Generalization: Single Source as Multiple Density DomainsIn the realm of LiDAR-based perception, significant strides have been made, yet domain generalization remains a substantial challenge. The performance often deteriorates when mod..

AI/Paper Review 2024.08.20

[논문리뷰] Lift, Splat, Shoot: Encoding Images From Arbitrary Camera Rigs by Implicitly Unprojecting to 3D

[2008.05711] Lift, Splat, Shoot: Encoding Images From Arbitrary Camera Rigs by Implicitly Unprojecting to 3D (arxiv.org) Lift, Splat, Shoot: Encoding Images From Arbitrary Camera Rigs by Implicitly Unprojecting to 3DThe goal of perception for autonomous vehicles is to extract semantic representations from multiple sensors and fuse these representations into a single "bird's-eye-view" coordinate ..

AI/Paper Review 2024.05.03

[논문 리뷰] NeRF-SLAM: Real-Time Dense Monocular SLAMwith Neural Radiance Fields

IntroductionNeRF는 outlier에 취약하다는 문제가 있었다. 이를 해결하기 위해 depth 정보를 주고 동시에 depth estimator의 uncertainty 측정 기법을 활용하면 NeRF의 문제를 해결하면서 동시에 SLAM을 수행할 수 있다.최종적으로 NeRF와 uncertainty information을 동시에 활용하는 SLAM method를 제안한다.MethodologyTracking: Dense SLAM with Covariances이 부분은 설명이 좀 어려운 것 같다...큰 틀은 DROID-SLAM을 따른다. 이미지 두개를 받아서 optical flow를 만들고, 그 optical flow를 기반으로 depth를 추정한다. 그리고 이 과정에서 local BA (Bundle Ad..

AI/Paper Review 2024.04.27

[논문 리뷰] DROID-SLAM: Deep Visual SLAM for Monocular, Stereo, and RGB-D Cameras

Introduction기존 방법은 EKF나 optimization based method로 나뉜다. 여기서 optimization의 핵심은 full Bundle Adjustment (BA)이다. 이는 camera pose와 3D map을 동시에 최적화한다.이런 optimization 방법은 여러 종류의 센서를 활용하기에 적합하다는 것이다.For example, ORB-SLAM3 [5] supports monocular, stereo, RGB-D, and IMU sensors, and modern systems can support a variety of camera models [5, 27, 43, 6].하지만 여전히 feature tracking을 실패하거나, drift error를 완전히 해결하지 못..

AI/Paper Review 2024.04.26
반응형