Skip to content

Latest commit

 

History

History
1511 lines (1241 loc) · 68.5 KB

Video-Open-World.md

File metadata and controls

1511 lines (1241 loc) · 68.5 KB

Video Open World Papers

Contents

Zero-Shot Learning Videos

Surveys

  • Zero-Shot Action Recognition in Videos: A Survey (Neurocomputing 2021) [Paper]

  • A Review of Generalized Zero-Shot Learning Methods (TPAMI 2022) [Paper]

  • Generalized Zero-Shot Learning for Action Recognition with Web-Scale Video Data (Arxiv 2017) [Paper]

2023 Papers

WACV

  • Language-Free Training for Zero-Shot Video Grounding (WACV 2023) [Paper]
    Datasets: Charades-STA, ActivityNet Captions
    Task: Video Grounding

  • Semantics Guided Contrastive Learning of Transformers for Zero-Shot Temporal Activity Detection (WACV 2023) [Paper]
    Datasets: Thumos’14 and Charades
    Task: Action Recognition

2022 Papers

CVPR

  • Uni-Perceiver: Pre-Training Unified Architecture for Generic Perception for Zero-Shot and Few-Shot Tasks (CVPR 2022) [Paper]
    Datasets: ImageNet-21k; Kinetics-700 and Moments in Time; BookCorpora & English Wikipedia (Books&Wiki) and PAQ; COCO Caption, SBUCaptions (SBU), Visual Genome, CC3M, CC12M and YFCC; Flickr30k, MSVD,VQA ,and GLUE
    Task: Image-Text Retreival; Image and Video Classification

  • Cross-Modal Representation Learning for Zero-Shot Action Recognition (CVPR 2022) [Paper] [Code]
    Datasets: Kinetics -> UCF101, HMDB51, and ActivityNet
    Task: Action Recognition

  • Audio-Visual Generalised Zero-Shot Learning With Cross-Modal Attention and Language (CVPR 2022) [Paper] [Code]
    Datasets: VGGSound; UCF101; ActivityNet
    Task: Action Recognition

  • Alignment-Uniformity Aware Representation Learning for Zero-Shot Video Classification (CVPR 2022) [Paper] [Code]
    Datasets: Kinetics-700 -> UCF101, HMDB51
    Task: Action Recognition

ECCV

  • Temporal and cross-modal attention foraudio-visual zero-shot learning (ECCV 2022) [Paper] [Code]
    Datasets: UCF-GZSL^cls, VGGSound-GZSL^cls, and ActivityNet-GZSL^cls1
    Task: Action Recognition

  • CLASTER: Clustering with Reinforcement Learning for Zero-Shot Action Recognition (ECCV 2022) [Paper] [Code]
    Datasets: Olympic Sports; UCF-101; HMDB-51
    Task: Action Recognition

  • Rethinking Zero-Shot Action Recognition: Learning from Latent Atomic Actions (ECCV 2022) [Paper]
    Datasets: KineticsZSAR, HMDB51, and UCF101
    Task: Action Recognition

  • Zero-Shot Temporal Action Detection via Vision-Language Prompting (ECCV 2022) [Paper] [Code]
    Datasets: THUMOS14; ActivityNet v1.3
    Task: Temporal Action Detection (TAD)

2021 Papers

CVPR

  • Recognizing Actions in Videos From Unseen Viewpoints (CVPR 2021) [Paper]
    Datasets: Human3.6M, MLB-YouTube, Toyota SmartHome (TSH), NTU-RGB-D
    Task: Action Recognition

BMVC

  • Zero-Shot Action Recognition from Diverse Object-Scene Compositions (BMVC 2021) [Paper] [Code]
    Datasets: UCF-101, Kinetics-400
    Task: Action Recognition

Older Papers

  • Out-Of-Distribution Detection for Generalized Zero-Shot Action Recognition (CVPR 2019) [Paper] [Code]
    Datasets: Olympic Sports, HMDB51 and UCF101
    Task: Action Recognition

  • Towards Universal Representation for Unseen Action Recognition (CVPR 2018) [Paper]
    Datasets: ActivityNet, HMDB51 and UCF101
    Task: Action Recognition

Out-of-Distribution Detection Videos

2023 Papers

CVPR

  • Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection to Image-Text Pre-Training (CVPR 2023) [Paper]

2022 Papers

CVPR

  • Unknown-Aware Object Detection: Learning What You Don't Know From Videos in the Wild (CVPR 2022) [Paper] [Code]
    Datasets: (Videos -> Images) BDD100K and Youtube-Video Instance Segmentation(Youtube-VIS) 2021 (ID) - MS-COCO and nuImages (OOD)
    Task: Object Detection

Older Papers

  • Uncertainty-aware audiovisual activity recognition using deep bayesian variational inference (ICCV 2019) [Paper]
    Datasets: MiT
    Task: Audiovisual Action Recognition

  • Bayesian activity recognition using variational inference (NeurIPS 2018) [Paper]
    Datasets: MiT video activity recognition dataset
    Task: Action Recognition

  • Out-Of-Distribution Detection for Generalized Zero-Shot Action Recognition (CVPR 2019) [Paper] [Code]
    Datasets: Olympic Sports, HMDB51 and UCF101
    Task: Action Recognition

Open-Set Recognition Videos

2023 Papers

CVPR

  • Enlarging Instance-specific and Class-specific Information for Open-set Action Recognition (CVPR 2023) [Paper]

  • AutoLabel: CLIP-based framework for Open-set Video Domain Adaptation (CVPR 2023) [Paper]

  • Open Set Action Recognition via Multi-Label Evidential Learning (CVPR 2023) [Paper]

  • OpenGait: Revisiting Gait Recognition Towards Better Practicality (CVPR 2023) [Paper]

  • Open-Category Human-Object Interaction Pre-training via Language Modeling Framework (CVPR 2023) [Paper]

  • SUDS: Scalable Urban Dynamic Scenes (CVPR 2023) [Paper]

WACV

  • Reconstructing Humpty Dumpty: Multi-Feature Graph Autoencoder for Open Set Action Recognition (WACV 2023) [Paper] [Code]
    Datasets: HMDB-51, UCF-101
    Task: Action Recognition

Arxiv & Others

  • Video Instance Segmentation in an Open-World (Arxiv 2023) [Paper]

  • Plan4MC: Skill Reinforcement Learning and Planning for Open-World Minecraft Tasks (Arxiv 2023) [Paper]

  • POAR: Towards Open-World Pedestrian Attribute Recognition (Arxiv 2023) [Paper]

  • Learning to Operate in Open Worlds by Adapting Planning Models (AAMAS 2023) [Paper]

  • PyReason: Software for Open World Temporal Logic (AAAI 2023) [Paper]

  • NovPhy: A Testbed for Physical Reasoning in Open-world Environments (Arxiv 2023) [Paper]

  • Improving Audio-Visual Video Parsing with Pseudo Visual Labels (Arxiv 2023) [Paper]

  • Open-World Object Manipulation using Pre-trained Vision-Language Models (Arxiv 2023) [Paper]

  • Towards Generalized Robot Assembly through Compliance-Enabled Contact Formations (ICRA 2023) [Paper]

  • Discovering Novel Actions in an Open World with Object-Grounded Visual Commonsense Reasoning (Arxiv 2023) [Paper]

  • Temporal-controlled Frame Swap for Generating High-Fidelity Stereo Driving Data for Autonomy Analysis (Arxiv 2023) [Paper]

2022 Papers

CVPR

  • Opening Up Open World Tracking (CVPR 2022) [Paper] [Code]
    Datasets: TAO-OW
    Task: Object Tracking

  • OpenTAL: Towards Open Set Temporal Action Localization (CVPR 2022) [Paper] [Code]
    Datasets: THUMOS14, ActivityNet1.3
    Task: Temporal Action Localization

  • UBnormal: New Benchmark for Supervised Open-Set Video Anomaly Detection (CVPR 2022) [Paper] [Code]
    Datasets: UBnormal, CHUK, Avenue, Shang-hai Tech
    Task: Anomaly Detection

  • Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From Learned Pairwise Affinity (CVPR 2022) [Paper]
    Datasets: COCO 17, LVIS, UVO (videos), ADE20k
    Task: Instance Segmentation

ECCV

  • Towards Open Set Video Anomaly Detection (ECCV 2022) [Paper]
    Datasets: XD Violence, UCF Crime, ShanghaiTech Campus
    Task: Anomaly Detection

Arxiv & Others

  • Human Activity Recognition in an Open World (Submitted to JAIR 2022) [Paper]

  • Self-Initiated Open World Learning for Autonomous AI Agents (AAAI 2022 Spring Symposium Series) [Paper]

  • UVO Challenge on Video-based Open-World Segmentation 2021: 1st Place Solution (Arxiv 2022) [Paper]

2021 Papers

CVPR

  • Generalizing to the Open World: Deep Visual Odometry With Online Adaptation (CVPR 2021) [Paper]
    Datasets: Cityscapes, KITTI, indoor TUM, NYUv2
    Task: Depth Estimation

ICCV

  • Evidential Deep Learning for Open Set Action Recognition (ICCV 2021) [Paper] [Code]
    Datasets: UCF-101, HMDB-51, MiT-v2
    Task: Action Recognition

  • Unidentified Video Objects: A Benchmark for Dense, Open-World Segmentation (ICCV 2021) [Paper]
    Datasets: UVO, COCO
    Task: Video Object Detection and Segmentation

  • Conditional Extreme Value Theory for Open Set Video Domain Adaptation (MMAsia 2021) [Paper]

  • Dual Metric Discriminator for Open Set Video Domain Adaptation (ICASSP 2021) [Paper]

  • Open-World Active Learning with Stacking Ensemble for Self-Driving Cars (Arxiv 2021) [Paper]

  • Physical Reasoning in an Open World (ACS 2021) [Paper]

  • Person Re-identification based on Robust Features in Open-world (Arxiv 2021) [Paper]

  • Online Action Recognition (AAAI 2021) [Paper]

  • ViNG: Learning Open-World Navigation with Visual Goals (ICRA 2021) [Paper]

Older Papers

  • Specifying weight priors in bayesian deep neural networks with empirical bayes (AAAI 2020) [Paper]
    Datasets: UCF-101, Urban Sound 8K, MNIST, Fashion-MNIST, CIFAR10
    Task: Image and Audio Classification, Video Activity Recognition

  • P-ODN: prototype-based open Deep network for open Set Recognition (Scientific Reports 2020) [Paper]
    Datasets: UCF11, UCF50, UCF101 and HMDB51
    Task: Action Recognition

  • Uncertainty-aware audiovisual activity recognition using deep bayesian variational inference (ICCV 2019) [Paper]
    Datasets: MiT
    Task: Audiovisual Action Recognition

  • Bayesian activity recognition using variational inference (NeurIPS 2018) [Paper]
    Datasets: MiT video activity recognition dataset
    Task: Action Recognition

  • ODN: Opening the deep network for open-set action recognition (ICME 2018) [Paper]
    Datasets: HMDB51, UCF50, UCF101
    Task: Action Recognition

  • Open-World Stereo Video Matching with Deep RNN (ECCV 2018) [Paper]
    Datasets: KITTI VO, Middlebury Stereo 2005 & 2006, Freiburg Sceneflow, Random dot, Synthia
    Task: Stereo Video Matching

  • Adversarial Open-World Person Re-Identification (ECCV 2018) [Paper]
    Datasets: Market-1501, CUHK01, CUHK03
    Task: Person Re-Identification

  • From Open Set to Closed Set: Counting Objects by Spatial Divide-and-Conquer (ICCV 2019) [Paper] [Code]
    Datasets: Synthesized Cell Counting, UCF-QNRF, ShanghaiTech, UCFCC50, TRANCOS and MTC
    Task: Visual Counting

  • AutOTranS: an Autonomous Open World Transportation System (Arxiv 2018) [Paper]

  • From Known to the Unknown: Transferring Knowledge to Answer Questions about Novel Visual and Semantic Concepts (Arxiv 2018) [Paper]

  • Visual Curiosity: Learning to Ask Questions to Learn Visual Recognition (CoRL 2018 Oral) [Paper]

  • Towards Large-Scale Video Video Object Mining (ECCVw 2018) [Paper]

Novel Class Discovery Videos

2023 Papers

CVPR

  • Open-Category Human-Object Interaction Pre-training via Language Modeling Framework (CVPR 2023) [Paper]

Arxiv & Others

  • NEV-NCD: Negative Learning, Entropy, and Variance regularization based novel action categories discovery (Arxiv 2023) [Paper]

2022 Papers

ECCV

  • Text-based Temporal Localization of Novel Events (ECCV 2022) [Paper]
    Datasets: Charades-STA Unseen, ActivityNet Captions Unseen
    Task: Temporal Action Localization

  • Discovering Objects That Can Move (ECCV 2022) [Paper] [Code]
    Datasets: KITTI; CATER; TRI-PD
    Task: Object Segmentation

2021 Papers

ICCV

  • Joint Representation Learning and Novel Category Discovery on Single- and Multi-Modal Data (ICCV 2021) [Paper]
    Datasets: ImageNet; CIFAR-10/CIFAR-100; Kinetics-400; VGG-Sound
    Task: Multimodal Data

  • Learning To Better Segment Objects From Unseen Classes With Unlabeled Videos (ICCV 2021) [Paper] [Code]
    Datasets: COCO -> Unseen-VIS; DAVIS
    Task: Instance Segmentation

BMVC

  • Unsupervised Discovery of Actions in Instructional Videos (BMVC 2021) [Paper]
    Datasets: 50-salads dataset, Narrated Instructional Videos (NIV) dataset, Breakfast dataset
    Task: Action Discovery

Older Papers

  • Tracking the Known and the Unknown by Leveraging Semantic Information (BMVC 2019) [Paper] [Code]
    Datasets: NFS, UAV123, LaSOT, TrackingNet, VOT2018
    Task: Object Tracking

  • DetectFusion: Detecting and Segmenting Both Known and Unknown Dynamic Objects in Real-time SLAM (BMVC 2019) [Paper]
    Datasets: TUM-RGB-D, MS COCO, PASCAL VOC
    Task: Object Tracking and Segmentation

  • Localizing Novel Attended Objects in Egocentric Views (BMVC 2020) [Paper]
    Datasets: GTEA Gaze+, Toy Room
    Task: Novel Object Localization

  • Video Face Clustering With Unknown Number of Clusters (ICCV 2019) [Paper] [Code]
    Datasets: MovieGraphs, The Big Bang Theory (BBT) and Buffy the Vampire Slayer (BUFFY)
    Task: Face Clustering

  • Incremental Class Discovery for Semantic Segmentation With RGBD Sensing (ICCV 2019) [Paper]
    Datasets: NYUDv2
    Task: Semantic Segmentation

  • Object Discovery in Videos as Foreground Motion Clustering (ICCV 2019) [Paper]
    Datasets: Flying Things 3d (FT3D), DAVIS2016, Freibug-Berkeley motion segmentation, Complex Background, and Camouflaged Animal
    Task: Object Discovery

Open Vocabulary Videos

2023 Papers

CVPR

  • Open-Category Human-Object Interaction Pre-training via Language Modeling Framework (CVPR 2023) [Paper]

  • Being Comes from Not-being: Open-vocabulary Text-to-Motion Generation with Wordless Training (CVPR 2023) [Paper]

  • OVTrack: Open-Vocabulary Multiple Object Tracking (CVPR 2023) [Paper]

ICLR

  • The Devil is in the Wrongly-classified Samples: Towards Unified Open-set Recognition (ICLR 2023) [Paper]
    Datasets: CIFAR100, LSUN, MiTv2, UCF101, HMDB51
    Task: Image and Video Classification

ICML

  • Open-VCLIP: Transforming CLIP to an Open-vocabulary Video Model via Interpolated Weight Optimization (ICML 2023) [Paper]

Arxiv & Others

  • Transforming CLIP to an Open-vocabulary Video Model via Interpolated Weight Optimization (Arxiv 2023) [Paper]

  • TagCLIP: Improving Discrimination Ability of Open-Vocabulary Semantic Segmentation (Arxiv 2023) [Paper]

  • MVP-SEG: Multi-View Prompt Learning for Open-Vocabulary Semantic Segmentation (Arxiv 2023) [Paper]

  • Segment Everything Everywhere All at Once (Arxiv 2023) [Paper]

  • Towards Open-Vocabulary Video Instance Segmentation (Arxiv 2023) [Paper]

  • CLIP Surgery for Better Explainability with Enhancement in Open-Vocabulary Tasks (Arxiv 2023) [Paper]

  • Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition (Arxiv 2023) [Paper]

  • V3Det: Vast Vocabulary Visual Detection Dataset (Arxiv 2023) [Paper]

  • Token Merging for Fast Stable Diffusion (Arxiv 2023) [Paper]

  • Going Beyond Nouns With Vision & Language Models Using Synthetic Data (Arxiv 2023) [Paper]

  • MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks (Arxiv 2023) [Paper]

  • ZBS: Zero-shot Background Subtraction via Instance-level Background Modeling and Foreground Selection (CVPR 2023) [Paper]

  • Prompt-Guided Transformers for End-to-End Open-Vocabulary Object Detection (Arxiv 2023) [Paper]

  • Three ways to improve feature alignment for open vocabulary detection (Arxiv 2023) [Paper]

  • Zero-guidance Segmentation Using Zero Segment Labels (Arxiv 2023) [Paper]

  • Open-Vocabulary Object Detection using Pseudo Caption Labels (Arxiv 2023) [Paper]

  • Uni-Fusion: Universal Continuous Mapping (Arxiv 2023) [Paper]

2022 Papers

Arxiv & Others

  • Open-Vocabulary Temporal Action Detection with Off-the-Shelf Image-Text Features (Arxiv 2022) [Paper]

Fine Grained Videos

2023 Papers

CVPR

  • On the Difficulty of Unpaired Infrared-to-Visible Video Translation: Fine-Grained Content-Rich Patches Transfer (CVPR 2023) [Paper]

  • ANetQA: A Large-scale Benchmark for Fine-grained Compositional Reasoning over Untrimmed Videos (CVPR 2023) [Paper]

  • MELTR: Meta Loss Transformer for Learning to Fine-tune Video Foundation Models (CVPR 2023) [Paper]

  • Modeling Video as Stochastic Processes for Fine-Grained Video Representation Learning (CVPR 2023) [Paper]

  • Class Prototypes based Contrastive Learning for Classifying Multi-Label and Fine-Grained Educational Videos (CVPR 2023) [Paper]

  • Progressive Disentangled Representation Learning for Fine-Grained Controllable Talking Head Synthesis (CVPR 2023) [Paper]

  • Focus On Details: Online Multi-object Tracking with Diverse Fine-grained Representation (CVPR 2023) [Paper]

  • Fine-grained Audible Video Description (CVPR 2023) [Paper]

WACV

  • Fine-Grained Activities of People Worldwide (WACV 2023) [Paper] [Code]
    Datasets: Consented Activities of People (CAP)
    Task: Action Recognition

  • Fine-Grained Affordance Annotation for Egocentric Hand-Object Interaction Videos (WACV 2023) [Paper]
    Datasets: EPIC-KITCHENS
    Task: Action Recognition

Arxiv & Others

  • Hand Guided High Resolution Feature Enhancement for Fine-Grained Atomic Action Segmentation Within Complex Human Assemblies (WACVw 2023) [Paper]

  • A Transformer-Based Late-Fusion Mechanism for Fine-Grained Object Recognition in Videos (WACVw 2023) [Paper]

  • Simplifying Open-Set Video Domain Adaptation with Contrastive Learning (CVIU 2023 under review) [Paper]

2022 Papers

CVPR

  • FineDiving: A Fine-Grained Dataset for Procedure-Aware Action Quality Assessment (CVPR 2022) [Paper] [Code]
    Datasets: FineDiving
    Task: Action Quality Assessment

  • Fine-Grained Temporal Contrastive Learning for Weakly-Supervised Temporal Action Localization (CVPR 2022) [Paper] [Code]
    Datasets: THUMOS14; ActivityNet1.3
    Task: Temporal Action Localization

  • How Do You Do It? Fine-Grained Action Understanding With Pseudo-Adverbs (CVPR 2022) [Paper] [Code]
    Datasets: VATEX Adverbs, ActivityNet Adverbs and MSR-VTT Adverbs
    Task: Adverb Recognition

  • EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching (CVPR 2022) [Paper] [Code]
    Datasets: VATEX-EVAL; ActivityNet-FOIL
    Task: Video Captioning

ECCV

  • Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action Recognition (ECCV 2022) [Paper]
    Datasets: Diving48
    Task: Action Recognition

  • Exploring Fine-Grained Audiovisual Categorization with the SSW60 Dataset (ECCV 2022) [Paper] [Code]
    Datasets: SSW60
    Task: Action Recognition

  • Weakly-Supervised Temporal Action Detection for Fine-Grained Videos with Hierarchical Atomic Actions (ECCV 2022) [Paper] [Code]
    Datasets: FineAction; FineGym
    Task: Action Recognition

  • Semantic-Aware Fine-Grained Correspondence (ECCV 2022) [Paper]
    Datasets: DAVIS-2017; JHMDB; Video Instance Parsing (VIP)
    Task: Video Object Segmentation, Human Pose Tracking, Human Part Tracking

  • Spotting Temporally Precise, Fine-Grained Events in Video (ECCV 2022) [Paper]
    Datasets: Tennis, Figure Skating, FineDiving, and Fine-Gym
    Task: Temporally Precise Spotting

  • Fine-Grained Egocentric Hand-Object Segmentation: Dataset, Model, and Applications (ECCV 2022) [Paper] [Code]
    Datasets: EPIC-KITCHENS; Ego4d; THU-READ; Escape Room
    Task: Semantic Segmentation

CVPRw

  • FenceNet: Fine-Grained Footwork Recognition in Fencing (CVPRw 2022) [Paper]
    Datasets: FFD a publicly available fencing dataset
    Task: Action Recognition

2021 Papers

CVPR

  • Fine-Grained Shape-Appearance Mutual Learning for Cloth-Changing Person Re-Identification (CVPR 2021) [Paper]

  • Temporal Query Networks for Fine-Grained Video Understanding (CVPR 2021) [Paper]

  • GLAVNet: Global-Local Audio-Visual Cues for Fine-Grained Material Recognition (CVPR 2021) [Paper]

ICCV

  • Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-Identification (ICCV 2021) [Paper]

  • Video Pose Distillation for Few-Shot, Fine-Grained Sports Action Recognition (ICCV 2021) [Paper]

  • FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting (ICCV 2021) [Paper]

Older Papers

  • FineGym: A Hierarchical Video Dataset for Fine-Grained Action Understanding (CVPR 2020) [Paper]

  • Multi-Modal Domain Adaptation for Fine-Grained Action Recognition (CVPR 2020) [Paper]

  • Revisiting Pose-Normalization for Fine-Grained Few-Shot Recognition (CVPR 2020) [Paper]

  • Fine-Grained Video-Text Retrieval With Hierarchical Graph Reasoning (CVPR 2020) [Paper]

  • Stochastic Fine-grained Labeling of Multi-state Sign Glosses for Continuous Sign Language Recognition (ECCV 2020) [Paper]

  • Fine-Grained Motion Representation For Template-Free Visual Tracking (WACV 2020) [Paper]

  • WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose (BMVC 2020) [Paper]

  • Yoga-82: A New Dataset for Fine-Grained Classification of Human Poses (CVPRw 2020) [Paper]

  • Fine-Grained Pointing Recognition for Natural Drone Guidance (CVPRw 2020) [Paper]

  • Local Temporal Bilinear Pooling for Fine-Grained Action Parsing (CVPR 2019) [Paper]

  • Drive&Act: A Multi-Modal Dataset for Fine-Grained Driver Behavior Recognition in Autonomous Vehicles (ICCV 2019) [Paper]

  • Fine-Grained Action Retrieval Through Multiple Parts-of-Speech Embeddings (ICCV 2019) [Paper]

  • Learning Motion in Feature Space: Locally-Consistent Deformable Convolution Networks for Fine-Grained Action Detection (ICCV 2019) [Paper]

  • ViSiL: Fine-Grained Spatio-Temporal Video Similarity Learning (ICCV 2019) [Paper]

  • Fine-Grained Visual Dribbling Style Analysis for Soccer Videos With Augmented Dribble Energy Image (CVPRw 2019) [Paper]

  • Anticipation of Human Actions With Pose-Based Fine-Grained Representations (CVPRw 2019) [Paper]

  • Fine-Grained Video Captioning for Sports Narrative (CVPR 2018) [Paper]

  • Where Will They Go? Predicting Fine-Grained Adversarial Multi-Agent Motion using Conditional Variational Autoencoders (ECCV 2018) [Paper]

  • Fine-grained Video Categorization with Redundancy Reduction Attention (ECCV 2018) [Paper]

  • Fine-Grained Head Pose Estimation Without Keypoints (CVPRw 2018) [Paper]

  • Fine-Grained Activity Recognition in Baseball Videos (CVPRw 2018) [Paper]

Long Tail Videos

2023 Papers

CVPR

  • Use Your Head: Improving Long-Tail Video Recognition (CVPR 2023) [Paper]

  • FEND: A Future Enhanced Distribution-Aware Contrastive Learning Framework For Long-tail Trajectory Prediction (CVPR 2023) [Paper]

2021 Papers

ICCV

  • VideoLT: Large-Scale Long-Tailed Video Recognition (ICCV 2021) [Paper]

  • On Exposing the Challenging Long Tail in Future Prediction of Traffic Actors (ICCV 2021) [Paper]

Anomaly Detection Videos

2023 Papers

ICCV

  • Multimodal Motion Conditioned Diffusion Model for Skeleton-based Video Anomaly Detection (ICCV 2023) [Paper]

WACV

  • DyAnNet: A Scene Dynamicity Guided Self-Trained Video Anomaly Detection Network (WACV 2023) [Paper]
    Datasets: UCF-Crime, CCTV-Fights, UBI-Fights

  • Cross-Domain Video Anomaly Detection Without Target Domain Adaptation (WACV 2023) [Paper]
    Datasets: SHTdc, SHT and Ped2, HMDB, UCF101

  • Bi-Directional Frame Interpolation for Unsupervised Video Anomaly Detection (WACV 2023) [Paper]
    Datasets: UCSD Ped2, CUHK Avenue, ShanghaiTech Campus

  • Towards Interpretable Video Anomaly Detection (WACV 2023) [Paper]
    Datasets: CUHK Avenue, ShanghaiTech Campus

  • Normality Guided Multiple Instance Learning for Weakly Supervised Video Anomaly Detection (WACV 2023) [Paper]
    Datasets: ShanghaiTech, UCF-Crime, XD-Violence

2022 Papers

CVPR

  • UBnormal: New Benchmark for Supervised Open-Set Video Anomaly Detection (CVPR 2022) [Paper] [Code]
    Datasets: UBnormal, CHUK, Avenue, Shang-hai Tech

  • Deep Anomaly Discovery From Unlabeled Videos via Normality Advantage and Self-Paced Refinement (CVPR 2022) [Paper] [Code]
    Datasets: UCS-Dped1/UCSDped2, Avenue and ShanghaiTech

  • Self-Supervised Predictive Convolutional Attentive Block for Anomaly Detection (CVPR 2022) [Paper] [Code]
    Datasets: MVTec AD, Avenue and ShanghaiTech

  • Anomaly Detection via Reverse Distillation From One-Class Embedding (CVPR 2022) [Paper]
    Datasets: MVTec; MNIST, FashionMNIST and CIFAR10

  • Bayesian Nonparametric Submodular Video Partition for Robust Anomaly Detection (CVPR 2022) [Paper]
    Datasets: ShanghaiTech, Avenue, UCF-Crime

  • Towards Total Recall in Industrial Anomaly Detection (CVPR 2022) [Paper] [Code]
    Datasets: MVTec; Magnetic Tile Defects (MTD); Mini Shanghai Tech Campus(mSTC)

  • Generative Cooperative Learning for Unsupervised Video Anomaly Detection (CVPR 2022) [Paper] [Code]
    Datasets: UCF-Crime (UCFC); ShanghaiTech

ICML

  • Latent Outlier Exposure for Anomaly Detection with Contaminated Data (ICML 2022) [Paper] [Code]
    Datasets: CIFAR-10, Fashion-MNIST, MVTEC, 30 tabular data sets, UCSD Peds1

ECCV

  • Towards Open Set Video Anomaly Detection (ECCV 2022) [Paper]
    Datasets: XD Violence, UCF Crime, ShanghaiTech Campus

  • Scale-Aware Spatio-Temporal Relation Learning for Video Anomaly Detection (ECCV 2022) [Paper] [Code]
    Datasets: UCF-Crime (UCFC); ShanghaiTech

  • Dynamic Local Aggregation Network with Adaptive Clusterer for Anomaly Detection (ECCV 2022) [Paper] [Code]
    Datasets: CUHK Avenue; UCSD Ped2; ShanghaiTech

  • Video Anomaly Detection by Solving Decoupled Spatio-Temporal Jigsaw Puzzles (ECCV 2022) [Paper]
    Datasets: CUHK Avenue; UCSD Ped2; ShanghaiTech

  • Self-Supervised Sparse Representation for Video Anomaly Detection (ECCV 2022) [Paper] [Code]
    Datasets: ShanghaiTech, UCF-Crime, and XD-Violence

  • Registration Based Few-Shot Anomaly Detection (ECCV 2022) [Paper] [Code]
    Datasets: MVTec; MPDD

  • DenseHybrid: Hybrid Anomaly Detection for Dense Open-set Recognition (ECCV 2022) [Paper] [Code]
    Datasets: Fishyscapes, SegmentMeIfYouCan (SMIYC), StreetHazards

CVPRw

  • Unsupervised Anomaly Detection From Time-of-Flight Depth Images (CVPRw 2022) [Paper]
    Datasets: TIMo

  • Adversarial Machine Learning Attacks Against Video Anomaly Detection Systems (CVPRw 2022) [Paper]
    Datasets: CUHK Avenue, the ShanghaiTech Campus

  • Anomaly Detection in Autonomous Driving: A Survey (CVPRw 2022) [Paper]

WACV

  • A Modular and Unified Framework for Detecting and Localizing Video Anomalies (WACV 2022) [Paper]
    Datasets: CUHK Avenue, UCSD Ped2, ShanghaiTech Campus, UR fall

  • FastAno: Fast Anomaly Detection via Spatio-Temporal Patch Transformation (WACV 2022) [Paper]
    Datasets: CUHK Avenue, UCSD Ped2, ShanghaiTech Campus

  • Multi-Branch Neural Networks for Video Anomaly Detection in Adverse Lighting and Weather Conditions (WACV 2022) [Paper] [Code]
    Datasets: CUHK Avenue (Augmented)

  • Discrete Neural Representations for Explainable Anomaly Detection (WACV 2022) [Paper] [Code]
    Datasets: CUHK Avenue, UCSD Ped2, X-MAN

  • Rethinking Video Anomaly Detection - A Continual Learning Approach (WACV 2022) [Paper]
    Datasets: NOLA

2021 Papers

CVPRw

  • Box-Level Tube Tracking and Refinement for Vehicles Anomaly Detection (CVPRw 2021) [Paper]

  • Dual-Modality Vehicle Anomaly Detection via Bilateral Trajectory Tracing (CVPRw 2021) [Paper]

  • A Vision-Based System for Traffic Anomaly Detection Using Deep Learning and Decision Trees (CVPRw 2021) [Paper]

  • Good Practices and a Strong Baseline for Traffic Anomaly Detection (CVPRw 2021) [Paper]

  • An Efficient Approach for Anomaly Detection in Traffic Videos (CVPRw 2021) [Paper]

  • Spacecraft Time-Series Anomaly Detection Using Transfer Learning (CVPRw 2021) [Paper]

Older Papers

  • Multi-Granularity Tracking With Modularlized Components for Unsupervised Vehicles Anomaly Detection (CVPRw 2020) [Paper]

  • Fractional Data Distillation Model for Anomaly Detection in Traffic Videos (CVPRw 2020) [Paper]

  • Towards Real-Time Systems for Vehicle Re-Identification, Multi-Camera Tracking, and Anomaly Detection (CVPRw 2020) [Paper]

  • Fast Unsupervised Anomaly Detection in Traffic Videos (CVPRw 2020) [Paper]

  • Continual Learning for Anomaly Detection in Surveillance Videos (CVPRw 2020) [Paper]

  • Any-Shot Sequential Anomaly Detection in Surveillance Videos (CVPRw 2020) [Paper]

  • Challenges in Time-Stamp Aware Anomaly Detection in Traffic Videos (CVPRw 2019) [Paper]

  • Traffic Anomaly Detection via Perspective Map based on Spatial-temporal Information Matrix (CVPRw 2019) [Paper]

  • Unsupervised Traffic Anomaly Detection Using Trajectories (CVPRw 2019) [Paper]

  • Attention Driven Vehicle Re-identification and Unsupervised Anomaly Detection for Traffic Understanding (CVPRw 2019) [Paper]

  • A Comparative Study of Faster R-CNN Models for Anomaly Detection in 2019 AI City Challenge (CVPRw 2019) [Paper]

  • Anomaly Candidate Identification and Starting Time Estimation of Vehicles from Traffic Videos (CVPRw 2019) [Paper]

  • Hybrid Deep Network for Anomaly Detection (BMVC 2019) [Paper]
    Datasets: CUHK Avenue, UCSD Ped2, Belleview, Traffic-Train

  • Motion-Aware Feature for Improved Video Anomaly Detection (BMVC 2019) [Paper]
    Datasets: UCF Crime

  • Adversarially Learned One-Class Classifier for Novelty Detection (CVPR 2018) [Paper]
    Datasets: MNIST, Caltech-256, UCSD Ped2
    Task: Image Classification, Anomaly Detection

  • Real-World Anomaly Detection in Surveillance Videos (CVPR 2018) [Paper] [Code]
    Datasets: Real-world Surveillance Videos

  • Future Frame Prediction for Anomaly Detection – A New Baseline (CVPR 2018) [Paper] [Code]
    Datasets: CUHK, Avenue, UCSD Ped1, UCSD Ped2, ShanghaiTech, Paper's toy dataset

  • Unsupervised Anomaly Detection for Traffic Surveillance Based on Background Modeling (CVPRw 2018) [Paper]

  • Dual-Mode Vehicle Motion Pattern Learning for High Performance Road Traffic Anomaly Detection (CVPRw 2018) [Paper]

Novelty Detection

ECCV

  • incDFM: Incremental Deep Feature Modeling for Continual Novelty Detection (ECCV 2022) [Paper]
    Datasets: 1.CIFAR-10 (10 classes), 2. CIFAR-100 (super-classlevel, 20 classes), 3. EMNIST (26 classes) and 4. iNaturalist21 (phylumlevel, 9 classes)
    Task: Image Classification

WACV

  • One-Class Learned Encoder-Decoder Network With Adversarial Context Masking for Novelty Detection (WACV 2022) [Paper] [Code]
    Datasets: MNIST, CIFAR-10, UCSD
    Task: Novelty Detection, Anomaly

2021 Papers

CVPR

  • Learning Deep Classifiers Consistent With Fine-Grained Novelty Detection (CVPR 2021) [Paper]
    Datasets: small- and large-scale FGVC
    Task: Novelty Detection

AAAI

  • A Unifying Framework for Formal Theories of Novelty:Framework, Examples and Discussion (AAAI 2021) [Paper]

BMVC

  • Multi-Class Novelty Detection with Generated Hard Novel Features (BMVC 2021) [Paper]
    Datasets: Stanford Dogs, Caltech 256, CUB 200, FounderType-200
    Task: Image Classification

Older Papers

  • Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents (NeurIPS 2018) [Paper]
    Datasets: OpenAI Gym
    Task: Reinforcement Learning

  • Multivariate Triangular Quantile Maps for Novelty Detection (NeurIPS 2019) [Paper] [Code]
    Datasets: MNIST and Fashion-MNIST, KDDCUP and Thyroid
    Task: Image Classification

  • Multi-class Novelty Detection Using Mix-up Technique (WACV 2020) [Paper]
    Datasets: Caltech 256 and Stanford Dogs
    Task: Image Classification

  • Hierarchical Novelty Detection for Visual Object Recognition (CVPR 2018) [Paper]
    Datasets: ImageNet, AwA2, CUB
    Task: Image Classification

  • Adversarially Learned One-Class Classifier for Novelty Detection (CVPR 2018) [Paper]
    Datasets: MNIST, Caltech-256, UCSD Ped2
    Task: Image Classification, Anomaly Detection

  • Multiple Class Novelty Detection Under Data Distribution Shift (ECCV 2020) [Paper]
    Datasets: SVHN, MNIST and USPS, Office-31
    Task: Image Classification

  • Utilizing Patch-level Category Activation Patterns for Multiple Class Novelty Detection (ECCV 2020) [Paper]
    Datasets: Caltech256, CUB-200, Stanford Dogs and FounderType-200
    Task: Image Classification

  • Unsupervised and Semi-supervised Novelty Detection using Variational Autoencoders in Opportunistic Science Missions (BMVC 2020) [Paper]
    Datasets: Mars novelty detection Mastcam labeled dataset
    Task: Image Classification

  • Where's Wally Now? Deep Generative and Discriminative Embeddings for Novelty Detection (CVPR 2019) [Paper]
    Datasets: CIFAR-10, IN-125
    Task: Image Classification

  • Deep Transfer Learning for Multiple Class Novelty Detection (CVPR 2019) [Paper]
    Datasets: Caltech256, Caltech-UCSD Birds 200 (CUB 200), Stanford Dogs, FounderType-200
    Task: Image Classification

  • Latent Space Autoregression for Novelty Detection (CVPR 2019) [Paper] [Code]
    Datasets: MNIST, CIFAR10, UCSD Ped2 and ShanghaiTech
    Task: Image Classification, Video Anomaly Detection

  • OCGAN: One-Class Novelty Detection Using GANs With Constrained Latent Representations (CVPR 2019) [Paper] [Code]
    Datasets: COIL100, fMNIST, MNIST, CIFAR10
    Task: Image Classification

  • RaPP: Novelty Detection with Reconstruction along Projection Pathway (ICLR 2020) [Paper] [Code]
    Datasets: fMNIST, MNIST, MI-F and MI-V, STL, OTTO, SNSR, EOPT, NASA, RARM
    Task: Image Classification, Anomaly Detection

  • Novelty Detection Via Blurring (ICLR 2020) [Paper]
    Datasets: CIFAR-10, CIFAR-100, CelebA, ImageNet, LSUN, SVHN
    Task: Image Classification

Other Related Papers

  • Understanding Cross-Domain Few-Shot Learning Based on Domain Similarity and Few-Shot Difficulty (NeurIPS 2022) [Paper] [Code]
    Datasets: ImageNet, tieredImageNet, and miniImageNet for source domain similarity to ImageNet: Places,CUB,Cars,Plantae,EuroSAT,CropDisease,ISIC,ChestX
    Task: Active Learning

  • Self-organization in a perceptual network (Info-max)(IEEE 1988) [Paper]

  • Mutual Information-Based Temporal Difference Learning for Human Pose Estimation in Video (CVPR 2023) [Paper]

  • Upcycling Models under Domain and Category Shift (CVPR 2023) [Paper]

  • Generative Meta-Adversarial Network for Unseen Object Navigation (ECCV 2022) [Paper] [Code]
    Datasets: AI2THOR and RoboTHOR
    Task: Object Navigation

Action Recognition Related

Surveys

  • Vision Transformers for Action Recognition: A Survey (Arxiv 2022) [Paper]