Table of Contents
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2025-04-10 | Pychop: Emulating Low-Precision Arithmetic in Numerical Methods and Neural Networks | Erin Carson et.al. | 2504.07835 | null |
2025-04-10 | Traversal Learning Coordination For Lossless And Efficient Distributed Learning | Erdenebileg Batbaatar et.al. | 2504.07471 | null |
2025-04-09 | Identifying regions of interest in whole slide images of renal cell carcinoma | Mohammed Lamine Benomar et.al. | 2504.07313 | null |
2025-04-09 | A new training approach for text classification in Mental Health: LatentGLoss | Korhan Sevinç et.al. | 2504.07245 | null |
2025-04-09 | Deep Learning for Cardiovascular Risk Assessment: Proxy Features from Carotid Sonography as Predictors of Arterial Damage | Christoph Balada et.al. | 2504.06680 | null |
2025-04-08 | Memory-Modular Classification: Learning to Generalize with Memory Replacement | Dahyun Kang et.al. | 2504.06021 | null |
2025-04-08 | Federated Unlearning Made Practical: Seamless Integration via Negated Pseudo-Gradients | Alessio Mora et.al. | 2504.05822 | null |
2025-04-08 | DefMamba: Deformable Visual State Space Model | Leiye Liu et.al. | 2504.05794 | null |
2025-04-08 | Layer-Aware Embedding Fusion for LLMs in Text Classifications | Jiho Gwak et.al. | 2504.05764 | null |
2025-04-07 | REEF: Relevance-Aware and Efficient LLM Adapter for Video Understanding | Sakib Reza et.al. | 2504.05491 | null |
2025-04-07 | Secure Diagnostics: Adversarial Robustness Meets Clinical Interpretability | Mohammad Hossein Najafi et.al. | 2504.05483 | null |
2025-04-07 | Explaining Low Perception Model Competency with High-Competency Counterfactuals | Sara Pohland et.al. | 2504.05254 | null |
2025-04-07 | Federated Learning for Medical Image Classification: A Comprehensive Benchmark | Zhekai Zhou et.al. | 2504.05238 | null |
2025-04-07 | Batch Aggregation: An Approach to Enhance Text Classification with Correlated Augmented Data | Charco Hui et.al. | 2504.05020 | null |
2025-04-07 | RS-RAG: Bridging Remote Sensing Imagery and Comprehensive Knowledge with a Multi-Modal Dataset and Retrieval-Augmented Generation Model | Congcong Wen et.al. | 2504.04988 | null |
2025-04-06 | Your Image Generator Is Your New Private Dataset | Nicolo Resmini et.al. | 2504.04582 | null |
2025-04-06 | Attributed Synthetic Data Generation for Zero-shot Domain-specific Image Classification | Shijian Wang et.al. | 2504.04510 | null |
2025-04-06 | Spatial-Geometry Enhanced 3D Dynamic Snake Convolutional Neural Network for Hyperspectral Image Classification | Guandong Li et.al. | 2504.04463 | null |
2025-04-05 | A Comparative Study of Explainable AI Methods: Model-Agnostic vs. Model-Specific Approaches | Keerthi Devireddy et.al. | 2504.04276 | null |
2025-04-05 | GlotEval: A Test Suite for Massively Multilingual Evaluation of Large Language Models | Hengyu Luo et.al. | 2504.04155 | null |
2025-04-05 | Scaling Federated Learning Solutions with Kubernetes for Synthesizing Histopathology Images | Andrei-Alexandru Preda et.al. | 2504.04130 | null |
2025-04-04 | Adaptive Classification of Interval-Valued Time Series | Wan Tian et.al. | 2504.03318 | null |
2025-04-04 | Beyond the Next Token: Towards Prompt-Robust Zero-Shot Classification via Efficient Multi-Token Prediction | Junlang Qian et.al. | 2504.03159 | null |
2025-04-03 | HQViT: Hybrid Quantum Vision Transformer for Image Classification | Hui Zhang et.al. | 2504.02730 | null |
2025-04-03 | LLM-Guided Evolution: An Autonomous Model Optimization for Object Detection | YiMing Yu et.al. | 2504.02280 | null |
2025-04-02 | Neural Style Transfer for Synthesising a Dataset of Ancient Egyptian Hieroglyphs | Lewis Matheson Creed et.al. | 2504.02163 | null |
2025-04-02 | A thorough benchmark of automatic text classification: From traditional approaches to large language models | Washington Cunha et.al. | 2504.01930 | link |
2025-04-02 | A Randomized Zeroth-Order Hierarchical Framework for Heterogeneous Federated Learning | Yuyang Qiu et.al. | 2504.01839 | null |
2025-04-02 | A Novel Approach To Implementing Knowledge Distillation In Tsetlin Machines | Calvin Kinateder et.al. | 2504.01798 | null |
2025-04-02 | Token Pruning in Audio Transformers: Optimizing Performance and Decoding Patch Importance | Taehan Lee et.al. | 2504.01690 | link |
2025-04-02 | All Patches Matter, More Patches Better: Enhance AI-Generated Image Detection via Panoptic Patch Learning | Zheng Yang et.al. | 2504.01396 | null |
2025-04-01 | TenAd: A Tensor-based Low-rank Black Box Adversarial Attack for Video Classification | Kimia haghjooei et.al. | 2504.01228 | null |
2025-04-01 | PolygoNet: Leveraging Simplified Polygonal Representation for Effective Image Classification | Salim Khazem et.al. | 2504.01214 | link |
2025-04-01 | Enabling Efficient Processing of Spiking Neural Networks with On-Chip Learning on Commodity Neuromorphic Processors for Edge AI Systems | Rachmad Vidya Wicaksana Putra et.al. | 2504.00957 | null |
2025-04-01 | Impact of Data Duplication on Deep Neural Network-Based Image Classifiers: Robust vs. Standard Models | Alireza Aghabagherloo et.al. | 2504.00638 | null |
2025-04-01 | Geometric Median Matching for Robust k-Subset Selection from Noisy Data | Anish Acharya et.al. | 2504.00564 | null |
2025-03-31 | NoProp: Training Neural Networks without Back-propagation or Forward-propagation | Qinyu Li et.al. | 2503.24322 | null |
2025-03-31 | CIBR: Cross-modal Information Bottleneck Regularization for Robust CLIP Generalization | Yingrui Ji et.al. | 2503.24182 | null |
2025-03-31 | PixelCAM: Pixel Class Activation Mapping for Histology Image Classification and ROI Localization | Alexis Guichemerre et.al. | 2503.24135 | link |
2025-03-31 | Crossmodal Knowledge Distillation with WordNet-Relaxed Text Embeddings for Robust Image Classification | Chenqi Guo et.al. | 2503.24017 | null |
2025-03-31 | FlexiMo: A Flexible Remote Sensing Foundation Model | Xuyang Li et.al. | 2503.23844 | null |
2025-03-31 | Expanding-and-Shrinking Binary Neural Networks | Xulong Shi et.al. | 2503.23709 | link |
2025-03-31 | WHERE and WHICH: Iterative Debate for Biomedical Synthetic Data Augmentation | Zhengyi Zhao et.al. | 2503.23673 | null |
2025-03-30 | Efficient Dynamic Attention 3D Convolution for Hyperspectral Image Classification | Guandong Li et.al. | 2503.23472 | null |
2025-03-30 | KernelDNA: Dynamic Kernel Sharing via Decoupled Naive Adapters | Haiduo Huang et.al. | 2503.23379 | link |
2025-03-29 | Optimizing Distributed Training Approaches for Scaling Neural Networks | Vishnu Vardhan Baligodugula et.al. | 2503.23186 | null |
2025-03-28 | Data-Free Universal Attack by Exploiting the Intrinsic Vulnerability of Deep Models | YangTian Yan et.al. | 2503.22205 | link |
2025-03-28 | Route-and-Aggregate Decentralized Federated Learning Under Communication Errors | Weicai Li et.al. | 2503.22186 | null |
2025-03-27 | On Large Multimodal Models as Open-World Image Classifiers | Alessandro Conti et.al. | 2503.21851 | link |
2025-03-27 | Bayesian Pseudo Posterior Mechanism for Differentially Private Machine Learning | Robert Chew et.al. | 2503.21528 | null |
2025-03-27 | Retinal Fundus Multi-Disease Image Classification using Hybrid CNN-Transformer-Ensemble Architectures | Deependra Singh et.al. | 2503.21465 | link |
2025-03-27 | Fine-Tuning LLMs on Small Medical Datasets: Text Classification and Normalization Effectiveness on Cardiology reports and Discharge records | Noah Losch et.al. | 2503.21349 | null |
2025-03-27 | Improving |
Mario García-Márquez et.al. | 2503.21244 | link |
2025-03-27 | Neural Architecture Search by Learning a Hierarchical Search Space | Mehraveh Javan Roshtkhari et.al. | 2503.21061 | null |
2025-03-26 | TS-Inverse: A Gradient Inversion Attack Tailored for Federated Time Series Forecasting Models | Caspar Meijer et.al. | 2503.20952 | link |
2025-03-26 | VESTA: A Versatile SNN-Based Transformer Accelerator with Unified PEs for Multiple Computational Layers | Ching-Yao Chen et.al. | 2503.20246 | null |
2025-03-26 | BeLightRec: A lightweight recommender system enhanced with BERT | Manh Mai Van et.al. | 2503.20206 | null |
2025-03-25 | Vanishing Depth: A Depth Adapter with Positional Depth Encoding for Generalized Image Encoders | Paul Koch et.al. | 2503.19947 | null |
2025-03-25 | Optimizing Breast Cancer Detection in Mammograms: A Comprehensive Study of Transfer Learning, Resolution Reduction, and Multi-View Classification | Daniel G. P. Petrini et.al. | 2503.19945 | null |
2025-03-25 | Extensions of regret-minimization algorithm for optimal design | Youguang Chen et.al. | 2503.19874 | null |
2025-03-25 | VectorFit : Adaptive Singular & Bias Vector Fine-Tuning of Pre-trained Foundation Models | Suhas G Hegde et.al. | 2503.19530 | null |
2025-03-25 | LRSCLIP: A Vision-Language Foundation Model for Aligning Remote Sensing Image with Longer Text | Weizhi Chen et.al. | 2503.19311 | null |
2025-03-25 | Face Spoofing Detection using Deep Learning | Najeebullah et.al. | 2503.19223 | link |
2025-03-24 | Exploring the Integration of Key-Value Attention Into Pure and Hybrid Transformers for Semantic Segmentation | DeShin Hwa et.al. | 2503.18862 | null |
2025-03-24 | Latent Space Class Dispersion: Effective Test Data Quality Assessment for DNNs | Vivek Vekariya et.al. | 2503.18799 | null |
2025-03-24 | Unbiasing through Textual Descriptions: Mitigating Representation Bias in Video Benchmarks | Nina Shvetsova et.al. | 2503.18637 | null |
2025-03-24 | Explaining Domain Shifts in Language: Concept erasing for Interpretable Image Classification | Zequn Zeng et.al. | 2503.18483 | null |
2025-03-24 | Teaching LLMs for Step-Level Automatic Math Correction via Reinforcement Learning | Junsong Li et.al. | 2503.18432 | null |
2025-03-24 | Sun-Shine: A Large Language Model for Tibetan Culture | Cheng Huang et.al. | 2503.18288 | null |
2025-03-23 | Feature Learning beyond the Lazy-Rich Dichotomy: Insights from Representational Geometry | Chi-Ning Chou et.al. | 2503.18114 | null |
2025-03-23 | What Time Tells Us? An Explorative Study of Time Awareness Learned from Static Images | Dongheng Lin et.al. | 2503.17899 | null |
2025-03-21 | Spatiotemporal Learning with Context-aware Video Tubelets for Ultrasound Video Analysis | Gary Y. Li et.al. | 2503.17475 | null |
2025-03-21 | Leveraging Text-to-Image Generation for Handling Spurious Correlation | Aryan Yazdan Parast et.al. | 2503.17226 | null |
2025-03-21 | CoRLD: Contrastive Representation Learning Of Deformable Shapes In Images | Tonmoy Hossain ana Miaomiao Zhang et.al. | 2503.17162 | null |
2025-03-21 | Beyond Accuracy: What Matters in Designing Well-Behaved Models? | Robin Hesse et.al. | 2503.17110 | null |
2025-03-21 | Symbolic Audio Classification via Modal Decision Tree Learning | Enrico Marzano et.al. | 2503.17018 | null |
2025-03-21 | EasyRobust: A Comprehensive and Easy-to-use Toolkit for Robust and Generalized Vision | Xiaofeng Mao et.al. | 2503.16975 | null |
2025-03-21 | City2Scene: Improving Acoustic Scene Classification with City Features | Yiqiang Cai et.al. | 2503.16862 | null |
2025-03-20 | MobilePlantViT: A Mobile-friendly Hybrid ViT for Generalized Plant Disease Image Classification | Moshiur Rahman Tonmoy et.al. | 2503.16628 | null |
2025-03-20 | PSA-MIL: A Probabilistic Spatial Attention-Based Multiple Instance Learning for Whole Slide Image Classification | Sharon Peled et.al. | 2503.16284 | link |
2025-03-20 | CLS-RL: Image Classification with Rule-Based Reinforcement Learning | Ming Li et.al. | 2503.16188 | null |
2025-03-20 | Corrective In-Context Learning: Evaluating Self-Correction in Large Language Models | Mario Sanz-Guerrero et.al. | 2503.16022 | link |
2025-03-20 | Beyond the Visible: Multispectral Vision-Language Learning for Earth Observation | Clive Tinashe Marimo et.al. | 2503.15969 | null |
2025-03-19 | Graph-Weighted Contrastive Learning for Semi-Supervised Hyperspectral Image Classification | Yuqing Zhang et.al. | 2503.15731 | null |
2025-03-20 | Dynamic Bi-Elman Attention Networks (DBEAN): Dual-Directional Context-Aware Representation Learning for Enhanced Text Classification | ZhengLin Lai et.al. | 2503.15469 | link |
2025-03-19 | Test-Time Backdoor Detection for Object Detection Models | Hangtao Zhang et.al. | 2503.15293 | null |
2025-03-19 | Efficient allocation of image recognition and LLM tasks on multi-GPU system | Marcin Lawenda et.al. | 2503.15252 | null |
2025-03-19 | Comparing Llama3 and DeepSeekR1 on Biomedical Text Classification Tasks | Yuting Guo et.al. | 2503.15169 | null |
2025-03-19 | ARC: Anchored Representation Clouds for High-Resolution INR Classification | Joost Luijmes et.al. | 2503.15156 | null |
2025-03-19 | Ultrasound Image-to-Video Synthesis via Latent Dynamic Diffusion Models | Tingxiu Chen et.al. | 2503.14966 | null |
2025-03-19 | Optimal Transport Adapter Tuning for Bridging Modality Gaps in Few-Shot Remote Sensing Scene Classification | Zhong Ji et.al. | 2503.14938 | null |
2025-03-18 | RAT: Boosting Misclassification Detection Ability without Extra Data | Ge Yan et.al. | 2503.14783 | null |
2025-03-18 | LipShiFT: A Certifiably Robust Shift-based Vision Transformer | Rohan Menon et.al. | 2503.14751 | null |
2025-03-18 | Utilization of Neighbor Information for Image Classification with Different Levels of Supervision | Gihan Jayatilaka et.al. | 2503.14500 | null |
2025-03-17 | Neural Edge Histogram Descriptors for Underwater Acoustic Target Recognition | Atharva Agashe et.al. | 2503.13763 | null |
2025-03-17 | Micro Text Classification Based on Balanced Positive-Unlabeled Learning | Lin-Han Jia et.al. | 2503.13562 | null |
2025-03-17 | Escaping Plato's Cave: Robust Conceptual Reasoning through Interpretable 3D Neural Object Volumes | Nhi Pham et.al. | 2503.13429 | null |
2025-03-17 | Do Vision Models Develop Human-Like Progressive Difficulty Understanding? | Zeyi Huang et.al. | 2503.13058 | null |
2025-03-16 | Domain Generalization for Improved Human Activity Recognition in Office Space Videos Using Adaptive Pre-processing | Partho Ghosh et.al. | 2503.12678 | null |
2025-03-16 | Scaling Semantic Categories: Investigating the Impact on Vision Transformer Labeling Performance | Anthony Lamelas et.al. | 2503.12617 | null |
2025-03-16 | Defense Against Model Stealing Based on Account-Aware Distribution Discrepancy | Jian-Ping Mei et.al. | 2503.12497 | null |
2025-03-16 | GeoRSMLLM: A Multimodal Large Language Model for Vision-Language Tasks in Geoscience and Remote Sensing | Zilun Zhang et.al. | 2503.12490 | null |
2025-03-16 | Shape Bias and Robustness Evaluation via Cue Decomposition for Image Classification and Segmentation | Edgar Heinert et.al. | 2503.12453 | null |
2025-03-16 | MExD: An Expert-Infused Diffusion Model for Whole-Slide Image Classification | Jianwei Zhao et.al. | 2503.12401 | null |
2025-03-15 | TLAC: Two-stage LMM Augmented CLIP for Zero-Shot Classification | Ans Munir et.al. | 2503.12206 | null |
2025-03-15 | Goal-Oriented Source Coding using LDPC Codes for Compressed-Domain Image Classification | Ahcen Aliouat et.al. | 2503.11954 | null |
2025-03-14 | Creating a Good Teacher for Knowledge Distillation in Acoustic Scene Classification | Tobias Morocutti et.al. | 2503.11363 | null |
2025-03-14 | PARIC: Probabilistic Attention Regularization for Language Guided Image Classification from Pre-trained Vison Language Models | Mayank Nautiyal et.al. | 2503.11360 | null |
2025-03-14 | APLA: A Simple Adaptation Method for Vision Transformers | Moein Sorkhei et.al. | 2503.11335 | null |
2025-03-14 | Open-Set Plankton Recognition | Joona Kareinen et.al. | 2503.11318 | null |
2025-03-14 | MEET: A Million-Scale Dataset for Fine-Grained Geospatial Scene Classification with Zoom-Free Remote Sensing Imagery | Yansheng Li et.al. | 2503.11219 | null |
2025-03-14 | Falcon: A Remote Sensing Vision-Language Foundation Model | Kelu Yao et.al. | 2503.11070 | null |
2025-03-13 | Juan Felipe Gomez et.al. | 2503.10945 | null | |
2025-03-13 | Learning Interpretable Logic Rules from Deep Vision Models | Chuqin Geng et.al. | 2503.10547 | null |
2025-03-13 | Extreme Learning Machines for Attention-based Multiple Instance Learning in Whole-Slide Image Classification | Rajiv Krishnakumar et.al. | 2503.10510 | null |
2025-03-13 | RoMA: Scaling up Mamba-based Foundation Models for Remote Sensing | Fengxiang Wang et.al. | 2503.10392 | link |
2025-03-13 | PS3C: An Ensemble-Based Two-Step Framework for Classification of Pep Smear Cell Images | Theo Di Piazza et.al. | 2503.10312 | link |
2025-03-13 | Wikipedia is Not a Dictionary, Delete! Text Classification as a Proxy for Analysing Wiki Deletion Discussions | Hsuvas Borkakoty et.al. | 2503.10294 | null |
2025-03-13 | A Multi-Modal Federated Learning Framework for Remote Sensing Image Classification | Barış Büyüktaş et.al. | 2503.10262 | null |
2025-03-13 | Interpretable Image Classification via Non-parametric Part Prototype Learning | Zhijie Zhu et.al. | 2503.10247 | null |
2025-03-13 | Multiplicative Learning | Han Kim et.al. | 2503.10144 | null |
2025-03-13 | Cognitive-Mental-LLM: Leveraging Reasoning in Large Language Models for Mental Health Prediction via Online Text | Avinash Patil et.al. | 2503.10095 | null |
2025-03-13 | Do We Always Need the Simplicity Bias? Looking for Optimal Inductive Biases in the Wild | Damien Teney et.al. | 2503.10065 | null |
2025-03-12 | Fair Federated Medical Image Classification Against Quality Shift via Inter-Client Progressive State Matching | Nannan Wu et.al. | 2503.09587 | null |
2025-03-12 | Double-Stage Feature-Level Clustering-Based Mixture of Experts Framework | Bakary Badjie et.al. | 2503.09504 | null |
2025-03-12 | ForAug: Recombining Foregrounds and Backgrounds to Improve Vision Transformer Training with Bias Mitigation | Tobias Christian Nauen et.al. | 2503.09399 | null |
2025-03-12 | Membership Inference Attacks fueled by Few-Short Learning to detect privacy leakage tackling data integrity | Daniel Jiménez-López et.al. | 2503.09365 | null |
2025-03-12 | Deep Learning for Climate Action: Computer Vision Analysis of Visual Narratives on X | Katharina Prasse et.al. | 2503.09361 | null |
2025-03-12 | Bayesian Test-Time Adaptation for Vision-Language Models | Lihua Zhou et.al. | 2503.09248 | null |
2025-03-12 | Probing Network Decisions: Capturing Uncertainties and Unveiling Vulnerabilities Without Label Information | Youngju Joung et.al. | 2503.09068 | null |
2025-03-12 | Discovering Influential Neuron Path in Vision Transformers | Yifan Wang et.al. | 2503.09046 | null |
2025-03-11 | KAN-Mixers: a new deep learning architecture for image classification | Jorge Luiz dos Santos Canuto et.al. | 2503.08939 | null |
2025-03-12 | MsaMIL-Net: An End-to-End Multi-Scale Aware Multiple Instance Learning Network for Efficient Whole Slide Image Classification | Jiangping Wen et.al. | 2503.08581 | null |
2025-03-11 | Generalizable and Explainable Deep Learning for Medical Image Computing: An Overview | Ahmad Chaddad et.al. | 2503.08420 | null |
2025-03-11 | Prototype-Based Multiple Instance Learning for Gigapixel Whole Slide Image Classification | Susu Sun et.al. | 2503.08384 | null |
2025-03-11 | Tangentially Aligned Integrated Gradients for User-Friendly Explanations | Lachlan Simpson et.al. | 2503.08240 | null |
2025-03-11 | EnergyFormer: Energy Attention with Fourier Embedding for Hyperspectral Image Classification | Saad Sohail et.al. | 2503.08239 | null |
2025-03-11 | Identification of Star Clusters in M31 from PAndAS Images Based on Deep Learning | Baisong Zhang et.al. | 2503.08130 | null |
2025-03-11 | LabelCoRank: Revolutionizing Long Tail Multi-Label Classification with Co-Occurrence Reranking | Yan Yan et.al. | 2503.07968 | null |
2025-03-12 | Measuring directional bias amplification in image captions using predictability | Rahul Nair et.al. | 2503.07878 | null |
2025-03-10 | Fair Text Classification via Transferable Representations | Thibaud Leteno et.al. | 2503.07691 | null |
2025-03-10 | Keeping Representation Similarity in Finetuning for Medical Image Analysis | Wenqiang Zu et.al. | 2503.07399 | null |
2025-03-10 | Brain Inspired Adaptive Memory Dual-Net for Few-Shot Image Classification | Kexin Di et.al. | 2503.07396 | null |
2025-03-10 | Is My Text in Your AI Model? Gradient-based Membership Inference Test applied to LLMs | Gonzalo Mancera et.al. | 2503.07384 | null |
2025-03-10 | Distilling Knowledge into Quantum Vision Transformers for Biomedical Image Classification | Thomas Boucher et.al. | 2503.07294 | null |
2025-03-10 | A Zero-shot Learning Method Based on Large Language Models for Multi-modal Knowledge Graph Embedding | Bingchen Liu et.al. | 2503.07202 | null |
2025-03-10 | Understanding the Learning Dynamics of LoRA: A Gradient Flow Perspective on Low-Rank Adaptation in Matrix Factorization | Ziqing Xu et.al. | 2503.06982 | null |
2025-03-10 | Task Vector Quantization for Memory-Efficient Model Merging | Youngeun Kim et.al. | 2503.06921 | null |
2025-03-10 | MADS: Multi-Attribute Document Supervision for Zero-Shot Image Classification | Xiangyan Qu et.al. | 2503.06847 | null |
2025-03-09 | Enhancing Layer Attention Efficiency through Pruning Redundant Retrievals | Hanze Li et.al. | 2503.06473 | null |
2025-03-09 | M |
Mingxiang Cao et.al. | 2503.06446 | null |
2025-03-07 | Similarity-Based Domain Adaptation with LLMs | Jie He et.al. | 2503.05281 | null |
2025-03-07 | Spatial Context-Driven Positive Pair Sampling for Enhanced Histopathology Image Classification | Willmer Rafell Quinones Robles et.al. | 2503.05170 | null |
2025-03-07 | Ensemble Debiasing Across Class and Sample Levels for Fairer Prompting Accuracy | Ruixi Lin et.al. | 2503.05157 | null |
2025-03-07 | Grouped Sequential Optimization Strategy -- the Application of Hyperparameter Importance Assessment in Deep Learning | Ruinan Wang et.al. | 2503.05106 | null |
2025-03-06 | HieroLM: Egyptian Hieroglyph Recovery with Next Word Prediction Language Model | Xuheng Cai et.al. | 2503.04996 | null |
2025-03-06 | Label Distribution Learning-Enhanced Dual-KNN for Text Classification | Bo Yuan et.al. | 2503.04869 | null |
2025-03-06 | Guiding LLMs to Generate High-Fidelity and High-Quality Counterfactual Explanations for Text Classification | Van Bach Nguyen et.al. | 2503.04463 | null |
2025-03-06 | WeakSupCon: Weakly Supervised Contrastive Learning for Encoder Pre-training | Bodong Zhang et.al. | 2503.04165 | null |
2025-03-04 | Measurement noise scaling laws for cellular representation learning | Gokul Gowri et.al. | 2503.02726 | null |
2025-03-04 | XFMamba: Cross-Fusion Mamba for Multi-View Medical Image Classification | Xiaoyu Zheng et.al. | 2503.02619 | null |
2025-03-04 | Remote Sensing Image Classification Using Convolutional Neural Network (CNN) and Transfer Learning Techniques | Mustafa Majeed Abd Zaid et.al. | 2503.02510 | null |
2025-03-06 | Union of Experts: Adapting Hierarchical Routing to Equivalently Decomposed Transformer | Yujiao Yang et.al. | 2503.02495 | null |
2025-03-04 | Making Better Mistakes in CLIP-Based Zero-Shot Classification with Hierarchy-Aware Language Prompts | Tong Liang et.al. | 2503.02248 | null |
2025-03-04 | Sharpness-Aware Minimization: General Analysis and Improved Rates | Dimitris Oikonomou et.al. | 2503.02225 | null |
2025-03-03 | Mathematical Foundation of Interpretable Equivariant Surrogate Models | Jacopo Joy Colombini et.al. | 2503.01942 | null |
2025-03-03 | Visual-RFT: Visual Reinforcement Fine-Tuning | Ziyu Liu et.al. | 2503.01785 | link |
2025-03-03 | Mamba base PKD for efficient knowledge compression | José Medina et.al. | 2503.01727 | null |
2025-03-04 | SAR-W-MixMAE: SAR Foundation Model Training Using Backscatter Power Weighting | Ali Caglayan et.al. | 2503.01181 | null |
2025-03-03 | Large Language Models for Healthcare Text Classification: A Systematic Review | Hajar Sakai et.al. | 2503.01159 | null |
2025-03-03 | Fast and Accurate Gigapixel Pathological Image Classification with Hierarchical Distillation Multi-Instance Learning | Jiuyang Dong et.al. | 2502.21130 | null |
2025-02-28 | Comparative study of the ansätze in quantum language models | Jordi Del Castillo et.al. | 2502.20744 | null |
2025-02-28 | Exploring the Impact of Temperature Scaling in Softmax for Classification and Adversarial Robustness | Hao Xuan et.al. | 2502.20604 | null |
2025-02-27 | In-Model Merging for Enhancing the Robustness of Medical Imaging Classification Models | Hu Wang et.al. | 2502.20516 | null |
2025-02-27 | Online Meta-learning for AutoML in Real-time (OnMAR) | Mia Gerber et.al. | 2502.20279 | null |
2025-03-03 | Gradient-Guided Annealing for Domain Generalization | Aristotelis Ballas et.al. | 2502.20162 | link |
2025-02-27 | QPM: Discrete Optimization for Globally Interpretable Image Classification | Thomas Norrenbrock et.al. | 2502.20130 | link |
2025-02-27 | ProAPO: Progressively Automatic Prompt Optimization for Visual Classification | Xiangyan Qu et.al. | 2502.19844 | link |
2025-02-27 | Text classification using machine learning methods | Bogdan Oancea et.al. | 2502.19801 | null |
2025-02-27 | InPK: Infusing Prior Knowledge into Prompt for Vision-Language Models | Shuchang Zhou et.al. | 2502.19777 | null |
2025-02-27 | Learning Mask Invariant Mutual Information for Masked Image Modeling | Tao Huang et.al. | 2502.19718 | null |
2025-02-27 | Language-Informed Hyperspectral Image Synthesis for Imbalanced-Small Sample Classification via Semi-Supervised Conditional Diffusion Model | Yimin Zhu et.al. | 2502.19700 | null |
2025-02-27 | Spatial-Spectral Diffusion Contrastive Representation Network for Hyperspectral Image Classification | Yimin Zhu et.al. | 2502.19699 | null |
2025-02-27 | A Residual Multi-task Network for Joint Classification and Regression in Medical Imaging | Junji Lin et.al. | 2502.19692 | null |
2025-02-26 | I Know What I Don't Know: Improving Model Cascades Through Confidence Tuning | Stephan Rabanser et.al. | 2502.19335 | null |
2025-02-26 | Active Few-Shot Learning for Text Classification | Saeed Ahmadnia et.al. | 2502.18782 | null |
2025-02-25 | Enhancing Image Classification with Augmentation: Data Augmentation Techniques for Improved Image Classification | Saorj Kumar et.al. | 2502.18691 | null |
2025-02-25 | Enhancing Text Classification with a Novel Multi-Agent Collaboration Framework Leveraging BERT | Hediyeh Baban et.al. | 2502.18653 | null |
2025-02-25 | MedKAN: An Advanced Kolmogorov-Arnold Network for Medical Image Classification | Zhuoqin Yang et.al. | 2502.18416 | null |
2025-02-26 | A Fusion Model for Art Author Identification Based on Convolutional Neural Networks and Transformers | Zhenyu Wang et.al. | 2502.18083 | null |
2025-02-25 | MAGE: Multi-Head Attention Guided Embeddings for Low Resource Sentiment Classification | Varun Vashisht et.al. | 2502.17987 | null |
2025-02-25 | Dual Classification Head Self-training Network for Cross-scene Hyperspectral Image Classification | Rong Liu et.al. | 2502.17879 | null |
2025-02-24 | Can Score-Based Generative Modeling Effectively Handle Medical Image Classification? | Sushmita Sarker et.al. | 2502.17727 | null |
2025-02-24 | A Priori Generalizability Estimate for a CNN | Cito Balsells et.al. | 2502.17622 | null |
2025-02-24 | Neural Attention: A Novel Mechanism for Enhanced Expressive Power in Transformer Models | Andrew DiGiugno et.al. | 2502.17206 | null |
2025-02-24 | Disentangling Visual Transformers: Patch-level Interpretability for Image Classification | Guillaume Jeanneret et.al. | 2502.17196 | null |
2025-02-24 | Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment | Chenghao Fan et.al. | 2502.16894 | null |
2025-02-24 | Applying LLMs to Active Learning: Towards Cost-Efficient Cross-Task Text Classification without Manually Labeled Data | Yejian Zhang et.al. | 2502.16892 | null |
2025-02-24 | A Transformer-in-Transformer Network Utilizing Knowledge Distillation for Image Recognition | Dewan Tauhid Rahman et.al. | 2502.16762 | null |
2025-02-23 | AUKT: Adaptive Uncertainty-Guided Knowledge Transfer with Conformal Prediction | Rui Liu et.al. | 2502.16736 | null |
2025-02-22 | MOB-GCN: A Novel Multiscale Object-Based Graph Neural Network for Hyperspectral Image Classification | Tuan-Anh Yang et.al. | 2502.16289 | link |
2025-02-22 | A Multi-Scale Isolation Forest Approach for Real-Time Detection and Filtering of FGSM Adversarial Attacks in Video Streams of Autonomous Vehicles | Richard Abhulimhen et.al. | 2502.16044 | null |
2025-02-21 | MMRAG: Multi-Mode Retrieval-Augmented Generation with Large Language Models for Biomedical In-Context Learning | Zaifu Zhan et.al. | 2502.15954 | null |
2025-02-21 | Directional Gradient Projection for Robust Fine-Tuning of Foundation Models | Chengyue Huang et.al. | 2502.15895 | null |
2025-02-21 | MHQA: A Diverse, Knowledge Intensive Mental Health Question Answering Challenge for Language Models | Suraj Racha et.al. | 2502.15418 | null |
2025-02-21 | A Novel Riemannian Sparse Representation Learning Network for Polarimetric SAR Image Classification | Junfei Shi et.al. | 2502.15302 | null |
2025-02-21 | Quantum autoencoders for image classification | Hinako Asaoka et.al. | 2502.15254 | null |
2025-02-21 | Steganographic Embeddings as an Effective Data Augmentation | Nicholas DiSalvo et.al. | 2502.15245 | null |
2025-02-21 | Learning to Collaborate: A Capability Vectors-based Architecture for Adaptive Human-AI Decision Making | Renlong Jie et.al. | 2502.15196 | null |
2025-02-21 | TransMamba: Fast Universal Architecture Adaption from Transformers to Mamba | Xiuwei Chen et.al. | 2502.15130 | null |
2025-02-20 | Fundamental Survey on Neuromorphic Based Audio Classification | Amlan Basu et.al. | 2502.15056 | null |
2025-02-20 | Reinforcement Learning for Ultrasound Image Analysis A Comprehensive Review of Advances and Applications | Maha Ezzelarab et.al. | 2502.14995 | null |
2025-02-20 | Sparse Activations as Conformal Predictors | Margarida M. Campos et.al. | 2502.14773 | link |
2025-02-20 | An Enhancement of Jiang, Z., et al.s Compression-Based Classification Algorithm Applied to News Article Categorization | Sean Lester C. Benavides et.al. | 2502.14444 | null |
2025-02-20 | Stochastic Resonance Improves the Detection of Low Contrast Images in Deep Learning Models | Siegfried Ludwig et.al. | 2502.14442 | null |
2025-02-20 | Token-Level Density-Based Uncertainty Quantification Methods for Eliciting Truthfulness of Large Language Models | Artem Vazhentsev et.al. | 2502.14427 | null |
2025-02-20 | Reliable Explainability of Deep Learning Spatial-Spectral Classifiers for Improved Semantic Segmentation in Autonomous Driving | Jon Gutiérrez-Zaballa et.al. | 2502.14416 | null |
2025-02-20 | QUAD-LLM-MLTC: Large Language Models Ensemble Learning for Healthcare Text Multi-Label Classification | Hajar Sakai et.al. | 2502.14189 | null |
2025-02-19 | Self-Regularization with Latent Space Explanations for Controllable LLM-based Classification | Xuansheng Wu et.al. | 2502.14133 | null |
2025-02-19 | Medical Image Classification with KAN-Integrated Transformers and Dilated Neighborhood Attention | Omid Nejati Manzari et.al. | 2502.13693 | link |
2025-02-18 | Language Models Can Predict Their Own Behavior | Dhananjay Ashok et.al. | 2502.13329 | null |
2025-02-18 | Performance Evaluation of Sentiment Analysis on Text and Emoji Data Using End-to-End, Transfer Learning, Distributed and Explainable AI Models | Sirisha Velampalli et.al. | 2502.13278 | null |
2025-02-18 | Private Text Generation by Seeding Large Language Model Prompts | Supriya Nagesh et.al. | 2502.13193 | null |
2025-02-18 | RingFormer: Rethinking Recurrent Transformer with Adaptive Level Signals | Jaemu Heo et.al. | 2502.13181 | null |
2025-02-18 | Benchmarking MedMNIST dataset on real quantum hardware | Gurinder Singh et.al. | 2502.13056 | null |
2025-02-18 | Likelihood-Ratio Regularized Quantile Regression: Adapting Conformal Prediction to High-Dimensional Covariate Shifts | Sunay Joshi et.al. | 2502.13030 | null |
2025-02-18 | A Survey of Text Classification Under Class Distribution Shift | Adriana Valentina Costache et.al. | 2502.12965 | null |
2025-02-18 | Task-Informed Anti-Curriculum by Masking Improves Downstream Performance on Text | Andrei Jarca et.al. | 2502.12953 | null |
2025-02-18 | DAMamba: Vision State Space Model with Dynamic Adaptive Scan | Tanzhe Li et.al. | 2502.12627 | null |
2025-02-18 | When Segmentation Meets Hyperspectral Image: New Paradigm for Hyperspectral Image Classification | Weilian Zhou et.al. | 2502.12541 | null |
2025-02-17 | Achieving Upper Bound Accuracy of Joint Training in Continual Learning | Saleh Momeni et.al. | 2502.12388 | null |
2025-02-17 | OCT Data is All You Need: How Vision Transformers with and without Pre-training Benefit Imaging | Zihao Han et.al. | 2502.12379 | null |
2025-02-17 | AdaSplash: Adaptive Sparse Flash Attention | Nuno Gonçalves et.al. | 2502.12082 | null |
2025-02-17 | Masked Latent Prediction and Classification for Self-Supervised Audio Representation Learning | Aurian Quelennec et.al. | 2502.12031 | null |
2025-02-17 | Text Classification in the LLM Era - Where do we stand? | Sowmya Vajjala et.al. | 2502.11830 | null |
2025-02-17 | Variable-frame CNNLSTM for Breast Nodule Classification using Ultrasound Videos | Xiangxiang Cui et.al. | 2502.11481 | null |
2025-02-16 | Leveraging Conditional Mutual Information to Improve Large Language Model Fine-Tuning For Classification | Thanushon Sivakaran et.al. | 2502.11258 | null |
2025-02-16 | UNITE-FND: Reframing Multimodal Fake News Detection through Unimodal Scene Translation | Arka Mukherjee et.al. | 2502.11132 | null |
2025-02-16 | Towards Achieving Concept Completeness for Unsupervised Textual Concept Bottleneck Models | Milan Bhan et.al. | 2502.11100 | null |
2025-02-16 | Leveraging Large Language Models for Cybersecurity: Enhancing SMS Spam Detection with Robust and Context-Aware Text Classification | Mohsen Ahmadi et.al. | 2502.11014 | null |
2025-02-15 | Simulations of Common Unsupervised Domain Adaptation Algorithms for Image Classification | Ahmad Chaddad et.al. | 2502.10694 | null |
2025-02-15 | REAL: Realism Evaluation of Text-to-Image Generation Models for Effective Data Augmentation | Ran Li et.al. | 2502.10663 | null |
2025-02-14 | Simplifying DINO via Coding Rate Regularization | Ziyang Wu et.al. | 2502.10385 | null |
2025-02-14 | Ocular Disease Classification Using CNN with Deep Convolutional Generative Adversarial Network | Arun Kunwar et.al. | 2502.10334 | null |
2025-02-14 | SeWA: Selective Weight Average via Probabilistic Masking | Peng Wang et.al. | 2502.10119 | null |
2025-02-14 | On Space Folds of ReLU Neural Networks | Michal Lewandowski et.al. | 2502.09954 | null |
2025-02-13 | A CNN Approach to Automated Detection and Classification of Brain Tumors | Md. Zahid Hasan et.al. | 2502.09731 | null |
2025-02-13 | GAIA: A Global, Multi-modal, Multi-scale Vision-Language Dataset for Remote Sensing Image Analysis | Angelos Zavras et.al. | 2502.09598 | link |
2025-02-14 | Optimizing GPT for Video Understanding: Zero-Shot Performance and Prompt Engineering | Mark Beliaev et.al. | 2502.09573 | null |
2025-02-13 | Feature-based Graph Attention Networks Improve Online Continual Learning | Adjovi Sim et.al. | 2502.09143 | null |
2025-02-13 | A Hybrid Model for Few-Shot Text Classification Using Transfer and Meta-Learning | Jia Gao et.al. | 2502.09086 | null |
2025-02-13 | Hierarchical Vision Transformer with Prototypes for Interpretable Medical Image Classification | Luisa Gallée et.al. | 2502.08997 | null |
2025-02-13 | Quantum Approaches for Dysphonia Assessment in Small Speech Datasets | Ha Tran et.al. | 2502.08968 | null |
2025-02-12 | Measuring Diversity in Synthetic Datasets | Yuchang Zhu et.al. | 2502.08512 | null |
2025-02-12 | ViLa-MIL: Dual-scale Vision-Language Multiple Instance Learning for Whole Slide Image Classification | Jiangbo Shi et.al. | 2502.08391 | null |
2025-02-12 | Keep your distance: learning dispersed embeddings on |
Evgeniia Tokarchuk et.al. | 2502.08231 | null |
2025-02-12 | Riemannian Complex Hermit Positive Definite Convolution Network for Polarimetric SAR Image Classification | Junfei Shi et.al. | 2502.08137 | null |
2025-02-12 | Knowledge Swapping via Learning and Unlearning | Mingyu Xing et.al. | 2502.08075 | null |
2025-02-12 | Can Machine Learning Support the Selection of Studies for Systematic Literature Review Updates? | Marcelo Costalonga et.al. | 2502.08050 | null |
2025-02-11 | ESPFormer: Doubly-Stochastic Attention with Expected Sliced Transport Plans | Ashkan Shahbazi et.al. | 2502.07962 | null |
2025-02-11 | Optimizing Knowledge Distillation in Transformers: Enabling Multi-Head Attention without Alignment Barriers | Zhaodong Bing et.al. | 2502.07436 | null |
2025-02-11 | MoENAS: Mixture-of-Expert based Neural Architecture Search for jointly Accurate, Fair, and Robust Edge Deep Neural Networks | Lotfi Abdelkrim Mecharbat et.al. | 2502.07422 | null |
2025-02-11 | MGPATH: Vision-Language Model with Multi-Granular Prompt Learning for Few-Shot WSI Classification | Anh-Tien Nguyen et.al. | 2502.07409 | null |
2025-02-11 | Don't Just Demo, Teach Me the Principles: A Principle-Based Multi-Agent Prompting Strategy for Text Classification | Peipei Wei et.al. | 2502.07165 | null |
2025-02-10 | From Image to Video: An Empirical Study of Diffusion Representations | Pedro Vélez et.al. | 2502.07001 | null |
2025-02-10 | Krum Federated Chain (KFC): Using blockchain to defend against adversarial attacks in Federated Learning | Mario García-Márquez et.al. | 2502.06917 | null |
2025-02-10 | Enhancing Performance of Explainable AI Models with Constrained Concept Refinement | Geyu Liang et.al. | 2502.06775 | null |
2025-02-10 | Efficient Scientific Full Text Classification: The Case of EICAT Impact Assessments | Marc Felix Brinner et.al. | 2502.06551 | null |
2025-02-10 | Hybrid State-Space and GRU-based Graph Tokenization Mamba for Hyperspectral Image Classification | Muhammad Ahmad et.al. | 2502.06427 | null |
2025-02-10 | Provably Near-Optimal Federated Ensemble Distillation with Negligible Overhead | Won-Jun Jang et.al. | 2502.06349 | null |
2025-02-10 | From Pixels to Components: Eigenvector Masking for Visual Representation Learning | Alice Bizeul et.al. | 2502.06314 | null |
2025-02-10 | Beyond Batch Learning: Global Awareness Enhanced Domain Adaptation | Lingkun Luo et.al. | 2502.06272 | null |
2025-02-10 | Multi-Scale Transformer Architecture for Accurate Medical Image Classification | Jiacheng Hu et.al. | 2502.06243 | null |
2025-02-10 | Low Tensor-Rank Adaptation of Kolmogorov--Arnold Networks | Yihang Gao et.al. | 2502.06153 | null |
2025-02-09 | Benchmarking Prompt Sensitivity in Large Language Models | Amirhossein Razavi et.al. | 2502.06065 | null |
2025-02-09 | ARISE: Iterative Rule Induction and Synthetic Data Generation for Text Classification | Yashwanth M. et.al. | 2502.05923 | null |
2025-02-07 | Training-free Neural Architecture Search through Variance of Knowledge of Deep Network Weights | Ondřej Týbl et.al. | 2502.04975 | null |
2025-02-07 | Enhancing Disinformation Detection with Explainable AI and Named Entity Replacement | Santiago González-Silot et.al. | 2502.04863 | null |
2025-02-07 | AIQViT: Architecture-Informed Post-Training Quantization for Vision Transformers | Runqing Jiang et.al. | 2502.04628 | null |
2025-02-06 | Augmented Conditioning Is Enough For Effective Training Image Generation | Jiahui Chen et.al. | 2502.04475 | null |
2025-02-06 | How does a Multilingual LM Handle Multiple Languages? | Santhosh Kakarla et.al. | 2502.04269 | null |
2025-02-06 | Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion | Marco Mistretta et.al. | 2502.04263 | null |
2025-02-06 | Expanding Training Data for Endoscopic Phenotyping of Eosinophilic Esophagitis | Juming Xiong et.al. | 2502.04199 | null |
2025-02-06 | Improving Natural Language Understanding for LLMs via Large-Scale Instruction Synthesis | Lin Yuan et.al. | 2502.03843 | null |
2025-02-06 | Self-Supervised Learning for Solar Radio Spectrum Classification | Siqi Li et.al. | 2502.03778 | null |
2025-02-06 | Conditional Diffusion Models are Medical Image Classifiers that Provide Explainability and Uncertainty for Free | Gian Mario Favero et.al. | 2502.03687 | null |
2025-02-05 | A Study in Dataset Distillation for Image Super-Resolution | Tobias Dietz et.al. | 2502.03656 | null |
2025-02-05 | Gompertz Linear Units: Leveraging Asymmetry for Enhanced Learning Dynamics | Indrashis Das et.al. | 2502.03654 | null |
2025-02-05 | Clinically-Inspired Hierarchical Multi-Label Classification of Chest X-rays with a Penalty-Based Loss Function | Mehrdad Asadi et.al. | 2502.03591 | link |
2025-02-05 | Optimal Task Order for Continual Learning of Multiple Tasks | Ziyan Li et.al. | 2502.03350 | null |
2025-02-05 | Out-of-Distribution Detection using Synthetic Data Generation | Momin Abbas et.al. | 2502.03323 | null |
2025-02-05 | Long-tailed Medical Diagnosis with Relation-aware Representation Learning and Iterative Classifier Calibration | Li Pan et.al. | 2502.03238 | null |
2025-02-05 | Adversarial Dependence Minimization | Pierre-François De Plaen et.al. | 2502.03227 | null |
2025-02-05 | Disentangling CLIP Features for Enhanced Localized Understanding | Samyak Rawelekar et.al. | 2502.02977 | null |
2025-02-05 | Slowing Learning by Erasing Simple Features | Lucia Quirke et.al. | 2502.02820 | null |
2025-02-04 | The Skin Game: Revolutionizing Standards for AI Dermatology Model Comparison | Łukasz Miętkiewicz et.al. | 2502.02500 | null |
2025-02-04 | BRIDLE: Generalized Self-supervised Learning with Quantization | Hoang M. Nguyen et.al. | 2502.02118 | null |
2025-02-04 | DCT-Mamba3D: Spectral Decorrelation and Spatial-Spectral Feature Extraction for Hyperspectral Image Classification | Weijia Cao et.al. | 2502.01986 | null |
2025-02-04 | Generative Data Mining with Longtail-Guided Diffusion | David S. Hayden et.al. | 2502.01980 | null |
2025-02-03 | A Multi-Scale Feature Fusion Framework Integrating Frequency Domain and Cross-View Attention for Dual-View X-ray Security Inspections | Shilong Hong et.al. | 2502.01710 | null |
2025-02-03 | Activation by Interval-wise Dropout: A Simple Way to Prevent Neural Networks from Plasticity Loss | Sangyeon Park et.al. | 2502.01342 | null |
2025-02-03 | A Framework for Double-Blind Federated Adaptation of Foundation Models | Nurbek Tastan et.al. | 2502.01289 | null |
2025-02-02 | Synthetic Artifact Auditing: Tracing LLM-Generated Synthetic Data Usage in Downstream Applications | Yixin Wu et.al. | 2502.00808 | null |
2025-02-02 | Enhanced Convolutional Neural Networks for Improved Image Classification | Xiaoran Yang et.al. | 2502.00663 | null |
2025-02-01 | Fast Vision Mamba: Pooling Spatial Dimensions for Accelerated Processing | Saarthak Kapse et.al. | 2502.00594 | null |
2025-01-31 | Redefining Machine Unlearning: A Conformal Prediction-Motivated Approach | Yingdan Shi et.al. | 2501.19403 | null |
2025-01-31 | An All-digital 65-nm Tsetlin Machine Image Classification Accelerator with 8.6 nJ per MNIST Frame at 60.3k Frames per Second | Svein Anders Tunheim et.al. | 2501.19347 | null |
2025-01-31 | Through the Looking Glass: LLM-Based Analysis of AR/VR Android Applications Privacy Policies | Abdulaziz Alghamdi et.al. | 2501.19223 | null |
2025-01-31 | Fairness Analysis of CLIP-Based Foundation Models for X-Ray Image Classification | Xiangyu Sun et.al. | 2501.19086 | null |
2025-01-31 | Memory-Efficient Fine-Tuning of Transformers via Token Selection | Antoine Simoulin et.al. | 2501.18824 | null |
2025-01-30 | OT-Transformer: A Continuous-time Transformer Architecture with Optimal Transport Regularization | Kelvin Kan et.al. | 2501.18793 | null |
2025-01-29 | Semantic Consistency Regularization with Large Language Models for Semi-supervised Sentiment Analysis | Kunrong Li et.al. | 2501.17598 | null |
2025-01-28 | Extending Information Bottleneck Attribution to Video Sequences | Veronika Solopova et.al. | 2501.16889 | link |
2025-01-28 | Misspellings in Natural Language Processing: A survey | Gianluca Sperduti et.al. | 2501.16836 | null |
2025-01-28 | DebugAgent: Efficient and Interpretable Error Slice Discovery for Comprehensive Model Debugging | Muxi Chen et.al. | 2501.16751 | null |
2025-01-28 | Toward Relative Positional Encoding in Spiking Transformers | Changze Lv et.al. | 2501.16745 | null |
2025-01-28 | Improving Interpretability and Accuracy in Neuro-Symbolic Rule Extraction Using Class-Specific Sparse Filters | Parth Padalkar et.al. | 2501.16677 | null |
2025-01-27 | Generating customized prompts for Zero-Shot Rare Event Medical Image Classification using LLM | Payal Kamboj et.al. | 2501.16481 | link |
2025-01-28 | SPECIAL: Zero-shot Hyperspectral Image Classification With CLIP | Li Pang et.al. | 2501.16222 | link |
2025-01-27 | The Linear Attention Resurrection in Vision Transformer | Chuanyang Zheng et.al. | 2501.16182 | null |
2025-01-27 | Enhancing the Convergence of Federated Learning Aggregation Strategies with Limited Data | Judith Sáinz-Pardo Díaz et.al. | 2501.15949 | null |
2025-01-26 | Quantum-Enhanced Attention Mechanism in NLP: A Hybrid Classical-Quantum Approach | S. M. Yousuf Iqbal Tomal et.al. | 2501.15630 | null |
2025-01-26 | Building Efficient Lightweight CNN Models | Nathan Isong et.al. | 2501.15547 | null |
2025-01-26 | Fuzzy-aware Loss for Source-free Domain Adaptation in Visual Emotion Recognition | Ying Zheng et.al. | 2501.15519 | null |
2025-01-26 | Variational Bayesian Adaptive Learning of Deep Latent Variables for Acoustic Knowledge Transfer | Hu Hu et.al. | 2501.15496 | null |
2025-01-25 | Pre-trained Model Guided Mixture Knowledge Distillation for Adversarial Federated Learning | Yu Qiao et.al. | 2501.15257 | null |
2025-01-24 | Feasible Learning | Juan Ramirez et.al. | 2501.14912 | link |
2025-01-24 | Rethinking Foundation Models for Medical Image Classification through a Benchmark Study on MedMNIST | Fuping Wu et.al. | 2501.14685 | null |
2025-01-24 | Geometric Mean Improves Loss For Few-Shot Learning | Tong Wu et.al. | 2501.14593 | null |
2025-01-24 | Idiom Detection in Sorani Kurdish Texts | Skala Kamaran Omer et.al. | 2501.14528 | null |
2025-01-24 | Guobin Shen et.al. | 2501.14484 | null | |
2025-01-24 | Impact of Batch Normalization on Convolutional Network Representations | Hermanus L. Potgieter et.al. | 2501.14441 | null |
2025-01-24 | Quantum Neural Networks: A Comparative Analysis and Noise Robustness Evaluation | Tasnim Ahmed et.al. | 2501.14412 | null |
2025-01-24 | Correlation-Based Band Selection for Hyperspectral Image Classification | Dibyabha Deb et.al. | 2501.14338 | link |
2025-01-24 | Relative Layer-Wise Relevance Propagation: a more Robust Neural Networks eXplaination | Eric Nyiri et.al. | 2501.14322 | null |
2025-01-24 | A Comprehensive Framework for Semantic Similarity Detection Using Transformer Architectures and Enhanced Ensemble Techniques | Lifu Gao et.al. | 2501.14288 | null |
2025-01-24 | TLXML: Task-Level Explanation of Meta-Learning via Influence Functions | Yoshihiro Mitsuka et.al. | 2501.14271 | null |
2025-01-23 | A Study of the Plausibility of Attention between RNN Encoders in Natural Language Inference | Duc Hau Nguyen et.al. | 2501.13735 | null |
2025-01-23 | A Transformer-based Autoregressive Decoder Architecture for Hierarchical Text Classification | Younes Yousef et.al. | 2501.13598 | link |
2025-01-23 | Multi-Level Attention and Contrastive Learning for Enhanced Text Classification with an Optimized Transformer | Jia Gao et.al. | 2501.13467 | null |
2025-01-23 | Atmospheric Noise-Resilient Image Classification in a Real-World Scenario: Using Hybrid CNN and Pin-GTSVM | Shlok Mehendale et.al. | 2501.13422 | null |
2025-01-23 | AEON: Adaptive Estimation of Instance-Dependent In-Distribution and Out-of-Distribution Label Noise for Robust Learning | Arpit Garg et.al. | 2501.13389 | null |
2025-01-23 | Multi-aspect Knowledge Distillation with Large Language Model | Taegyeong Lee et.al. | 2501.13341 | null |
2025-01-22 | Revisiting Data Augmentation for Ultrasound Images | Adam Tupper et.al. | 2501.13193 | link |
2025-01-22 | Regularization, Semi-supervision, and Supervision for a Plausible Attention-Based Explanation | Duc Hau Nguyen et.al. | 2501.12775 | link |
2025-01-22 | Estimating the Conformal Prediction Threshold from Noisy Labels | Coby Penso et.al. | 2501.12749 | link |
2025-01-22 | Adapting OpenAI's CLIP Model for Few-Shot Image Inspection in Manufacturing Quality Control: An Expository Case Study with Multiple Application Examples | Fadel M. Megahed et.al. | 2501.12596 | null |
2025-01-21 | Efficient Lung Ultrasound Severity Scoring Using Dedicated Feature Extractor | Jiaqi Guo et.al. | 2501.12524 | null |
2025-01-21 | CCESAR: Coastline Classification-Extraction From SAR Images Using CNN-U-Net Combination | Vidhu Arora et.al. | 2501.12384 | null |
2025-01-21 | CBVLM: Training-free Explainable Concept-based Large Vision Language Models for Medical Image Classification | Cristiano Patrício et.al. | 2501.12266 | null |
2025-01-21 | Early Detection and Classification of Breast Cancer Using Deep Learning Techniques | Mst. Mumtahina Labonno et.al. | 2501.12217 | null |
2025-01-21 | UAV-Assisted Real-Time Disaster Detection Using Optimized Transformer Model | Branislava Jankovic et.al. | 2501.12087 | null |
2025-01-20 | Communication-Efficient Federated Learning Based on Explanation-Guided Pruning for Remote Sensing Image Classification | Jonas Klotz et.al. | 2501.11493 | null |
2025-01-22 | QGAIC: Quantum Inspired Genetic Algorithm for Image Classification | Akhilesh Kumar Singh et.al. | 2501.11477 | null |
2025-01-20 | GenVidBench: A Challenging Benchmark for Detecting AI-Generated Video | Zhenliang Ni et.al. | 2501.11340 | null |
2025-01-20 | KPL: Training-Free Medical Knowledge Mining of Vision-Language Models | Jiaxiang Liu et.al. | 2501.11231 | link |
2025-01-19 | CLOFAI: A Dataset of Real And Fake Image Classification Tasks for Continual Learning | William Doherty et.al. | 2501.11140 | link |
2025-01-19 | Leveraging counterfactual concepts for debugging and improving CNN model performance | Syed Ali Tariq et.al. | 2501.11087 | null |
2025-01-17 | A Vision-Language Framework for Multispectral Scene Representation Using Language-Grounded Features | Enes Karanfil et.al. | 2501.10144 | null |
2025-01-17 | Classifier Ensemble for Efficient Uncertainty Calibration of Deep Neural Networks for Image Classification | Michael Schulze et.al. | 2501.10089 | null |
2025-01-17 | One-D-Piece: Image Tokenizer Meets Quality-Controllable Compression | Keita Miwa et.al. | 2501.10064 | null |
2025-01-17 | LWGANet: A Lightweight Group Attention Backbone for Remote Sensing Visual Tasks | Wei Lu et.al. | 2501.10040 | link |
2025-01-16 | Empirical Evaluation of Embedding Models in the Context of Text Classification in Document Review in Construction Delay Disputes | Fusheng Wei et.al. | 2501.09859 | null |
2025-01-16 | SRE-Conv: Symmetric Rotation Equivariant Convolution for Biomedical Image Classification | Yuexi Du et.al. | 2501.09753 | link |
2025-01-16 | Practical Continual Forgetting for Pre-trained Vision Models | Hongbo Zhao et.al. | 2501.09705 | link |
2025-01-16 | Multimodal Marvels of Deep Learning in Medical Diagnosis: A Comprehensive Review of COVID-19 Detection | Md Shofiqul Islama et.al. | 2501.09506 | link |
2025-01-16 | HydraMix: Multi-Image Feature Mixing for Small Data Image Classification | Christoph Reinders et.al. | 2501.09504 | null |
2025-01-16 | Quantum-Enhanced Transformers for Robust Acoustic Scene Classification in IoT Environments | Minh K. Quan et.al. | 2501.09394 | null |
2025-01-16 | Shape-Based Single Object Classification Using Ensemble Method Classifiers | Nur Shazwani Kamarudin et.al. | 2501.09311 | null |
2025-01-16 | Efficient Few-Shot Medical Image Analysis via Hierarchical Contrastive Vision-Language Learning | Harrison Fuller et.al. | 2501.09294 | null |
2025-01-16 | A Simple Graph Contrastive Learning Framework for Short Text Classification | Yonghao Liu et.al. | 2501.09219 | link |
2025-01-16 | Boosting Short Text Classification with Multi-Source Information Exploration and Dual-Level Contrastive Learning | Yonghao Liu et.al. | 2501.09214 | link |
2025-01-15 | Augmenting Human-Annotated Training Data with Large Language Model Generation and Distillation in Open-Response Assessment | Conrad Borchers et.al. | 2501.09126 | null |
2025-01-15 | IDEA: Image Description Enhanced CLIP-Adapter | Zhipeng Ye et.al. | 2501.08816 | null |
2025-01-15 | MIAFEx: An Attention-based Feature Extraction Method for Medical Image Classification | Oscar Ramos-Soto et.al. | 2501.08562 | null |
2025-01-14 | Towards Zero-Shot & Explainable Video Description by Reasoning over Graphs of Events in Space and Time | Mihai Masala et.al. | 2501.08460 | null |
2025-01-14 | Large Language Models For Text Classification: Case Study And Comprehensive Review | Arina Kostina et.al. | 2501.08457 | null |
2025-01-14 | READ: Reinforcement-based Adversarial Learning for Text Classification with Limited Labeled Data | Rohit Sharma et.al. | 2501.08035 | null |
2025-01-14 | Training Hybrid Neural Networks with Multimode Optical Nonlinearities Using Digital Twins | Ilker Oguz et.al. | 2501.07991 | null |
2025-01-14 | deepTerra -- AI Land Classification Made Easy | Andrew Keith Wilkinson et.al. | 2501.07859 | null |
2025-01-14 | A Low-cost and Ultra-lightweight Binary Neural Network for Traffic Signal Recognition | Mingke Xiao et.al. | 2501.07808 | null |
2025-01-14 | Balance Divergence for Knowledge Distillation | Yafei Qi et.al. | 2501.07804 | null |
2025-01-14 | Parameter-Inverted Image Pyramid Networks for Visual Perception and Multimodal Understanding | Zhaokai Wang et.al. | 2501.07783 | link |
2025-01-13 | Universal Training of Neural Networks to Achieve Bayes Optimal Classification Accuracy | Mohammadreza Tavasoli Naeini et.al. | 2501.07754 | null |
2025-01-13 | Uncertainty Guarantees on Automated Precision Weeding using Conformal Prediction | Paul Melki et.al. | 2501.07185 | null |
2025-01-13 | Adaptive Noise-Tolerant Network for Image Segmentation | Weizhi Li et.al. | 2501.07163 | null |
2025-01-12 | LarvSeg: Exploring Image Classification Data For Large Vocabulary Semantic Segmentation via Category-wise Attentive Classifier | Haojun Yu et.al. | 2501.06862 | link |
2025-01-12 | Rice Leaf Disease Detection: A Comparative Study Between CNN, Transformer and Non-neural Network Architectures | Samia Mehnaz et.al. | 2501.06740 | null |
2025-01-12 | Multi-Label Scene Classification in Remote Sensing Benefits from Image Super-Resolution | Ashitha Mudraje et.al. | 2501.06720 | null |
2025-01-11 | Synthetic Feature Augmentation Improves Generalization Performance of Language Models | Ashok Choudhary et.al. | 2501.06434 | null |
2025-01-10 | Kolmogorov-Arnold networks for metal surface defect classification | Maciej Krzywda et.al. | 2501.06389 | null |
2025-01-10 | Merging Feed-Forward Sublayers for Compressed Transformers | Neha Verma et.al. | 2501.06126 | link |
2025-01-10 | Averaged Adam accelerates stochastic optimization in the training of deep neural network approximations for partial differential equation and optimal control problems | Steffen Dereich et.al. | 2501.06081 | link |
2025-01-10 | Constrained Over-the-Air Model Updating for Wireless Online Federated Learning with Delayed Information | Juncheng Wang et.al. | 2501.05637 | null |
2025-01-10 | The Impact of Model Scaling on Seen and Unseen Language Performance | Rhitabrat Pokharel et.al. | 2501.05629 | null |
2025-01-09 | Vision-Language Models for Autonomous Driving: CLIP-Based Dynamic Scene Understanding | Mohammed Elhenawy et.al. | 2501.05566 | null |
2025-01-09 | Spatial Information Integration in Small Language Models for Document Layout Generation and Classification | Pablo Melendez et.al. | 2501.05497 | null |
2025-01-09 | An Empirical Study of Autoregressive Pre-training from Videos | Jathushan Rajasegaran et.al. | 2501.05453 | null |
2025-01-09 | A 1Mb mixed-precision quantized encoder for image classification and patch-based compression | Van Thien Nguyen et.al. | 2501.05097 | null |
2025-01-09 | A CT Image Classification Network Framework for Lung Tumors Based on Pre-trained MobileNetV2 Model and Transfer learning, And Its Application and Market Analysis in the Medical field | Ziyang Gao et.al. | 2501.04996 | null |
2025-01-09 | MambaHSI: Spatial-Spectral Mamba for Hyperspectral Image Classification | Yapeng Li et.al. | 2501.04944 | null |
2025-01-09 | A New Perspective on Privacy Protection in Federated Learning with Granular-Ball Computing | Guannan Lai et.al. | 2501.04940 | link |
2025-01-09 | ThriftLLM: On Cost-Effective Selection of Large Language Models for Classification Queries | Keke Huang et.al. | 2501.04901 | null |
2025-01-09 | Online Continual Learning: A Systematic Literature Review of Approaches, Challenges, and Benchmarks | Seyed Amir Bidaki et.al. | 2501.04897 | link |
2025-01-08 | Planarian Neural Networks: Evolutionary Patterns from Basic Bilateria Shaping Modern Artificial Neural Network Architectures | Ziyuan Huang et.al. | 2501.04700 | null |
2025-01-08 | Discrete Wavelet Transform-Based Capsule Network for Hyperspectral Image Classification | Zhiqiang Gao et.al. | 2501.04643 | null |
2025-01-08 | Enhancing Scene Classification in Cloudy Image Scenarios: A Collaborative Transfer Method with Information Regulation Mechanism using Optical Cloud-Covered and SAR Remote Sensing Images | Yuze Wang et.al. | 2501.04283 | null |
2025-01-08 | Comparison of Neural Models for X-ray Image Classification in COVID-19 Detection | Jimi Togni et.al. | 2501.04196 | null |
2025-01-07 | Temporal Feature Weaving for Neonatal Echocardiographic Viewpoint Video Classification | Satchel French et.al. | 2501.03967 | link |
2025-01-07 | Dolphin: Closed-loop Open-ended Auto-research through Thinking, Practice, and Feedback | Jiakang Yuan et.al. | 2501.03916 | null |
2025-01-07 | MedFocusCLIP : Improving few shot classification in medical datasets using pixel wise attention | Aadya Arora et.al. | 2501.03839 | null |
2025-01-07 | LHGNN: Local-Higher Order Graph Neural Networks For Audio Classification and Tagging | Shubhr Singh et.al. | 2501.03464 | null |
2025-01-06 | FTA-FTL: A Fine-Tuned Aggregation Federated Transfer Learning Scheme for Lithology Microscopic Image Classification | Keyvan RahimiZadeh et.al. | 2501.03349 | link |
2025-01-06 | CM3T: Framework for Efficient Multimodal Learning for Inhomogeneous Interaction Datasets | Tanay Agrawal et.al. | 2501.03332 | null |
2025-01-06 | Plant Leaf Disease Detection and Classification Using Deep Learning: A Review and A Proposed System on Bangladesh's Perspective | Md. Jalal Uddin Chowdhury et.al. | 2501.03305 | null |
2025-01-06 | Deep-Relative-Trust-Based Diffusion for Decentralized Deep Learning | Muyun Li et.al. | 2501.03162 | null |
2025-01-06 | Graph-based Retrieval Augmented Generation for Dynamic Few-shot Text Classification | Yubo Wang et.al. | 2501.02844 | null |
2025-01-06 | TARDiS : Text Augmentation for Refining Diversity and Separability | Kyungmin Kim et.al. | 2501.02739 | null |
2025-01-05 | FedRSClip: Federated Learning for Remote Sensing Scene Classification Using Vision-Language Models | Hui Lin et.al. | 2501.02461 | null |
2025-01-04 | Exploring Secure Machine Learning Through Payload Injection and FGSM Attacks on ResNet-50 | Umesh Yadav et.al. | 2501.02147 | null |
2025-01-03 | A Separable Self-attention Inspired by the State Space Model for Computer Vision | Juntao Zhang et.al. | 2501.02040 | link |
2025-01-03 | Google is all you need: Semi-Supervised Transfer Learning Strategy For Light Multimodal Multi-Task Classification Model | Haixu Liu et.al. | 2501.01611 | null |
2025-01-02 | Multi-Modal Video Feature Extraction for Popularity Prediction | Haixu Liu et.al. | 2501.01422 | null |
2025-01-02 | A Multi-task Supervised Compression Model for Split Computing | Yoshitomo Matsubara et.al. | 2501.01420 | link |
2025-01-02 | Multi-Head Explainer: A General Framework to Improve Explainability in CNNs and Transformers | Bohang Sun et.al. | 2501.01311 | null |
2025-01-02 | FAST: Fast Audio Spectrogram Transformer | Anugunj Naman et.al. | 2501.01104 | null |
2025-01-01 | A Novel Approach using CapsNet and Deep Belief Network for Detection and Identification of Oral Leukopenia | Hirthik Mathesh GV et.al. | 2501.00876 | null |
2025-01-01 | Ensuring superior learning outcomes and data security for authorized learner | Jeongho Bang et.al. | 2501.00754 | null |
2024-12-31 | TSPE: Task-Specific Prompt Ensemble for Improved Zero-Shot Audio Classification | Nishit Anand et.al. | 2501.00398 | null |
2024-12-31 | Exploring Variability in Fine-Tuned Models for Text Classification with DistilBERT | Giuliano Lorenzoni et.al. | 2501.00241 | null |
2024-12-30 | The Text Classification Pipeline: Starting Shallow going Deeper | Marco Siino et.al. | 2501.00174 | null |
2024-12-30 | Text Classification: Neural Networks VS Machine Learning Models VS Pre-trained Models | Christos Petridis et.al. | 2412.21022 | null |
2024-12-30 | FPGA-based Acceleration of Neural Network for Image Classification using Vitis AI | Zhengdong Li et.al. | 2412.20974 | null |
2024-12-30 | Uncertainty-Aware Out-of-Distribution Detection with Gaussian Processes | Yang Chen et.al. | 2412.20918 | null |
2024-12-30 | UniRS: Unifying Multi-temporal Remote Sensing Tasks through Vision Language Models | Yujie Li et.al. | 2412.20742 | null |
2024-12-30 | Improving Acoustic Scene Classification in Low-Resource Conditions | Zhi Chen et.al. | 2412.20722 | null |
2024-12-29 | Hilbert Curve Based Molecular Sequence Analysis | Sarwan Ali et.al. | 2412.20616 | null |
2024-12-29 | A Novel FPGA-based CNN Hardware Accelerator: Optimization for Convolutional Layers using Karatsuba Ofman Multiplier | Amit Sarkar et.al. | 2412.20393 | null |
2024-12-29 | HindiLLM: Large Language Model for Hindi | Sanjay Chouhan et.al. | 2412.20357 | null |
2024-12-29 | Deep Learning in Image Classification: Evaluating VGG19's Performance on Complex Visual Data | Weijie He et.al. | 2412.20345 | null |
2024-12-28 | Few-shot Algorithm Assurance | Dang Nguyen et.al. | 2412.20275 | null |
2024-12-27 | Asymmetrical Reciprocity-based Federated Learning for Resolving Disparities in Medical Diagnosis | Jiaqi Wang et.al. | 2412.19654 | null |
2024-12-27 | Enhancing Fine-grained Image Classification through Attentive Batch Training | Duy M. Le et.al. | 2412.19606 | null |
2024-12-27 | A Comparative Study of Machine Unlearning Techniques for Image and Text Classification Models | Omar M. Safa et.al. | 2412.19583 | null |
2024-12-27 | Multi-label Classification using Deep Multi-order Context-aware Kernel Networks | Mingyuan Jiu et.al. | 2412.19491 | null |
2024-12-27 | Residual Feature-Reutilization Inception Network for Image Classification | Yuanpeng He et.al. | 2412.19433 | null |
2024-12-27 | An In-Depth Analysis of Adversarial Discriminative Domain Adaptation for Digit Classification | Eugene Choi et.al. | 2412.19391 | link |
2024-12-26 | Assessing Pre-trained Models for Transfer Learning through Distribution of Spectral Components | Tengxue Zhang et.al. | 2412.19085 | null |
2024-12-26 | Let the Rule Speak: Enhancing In-context Learning Debiasing with Interpretability | Ruixi Lin et.al. | 2412.19018 | null |
2024-12-25 | Injecting Bias into Text Classification Models using Backdoor Attacks | A. Dilara Yavuz et.al. | 2412.18975 | null |
2024-12-25 | Research Experiment on Multi-Model Comparison for Chinese Text Classification Tasks | JiaCheng Li et.al. | 2412.18908 | null |
2024-12-24 | VisionGRU: A Linear-Complexity RNN Model for Efficient Image Analysis | Shicheng Yin et.al. | 2412.18178 | link |
2024-12-24 | Beyond Gradient Averaging in Parallel Optimization: Improved Robustness through Gradient Agreement Filtering | Francois Chaubard et.al. | 2412.18052 | null |
2024-12-23 | Explainability in Neural Networks for Natural Language Processing Tasks | Melkamu Mersha et.al. | 2412.18036 | null |
2024-12-23 | COBRA: COmBinatorial Retrieval Augmentation for Few-Shot Learning | Arnav M. Das et.al. | 2412.17684 | null |
2024-12-23 | Resource-Aware Arabic LLM Creation: Model Adaptation, Integration, and Multi-Domain Testing | Prakash Aryan et.al. | 2412.17548 | link |
2024-12-23 | Domain-Incremental Learning for Audio Classification | Manjunath Mulimani et.al. | 2412.17424 | null |
2024-12-23 | An Experimental Evaluation of Japanese Tokenizers for Sentiment-Based Text Classification | Andre Rusli et.al. | 2412.17361 | link |
2024-12-23 | DiffFormer: a Differential Spatial-Spectral Transformer for Hyperspectral Image Classification | Muhammad Ahmad et.al. | 2412.17350 | link |
2024-12-22 | Survey on Abstractive Text Summarization: Dataset, Models, and Metrics | Gospel Ozioma Nnadi et.al. | 2412.17165 | link |
2024-12-22 | LH-Mix: Local Hierarchy Correlation Guided Mixup over Hierarchical Prompt Tuning | Fanshuang Kong et.al. | 2412.16963 | link |
2024-12-22 | Predicting the Reliability of an Image Classifier under Image Distortion | Dang Nguyen et.al. | 2412.16881 | null |
2024-12-21 | Forget Vectors at Play: Universal Input Perturbations Driving Machine Unlearning in Image Classification | Changchang Sun et.al. | 2412.16780 | null |
2024-12-21 | UNEM: UNrolled Generalized EM for Transductive Few-Shot Learning | Long Zhou et.al. | 2412.16739 | link |
2024-12-20 | Mamba2D: A Natively Multi-Dimensional State-Space Model for Vision Tasks | Enis Baty et.al. | 2412.16146 | null |
2024-12-20 | Towards Interpretable Radiology Report Generation via Concept Bottlenecks using a Multi-Agentic RAG | Hasan Md Tusfiqur Alam et.al. | 2412.16086 | link |
2024-12-20 | A Thorough Investigation into the Application of Deep CNN for Enhancing Natural Language Processing Capabilities | Chang Weng et.al. | 2412.15900 | null |
2024-12-20 | Continual Learning Using a Kernel-Based Method Over Foundation Models | Saleh Momeni et.al. | 2412.15571 | link |
2024-12-19 | Time Will Tell: Timing Side Channels via Output Token Count in Large Language Models | Tianchen Zhang et.al. | 2412.15431 | null |
2024-12-19 | Till the Layers Collapse: Compressing a Deep Neural Network through the Lenses of Batch Normalization Layers | Zhu Liao et.al. | 2412.15077 | null |
2024-12-18 | Zero-Shot Prompting and Few-Shot Fine-Tuning: Revisiting Document Image Classification Using Large Language Models | Anna Scius-Bertrand et.al. | 2412.13859 | null |
2024-12-18 | Modelling Multi-modal Cross-interaction for ML-FSIC Based on Local Feature Selection | Kun Yan et.al. | 2412.13732 | null |
2024-12-18 | MBInception: A new Multi-Block Inception Model for Enhancing Image Processing Efficiency | Fatemeh Froughirad et.al. | 2412.13703 | null |
2024-12-17 | Identifying Bias in Deep Neural Networks Using Image Transforms | Sai Teja Erukude et.al. | 2412.13079 | link |
2024-12-17 | Token-Level Graphs for Short Text Classification | Gregor Donabauer et.al. | 2412.12754 | link |
2024-12-17 | Your Next State-of-the-Art Could Come from Another Domain: A Cross-Domain Analysis of Hierarchical Text Classification | Nan Li et.al. | 2412.12744 | link |
2024-12-17 | ShotVL: Human-Centric Highlight Frame Retrieval via Language Queries | Wangyu Xue et.al. | 2412.12675 | null |
2024-12-17 | Structural Pruning via Spatial-aware Information Redundancy for Semantic Segmentation | Dongyue Wu et.al. | 2412.12672 | link |
2024-12-19 | RemoteTrimmer: Adaptive Structural Pruning for Remote Sensing Image Classification | Guangwenjie Zou et.al. | 2412.12603 | link |
2024-12-17 | Addressing Small and Imbalanced Medical Image Datasets Using Generative Models: A Comparative Study of DDPM and PGGANs with Random and Greedy K Sampling | Iman Khazrak et.al. | 2412.12532 | link |
2024-12-16 | Gramian Multimodal Representation Learning and Alignment | Giordano Cicchetti et.al. | 2412.11959 | null |
2024-12-16 | The Impact of Generalization Techniques on the Interplay Among Privacy, Utility, and Fairness in Image Classification | Ahmad Hassanpour et.al. | 2412.11951 | null |
2024-12-16 | Does VLM Classification Benefit from LLM Description Semantics? | Pingchuan Ma et.al. | 2412.11917 | link |
2024-12-16 | Discrepancy-Aware Attention Network for Enhanced Audio-Visual Zero-Shot Learning | RunLin Yu et.al. | 2412.11715 | null |
2024-12-16 | LMM-Regularized CLIP Embeddings for Image Classification | Maria Tzelepi et.al. | 2412.11663 | null |
2024-12-16 | Non-Convex Optimization in Federated Learning via Variance Reduction and Adaptive Learning | Dipanwita Thakur et.al. | 2412.11660 | null |
2024-12-16 | CNNtention: Can CNNs do better with Attention? | Julian Glattki et.al. | 2412.11657 | null |
2024-12-16 | Explicit and Implicit Graduated Optimization in Deep Neural Networks | Naoki Sato et.al. | 2412.11501 | link |
2024-12-16 | Towards Better Multi-task Learning: A Framework for Optimizing Dataset Combinations in Large Language Models | Zaifu Zhan et.al. | 2412.11455 | null |
2024-12-16 | Scaled Conjugate Gradient Method for Nonconvex Optimization in Deep Neural Networks | Naoki Sato et.al. | 2412.11400 | null |
2024-12-13 | Robust image classification with multi-modal large language models | Francesco Villani et.al. | 2412.10353 | null |
2024-12-13 | MVQ:Towards Efficient DNN Compression and Acceleration with Masked Vector Quantization | Shuaiting Li et.al. | 2412.10261 | null |
2024-12-13 | Label-template based Few-Shot Text Classification with Contrastive Learning | Guanghua Hou et.al. | 2412.10110 | null |
2024-12-13 | Data Pruning Can Do More: A Comprehensive Data Pruning Approach for Object Re-identification | Zi Yang et.al. | 2412.10091 | link |
2024-12-13 | Low-Resource Fast Text Classification Based on Intra-Class and Inter-Class Distance Calculation | Yanxu Mao et.al. | 2412.09922 | null |
2024-12-12 | DQA: An Efficient Method for Deep Quantization of Deep Neural Network Activations | Wenhao Hu et.al. | 2412.09687 | null |
2024-12-12 | Embeddings are all you need! Achieving High Performance Medical Image Classification through Training-Free Embedding Analysis | Raj Hansini Khoiwal et.al. | 2412.09445 | null |
2024-12-12 | Learned Compression for Compressed Learning | Dan Jacobellis et.al. | 2412.09405 | link |
2024-12-12 | Advancing Attribution-Based Neural Network Explainability through Relative Absolute Magnitude Layer-Wise Relevance Propagation and Multi-Component Evaluation | Davor Vukadin et.al. | 2412.09311 | link |
2024-12-13 | An Efficient Framework for Enhancing Discriminative Models via Diffusion Techniques | Chunxiao Li et.al. | 2412.09063 | null |
2024-12-12 | STEAM: Squeeze and Transform Enhanced Attention Module | Rishabh Sabharwal et.al. | 2412.09023 | null |
2024-12-12 | Stochastic Learning of Non-Conjugate Variational Posterior for Image Classification | Kart-Leong Lim et.al. | 2412.08951 | null |
2024-12-11 | BDA: Bangla Text Data Augmentation Framework | Md. Tariquzzaman et.al. | 2412.08753 | null |
2024-12-11 | Advancing Single- and Multi-task Text Classification through Large Language Model Fine-tuning | Hang Zhao et.al. | 2412.08587 | null |
2024-12-11 | ALoRE: Efficient Visual Adaptation via Aggregating Low Rank Experts | Sinan Du et.al. | 2412.08341 | null |
2024-12-11 | Online training and pruning of photonic neural networks | Jiawei Zhang et.al. | 2412.08184 | null |
2024-12-11 | Wasserstein Distance Rivals Kullback-Leibler Divergence for Knowledge Distillation | Jiaming Lv et.al. | 2412.08139 | null |
2024-12-11 | Concept Bottleneck Large Language Models | Chung-En Sun et.al. | 2412.07992 | link |
2024-12-10 | FastDDS-Based Middleware System for Remote X-Ray Image Classification Using Raspberry Pi | Omar H. Khater et.al. | 2412.07818 | null |
2024-12-10 | Leveraging Content and Context Cues for Low-Light Image Enhancement | Igor Morawski et.al. | 2412.07693 | link |
2024-12-10 | Post-Training Non-Uniform Quantization for Convolutional Neural Networks | Ahmed Luqman et.al. | 2412.07391 | null |
2024-12-10 | Image Classification Using Singular Value Decomposition and Optimization | Isabela M. Yepes et.al. | 2412.07288 | link |
2024-12-10 | An Enhancement of CNN Algorithm for Rice Leaf Disease Image Classification in Mobile Applications | Kayne Uriel K. Rodrigo et.al. | 2412.07182 | null |
2024-12-09 | Convolution goes higher-order: a biologically inspired mechanism empowers image classification | Simone Azeglio et.al. | 2412.06740 | null |
2024-12-09 | Impact of Privacy Parameters on Deep Learning Models for Image Classification | Basanta Chaulagain et.al. | 2412.06689 | null |
2024-12-10 | Data Quality Enhancement on the Basis of Diversity with Large Language Models for Text Classification: Uncovered, Difficult, and Noisy | Min Zeng et.al. | 2412.06575 | null |
2024-12-09 | How Certain are Uncertainty Estimates? Three Novel Earth Observation Datasets for Benchmarking Uncertainty Quantification in Machine Learning | Yuanyuan Wang et.al. | 2412.06451 | null |
2024-12-09 | Optimizing Multi-Task Learning for Enhanced Performance in Large Language Models | Zhen Qi et.al. | 2412.06249 | null |
2024-12-08 | Hyperspectral Image Spectral-Spatial Feature Extraction via Tensor Principal Component Analysis | Yuemei Ren et.al. | 2412.06075 | null |
2024-12-08 | Vision Transformer-based Semantic Communications With Importance-Aware Quantization | Joohyuk Park et.al. | 2412.06038 | null |
2024-12-06 | Sparse autoencoders reveal selective remapping of visual concepts during adaptation | Hyesu Lim et.al. | 2412.05276 | link |
2024-12-06 | MTSpark: Enabling Multi-Task Learning with Spiking Neural Networks for Generalist Agents | Avaneesh Devkota et.al. | 2412.04847 | null |
2024-12-05 | Grounding Descriptions in Images informs Zero-Shot Visual Recognition | Shaunak Halbe et.al. | 2412.04429 | link |
2024-12-05 | FedDUAL: A Dual-Strategy with Adaptive Loss and Dynamic Aggregation for Mitigating Data Heterogeneity in Federated Learning | Pranab Sahoo et.al. | 2412.04416 | link |
2024-12-05 | Enhancing Whole Slide Image Classification through Supervised Contrastive Domain Adaptation | Ilán Carretero et.al. | 2412.04260 | null |
2024-12-05 | Demonstration Selection for In-Context Learning via Reinforcement Learning | Xubin Wang et.al. | 2412.03966 | null |
2024-12-05 | Quantized and Interpretable Learning Scheme for Deep Neural Networks in Classification Task | Alireza Maleki et.al. | 2412.03915 | null |
2024-12-05 | Multisource Collaborative Domain Generalization for Cross-Scene Remote Sensing Image Classification | Zhu Han et.al. | 2412.03897 | null |
2024-12-05 | Dual-Branch Subpixel-Guided Network for Hyperspectral Image Classification | Zhu Han et.al. | 2412.03893 | link |
2024-12-04 | Language Model Meets Prototypes: Towards Interpretable Text Classification Models through Prototypical Networks | Ximing Wen et.al. | 2412.03761 | null |
2024-12-05 | Continual Low-Rank Scaled Dot-product Attention | Ginés Carreto Picón et.al. | 2412.03214 | null |
2024-12-04 | Multi-Level Correlation Network For Few-Shot Image Classification | Yunkai Dang et.al. | 2412.03159 | link |
2024-12-04 | Assessing the performance of CT image denoisers using Laguerre-Gauss Channelized Hotelling Observer for lesion detection | Prabhat Kc et.al. | 2412.02920 | null |
2024-12-04 | Higher Order Transformers: Efficient Attention Mechanism for Tensor Structured Data | Soroush Omranpour et.al. | 2412.02919 | null |
2024-12-03 | Synergistic Development of Perovskite Memristors and Algorithms for Robust Analog Computing | Nanyang Ye et.al. | 2412.02779 | null |
2024-12-03 | Mixture of Physical Priors Adapter for Parameter-Efficient Fine-Tuning | Zhaozhi Wang et.al. | 2412.02759 | null |
2024-12-03 | Multimodal Remote Sensing Scene Classification Using VLMs and Dual-Cross Attention Networks | Jinjin Cai et.al. | 2412.02531 | null |
2024-12-04 | GenMix: Effective Data Augmentation with Generative Diffusion Model Image Editing | Khawar Islam et.al. | 2412.02366 | null |
2024-12-03 | Multi-Granularity Tibetan Textual Adversarial Attack Method Based on Masked Language Model | Xi Cao et.al. | 2412.02343 | null |
2024-12-03 | Active Learning via Classifier Impact and Greedy Selection for Interactive Image Retrieval | Leah Bar et.al. | 2412.02310 | link |
2024-12-03 | A Classic-Quantum Hybrid Network Framework: CQH-Net | Ao Liu et.al. | 2412.02059 | null |
2024-12-02 | PROFIT: A PROximal FIne Tuning Optimizer for Multi-Task Learning | Anirudh S Chakravarthy et.al. | 2412.01930 | null |
2024-12-02 | Concept Based Continuous Prompts for Interpretable Text Classification | Qian Chen et.al. | 2412.01644 | link |
2024-12-02 | NYT-Connections: A Deceptively Simple Text Classification Task that Stumps System-1 Thinkers | Angel Yahir Loredo Lopez et.al. | 2412.01621 | null |
2024-12-02 | Explaining the Unexplained: Revealing Hidden Correlations for Better Interpretability | Wen-Dong Jiang et.al. | 2412.01365 | null |
2024-12-02 | Class Distance Weighted Cross Entropy Loss for Classification of Disease Severity | Gorkem Polat et.al. | 2412.01246 | null |
2024-11-29 | LLM Teacher-Student Framework for Text Classification With No Manually Annotated Data: A Case Study in IPTC News Topic Classification | Taja Kuzman et.al. | 2411.19638 | link |
2024-11-29 | FairDD: Fair Dataset Distillation via Synchronized Matching | Qihang Zhou et.al. | 2411.19623 | null |
2024-11-29 | Memristive Nanowire Network for Energy Efficient Audio Classification: Pre-Processing-Free Reservoir Computing with Reduced Latency | Akshaya Rajesh et.al. | 2411.19611 | null |
2024-11-29 | Contextual Checkerboard Denoise -- A Novel Neural Network-Based Approach for Classification-Aware OCT Image Denoising | Md. Touhidul Islam et.al. | 2411.19549 | link |
2024-11-28 | CLIP meets DINO for Tuning Zero-Shot Classifier using Unlabeled Image Collections | Mohamed Fazli Imam et.al. | 2411.19346 | link |
2024-11-28 | Quantum Neural Networks in Practice: A Comparative Study with Classical Models from Standard Data Sets to Industrial Images | Daniel Basilewitsch et.al. | 2411.19276 | null |
2024-11-28 | Controlling Participation in Federated Learning with Feedback | Michael Cummins et.al. | 2411.19242 | null |
2024-11-28 | Introducing Three New Benchmark Datasets for Hierarchical Text Classification | Jaco du Toit et.al. | 2411.19119 | null |
2024-11-28 | MVFormer: Diversifying Feature Normalization and Token Mixing for Efficient Vision Transformers | Jongseong Bae et.al. | 2411.18995 | null |
2024-11-27 | Fall Leaf Adversarial Attack on Traffic Sign Classification | Anthony Etim et.al. | 2411.18776 | null |
2024-11-27 | Leveraging Semi-Supervised Learning to Enhance Data Mining for Image Classification under Limited Labeled Data | Aoran Shen et.al. | 2411.18622 | null |
2024-11-27 | Pruning Deep Convolutional Neural Network Using Conditional Mutual Information | Tien Vu-Van et.al. | 2411.18578 | null |
2024-11-27 | Mixture of Experts in Image Classification: What's the Sweet Spot? | Mathurin Videau et.al. | 2411.18322 | null |
2024-11-27 | KANs for Computer Vision: An Experimental Study | Karthik Mohan et.al. | 2411.18224 | null |
2024-11-27 | Spectral-Spatial Transformer with Active Transfer Learning for Hyperspectral Image Classification | Muhammad Ahmad et.al. | 2411.18115 | link |
2024-11-27 | Vision Mamba Distillation for Low-resolution Fine-grained Image Classification | Yao Chen et.al. | 2411.17980 | link |
2024-11-27 | Optimized Tradeoffs for Private Prediction with Majority Ensembling | Shuli Jiang et.al. | 2411.17965 | null |
2024-11-26 | What Differentiates Educational Literature? A Multimodal Fusion Approach of Transformers and Computational Linguistics | Jordan J. Bird et.al. | 2411.17593 | null |
2024-11-26 | TinyViM: Frequency Decoupling for Tiny Hybrid Vision Mamba | Xiaowen Ma et.al. | 2411.17473 | link |
2024-11-26 | SpikeAtConv: An Integrated Spiking-Convolutional Attention Architecture for Energy-Efficient Neuromorphic Vision Processing | Wangdan Liao et.al. | 2411.17439 | null |
2024-11-26 | CoA: Chain-of-Action for Generative Semantic Labels | Meng Wei et.al. | 2411.17406 | link |
2024-11-26 | BadScan: An Architectural Backdoor Attack on Visual State Space Models | Om Suhas Deshmukh et.al. | 2411.17283 | null |
2024-11-26 | An In-depth Investigation of Sparse Rate Reduction in Transformer-like Models | Yunzhe Hu et.al. | 2411.17182 | null |
2024-11-25 | Contrastive Multi-graph Learning with Neighbor Hierarchical Sifting for Semi-supervised Text Classification | Wei Ai et.al. | 2411.16787 | null |
2024-11-25 | A Supervised Machine Learning Approach for Assessing Grant Peer Review Reports | Gabriel Okasa et.al. | 2411.16662 | link |
2024-11-25 | Debiasing Classifiers by Amplifying Bias with Latent Diffusion and Large Language Models | Donggeun Ko et.al. | 2411.16079 | null |
2024-11-24 | Context-Aware Detection of Mixed Critical Events using Video Classification | Filza Akhlaq et.al. | 2411.15773 | null |
2024-11-23 | MUNBa: Machine Unlearning via Nash Bargaining | Jing Wu et.al. | 2411.15537 | null |
2024-11-23 | Twin Trigger Generative Networks for Backdoor Attacks against Object Detection | Zhiying Li et.al. | 2411.15439 | null |
2024-11-22 | MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs | Chaoyou Fu et.al. | 2411.15296 | null |
2024-11-21 | CODE-CL: COnceptor-Based Gradient Projection for DEep Continual Learning | Marco Paul E. Apolinario et.al. | 2411.15235 | null |
2024-11-21 | BiomedCoOp: Learning to Prompt for Biomedical Vision-Language Models | Taha Koleilat et.al. | 2411.15232 | null |
2024-11-22 | FOCUS: Knowledge-enhanced Adaptive Visual Compression for Few-shot Whole Slide Image Classification | Zhengrui Guo et.al. | 2411.14743 | link |
2024-11-21 | Adaptable Embeddings Network (AEN) | Stan Loosmore et.al. | 2411.13786 | null |
2024-11-20 | Hierarchical Text Classification (HTC) vs. eXtreme Multilabel Classification (XML): Two Sides of the Same Medal | Nerijus Bertalis et.al. | 2411.13687 | link |
2024-11-20 | Combining Autoregressive and Autoencoder Language Models for Text Classification | João Gonçalves et.al. | 2411.13282 | link |
2024-11-20 | MEGL: Multimodal Explanation-Guided Learning | Yifei Zhang et.al. | 2411.13053 | null |
2024-11-19 | Problem-dependent convergence bounds for randomized linear gradient compression | Thomas Flynn et.al. | 2411.12898 | null |
2024-11-19 | Enhancing Multi-Class Disease Classification: Neoplasms, Cardiovascular, Nervous System, and Digestive Disorders Using Advanced LLMs | Ahmed Akib Jawad Karim et.al. | 2411.12712 | null |
2024-11-22 | STREAM: A Universal State-Space Model for Sparse Geometric Data | Mark Schöne et.al. | 2411.12603 | null |
2024-11-19 | AdaCM |
Yuanbin Man et.al. | 2411.12593 | null |
2024-11-19 | Zero-Shot Crate Digging: DJ Tool Retrieval Using Speech Activity, Music Structure And CLAP Embeddings | Iroro Orife et.al. | 2411.12209 | link |
2024-11-19 | Invariant Shape Representation Learning For Image Classification | Tonmoy Hossain et.al. | 2411.12201 | link |
2024-11-19 | Self-Supervised Learning in Deep Networks: A Pathway to Robust Few-Shot Classification | Yuyang Xiao et.al. | 2411.12151 | null |
2024-11-18 | Just Leaf It: Accelerating Diffusion Classifiers with Hierarchical Class Pruning | Arundhati S. Shanbhag et.al. | 2411.12073 | link |
2024-11-18 | Vision Language Models Are Few-Shot Audio Spectrogram Classifiers | Satvik Dixit et.al. | 2411.12058 | null |
2024-11-18 | Fair Distillation: Teaching Fairness from Biased Teachers in Medical Imaging | Milad Masroor et.al. | 2411.11939 | null |
2024-11-18 | Exploring Emerging Trends and Research Opportunities in Visual Place Recognition | Antonios Gasteratos et.al. | 2411.11481 | null |
2024-11-16 | MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map | Yuhong Chou et.al. | 2411.10741 | null |
2024-11-16 | Diagnostic Text-guided Representation Learning in Hierarchical Classification for Pathological Whole Slide Image | Jiawen Li et.al. | 2411.10709 | null |
2024-11-16 | Multi-perspective Contrastive Logit Distillation | Qi Wang et.al. | 2411.10693 | null |
2024-11-15 | Vision Eagle Attention: A New Lens for Advancing Image Classification | Mahmudul Hasan et.al. | 2411.10564 | link |
2024-11-15 | On the Cost of Model-Serving Frameworks: An Experimental Evaluation | Pasquale De Rosa et.al. | 2411.10337 | null |
2024-11-15 | Embedding Byzantine Fault Tolerance into Federated Learning via Virtual Data-Driven Consistency Scoring Plugin | Youngjoon Lee et.al. | 2411.10212 | link |
2024-11-15 | Outliers resistant image classification by anomaly detection | Anton Sergeev et.al. | 2411.10150 | null |
2024-11-15 | Adapting the Biological SSVEP Response to Artificial Neural Networks | Emirhan Böge et.al. | 2411.10084 | null |
2024-11-15 | Evidential Federated Learning for Skin Lesion Image Classification | Rutger Hendrix et.al. | 2411.10071 | null |
2024-11-14 | Adversarial Attacks Using Differentiable Rendering: A Survey | Matthew Hull et.al. | 2411.09749 | null |
2024-11-14 | ResidualDroppath: Enhancing Feature Reuse over Residual Connections | Sejik Park et.al. | 2411.09475 | null |
2024-11-14 | SAG-ViT: A Scale-Aware, High-Fidelity Patching Approach with Graph Attention for Vision Transformers | Shravan Venkatraman et.al. | 2411.09420 | null |
2024-11-14 | Heuristical Comparison of Vision Transformers Against Convolutional Neural Networks for Semantic Segmentation on Remote Sensing Imagery | Ashim Dahal et.al. | 2411.09101 | link |
2024-11-13 | Computed tomography using meta-optics | Maksym Zhelyeznuyakov et.al. | 2411.08995 | null |
2024-11-13 | CoCoP: Enhancing Text Classification with LLM through Code Completion Prompt | Mohammad Mahdi Mohajeri et.al. | 2411.08979 | null |
2024-11-13 | ScaleNet: Scale Invariance Learning in Directed Graphs | Qin Jiang et.al. | 2411.08758 | link |
2024-11-13 | Efficient Whole Slide Image Classification through Fisher Vector Representation | Ravi Kant Gupta et.al. | 2411.08530 | null |
2024-11-12 | HMIL: Hierarchical Multi-Instance Learning for Fine-Grained Whole Slide Image Classification | Cheng Jin et.al. | 2411.07660 | null |
2024-11-12 | Semantic segmentation on multi-resolution optical and microwave data using deep learning | Jai G Singla et.al. | 2411.07581 | null |
2024-11-11 | The Inherent Adversarial Robustness of Analog In-Memory Computing | Corey Lammie et.al. | 2411.07023 | null |
2024-11-11 | ScaleKD: Strong Vision Transformers Could Be Excellent Teachers | Jiawei Fan et.al. | 2411.06786 | link |
2024-11-11 | A Text Classification Model Combining Adversarial Training with Pre-trained Language Model and neural networks: A Case Study on Telecom Fraud Incident Texts | Liu Zhuoxian et.al. | 2411.06772 | null |
2024-11-11 | Can KAN Work? Exploring the Potential of Kolmogorov-Arnold Networks in Computer Vision | Yueyang Cang et.al. | 2411.06727 | null |
2024-11-10 | Deep Active Learning in the Open World | Tian Xie et.al. | 2411.06353 | null |
2024-11-09 | Clustering Algorithms and RAG Enhancing Semi-Supervised Text Classification with Large LLMs | Shan Zhong et.al. | 2411.06175 | null |
2024-11-09 | AI-Compass: A Comprehensive and Effective Multi-module Testing Tool for AI Systems | Zhiyu Zhu et.al. | 2411.06146 | null |
2024-11-09 | Exploring Structural Nonlinearity in Binary Polariton-Based Neuromorphic Architectures | Evgeny Sedov et.al. | 2411.06124 | null |
2024-11-09 | Mutual-energy inner product optimization method for constructing feature coordinates and image classification in Machine Learning | Yuanxiu Wang et.al. | 2411.06100 | null |
2024-11-08 | GUIDEQ: Framework for Guided Questioning for progressive informational collection and classification | Priya Mishra et.al. | 2411.05991 | link |
2024-11-08 | FisherMask: Enhancing Neural Network Labeling Efficiency in Image Classification Using Fisher Information | Shreen Gul et.al. | 2411.05752 | link |
2024-11-08 | Visual-TCAV: Concept-based Attribution and Saliency Maps for Post-hoc Explainability in Image Classification | Antonio De Santis et.al. | 2411.05698 | null |
2024-11-08 | Efficient Audio-Visual Fusion for Video Classification | Mahrukh Awan et.al. | 2411.05603 | null |
2024-11-08 | Training objective drives the consistency of representational similarity across datasets | Laure Ciernik et.al. | 2411.05561 | link |
2024-11-08 | Estimating the Influence of Sequentially Correlated Literary Properties in Textual Classification: A Data-Centric Hypothesis-Testing Approach | Gideon Yoffe et.al. | 2411.04950 | null |
2024-11-07 | Attention Masks Help Adversarial Attacks to Bypass Safety Detectors | Yunfan Shi et.al. | 2411.04772 | link |
2024-11-07 | Zero-Shot Temporal Resolution Domain Adaptation for Spiking Neural Networks | Sanja Karilanova et.al. | 2411.04760 | null |
2024-11-07 | Is network fragmentation a useful complexity measure? | Coenraad Mouton et.al. | 2411.04695 | null |
2024-11-07 | DISCO: DISCovering Overfittings as Causal Rules for Text Classification Models | Zijian Zhang et.al. | 2411.04649 | null |
2024-11-07 | Neural Fingerprints for Adversarial Attack Detection | Haim Fisher et.al. | 2411.04533 | link |
2024-11-06 | Multimodal Structure-Aware Quantum Data Processing | Hala Hawashin et.al. | 2411.04242 | null |
2024-11-06 | RaVL: Discovering and Mitigating Spurious Correlations in Fine-Tuned Vision-Language Models | Maya Varma et.al. | 2411.04097 | link |
2024-11-06 | Overcoming label shift in targeted federated learning | Edvin Listo Zec et.al. | 2411.03799 | null |
2024-11-06 | Deferred Poisoning: Making the Model More Vulnerable via Hessian Singularization | Yuhao He et.al. | 2411.03752 | null |
2024-11-05 | Judge Like a Real Doctor: Dual Teacher Sample Consistency Framework for Semi-supervised Medical Image Classification | Zhang Qixiang et.al. | 2411.03041 | null |
2024-11-06 | Confidence Calibration of Classifiers with Many Classes | Adrien LeCoz et.al. | 2411.02988 | link |
2024-11-05 | Domain Expansion and Boundary Growth for Open-Set Single-Source Domain Generalization | Pengkun Jiao et.al. | 2411.02920 | null |
2024-11-05 | ADOPT: Modified Adam Can Converge with Any |
Shohei Taniguchi et.al. | 2411.02853 | link |
2024-11-05 | Integrated lithium niobate photonic computing circuit based on efficient and high-speed electro-optic conversion | Yaowen Hu et.al. | 2411.02734 | null |
2024-11-06 | Wave Network: An Ultra-Small Language Model | Xin Zhang et.al. | 2411.02674 | null |
2024-11-04 | FUSECAPS: Investigating Feature Fusion Based Framework for Capsule Endoscopy Image Classification | Bidisha Chakraborty et.al. | 2411.02637 | null |
2024-11-04 | TripletCLIP: Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives | Maitreya Patel et.al. | 2411.02545 | null |
2024-11-04 | A Comparative Analysis of Instruction Fine-Tuning LLMs for Financial Text Classification | Sorouralsadat Fatemi et.al. | 2411.02476 | null |
2024-11-04 | Exploiting Contextual Uncertainty of Visual Data for Efficient Training of Deep Models | Sharat Agarwal et.al. | 2411.01925 | null |
2024-11-03 | Optimizing Gastrointestinal Diagnostics: A CNN-Based Model for VCE Image Classification | Vaneeta Ahlawat et.al. | 2411.01652 | null |
2024-11-03 | ParseCaps: An Interpretable Parsing Capsule Network for Medical Image Diagnosis | Xinyu Geng et.al. | 2411.01564 | null |
2024-11-03 | Efficient Deep Learning Infrastructures for Embedded Computing Systems: A Comprehensive Survey and Future Envision | Xiangzhong Luo et.al. | 2411.01431 | null |
2024-11-02 | Combining Financial Data and News Articles for Stock Price Movement Prediction Using Large Language Models | Ali Elahi et.al. | 2411.01368 | null |
2024-11-02 | Optimizing Violence Detection in Video Classification Accuracy through 3D Convolutional Neural Networks | Aarjav Kavathia et.al. | 2411.01348 | null |
2024-11-02 | MIC: Medical Image Classification Using Chest X-ray (COVID-19 and Pneumonia) Dataset with the Help of CNN and Customized CNN | Nafiz Fahad et.al. | 2411.01163 | null |
2024-11-02 | Few-Class Arena: A Benchmark for Efficient Selection of Vision Models and Dataset Difficulty Measurement | Bryan Bo Cao et.al. | 2411.01099 | link |
2024-11-01 | Towards Robust Text Classification: Mitigating Spurious Correlations with Causal Learning | Yuqing Zhou et.al. | 2411.01045 | null |
2024-11-01 | FISHing in Uncertainty: Synthetic Contrastive Learning for Genetic Aberration Detection | Simon Gutwein et.al. | 2411.01025 | link |
2024-10-31 | Video Token Merging for Long-form Video Understanding | Seon-Ho Lee et.al. | 2410.23782 | null |
2024-10-31 | Neurobench: DCASE 2020 Acoustic Scene Classification benchmark on XyloAudio 2 | Weijie Ke et.al. | 2410.23776 | null |
2024-10-31 | QUEST-A: Untrained Filtering with Trained Focusing led to Enhanced Quantum Architectures | Lian-Hui Yu et.al. | 2410.23560 | link |
2024-11-01 | Large Language Models for Patient Comments Multi-Label Classification | Hajar Sakai et.al. | 2410.23528 | null |
2024-10-30 | Multilingual Vision-Language Pre-training for the Remote Sensing Domain | João Daniel Silva et.al. | 2410.23370 | null |
2024-10-30 | Domain-decomposed image classification algorithms using linear discriminant analysis and convolutional neural networks | Axel Klawonn et.al. | 2410.23359 | null |
2024-10-30 | CLIPErase: Efficient Unlearning of Visual-Textual Associations in CLIP | Tianyu Yang et.al. | 2410.23330 | null |
2024-10-30 | Don't Just Pay Attention, PLANT It: Transfer L2R Models to Fine-tune Attention in Extreme Multi-Label Text Classification | Debjyoti Saharoy et.al. | 2410.23066 | null |
2024-10-30 | Automated Trustworthiness Oracle Generation for Machine Learning Text Classifiers | Lam Nguyen Tung et.al. | 2410.22663 | null |
2024-10-29 | Developing Convolutional Neural Networks using a Novel Lamarckian Co-Evolutionary Algorithm | Zaniar Sharifi et.al. | 2410.22487 | null |
2024-10-29 | EfficientNet with Hybrid Attention Mechanisms for Enhanced Breast Histopathology Classification: A Comprehensive Approach | Naren Sengodan et.al. | 2410.22392 | null |
2024-10-29 | DISCERN: Decoding Systematic Errors in Natural Language for Text Classifiers | Rakesh R. Menon et.al. | 2410.22239 | null |
2024-10-29 | Class-Aware Contrastive Optimization for Imbalanced Text Classification | Grigorii Khvatskii et.al. | 2410.22197 | null |
2024-10-29 | Active Learning for Vision-Language Models | Bardia Safaei et.al. | 2410.22187 | null |
2024-10-29 | Multi-Level Feature Distillation of Joint Teachers Trained on Distinct Image Datasets | Adrian Iordache et.al. | 2410.22184 | link |
2024-10-29 | Natural Language Processing for Analyzing Electronic Health Records and Clinical Notes in Cancer Research: A Review | Muhammad Bilal et.al. | 2410.22180 | null |
2024-10-29 | FakeFormer: Efficient Vulnerability-Driven Transformers for Generalisable Deepfake Detection | Dat Nguyen et.al. | 2410.21964 | null |
2024-10-29 | Bayesian Optimization for Hyperparameters Tuning in Neural Networks | Gabriele Onorato et.al. | 2410.21886 | null |
2024-10-29 | Advancing Efficient Brain Tumor Multi-Class Classification -- New Insights from the Vision Mamba Model in Transfer Learning | Yinyi Lai et.al. | 2410.21872 | null |
2024-10-28 | Audio Classification of Low Feature Spectrograms Utilizing Convolutional Neural Networks | Noel Elias et.al. | 2410.21561 | null |
2024-10-30 | A Novel Score-CAM based Denoiser for Spectrographic Signature Extraction without Ground Truth | Noel Elias et.al. | 2410.21557 | null |
2024-10-28 | Attacking Misinformation Detection Using Adversarial Examples Generated by Language Models | Piotr Przybyła et.al. | 2410.20940 | null |
2024-10-28 | Data-Efficient Low-Complexity Acoustic Scene Classification via Distilling and Progressive Pruning | Bing Han et.al. | 2410.20775 | null |
2024-10-28 | Interpretable Image Classification with Adaptive Prototype-based Vision Transformers | Chiyu Ma et.al. | 2410.20722 | null |
2024-10-27 | Graph Neural Networks on Discriminative Graphs of Words | Yassine Abbahaddou et.al. | 2410.20469 | null |
2024-10-27 | Historical Test-time Prompt Tuning for Vision Foundation Models | Jingyi Zhang et.al. | 2410.20346 | null |
2024-10-27 | Sequential Large Language Model-Based Hyper-Parameter Optimization | Kanan Mahammadli et.al. | 2410.20302 | link |
2024-10-26 | Enhancing CNN Classification with Lamarckian Memetic Algorithms and Local Search | Akhilbaran Ghosh et.al. | 2410.20234 | null |
2024-10-26 | Annotation Efficiency: Identifying Hard Samples via Blocked Sparse Linear Bandits | Adit Jain et.al. | 2410.20041 | null |
2024-10-26 | Attacks against Abstractive Text Summarization Models through Lead Bias and Influence Functions | Poojitha Thota et.al. | 2410.20019 | null |
2024-10-26 | Vulnerability of LLMs to Vertically Aligned Text Manipulations | Zhecheng Li et.al. | 2410.20016 | null |
2024-10-25 | Learning the Regularization Strength for Deep Fine-Tuning via a Data-Emphasized Variational Objective | Ethan Harvey et.al. | 2410.19675 | null |
2024-10-24 | Noise Adaption Network for Morse Code Image Classification | Xiaxia Wang et.al. | 2410.19180 | link |
2024-10-24 | Hybrid Quantum-Classical Feature Extraction approach for Image Classification using Autoencoders and Quantum SVMs | Donovan Slabbert et.al. | 2410.18814 | null |
2024-10-24 | Spatial-Temporal Search for Spiking Neural Networks | Kaiwei Che et.al. | 2410.18580 | null |
2024-10-25 | Interpretable Bilingual Multimodal Large Language Model for Diverse Biomedical Tasks | Lehan Wang et.al. | 2410.18387 | null |
2024-10-23 | Using Cartesian slice plots of a cosmological simulation as input of a convolutional neural network | Guillermo Arreaga-Garcia et.al. | 2410.18320 | null |
2024-10-25 | Backdoor in Seconds: Unlocking Vulnerabilities in Large Pre-trained Models via Model Editing | Dongliang Guo et.al. | 2410.18267 | null |
2024-10-23 | Future Token Prediction -- Causal Language Modelling with Per-Token Semantic State Vector for Multi-Token Prediction | Nicholas Walker et.al. | 2410.18160 | null |
2024-10-23 | Deep Learning for Active Region Classification: A Systematic Study from Convolutional Neural Networks to Vision Transformers | Edoardo Legnaro et.al. | 2410.17816 | null |
2024-10-23 | New Insight in Cervical Cancer Diagnosis Using Convolution Neural Network Architecture | Ach. Khozaimi et.al. | 2410.17735 | null |
2024-10-24 | Advancing Interpretability in Text Classification through Prototype Learning | Bowen Wei et.al. | 2410.17546 | null |
2024-10-23 | Enhancing Multimodal Medical Image Classification using Cross-Graph Modal Contrastive Learning | Jun-En Ding et.al. | 2410.17494 | null |
2024-10-22 | Data Obfuscation through Latent Space Projection (LSP) for Privacy-Preserving AI Governance: Case Studies in Medical Diagnosis and Finance Fraud Detection | Mahesh Vaijainthymala Krishnamoorthy et.al. | 2410.17459 | null |
2024-10-22 | Altogether: Image Captioning via Re-aligning Alt-text | Hu Xu et.al. | 2410.17251 | null |
2024-10-22 | KANICE: Kolmogorov-Arnold Networks with Interactive Convolutional Elements | Md Meftahul Ferdaus et.al. | 2410.17172 | link |
2024-10-22 | Development of CNN Architectures using Transfer Learning Methods for Medical Image Classification | Ganga Prasad Basyal et.al. | 2410.16711 | null |
2024-10-21 | Efficient Neural Network Training via Subset Pretraining | Jan Spörer et.al. | 2410.16523 | null |
2024-10-21 | 1024m at SMM4H 2024: Tasks 3, 5 & 6 -- Ensembles of Transformers and Large Language Models for Medical Text Classification | Ram Mohan Rao Kadiyala et.al. | 2410.15998 | null |
2024-10-21 | Visual Representation Learning Guided By Multi-modal Prior Knowledge | Hongkuan Zhou et.al. | 2410.15981 | null |
2024-10-21 | AutoTrain: No-code training for state-of-the-art models | Abhishek Thakur et.al. | 2410.15735 | link |
2024-10-21 | ViMoE: An Empirical Study of Designing Vision Mixture-of-Experts | Xumeng Han et.al. | 2410.15732 | null |
2024-10-21 | P-YOLOv8: Efficient and Accurate Real-Time Detection of Distracted Driving | Mohamed R. Elshamy et.al. | 2410.15602 | null |
2024-10-20 | Open-vocabulary vs. Closed-set: Best Practice for Few-shot Object Detection Considering Text Describability | Yusuke Hosoya et.al. | 2410.15315 | link |
2024-10-19 | Spatial-Mamba: Effective Visual State Space Models via Structure-Aware State Fusion | Chaodong Xiao et.al. | 2410.15091 | link |
2024-10-19 | PAT: Parameter-Free Audio-Text Aligner to Boost Zero-Shot Audio Classification | Ashish Seth et.al. | 2410.15062 | null |
2024-10-19 | Weakly-supervised diagnosis identification from Italian discharge letters | Vittorio Torri et.al. | 2410.15051 | null |
2024-10-19 | Reflexive Guidance: Improving OoDD in Vision-Language Models via Self-Guided Image-Adaptive Concept Generation | Seulbi Lee et.al. | 2410.14975 | null |
2024-10-18 | A Hybrid Feature Fusion Deep Learning Framework for Leukemia Cancer Detection in Microscopic Blood Sample Using Gated Recurrent Unit and Uncertainty Quantification | Maksuda Akter et.al. | 2410.14536 | null |
2024-10-18 | Unlearning Backdoor Attacks for LLMs with Weak-to-Strong Knowledge Distillation | Shuai Zhao et.al. | 2410.14425 | link |
2024-10-18 | A Novel Method to Metigate Demographic and Expert Bias in ICD Coding with Causal Inference | Bin Zhang et.al. | 2410.14236 | null |
2024-10-18 | Comparative Evaluation of Clustered Federated Learning Method | Michael Ben Ali et.al. | 2410.14212 | link |
2024-10-17 | Reproducibility study of "LICO: Explainable Models with Language-Image Consistency" | Luan Fletcher et.al. | 2410.13989 | link |
2024-10-17 | LoLDU: Low-Rank Adaptation via Lower-Diag-Upper Decomposition for Parameter-Efficient Fine-Tuning | Yiming Shi et.al. | 2410.13618 | link |
2024-10-17 | Augmentation Policy Generation for Image Classification Using Large Language Models | Ant Duru et.al. | 2410.13453 | null |
2024-10-17 | Similarity-Dissimilarity Loss with Supervised Contrastive Learning for Multi-label Classification | Guangming Huang et.al. | 2410.13439 | null |
2024-10-16 | Interpreting and Analyzing CLIP's Zero-Shot Image Classification via Mutual Knowledge | Fawaz Sammani et.al. | 2410.13016 | link |
2024-10-16 | PND-Net: Plant Nutrition Deficiency and Disease Classification using Graph Convolutional Network | Asish Bera et.al. | 2410.12742 | null |
2024-10-16 | Beyond Speech and More: Investigating the Emergent Ability of Speech Foundation Models for Classifying Physiological Time-Series Signals | Orchid Chetia Phukan et.al. | 2410.12645 | null |
2024-10-17 | From Measurement Instruments to Data: Leveraging Theory-Driven Synthetic Training Data for Classifying Social Constructs | Lukas Birkenmaier et.al. | 2410.12622 | null |
2024-10-16 | Feature Augmentation for Self-supervised Contrastive Learning: A Closer Look | Yong Zhang et.al. | 2410.12396 | null |
2024-10-15 | Clustering doc2vec output for topic-dimensionality reduction: A MITRE ATT&CK calibration | Nathan Monnet et.al. | 2410.11573 | null |
2024-10-15 | LoKO: Low-Rank Kalman Optimizer for Online Fine-Tuning of Large Models | Hossein Abdi et.al. | 2410.11551 | null |
2024-10-15 | Reducing Labeling Costs in Sentiment Analysis via Semi-Supervised Learning | Minoo Jafarlou et.al. | 2410.11355 | null |
2024-10-14 | Towards a More Complete Theory of Function Preserving Transforms | Michael Painter et.al. | 2410.11038 | null |
2024-10-14 | Enhancing JEPAs with Spatial Conditioning: Robust and Efficient Representation Learning | Etai Littwin et.al. | 2410.10773 | null |
2024-10-15 | Ensemble of ConvNeXt V2 and MaxViT for Long-Tailed CXR Classification with View-Based Aggregation | Yosuke Yamagishi et.al. | 2410.10710 | link |
2024-10-14 | Queryable Prototype Multiple Instance Learning with Vision-Language Models for Incremental Whole Slide Image Classification | Jiaxiang Gou et.al. | 2410.10573 | null |
2024-10-14 | Dynamic Power Control in a Hardware Neural Network with Error-Configurable MAC Units | Maedeh Ghaderi et.al. | 2410.10545 | null |
2024-10-14 | Improve Meta-learning for Few-Shot Text Classification with All You Can Acquire from the Tasks | Xinyue Liu et.al. | 2410.10454 | link |
2024-10-14 | GlobalMamba: Global Image Serialization for Vision Mamba | Chengkun Wang et.al. | 2410.10316 | link |
2024-10-14 | A Multi-Task Text Classification Pipeline with Natural Language Explanations: A User-Centric Evaluation in Sentiment Analysis and Offensive Language Identification in Greek Tweets | Nikolaos Mylonas et.al. | 2410.10290 | null |
2024-10-14 | big.LITTLE Vision Transformer for Efficient Visual Recognition | He Guo et.al. | 2410.10267 | null |
2024-10-14 | SkillAggregation: Reference-free LLM-Dependent Aggregation | Guangzhi Sun et.al. | 2410.10215 | null |
2024-10-14 | Will the Inclusion of Generated Data Amplify Bias Across Generations in Future Image Classification Models? | Zeliang Zhang et.al. | 2410.10160 | null |
2024-10-11 | Efficient Hyperparameter Importance Assessment for CNNs | Ruinan Wang et.al. | 2410.08920 | null |
2024-10-11 | Parameter-Efficient Fine-Tuning of Large Language Models using Semantic Knowledge Tuning | Nusrat Jahan Prottasha et.al. | 2410.08598 | null |
2024-10-11 | DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention | Nguyen Huu Bao Long et.al. | 2410.08582 | link |
2024-10-11 | Accelerated Distributed Stochastic Non-Convex Optimization over Time-Varying Directed Networks | Yiyue Chen et.al. | 2410.08508 | null |
2024-10-11 | Semantic Token Reweighting for Interpretable and Controllable Text Embeddings in CLIP | Eunji Kim et.al. | 2410.08469 | null |
2024-10-10 | Bilinear MLPs enable weight-based mechanistic interpretability | Michael T. Pearce et.al. | 2410.08417 | null |
2024-10-10 | What is Left After Distillation? How Knowledge Transfer Impacts Fairness and Bias | Aida Mohammadshahi et.al. | 2410.08407 | null |
2024-10-10 | Time Traveling to Defend Against Adversarial Example Attacks in Image Classification | Anthony Etim et.al. | 2410.08338 | null |
2024-10-10 | More Experts Than Galaxies: Conditionally-overlapping Experts With Biologically-Inspired Fixed Routing | Sagi Shaier et.al. | 2410.08003 | null |
2024-10-10 | When the Small-Loss Trick is Not Enough: Multi-Label Image Classification with Noisy Labels Applied to CCTV Sewer Inspections | Keryan Chelouche et.al. | 2410.07689 | null |
2024-10-10 | Invisibility Cloak: Disappearance under Human Pose Estimation via Backdoor Attacks | Minxing Zhang et.al. | 2410.07670 | null |
2024-10-10 | StablePrompt: Automatic Prompt Tuning using Reinforcement Learning for Large Language Models | Minchan Kwon et.al. | 2410.07652 | null |
2024-10-10 | Explainability of Deep Neural Networks for Brain Tumor Detection | S. Park et.al. | 2410.07613 | link |
2024-10-10 | CSA: Data-efficient Mapping of Unimodal Features to Multimodal Features | Po-han Li et.al. | 2410.07610 | null |
2024-10-09 | One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation | Fabian Paischer et.al. | 2410.07170 | link |
2024-10-09 | JPEG Inspired Deep Learning | Ahmed H. Salamah et.al. | 2410.07081 | link |
2024-10-09 | Optimizing Estimators of Squared Calibration Errors in Classification | Sebastian G. Gruber et.al. | 2410.07014 | null |
2024-10-09 | Spectral and Rhythm Features for Audio Classification with Deep Convolutional Neural Networks | Friedrich Wolf-Monheim et.al. | 2410.06927 | null |
2024-10-09 | QuadMamba: Learning Quadtree-based Selective Scan for Visual State Space Model | Fei Xie et.al. | 2410.06806 | null |
2024-10-09 | Convex Distillation: Efficient Compression of Deep Networks via Convex Optimization | Prateek Varshney et.al. | 2410.06567 | null |
2024-10-08 | A Comparative Study of Hybrid Models in Health Misinformation Text Classification | Mkululi Sikosana et.al. | 2410.06311 | null |
2024-10-08 | Conformal Structured Prediction | Botong Zhang et.al. | 2410.06296 | link |
2024-10-08 | TEOChat: A Large Vision-Language Assistant for Temporal Earth Observation Data | Jeremy Andrew Irvin et.al. | 2410.06234 | null |
2024-10-08 | Manual Verbalizer Enrichment for Few-Shot Text Classification | Quang Anh Nguyen et.al. | 2410.06173 | null |
2024-10-07 | LoTLIP: Improving Language-Image Pre-training for Long Text Understanding | Wei Wu et.al. | 2410.05249 | null |
2024-10-07 | Variable Resolution Pixel Quantization for Low Power Machine Vision Application on Edge | Senorita Deb et.al. | 2410.05189 | null |
2024-10-07 | IGroupSS-Mamba: Interval Group Spatial-Spectral Mamba for Hyperspectral Image Classification | Yan He et.al. | 2410.05100 | null |
2024-10-07 | Explanation sensitivity to the randomness of large language models: the case of journalistic text classification | Jeremie Bogaert et.al. | 2410.05085 | null |
2024-10-07 | Control-oriented Clustering of Visual Latent Representation | Han Qi et.al. | 2410.05063 | null |
2024-10-07 | SELECT: A Large-Scale Benchmark of Data Curation Strategies for Image Classification | Benjamin Feuer et.al. | 2410.05057 | link |
2024-10-07 | Art Forgery Detection using Kolmogorov Arnold and Convolutional Neural Networks | Sandro Boccuzzo et.al. | 2410.04866 | null |
2024-10-06 | MECFormer: Multi-task Whole Slide Image Classification with Expert Consultation Network | Doanh C. Bui et.al. | 2410.04507 | null |
2024-10-06 | Interpret Your Decision: Logical Reasoning Regularization for Generalization in Visual Classification | Zhaorui Tan et.al. | 2410.04492 | link |
2024-10-05 | IT |
Nikita Durasov et.al. | 2410.04201 | null |
2024-10-04 | Classification-Denoising Networks | Louis Thiry et.al. | 2410.03505 | null |
2024-10-04 | A Multimodal Framework for Deepfake Detection | Kashish Gandhi et.al. | 2410.03487 | null |
2024-10-04 | On Uncertainty In Natural Language Processing | Dennis Ulmer et.al. | 2410.03446 | link |
2024-10-04 | Comparing zero-shot self-explanations with human rationales in multilingual text classification | Stephanie Brandl et.al. | 2410.03296 | null |
2024-10-04 | Sm: enhanced localization in Multiple Instance Learning for medical imaging classification | Francisco M. Castro-Macías et.al. | 2410.03276 | null |
2024-10-04 | Selective Transformer for Hyperspectral Image Classification | Yichu Xu et.al. | 2410.03171 | null |
2024-10-03 | CPFD: Confidence-aware Privileged Feature Distillation for Short Video Classification | Jinghao Shi et.al. | 2410.03038 | null |
2024-10-03 | On Expert Estimation in Hierarchical Mixture of Experts: Beyond Softmax Gating Functions | Huy Nguyen et.al. | 2410.02935 | null |
2024-10-03 | Lie Algebra Canonicalization: Equivariant Neural Operators under arbitrary Lie Groups | Zakhar Shumaylov et.al. | 2410.02698 | null |
2024-10-03 | LoGra-Med: Long Context Multi-Graph Alignment for Medical Vision-Language Model | Duy M. H. Nguyen et.al. | 2410.02615 | null |
2024-10-03 | Personalized Quantum Federated Learning for Privacy Image Classification | Jinjing Shi et.al. | 2410.02547 | null |
2024-10-03 | BiSSL: Bilevel Optimization for Self-Supervised Pre-Training and Fine-Tuning | Gustav Wagner Zakarias et.al. | 2410.02387 | null |
2024-10-03 | CTARR: A fast and robust method for identifying anatomical regions on CT images via atlas registration | Thomas Buddenkotte et.al. | 2410.02316 | link |
2024-10-03 | Hard Negative Sample Mining for Whole Slide Image Classification | Wentao Huang et.al. | 2410.02212 | link |
2024-10-02 | Kolmogorov-Arnold Network Autoencoders | Mohammadamin Moradi et.al. | 2410.02077 | link |
2024-10-02 | Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data | Sreyan Ghosh et.al. | 2410.02056 | null |
2024-10-02 | FLAG: Financial Long Document Classification via AMR-based GNN | Bolun et.al. | 2410.02024 | link |
2024-10-02 | MONICA: Benchmarking on Long-tailed Medical Image Classification | Lie Ju et.al. | 2410.02010 | null |
2024-10-02 | Revisiting Hierarchical Text Classification: Inference and Metrics | Roman Plaud et.al. | 2410.01305 | link |
2024-10-02 | Automatic deductive coding in discourse analysis: an application of large language models in learning analytics | Lishan Zhang et.al. | 2410.01240 | null |
2024-10-01 | Deep Nets with Subsampling Layers Unwittingly Discard Useful Activations at Test-Time | Chiao-An Yang et.al. | 2410.01083 | link |
2024-10-01 | Local-to-Global Self-Supervised Representation Learning for Diabetic Retinopathy Grading | Mostafa Hajighasemloua et.al. | 2410.00779 | null |
2024-10-01 | NECOMIMI: Neural-Cognitive Multimodal EEG-informed Image Generation with Diffusion Models | Chi-Sheng Chen et.al. | 2410.00712 | null |
2024-10-01 | TikGuard: A Deep Learning Transformer-Based Solution for Detecting Unsuitable TikTok Content for Kids | Mazen Balat et.al. | 2410.00403 | null |
2024-09-30 | KPCA-CAM: Visual Explainability of Deep Computer Vision Models using Kernel PCA | Sachin Karmani et.al. | 2410.00267 | null |
2024-09-30 | A Methodology for Explainable Large Language Models with Integrated Gradients and Linguistic Analysis in Text Classification | Marina Ribeiro et.al. | 2410.00250 | null |
2024-09-30 | Evaluating the performance of state-of-the-art esg domain-specific pre-trained large language models in text classification against existing models and traditional machine learning techniques | Tin Yuet Chung et.al. | 2410.00207 | null |
2024-10-02 | Evaluating the fairness of task-adaptive pretraining on unlabeled test data before few-shot text classification | Kush Dubey et.al. | 2410.00179 | link |
2024-09-30 | POMONAG: Pareto-Optimal Many-Objective Neural Architecture Generator | Eugenio Lomurno et.al. | 2409.20447 | null |
2024-09-30 | Satellite image classification with neural quantum kernels | Pablo Rodriguez-Grasa et.al. | 2409.20356 | null |
2024-09-30 | All-optical autoencoder machine learning framework using diffractive processors | Peijie Feng et.al. | 2409.20346 | null |
2024-09-30 | Fine-Tuning Personalization in Federated Learning to Mitigate Adversarial Clients | Youssef Allouah et.al. | 2409.20329 | null |
2024-09-30 | Classroom-Inspired Multi-Mentor Distillation with Adaptive Learning Strategies | Shalini Sarode et.al. | 2409.20237 | null |
2024-09-30 | Classification of Radiological Text in Small and Imbalanced Datasets in a Non-English Language | Vincent Beliveau et.al. | 2409.20147 | null |
2024-09-30 | SATA: Spatial Autocorrelation Token Analysis for Enhancing the Robustness of Vision Transformers | Nick Nikzad et.al. | 2409.19850 | null |
2024-09-29 | Adversarial Examples for DNA Classification | Hyunwoo Yoo et.al. | 2409.19788 | null |
2024-09-29 | FAST: A Dual-tier Few-Shot Learning Paradigm for Whole Slide Image Classification | Kexue Fu et.al. | 2409.19720 | null |
2024-09-29 | Vision-Language Models are Strong Noisy Label Detectors | Tong Wei et.al. | 2409.19696 | link |
2024-09-27 | Unconditional stability of a recurrent neural circuit implementing divisive normalization | Shivang Rawat et.al. | 2409.18946 | null |
2024-09-27 | Subspace Preserving Quantum Convolutional Neural Network Architectures | Léo Monbroussou et.al. | 2409.18918 | null |
2024-09-27 | Med-IC: Fusing a Single Layer Involution with Convolutions for Enhanced Medical Image Classification and Segmentation | Md. Farhadul Islam et.al. | 2409.18506 | null |
2024-09-26 | Towards the Mitigation of Confirmation Bias in Semi-supervised Learning: a Debiased Training Perspective | Yu Wang et.al. | 2409.18316 | null |
2024-09-26 | Realistic Evaluation of Model Merging for Compositional Generalization | Derek Tam et.al. | 2409.18314 | null |
2024-09-26 | DARE: Diverse Visual Question Answering with Robustness Evaluation | Hannah Sterz et.al. | 2409.18023 | null |
2024-09-26 | The Lou Dataset -- Exploring the Impact of Gender-Fair Language in German Text Classification | Andreas Waldis et.al. | 2409.17929 | null |
2024-09-26 | Cascade Prompt Learning for Vision-Language Model Adaptation | Ge Wu et.al. | 2409.17805 | null |
2024-09-26 | Byzantine-Robust Aggregation for Securing Decentralized Federated Learning | Diego Cajaraville-Aboy et.al. | 2409.17754 | null |
2024-09-26 | Let the Quantum Creep In: Designing Quantum Neural Network Models by Gradually Swapping Out Classical Components | Peiyong Wang et.al. | 2409.17583 | link |
2024-09-26 | Leveraging Annotator Disagreement for Text Classification | Jin Xu et.al. | 2409.17577 | null |
2024-09-26 | Uni-Med: A Unified Medical Generalist Foundation Model For Multi-Task Learning Via Connector-MoE | Xun Zhu et.al. | 2409.17508 | null |
2024-09-26 | Reducing and Exploiting Data Augmentation Noise through Meta Reweighting Contrastive Learning for Text Classification | Guanyi Mou et.al. | 2409.17474 | null |
2024-09-26 | Navigating the Shortcut Maze: A Comprehensive Analysis of Shortcut Learning in Text Classification by Language Models | Yuqing Zhou et.al. | 2409.17455 | null |
2024-09-25 | Block Expanded DINORET: Adapting Natural Domain Foundation Models for Retinal Imaging Without Catastrophic Forgetting | Jay Zoellin et.al. | 2409.17332 | null |
2024-09-25 | BitQ: Tailoring Block Floating Point Precision for Improved DNN Efficiency on Resource-Constrained Devices | Yongqi Xu et.al. | 2409.17093 | link |
2024-09-25 | Accumulator-Aware Post-Training Quantization | Ian Colbert et.al. | 2409.17092 | null |
2024-09-26 | HVT: A Comprehensive Vision Framework for Learning in Non-Euclidean Space | Jacob Fein-Ashley et.al. | 2409.16897 | link |
2024-09-25 | Shifting from endangerment to rebirth in the Artificial Intelligence Age: An Ensemble Machine Learning Approach for Hawrami Text Classification | Aram Khaksar et.al. | 2409.16884 | null |
2024-09-25 | Explicitly Modeling Pre-Cortical Vision with a Neuro-Inspired Front-End Improves CNN Robustness | Lucas Piper et.al. | 2409.16838 | link |
2024-09-24 | Unleashing the Potential of Synthetic Images: A Study on Histopathology Image Classification | Leire Benito-Del-Valle et.al. | 2409.16002 | link |
2024-09-24 | An ensemble framework approach of hybrid Quantum convolutional neural networks for classification of breast cancer images | Dibyasree Guha et.al. | 2409.15958 | null |
2024-09-24 | iGAiVA: Integrated Generative AI and Visual Analytics in a Machine Learning Workflow for Text Classification | Yuanzhe Jin et.al. | 2409.15848 | link |
2024-09-23 | Optimizing News Text Classification with Bi-LSTM and Attention Mechanism for Efficient Data Processing | Bingyao Liu et.al. | 2409.15576 | null |
2024-09-23 | Critic Loss for Image Classification | Brendan Hogan Rappazzo et.al. | 2409.15565 | null |
2024-09-23 | VLMine: Long-Tail Data Mining with Vision Language Models | Mao Ye et.al. | 2409.15486 | null |
2024-09-23 | HydroVision: LiDAR-Guided Hydrometric Prediction with Vision Transformers and Hybrid Graph Learning | Naghmeh Shafiee Roudbari et.al. | 2409.15213 | null |
2024-09-23 | Benchmarking Edge AI Platforms for High-Performance ML Inference | Rakshith Jayanth et.al. | 2409.14803 | null |
2024-09-23 | Less yet robust: crucial region selection for scene recognition | Jianqi Zhang et.al. | 2409.14741 | null |
2024-09-22 | Low-Light Enhancement Effect on Classification and Detection: An Empirical Study | Xu Wu et.al. | 2409.14461 | null |
2024-09-18 | Unraveling the Hessian: A Key to Smooth Convergence in Loss Function Landscapes | Nikita Kiselev et.al. | 2409.11995 | link |
2024-09-18 | Data Efficient Acoustic Scene Classification using Teacher-Informed Confusing Class Instruction | Jin Jie Sean Yeo et.al. | 2409.11964 | null |
2024-09-18 | Agglomerative Token Clustering | Joakim Bruslund Haurum et.al. | 2409.11923 | null |
2024-09-18 | Distillation-free Scaling of Large SSMs for Images and Videos | Hamid Suleman et.al. | 2409.11867 | null |
2024-09-18 | Community Shaping in the Digital Age: A Temporal Fusion Framework for Analyzing Discourse Fragmentation in Online Social Networks | Amirhossein Dezhboro et.al. | 2409.11665 | null |
2024-09-18 | Few-Shot Learning Approach on Tuberculosis Classification Based on Chest X-Ray Images | A. A. G. Yogi Pramana et.al. | 2409.11644 | null |
2024-09-18 | Hyperspectral Image Classification Based on Faster Residual Multi-branch Spiking Neural Network | Yang Liu et.al. | 2409.11619 | null |
2024-09-17 | Multi-Cohort Framework with Cohort-Aware Attention and Adversarial Mutual-Information Minimization for Whole Slide Image Classification | Sharon Peled et.al. | 2409.11119 | null |
2024-09-17 | Anti-ESIA: Analyzing and Mitigating Impacts of Electromagnetic Signal Injection Attacks | Denglin Kang et.al. | 2409.10922 | null |
2024-09-16 | Are Deep Learning Models Robust to Partial Object Occlusion in Visual Recognition Tasks? | Kaleb Kassaw et.al. | 2409.10775 | null |
2024-09-16 | Frequency-Guided Masking for Enhanced Vision Self-Supervised Learning | Amin Karimi Monsefi et.al. | 2409.10362 | null |
2024-09-16 | InfoDisent: Explainability of Image Classification Models by Information Disentanglement | Łukasz Struski et.al. | 2409.10329 | null |
2024-09-16 | Enhancing Image Classification in Small and Unbalanced Datasets through Synthetic Data Augmentation | Neil De La Fuente et.al. | 2409.10286 | null |
2024-09-15 | Finetuning CLIP to Reason about Pairwise Differences | Dylan Sam et.al. | 2409.09721 | null |
2024-09-15 | Compositional Audio Representation Learning | Sripathi Sridhar et.al. | 2409.09619 | null |
2024-09-14 | One missing piece in Vision and Language: A Survey on Comics Understanding | Emanuele Vivoli et.al. | 2409.09502 | link |
2024-09-14 | Real-world Adversarial Defense against Patch Attacks based on Diffusion Model | Xingxing Wei et.al. | 2409.09406 | null |
2024-09-14 | Turbo your multi-modal classification with contrastive learning | Zhiyu Zhang et.al. | 2409.09282 | null |
2024-09-14 | Leveraging Foundation Models for Efficient Federated Learning in Resource-restricted Edge Networks | S. Kawa Atapour et.al. | 2409.09273 | null |
2024-09-13 | ReCLAP: Improving Zero Shot Audio Classification by Describing Sounds | Sreyan Ghosh et.al. | 2409.09213 | link |
2024-09-13 | Pushing the boundaries of event subsampling in event-based video classification using CNNs | Hesam Araghi et.al. | 2409.08953 | link |
2024-09-13 | Pushing Joint Image Denoising and Classification to the Edge | Thomas C Markhorst et.al. | 2409.08943 | null |
2024-09-13 | Byzantine-Robust and Communication-Efficient Distributed Learning via Compressed Momentum Filtering | Changxin Liu et.al. | 2409.08640 | null |
2024-09-13 | Anytime Continual Learning for Open Vocabulary Classification | Zhen Zhu et.al. | 2409.08518 | link |
2024-09-12 | Enhancing Few-Shot Image Classification through Learnable Multi-Scale Embedding and Attention Mechanisms | Fatemeh Askari et.al. | 2409.07989 | link |
2024-09-12 | Microscopic-Mamba: Revealing the Secrets of Microscopic Images with Just 4M Parameters | Shun Zou et.al. | 2409.07896 | link |
2024-09-12 | Classifying Images with CoLaNET Spiking Neural Network -- the MNIST Example | Mikhail Kiselev et.al. | 2409.07833 | null |
2024-09-12 | Efficient Privacy-Preserving KAN Inference Using Homomorphic Encryption | Zhizheng Lai et.al. | 2409.07751 | null |
2024-09-12 | DFDG: Data-Free Dual-Generator Adversarial Distillation for One-Shot Federated Learning | Kangyang Luo et.al. | 2409.07734 | null |
2024-09-12 | Cooperative Inference with Interleaved Operator Partitioning for CNNs | Zhibang Liu et.al. | 2409.07693 | null |
2024-09-11 | Token Turing Machines are Efficient Vision Models | Purvish Jajal et.al. | 2409.07613 | null |
2024-09-11 | Minimizing Embedding Distortion for Robust Out-of-Distribution Performance | Tom Shaked et.al. | 2409.07582 | null |
2024-09-11 | A Contrastive Symmetric Forward-Forward Algorithm (SFFA) for Continual Learning Tasks | Erik B. Terres-Escudero et.al. | 2409.07387 | null |
2024-09-11 | Optimizing Neural Network Performance and Interpretability with Diophantine Equation Encoding | Ronald Katende et.al. | 2409.07310 | null |
2024-09-11 | LLM-based feature generation from text for interpretable machine learning | Vojtěch Balek et.al. | 2409.07132 | null |
2024-09-11 | Privacy-Preserving Federated Learning with Consistency via Knowledge Distillation Using Conditional Generator | Kangyang Luo et.al. | 2409.06955 | null |
2024-09-10 | Dynamic Decoupling of Placid Terminal Attractor-based Gradient Descent Algorithm | Jinwei Zhao et.al. | 2409.06542 | null |
2024-09-10 | Seam Carving as Feature Pooling in CNN | Mohammad Imrul Jubair et.al. | 2409.06311 | null |
2024-09-10 | EntAugment: Entropy-Driven Adaptive Data Augmentation Framework for Image Classification | Suorong Yang et.al. | 2409.06290 | link |
2024-09-09 | A Small Claims Court for the NLP: Judging Legal Text Classification Strategies With Small Datasets | Mariana Yukari Noguti et.al. | 2409.05972 | null |
2024-09-09 | SVFit: Parameter-Efficient Fine-Tuning of Large Pre-Trained Models Using Singular Values | Chengwei Sun et.al. | 2409.05926 | null |
2024-09-09 | Adversarial Attacks on Data Attribution | Xinhe Wang et.al. | 2409.05657 | null |
2024-09-09 | Look One and More: Distilling Hybrid Order Relational Knowledge for Cross-Resolution Image Recognition | Shiming Ge et.al. | 2409.05384 | null |
2024-09-09 | RexUniNLU: Recursive Method with Explicit Schema Instructor for Universal NLU | Chengyuan Liu et.al. | 2409.05275 | null |
2024-09-09 | Scalable Frame Sampling for Video Classification: A Semi-Optimal Policy Approach with Reduced Search Space | Junho Lee et.al. | 2409.05260 | null |
2024-09-08 | PatchAlign:Fair and Accurate Skin Disease Image Classification by Alignment with Clinical Labels | Aayushman et.al. | 2409.04975 | link |
2024-09-07 | Activation Function Optimization Scheme for Image Classification | Abdur Rahman et.al. | 2409.04915 | null |
2024-09-07 | LoCa: Logit Calibration for Knowledge Distillation | Runming Yang et.al. | 2409.04778 | null |
2024-09-07 | Swin Transformer for Robust Differentiation of Real and Synthetic Images: Intra- and Inter-Dataset Analysis | Preetu Mehta et.al. | 2409.04734 | null |
2024-09-06 | Connectivity-Inspired Network for Context-Aware Recognition | Gianluca Carloni et.al. | 2409.04360 | null |
2024-09-06 | An optically accelerated extreme learning machine using hot atomic vapors | Pierre Azam et.al. | 2409.04312 | null |
2024-09-06 | PlantSeg: A Large-Scale In-the-wild Dataset for Plant Disease Segmentation | Tianqi Wei et.al. | 2409.04038 | null |
2024-09-05 | Deep Clustering of Remote Sensing Scenes through Heterogeneous Transfer Learning | Isaac Ray et.al. | 2409.03938 | null |
2024-09-05 | WaterMAS: Sharpness-Aware Maximization for Neural Network Watermarking | Carl De Sousa Trias et.al. | 2409.03902 | null |
2024-09-05 | On-board Satellite Image Classification for Earth Observation: A Comparative Study of Pre-Trained Vision Transformer Models | Thanh-Dung Le et.al. | 2409.03901 | null |
2024-09-05 | Have Large Vision-Language Models Mastered Art History? | Ombretta Strafforello et.al. | 2409.03521 | null |
2024-09-05 | Non-Uniform Illumination Attack for Fooling Convolutional Neural Networks | Akshay Jain et.al. | 2409.03458 | link |
2024-09-05 | Training-free Conversion of Pretrained ANNs to SNNs for Low-Power and High-Performance Applications | Tong Bu et.al. | 2409.03368 | null |
2024-09-05 | PEPL: Precision-Enhanced Pseudo-Labeling for Fine-Grained Image Classification in Semi-Supervised Learning | Bowen Tian et.al. | 2409.03192 | null |
2024-09-05 | The AdEMAMix Optimizer: Better, Faster, Older | Matteo Pagliardini et.al. | 2409.03137 | null |
2024-09-04 | iConFormer: Dynamic Parameter-Efficient Tuning with Input-Conditioned Adaptation | Hayeon Jo et.al. | 2409.02838 | null |
2024-09-03 | MedUnA: Language guided Unsupervised Adaptation of Vision-Language Models for Medical Image Classification | Umaima Rahman et.al. | 2409.02729 | null |
2024-09-05 | OpenFact at CheckThat! 2024: Combining Multiple Attack Methods for Effective Adversarial Text Generation | Włodzimierz Lewoniewski et.al. | 2409.02649 | null |
2024-09-04 | Boosting Generalizability towards Zero-Shot Cross-Dataset Single-Image Indoor Depth by Meta-Initialization | Cho-Ying Wu et.al. | 2409.02486 | null |
2024-09-03 | Evaluation and Comparison of Visual Language Models for Transportation Engineering Problems | Sanjita Prajapati et.al. | 2409.02278 | null |
2024-09-05 | Robust Clustering on High-Dimensional Data with Stochastic Quantization | Anton Kozyriev et.al. | 2409.02066 | link |
2024-09-03 | Compressed learning based onboard semantic compression for remote sensing platforms | Protim Bhattacharjee et.al. | 2409.01988 | null |
2024-09-03 | State-of-the-art Advances of Deep-learning Linguistic Steganalysis Research | Yihao Wang et.al. | 2409.01780 | null |
2024-09-03 | Enhancing Fine-Grained Visual Recognition in the Low-Data Regime Through Feature Magnitude Regularization | Avraham Chapman et.al. | 2409.01672 | null |
2024-09-03 | ReSpike: Residual Frames-based Hybrid Spiking Neural Networks for Efficient Action Recognition | Shiting Xiao et.al. | 2409.01564 | null |
2024-08-30 | Assessing Generative Language Models in Classification Tasks: Performance and Self-Evaluation Capabilities in the Environmental and Climate Change Domain | Francesca Grasso et.al. | 2408.17362 | link |
2024-08-30 | Covariance-corrected Whitening Alleviates Network Degeneration on Imbalanced Classification | Zhiwei Zhang et.al. | 2408.17197 | null |
2024-08-30 | Improving Extraction of Clinical Event Contextual Properties from Electronic Health Records: A Comparative Study | Shubham Agarwal et.al. | 2408.17181 | null |
2024-09-02 | Instant Adversarial Purification with Adversarial Consistency Distillation | Chun Tong Lei et.al. | 2408.17064 | null |
2024-08-30 | Generative Modeling Perspective for Control and Reasoning in Robotics | Takuma Yoneda et.al. | 2408.17041 | null |
2024-08-29 | Tex-ViT: A Generalizable, Robust, Texture-based dual-branch cross-attention deepfake detector | Deepak Dagar et.al. | 2408.16892 | null |
2024-08-29 | SODAWideNet++: Combining Attention and Convolutions for Salient Object Detection | Rohit Venkata Sai Dulam et.al. | 2408.16645 | null |
2024-08-29 | Android Malware Detection Based on RGB Images and Multi-feature Fusion | Zhiqiang Wang et.al. | 2408.16555 | null |
2024-08-29 | SAU: A Dual-Branch Network to Enhance Long-Tailed Recognition via Generative Models | Guangxi Li et.al. | 2408.16273 | link |
2024-08-29 | Improving Diffusion-based Data Augmentation with Inversion Spherical Interpolation | Yanghao Wang et.al. | 2408.16266 | null |
2024-08-29 | Low Saturation Confidence Distribution-based Test-Time Adaptation for Cross-Domain Remote Sensing Image Classification | Yu Liang et.al. | 2408.16265 | null |
2024-08-28 | EMP: Enhance Memory in Data Pruning | Jinying Xiao et.al. | 2408.16031 | null |
2024-08-28 | Local Descriptors Weighted Adaptive Threshold Filtering For Few-Shot Learning | Bingchen Yan et.al. | 2408.15924 | null |
2024-08-28 | ModalityMirror: Improving Audio Classification in Modality Heterogeneity Federated Learning with Multimodal Distillation | Tiantian Feng et.al. | 2408.15803 | null |
2024-08-28 | Visual Prompt Engineering for Medical Vision Language Models in Radiology | Stefan Denner et.al. | 2408.15802 | null |
2024-08-28 | Harnessing the Intrinsic Knowledge of Pretrained Language Models for Challenging Text Classification Settings | Lingyu Gao et.al. | 2408.15650 | null |
2024-08-27 | DCT-CryptoNets: Scaling Private Inference in the Frequency Domain | Arjun Roy et.al. | 2408.15231 | null |
2024-08-27 | A Review of Transformer-Based Models for Computer Vision Tasks: Capturing Global Context and Spatial Relationships | Gracile Astlin Pereira et.al. | 2408.15178 | null |
2024-08-28 | AnomalousPatchCore: Exploring the Use of Anomalous Samples in Industrial Anomaly Detection | Mykhailo Koshil et.al. | 2408.15113 | null |
2024-08-27 | Data downlink prioritization using image classification on-board a 6U CubeSat | Keenan A. A. Chatar et.al. | 2408.14865 | null |
2024-08-27 | Leveraging Self-supervised Audio Representations for Data-Efficient Acoustic Scene Classification | Yiqiang Cai et.al. | 2408.14862 | null |
2024-08-27 | Text-guided Foundation Model Adaptation for Long-Tailed Medical Image Classification | Sirui Li et.al. | 2408.14770 | null |
2024-08-26 | On-Chip Learning with Memristor-Based Neural Networks: Assessing Accuracy and Efficiency Under Device Variations, Conductance Errors, and Input Noise | M. Reza Eslami et.al. | 2408.14680 | null |
2024-08-26 | Attend-Fusion: Efficient Audio-Visual Fusion for Video Classification | Mahrukh Awan et.al. | 2408.14441 | null |
2024-08-26 | Uncertainties of Latent Representations in Computer Vision | Michael Kirchhof et.al. | 2408.14281 | null |
2024-08-26 | MSFMamba: Multi-Scale Feature Fusion State Space Model for Multi-Source Remote Sensing Image Classification | Feng Gao et.al. | 2408.14255 | null |
2024-08-26 | Feature Aligning Few shot Learning Method Using Local Descriptors Weighted Rules | Bingchen Yan et.al. | 2408.14192 | null |
2024-08-26 | GenFormer -- Generated Images are All You Need to Improve Robustness of Transformers on Small Datasets | Sven Oehri et.al. | 2408.14131 | null |
2024-08-25 | Few-Shot Histopathology Image Classification: Evaluating State-of-the-Art Methods and Unveiling Performance Insights | Ardhendu Sekhar et.al. | 2408.13816 | null |
2024-08-25 | On the Robustness of Kolmogorov-Arnold Networks: An Adversarial Perspective | Tal Alter et.al. | 2408.13809 | null |
2024-08-25 | Enhancing Adaptive Deep Networks for Image Classification via Uncertainty-aware Decision Fusion | Xu Zhang et.al. | 2408.13744 | link |
2024-08-25 | 3D-RCNet: Learning from Transformer to Build a 3D Relational ConvNet for Hyperspectral Image Classification | Haizhao Jing et.al. | 2408.13728 | null |
2024-08-24 | Enhanced Astronomical Source Classification with Integration of Attention Mechanisms and Vision Transformers | Srinadh Reddy Bhavanam et.al. | 2408.13634 | null |
2024-08-23 | Domain-specific long text classification from sparse relevant information | Célia D'Cruz et.al. | 2408.13253 | null |
2024-08-23 | EAViT: External Attention Vision Transformer for Audio Classification | Aquib Iqbal et.al. | 2408.13201 | null |
2024-08-23 | A gradient system based on anisotropic monochrome image processing with orientation auto-adjustment | Harbir Antil et.al. | 2408.12847 | null |
2024-08-23 | Underwater SONAR Image Classification and Analysis using LIME-based Explainable Artificial Intelligence | Purushothaman Natarajan et.al. | 2408.12837 | null |
2024-08-23 | VALE: A Multimodal Visual and Language Explanation Framework for Image Classifiers using eXplainable AI and Language Models | Purushothaman Natarajan et.al. | 2408.12808 | null |
2024-08-23 | BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks on Large Language Models | Yige Li et.al. | 2408.12798 | null |
2024-08-23 | Semi-Supervised Variational Adversarial Active Learning via Learning to Rank and Agreement-Based Pseudo Labeling | Zongyao Lyu et.al. | 2408.12774 | null |
2024-08-23 | Symmetric masking strategy enhances the performance of Masked Image Modeling | Khanh-Binh Nguyen et.al. | 2408.12772 | null |
2024-08-22 | ssProp: Energy-Efficient Training for Convolutional Neural Networks with Scheduled Sparse Back Propagation | Lujia Zhong et.al. | 2408.12561 | link |
2024-08-22 | The Russian-focused embedders' exploration: ruMTEB benchmark and Russian embedding model design | Artem Snegirev et.al. | 2408.12503 | null |
2024-08-22 | Enhanced Infield Agriculture with Interpretable Machine Learning Approaches for Crop Classification | Sudi Murindanyi et.al. | 2408.12426 | null |
2024-08-22 | AT-SNN: Adaptive Tokens for Vision Transformer on Spiking Neural Network | Donghwa Kang et.al. | 2408.12293 | null |
2024-08-22 | Whole Slide Image Classification of Salivary Gland Tumours | John Charlton et.al. | 2408.12275 | null |
2024-08-22 | Query-Efficient Video Adversarial Attack with Stylized Logo | Duoxun Tang et.al. | 2408.12099 | null |
2024-08-21 | Approaching Deep Learning through the Spectral Dynamics of Weights | David Yunis et.al. | 2408.11804 | link |
2024-08-21 | SBDet: A Symmetry-Breaking Object Detector via Relaxed Rotation-Equivariance | Zhiqiang Wu et.al. | 2408.11760 | null |
2024-08-21 | Improving Calibration by Relating Focal Loss, Temperature Scaling, and Properness | Viacheslav Komisarenko et.al. | 2408.11598 | link |
2024-08-21 | MSCPT: Few-shot Whole Slide Image Classification with Multi-scale and Context-focused Prompt Tuning | Minghao Han et.al. | 2408.11505 | null |
2024-08-21 | Enabling Small Models for Zero-Shot Classification through Model Label Learning | Jia Zhang et.al. | 2408.11449 | null |
2024-08-21 | Automatic Dataset Construction (ADC): Sample Collection, Data Curation, and Beyond | Minghao Liu et.al. | 2408.11338 | null |
2024-08-21 | Towards Evaluating Large Language Models on Sarcasm Understanding | Yazhou Zhang et.al. | 2408.11319 | null |
2024-08-20 | Privacy-preserving Universal Adversarial Defense for Black-box Models | Qiao Li et.al. | 2408.10647 | null |
2024-08-20 | A Tutorial on Explainable Image Classification for Dementia Stages Using Convolutional Neural Network and Gradient-weighted Class Activation Mapping | Kevin Kam Fung Yuen et.al. | 2408.10572 | null |
2024-08-20 | NoMatterXAI: Generating "No Matter What" Alterfactual Examples for Explaining Black-Box Text Classification Models | Tuc Nguyen et.al. | 2408.10528 | null |
2024-08-20 | Cervical Cancer Detection Using Multi-Branch Deep Learning Model | Tatsuhiro Baba et.al. | 2408.10498 | null |
2024-08-19 | HaSPeR: An Image Repository for Hand Shadow Puppet Recognition | Syed Rifat Raiyan et.al. | 2408.10360 | link |
2024-08-19 | Leveraging Superfluous Information in Contrastive Representation Learning | Xuechu Yu et.al. | 2408.10292 | null |
2024-08-19 | SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models | Anke Tang et.al. | 2408.10174 | link |
2024-08-19 | Towards Robust Federated Image Classification: An Empirical Study of Weight Selection Strategies in Manufacturing | Vinit Hegiste et.al. | 2408.10024 | null |
2024-08-19 | Detecting Adversarial Attacks in Semantic Segmentation via Uncertainty Estimation: A Deep Analysis | Kira Maag et.al. | 2408.10021 | null |
2024-08-19 | Active Learning for Identifying Disaster-Related Tweets: A Comparison with Keyword Filtering and Generic Fine-Tuning | David Hanny et.al. | 2408.09914 | null |
2024-08-19 | Ranking Generated Answers: On the Agreement of Retrieval Models with Humans on Consumer Health Questions | Sebastian Heineking et.al. | 2408.09831 | null |
2024-08-19 | AutoML-guided Fusion of Entity and LLM-based representations | Boshko Koloski et.al. | 2408.09794 | null |
2024-08-19 | Dataset Distillation for Histopathology Image Classification | Cong Cong et.al. | 2408.09709 | null |
2024-08-19 | A Strategy to Combine 1stGen Transformers and Open LLMs for Automatic Text Classification | Claudio M. V. de Andrade et.al. | 2408.09629 | null |
2024-08-18 | Attention Is Not What You Need: Revisiting Multi-Instance Learning for Whole Slide Image Classification | Xin Liu et.al. | 2408.09449 | null |
2024-08-17 | Narrowing the Focus: Learned Optimizers for Pretrained Models | Gus Kristiansen et.al. | 2408.09310 | null |
2024-08-16 | DPA: Dual Prototypes Alignment for Unsupervised Adaptation of Vision-Language Models | Eman Ali et.al. | 2408.08855 | null |
2024-08-16 | LEVIS: Large Exact Verifiable Input Spaces for Neural Networks | Mohamad Fares El Hajj Chehade et.al. | 2408.08824 | null |
2024-08-16 | Leveraging FourierKAN Classification Head for Pre-Trained Transformer-based Text Classification | Abdullah Al Imran et.al. | 2408.08803 | null |
2024-08-16 | Xpikeformer: Hybrid Analog-Digital Hardware Acceleration for Spiking Transformers | Zihang Song et.al. | 2408.08794 | null |
2024-08-16 | Quantum convolutional neural networks for jet images classification | Hala Elhag et.al. | 2408.08701 | null |
2024-08-16 | MM-UNet: A Mixed MLP Architecture for Improved Ophthalmic Image Segmentation | Zunjie Xiao et.al. | 2408.08600 | null |
2024-08-16 | Tell Codec What Worth Compressing: Semantically Disentangled Image Coding for Machine with LMMs | Jinming Liu et.al. | 2408.08575 | null |
2024-08-16 | Efficient Image-to-Image Diffusion Classifier for Adversarial Robustness | Hefei Mei et.al. | 2408.08502 | link |
2024-08-15 | Beyond Uniform Query Distribution: Key-Driven Grouped Query Attention | Zohaib Khan et.al. | 2408.08454 | null |
2024-08-15 | Predictive uncertainty estimation in deep learning for lung carcinoma classification in digital pathology under real dataset shifts | Abdur R. Fayjie et.al. | 2408.08432 | null |
2024-08-15 | SLCA++: Unleash the Power of Sequential Fine-tuning for Continual Learning with Pre-training | Gengwei Zhang et.al. | 2408.08295 | link |
2024-08-15 | Moving Healthcare AI-Support Systems for Visually Detectable Diseases onto Constrained Devices | Tess Watt et.al. | 2408.08215 | null |
2024-08-15 | Towards flexible perception with visual memory | Robert Geirhos et.al. | 2408.08172 | null |
2024-08-15 | Category-Prompt Refined Feature Learning for Long-Tailed Multi-Label Image Classification | Jiexuan Yan et.al. | 2408.08125 | link |
2024-08-15 | HAIR: Hypernetworks-based All-in-One Image Restoration | Jin Cao et.al. | 2408.08091 | link |
2024-08-14 | Large Language Models Prompting With Episodic Memory | Dai Do et.al. | 2408.07465 | null |
2024-08-14 | Leveraging Perceptual Scores for Dataset Pruning in Computer Vision Tasks | Raghavendra Singh et.al. | 2408.07243 | null |
2024-08-13 | Efficient Search for Customized Activation Functions with Gradient Descent | Lukas Strack et.al. | 2408.06820 | link |
2024-08-13 | Do Vision-Language Foundational models show Robust Visual Perception? | Shivam Chandhok et.al. | 2408.06781 | link |
2024-08-13 | Towards Cross-Domain Single Blood Cell Image Classification via Large-Scale LoRA-based Segment Anything Model | Yongcheng Li et.al. | 2408.06716 | link |
2024-08-13 | Coherence Awareness in Diffractive Neural Networks | Matan Kleiner et.al. | 2408.06681 | null |
2024-08-12 | Is it a work or leisure travel? Applying text classification to identify work-related travel on social networks | Lucas Félix et.al. | 2408.06341 | null |
2024-08-12 | Audio Enhancement for Computer Audition -- An Iterative Training Paradigm Using Sample Importance | Manuel Milling et.al. | 2408.06264 | null |
2024-08-12 | Deep Learning System Boundary Testing through Latent Space Style Mixing | Amr Abdellatif et.al. | 2408.06258 | null |
2024-08-12 | Global-to-Local Support Spectrums for Language Model Explainability | Lucas Agussurja et.al. | 2408.05976 | null |
2024-08-12 | A Simple Task-aware Contrastive Local Descriptor Selection Strategy for Few-shot Learning between inter class and intra class | Qian Qiao et.al. | 2408.05953 | null |
2024-08-12 | Classifier Guidance Enhances Diffusion-based Adversarial Purification by Preserving Predictive Information | Mingkun Zhang et.al. | 2408.05900 | null |
2024-08-11 | HiLight: A Hierarchy-aware Light Global Model with Hierarchical Local ConTrastive Learning | Zhijian Chen et.al. | 2408.05786 | null |
2024-08-11 | PRECISe : Prototype-Reservation for Explainable Classification under Imbalanced and Scarce-Data Settings | Vaibhav Ganatra et.al. | 2408.05754 | null |
2024-08-11 | Disposable-key-based image encryption for collaborative learning of Vision Transformer | Rei Aso et.al. | 2408.05737 | null |
2024-08-11 | A Novel Momentum-Based Deep Learning Techniques for Medical Image Classification and Segmentation | Koushik Biswas et.al. | 2408.05692 | null |
2024-08-09 | A conformalized learning of a prediction set with applications to medical imaging classification | Roy Hirsch et.al. | 2408.05037 | null |
2024-08-09 | Generalisation First, Memorisation Second? Memorisation Localisation for Natural Language Classification Tasks | Verna Dankers et.al. | 2408.04965 | null |
2024-08-09 | LiD-FL: Towards List-Decodable Federated Learning | Hong Liu et.al. | 2408.04963 | null |
2024-08-09 | In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation | Dahyun Kang et.al. | 2408.04961 | link |
2024-08-08 | Enhanced Prototypical Part Network (EPPNet) For Explainable Image Classification Via Prototypes | Bhushan Atote et.al. | 2408.04606 | null |
2024-08-08 | SCENE: Evaluating Explainable AI Techniques Using Soft Counterfactuals | Haoran Zheng et.al. | 2408.04575 | null |
2024-08-08 | An experimental comparative study of backpropagation and alternatives for training binary neural networks for image classification | Ben Crulis et.al. | 2408.04460 | null |
2024-08-08 | Dual-branch PolSAR Image Classification Based on GraphMAE and Local Feature Extraction | Yuchen Wang et.al. | 2408.04294 | null |
2024-08-07 | FMiFood: Multi-modal Contrastive Learning for Food Image Classification | Xinyue Pan et.al. | 2408.03922 | null |
2024-08-07 | Leveraging Variation Theory in Counterfactual Data Augmentation for Optimized Active Learning | Simret Araya Gebreegziabher et.al. | 2408.03819 | null |
2024-08-07 | Intuitionistic Fuzzy Cognitive Maps for Interpretable Image Classification | Georgia Sovatzidi et.al. | 2408.03745 | null |
2024-08-07 | CAS-ViT: Convolutional Additive Self-attention Vision Transformers for Efficient Mobile Applications | Tianfang Zhang et.al. | 2408.03703 | link |
2024-08-07 | Designing Extremely Memory-Efficient CNNs for On-device Vision Tasks | Jaewook Lee et.al. | 2408.03663 | null |
2024-08-07 | Making Robust Generalizers Less Rigid with Soft Ascent-Descent | Matthew J. Holland et.al. | 2408.03619 | null |
2024-08-06 | AI Foundation Models in Remote Sensing: A Survey | Siqi Lu et.al. | 2408.03464 | null |
2024-08-06 | Compress and Compare: Interactively Evaluating Efficiency and Behavior Across ML Model Compression Experiments | Angie Boggust et.al. | 2408.03274 | null |
2024-08-06 | A Debiased Nearest Neighbors Framework for Multi-Label Text Classification | Zifeng Cheng et.al. | 2408.03202 | null |
2024-08-06 | Leveraging Parameter Efficient Training Methods for Low Resource Text Classification: A Case Study in Marathi | Pranita Deshmukh et.al. | 2408.03172 | null |
2024-08-06 | Comb, Prune, Distill: Towards Unified Pruning for Vision Model Compression | Jonas Schmitt et.al. | 2408.03046 | null |
2024-08-06 | L3iTC at the FinLLM Challenge Task: Quantization for Financial Text Classification & Summarization | Elvys Linhares Pontes et.al. | 2408.03033 | null |
2024-08-06 | Adversarial Robustness of Open-source Text Classification Models and Fine-Tuning Chains | Hao Qin et.al. | 2408.02963 | null |
2024-08-06 | Dual-View Pyramid Pooling in Deep Neural Networks for Improved Medical Image Classification and Confidence Calibration | Xiaoqing Zhang et.al. | 2408.02906 | null |
2024-08-05 | Interpretation of the Intent Detection Problem as Dynamics in a Low-dimensional Space | Eduardo Sanchez-Karhunen et.al. | 2408.02838 | null |
2024-08-05 | Pre-trained Encoder Inference: Revealing Upstream Encoders In Downstream Machine Learning Services | Shaopeng Fu et.al. | 2408.02814 | null |
2024-08-05 | FPT+: A Parameter and Memory Efficient Transfer Learning Method for High-resolution Medical Image Classification | Yijin Huang et.al. | 2408.02426 | null |
2024-08-05 | On the Robustness of Malware Detectors to Adversarial Samples | Muhammad Salman et.al. | 2408.02310 | null |
2024-08-05 | Low-Cost Self-Ensembles Based on Multi-Branch Transformation and Grouped Convolution | Hojung Lee et.al. | 2408.02307 | null |
2024-08-05 | Network Fission Ensembles for Low-Cost Self-Ensembles | Hojung Lee et.al. | 2408.02301 | null |
2024-08-04 | VidModEx: Interpretable and Efficient Black Box Model Extraction for High-Dimensional Spaces | Somnath Sendhil Kumar et.al. | 2408.02140 | null |
2024-08-04 | DeMansia: Mamba Never Forgets Any Tokens | Ricky Fang et.al. | 2408.01986 | null |
2024-08-06 | A Survey and Evaluation of Adversarial Attacks for Object Detection | Khoi Nguyen Tiet Nguyen et.al. | 2408.01934 | null |
2024-08-03 | Safe Semi-Supervised Contrastive Learning Using In-Distribution Data as Positive Examples | Min Gu Kwak et.al. | 2408.01872 | null |
2024-08-03 | LAM3D: Leveraging Attention for Monocular 3D Object Detection | Diana-Alexandra Sas et.al. | 2408.01739 | null |
2024-08-02 | Counterfactual Explanations for Medical Image Classification and Regression using Diffusion Autoencoder | Matan Atad et.al. | 2408.01571 | null |
2024-08-02 | Spatial-Spectral Morphological Mamba for Hyperspectral Image Classification | Muhammad Ahmad et.al. | 2408.01372 | null |
2024-08-02 | WaveMamba: Spatial-Spectral Wavelet Mamba for Hyperspectral Image Classification | Muhammad Ahmad et.al. | 2408.01231 | null |
2024-08-02 | Multi-head Spatial-Spectral Mamba for Hyperspectral Image Classification | Muhammad Ahmad et.al. | 2408.01224 | null |
2024-08-02 | Rethinking Pre-trained Feature Extractor Selection in Multiple Instance Learning for Whole Slide Image Classification | Bryan Wong et.al. | 2408.01167 | null |
2024-08-01 | CERT-ED: Certifiably Robust Text Classification for Edit Distance | Zhuoqun Huang et.al. | 2408.00728 | null |
2024-08-01 | Deep Learning in Medical Image Classification from MRI-based Brain Tumor Images | Xiaoyi Liu et.al. | 2408.00636 | null |
2024-08-01 | DECIDER: Leveraging Foundation Model Priors for Improved Model Failure Detection and Explanation | Rakshith Subramanyam et.al. | 2408.00331 | null |
2024-07-31 | Vera Verto: Multimodal Hijacking Attack | Minxing Zhang et.al. | 2408.00129 | null |
2024-07-31 | Learning Video Context as Interleaved Multimodal Sequences | Kevin Qinghong Lin et.al. | 2407.21757 | null |
2024-07-30 | Contrasting Deep Learning Models for Direct Respiratory Insufficiency Detection Versus Blood Oxygen Saturation Estimation | Marcelo Matheus Gauy et.al. | 2407.20989 | null |
2024-07-30 | Faithful and Plausible Natural Language Explanations for Image Classification: A Pipeline Approach | Adam Wojciechowski et.al. | 2407.20899 | null |
2024-08-01 | DFE-IANet: A Method for Polyp Image Classification Based on Dual-domain Feature Extraction and Interaction Attention | Wei Wang et.al. | 2407.20843 | null |
2024-08-01 | The Susceptibility of Example-Based Explainability Methods to Class Outliers | Ikhtiyor Nematov et.al. | 2407.20678 | null |
2024-07-30 | Knowledge Fused Recognition: Fusing Hierarchical Knowledge for Image Recognition through Quantitative Relativity Modeling and Deep Metric Learning | Yunfeng Zhao et.al. | 2407.20600 | null |
2024-07-30 | Exploring Liquid Neural Networks on Loihi-2 | Wiktoria Agata Pawlak et.al. | 2407.20590 | null |
2024-07-29 | Graphite: A Graph-based Extreme Multi-Label Short Text Classifier for Keyphrase Recommendation | Ashirbad Mishra et.al. | 2407.20462 | null |
2024-07-29 | Diffusion Feedback Helps CLIP See Better | Wenxuan Wang et.al. | 2407.20171 | null |
2024-07-29 | Distilling High Diagnostic Value Patches for Whole Slide Image Classification Using Attention Mechanism | Tianhang Nan et.al. | 2407.19821 | null |
2024-07-28 | Competition-based Adaptive ReLU for Deep Neural Networks | Junjia Chen et.al. | 2407.19441 | null |
2024-07-28 | Depth-Wise Convolutions in Vision Transformers for Efficient Training on Small Datasets | Tianxiao Zhang et.al. | 2407.19394 | link |
2024-07-27 | Inference-Time Selective Debiasing | Gleb Kuzmin et.al. | 2407.19345 | null |
2024-07-27 | Stellar Blend Image Classification Using Computationally Efficient Gaussian Processes | Chinedu Eleh et.al. | 2407.19297 | null |
2024-07-27 | Towards Robust Few-shot Class Incremental Learning in Audio Classification using Contrastive Representation | Riyansha Singh et.al. | 2407.19265 | null |
2024-07-27 | A Survey of Malware Detection Using Deep Learning | Ahmed Bensaoud et.al. | 2407.19153 | null |
2024-07-26 | UniForensics: Face Forgery Detection via General Facial Representation | Ziyuan Fang et.al. | 2407.19079 | null |
2024-07-26 | A Scalable Quantum Non-local Neural Network for Image Classification | Sparsh Gupta et.al. | 2407.18906 | link |
2024-07-26 | Unifying Visual and Semantic Feature Spaces with Diffusion Models for Enhanced Cross-Modal Alignment | Yuze Zheng et.al. | 2407.18854 | null |
2024-07-26 | Local Binary Pattern(LBP) Optimization for Feature Extraction | Zeinab Sedaghatjoo et.al. | 2407.18665 | null |
2024-07-26 | Topology Optimization of Random Memristors for Input-Aware Dynamic SNN | Bo Wang et.al. | 2407.18625 | null |
2024-07-26 | Content-driven Magnitude-Derivative Spectrum Complementary Learning for Hyperspectral Image Classification | Huiyan Bai et.al. | 2407.18593 | null |
2024-07-26 | VSSD: Vision Mamba with Non-Casual State Space Duality | Yuheng Shi et.al. | 2407.18559 | link |
2024-07-25 | Self-supervised pre-training with diffusion model for few-shot landmark detection in x-ray images | Roberto Di Via et.al. | 2407.18125 | null |
2024-07-25 | Mew: Multiplexed Immunofluorescence Image Analysis through an Efficient Multiplex Network | Sukwon Yun et.al. | 2407.17857 | link |
2024-07-25 | SAM-MIL: A Spatial Contextual Aware Multiple Instance Learning Approach for Whole Slide Image Classification | Heng Fang et.al. | 2407.17689 | link |
2024-07-26 | Unsqueeze [CLS] Bottleneck to Learn Rich Representations | Qing Su et.al. | 2407.17671 | link |
2024-07-24 | Explaining the Model, Protecting Your Data: Revealing and Mitigating the Data Privacy Risks of Post-Hoc Model Explanations via Membership Inference | Catherine Huang et.al. | 2407.17663 | null |
2024-07-23 | S-E Pipeline: A Vision Transformer (ViT) based Resilient Classification Pipeline for Medical Imaging Against Adversarial Attacks | Neha A S et.al. | 2407.17587 | null |
2024-07-24 | A Novel Two-Step Fine-Tuning Pipeline for Cold-Start Active Learning in Text Classification Tasks | Fabiano Belém et.al. | 2407.17284 | null |
2024-07-24 | Graph Neural Networks: A suitable Alternative to MLPs in Latent 3D Medical Image Classification? | Johannes Kiechle et.al. | 2407.17219 | link |
2024-07-24 | Quanv4EO: Empowering Earth Observation by means of Quanvolutional Neural Networks | Alessandro Sebastianelli et.al. | 2407.17108 | null |
2024-07-24 | An Adaptive Gradient Regularization Method | Huixiu Jiang et.al. | 2407.16944 | null |
2024-07-23 | Lawma: The Power of Specialization for Legal Tasks | Ricardo Dominguez-Olmedo et.al. | 2407.16615 | null |
2024-07-23 | Deep Bayesian segmentation for colon polyps: Well-calibrated predictions in medical imaging | Daniela L. Ramos et.al. | 2407.16608 | null |
2024-07-23 | Designing robust diffractive neural networks with improved transverse shift tolerance | Daniil V. Soshnikov et.al. | 2407.16456 | null |
2024-07-23 | Image Classification using Fuzzy Pooling in Convolutional Kolmogorov-Arnold Networks | Ayan Igali et.al. | 2407.16268 | null |
2024-07-23 | HSVLT: Hierarchical Scale-Aware Vision-Language Transformer for Multi-Label Image Classification | Shuyi Ouyang et.al. | 2407.16244 | null |
2024-07-23 | Improved Few-Shot Image Classification Through Multiple-Choice Questions | Dipika Khullar et.al. | 2407.16145 | null |
2024-07-22 | Pavement Fatigue Crack Detection and Severity Classification Based on Convolutional Neural Network | Zhen Wang et.al. | 2407.16021 | null |
2024-07-22 | AIDE: Antithetical, Intent-based, and Diverse Example-Based Explanations | Ikhtiyor Nematov et.al. | 2407.16010 | null |
2024-07-22 | Comprehensive Study on Performance Evaluation and Optimization of Model Compression: Bridging Traditional Deep Learning and Large Language Models | Aayush Saxena et.al. | 2407.15904 | null |
2024-07-22 | Beyond Size and Class Balance: Alpha as a New Dataset Quality Metric for Deep Learning | Josiah Couch et.al. | 2407.15724 | null |
2024-07-22 | Retinomorphic Feature Detection and Machine Vision in a Network Laser | Wai Kit Ng et.al. | 2407.15558 | null |
2024-07-22 | Learning deep illumination-robust features from multispectral filter array images | Anis Amziane et.al. | 2407.15472 | null |
2024-07-22 | Is user feedback always informative? Retrieval Latent Defending for Semi-Supervised Domain Adaptation without Source Data | Junha Song et.al. | 2407.15383 | null |
2024-07-22 | FMDNN: A Fuzzy-guided Multi-granular Deep Neural Network for Histopathological Image Classification | Weiping Ding et.al. | 2407.15312 | null |
2024-07-21 | Assessing Sample Quality via the Latent Space of Generative Models | Jingyi Xu et.al. | 2407.15171 | null |
2024-07-21 | A multi-level multi-label text classification dataset of 19th century Ottoman and Russian literary and critical texts | Gokcen Gokceoglu et.al. | 2407.15136 | null |
2024-07-20 | Toward Efficient Convolutional Neural Networks With Structured Ternary Patterns | Christos Kyrkou et.al. | 2407.14831 | link |
2024-07-20 | Subgraph Clustering and Atom Learning for Improved Image Classification | Aryan Singh et.al. | 2407.14772 | null |
2024-07-20 | A Comprehensive Review of Few-shot Action Recognition | Yuyang Wanyan et.al. | 2407.14744 | null |
2024-07-19 | DEPICT: Diffusion-Enabled Permutation Importance for Image Classification Tasks | Sarah Jabbour et.al. | 2407.14509 | null |
2024-07-19 | Enhancing Zero-shot Audio Classification using Sound Attribute Knowledge from Large Language Models | Xuenan Xu et.al. | 2407.14355 | null |
2024-07-19 | EmoCAM: Toward Understanding What Drives CNN-based Emotion Recognition | Youssef Doulfoukar et.al. | 2407.14314 | null |
2024-07-18 | CoAPT: Context Attribute words for Prompt Tuning | Gun Lee et.al. | 2407.13808 | null |
2024-07-18 | GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model | Abdelrahman Shaker et.al. | 2407.13772 | link |
2024-07-18 | Addressing Imbalance for Class Incremental Learning in Medical Image Classification | Xuze Hao et.al. | 2407.13768 | null |
2024-07-18 | Differential Privacy Mechanisms in Neural Tangent Kernel Regression | Jiuxiang Gu et.al. | 2407.13621 | null |
2024-07-18 | CycleMix: Mixing Source Domains for Domain Generalization in Style-Dependent Data | Aristotelis Ballas et.al. | 2407.13421 | link |
2024-07-17 | LookupViT: Compressing visual information to a limited number of tokens | Rajat Koner et.al. | 2407.12753 | null |
2024-07-17 | Toward INT4 Fixed-Point Training via Exploring Quantization Error for Gradients | Dohyung Kim et.al. | 2407.12637 | null |
2024-07-17 | Domain-specific or Uncertainty-aware models: Does it really make a difference for biomedical text classification? | Aman Sinha et.al. | 2407.12626 | null |
2024-07-18 | Benchmarking Robust Self-Supervised Learning Across Diverse Downstream Tasks | Antoni Kowalczuk et.al. | 2407.12588 | link |
2024-07-17 | Non-parametric regularization for class imbalance federated medical image classification | Jeffry Wicaksana et.al. | 2407.12446 | link |
2024-07-17 | FETCH: A Memory-Efficient Replay Approach for Continual Learning in Image Classification | Markus Weißflog et.al. | 2407.12375 | null |
2024-07-17 | Adaptive Cascading Network for Continual Test-Time Adaptation | Kien X. Nguyen et.al. | 2407.12240 | null |
2024-07-16 | Generalized Coverage for More Robust Low-Budget Active Learning | Wonho Bae et.al. | 2407.12212 | null |
2024-07-18 | A Closer Look at Benchmarking Self-Supervised Pre-training with Image Classification | Markus Marks et.al. | 2407.12210 | null |
2024-07-16 | Novel Artistic Scene-Centric Datasets for Effective Transfer Learning in Fragrant Spaces | Shumei Liu et.al. | 2407.11701 | null |
2024-07-16 | Probing the Efficacy of Federated Parameter-Efficient Fine-Tuning of Vision Transformers for Medical Image Classification | Naif Alkhunaizi et.al. | 2407.11573 | null |
2024-07-16 | TCFormer: Visual Recognition via Token Clustering Transformer | Wang Zeng et.al. | 2407.11321 | link |
2024-07-16 | PADRe: A Unifying Polynomial Attention Drop-in Replacement for Efficient Vision Transformer | Pierre-David Letourneau et.al. | 2407.11306 | null |
2024-07-15 | Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer from Text to Image via CLIP Inversion | Philipp Allgeuer et.al. | 2407.11211 | null |
2024-07-16 | DataDream: Few-shot Guided Dataset Generation | Jae Myung Kim et.al. | 2407.10910 | link |
2024-07-15 | Pathology-knowledge Enhanced Multi-instance Prompt Learning for Few-shot Whole Slide Image Classification | Linhao Qu et.al. | 2407.10814 | null |
2024-07-15 | Employing Sentence Space Embedding for Classification of Data Stream from Fake News Domain | Paweł Zyblewski et.al. | 2407.10807 | null |
2024-07-15 | Anticipating Future Object Compositions without Forgetting | Youssef Zahran et.al. | 2407.10723 | null |
2024-07-15 | GeoMix: Towards Geometry-Aware Data Augmentation | Wentao Zhao et.al. | 2407.10681 | link |
2024-07-15 | Learning Natural Consistency Representation for Face Forgery Video Detection | Daichi Zhang et.al. | 2407.10550 | null |
2024-07-15 | Improving Hyperbolic Representations via Gromov-Wasserstein Regularization | Yifei Yang et.al. | 2407.10495 | null |
2024-07-15 | Backdoor Attacks against Image-to-Image Networks | Wenbo Jiang et.al. | 2407.10445 | null |
2024-07-14 | Deep Learning Algorithms for Early Diagnosis of Acute Lymphoblastic Leukemia | Dimitris Papaioannou et.al. | 2407.10251 | null |
2024-07-14 | Advancing Continual Learning for Robust Deepfake Audio Classification | Feiyi Dong et.al. | 2407.10108 | null |
2024-07-12 | Evaluating the Adversarial Robustness of Semantic Segmentation: Trying Harder Pays Off | Levente Halmosi et.al. | 2407.09150 | link |
2024-07-12 | Open Vocabulary Multi-Label Video Classification | Rohit Gupta et.al. | 2407.09073 | null |
2024-07-12 | GPC: Generative and General Pathology Image Classifier | Anh Tien Nguyen et.al. | 2407.09035 | null |
2024-07-12 | CAMP: Continuous and Adaptive Learning Model in Pathology | Anh Tien Nguyen et.al. | 2407.09030 | null |
2024-07-12 | SlideGCD: Slide-based Graph Collaborative Training with Knowledge Distillation for Whole Slide Image Classification | Tong Shu et.al. | 2407.08968 | null |
2024-07-12 | Domain-Hierarchy Adaptation via Chain of Iterative Reasoning for Few-shot Hierarchical Text Classification | Ke Ji et.al. | 2407.08959 | null |
2024-07-11 | Local Clustering for Lung Cancer Image Classification via Sparse Solution Technique | Jackson Hamel et.al. | 2407.08800 | null |
2024-07-11 | Data Adaptive Traceback for Vision-Language Foundation Models in Image Classification | Wenshuo Peng et.al. | 2407.08787 | null |
2024-07-11 | ElasticAST: An Audio Spectrogram Transformer for All Length and Resolutions | Jiu Feng et.al. | 2407.08691 | link |
2024-07-11 | Histopathological Image Classification with Cell Morphology Aware Deep Neural Networks | Andrey Ignatov et.al. | 2407.08625 | link |
2024-07-11 | BiasPruner: Debiased Continual Learning for Medical Image Classification | Nourhan Bayasi et.al. | 2407.08609 | link |
2024-07-11 | GraphMamba: An Efficient Graph Structure Learning Vision Mamba for Hyperspectral Image Classification | Aitao Yang et.al. | 2407.08255 | link |
2024-07-11 | Beyond Text: Leveraging Multi-Task Learning and Cognitive Appraisal Theory for Post-Purchase Intention Analysis | Gerard Christopher Yeo et.al. | 2407.08182 | null |
2024-07-11 | Enrich the content of the image Using Context-Aware Copy Paste | Qiushi Guo et.al. | 2407.08151 | null |
2024-07-10 | MambaVision: A Hybrid Mamba-Transformer Vision Backbone | Ali Hatamizadeh et.al. | 2407.08083 | link |
2024-07-10 | The Misclassification Likelihood Matrix: Some Classes Are More Likely To Be Misclassified Than Others | Daniel Sikar et.al. | 2407.07818 | null |
2024-07-11 | Trainable Highly-expressive Activation Functions | Irit Chelly et.al. | 2407.07564 | null |
2024-07-10 | HDKD: Hybrid Data-Efficient Knowledge Distillation Network for Medical Image Classification | Omar S. EL-Assiouti et.al. | 2407.07516 | null |
2024-07-10 | Towards a text-based quantitative and explainable histopathology image analysis | Anh Tien Nguyen et.al. | 2407.07360 | null |
2024-07-11 | FALFormer: Feature-aware Landmarks self-attention for Whole-slide Image Classification | Doanh C. Bui et.al. | 2407.07340 | link |
2024-07-10 | Dual-stage Hyperspectral Image Classification Model with Spectral Supertoken | Peifu Liu et.al. | 2407.07307 | link |
2024-07-09 | Exploring Camera Encoder Designs for Autonomous Driving Perception | Barath Lakshmanan et.al. | 2407.07276 | null |
2024-07-09 | CTRL-F: Pairing Convolution with Transformer for Image Classification via Multi-Level Feature Cross-Attention and Representation Learning Fusion | Hosam S. EL-Assiouti et.al. | 2407.06673 | null |
2024-07-09 | NoisyAG-News: A Benchmark for Addressing Instance-Dependent Noise in Text Classification | Hongfei Huang et.al. | 2407.06579 | null |
2024-07-08 | Hybrid Classical-Quantum architecture for vectorised image classification of hand-written sketches | Y. Cordero et.al. | 2407.06416 | null |
2024-07-08 | GeoWATCH for Detecting Heavy Construction in Heterogeneous Time Series of Satellite Images | Jon Crall et.al. | 2407.06337 | null |
2024-07-08 | Multi-Label Plant Species Classification with Self-Supervised Vision Transformers | Murilo Gustineli et.al. | 2407.06298 | link |
2024-07-08 | Active Label Refinement for Robust Training of Imbalanced Medical Image Classification Tasks in the Presence of High Label Noise | Bidur Khanal et.al. | 2407.05973 | null |
2024-07-08 | Wavelet Convolutions for Large Receptive Fields | Shahaf E. Finder et.al. | 2407.05848 | link |
2024-07-08 | Evaluating the Fairness of Neural Collapse in Medical Image Classification | Kaouther Mouheb et.al. | 2407.05843 | null |
2024-07-08 | Learning to Adapt Category Consistent Meta-Feature of CLIP for Few-Shot Classification | Jiaying Shi et.al. | 2407.05647 | null |
2024-07-08 | New Directions in Text Classification Research: Maximizing The Performance of Sentiment Classification from Limited Data | Surya Agustian et.al. | 2407.05627 | null |
2024-07-08 | Momentum Auxiliary Network for Supervised Local Learning | Junhao Su et.al. | 2407.05623 | link |
2024-07-08 | Open-world Multi-label Text Classification with Extremely Weak Supervision | Xintong Li et.al. | 2407.05609 | link |
2024-07-08 | FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performance | Jiedong Zhuang et.al. | 2407.05578 | null |
2024-07-08 | An accurate detection is not all you need to combat label noise in web-noisy datasets | Paul Albert et.al. | 2407.05528 | null |
2024-07-07 | Leveraging Topological Guidance for Improved Knowledge Distillation | Eun Som Jeon et.al. | 2407.05316 | link |
2024-07-05 | AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation | Yuhan Zhu et.al. | 2407.04603 | null |
2024-07-05 | AMD: Automatic Multi-step Distillation of Large-scale Vision Models | Cheng Han et.al. | 2407.04208 | null |
2024-07-04 | LeDNet: Localization-enabled Deep Neural Network for Multi-Label Radiography Image Classification | Lalit Pant et.al. | 2407.03931 | null |
2024-07-04 | DocXplain: A Novel Model-Agnostic Explainability Method for Document Image Classification | Saifullah Saifullah et.al. | 2407.03830 | null |
2024-07-04 | reBEN: Refined BigEarthNet Dataset for Remote Sensing Image Analysis | Kai Norman Clasen et.al. | 2407.03653 | link |
2024-07-04 | Resampled Datasets Are Not Enough: Mitigating Societal Bias Beyond Single Attributes | Yusuke Hirota et.al. | 2407.03623 | null |
2024-07-04 | Self Adaptive Threshold Pseudo-labeling and Unreliable Sample Contrastive Loss for Semi-supervised Image Classification | Xuerong Zhang et.al. | 2407.03596 | null |
2024-07-04 | DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image Classification | Wenhui Zhu et.al. | 2407.03575 | link |
2024-07-03 | A multicategory jet image classification framework using deep neural network | Jairo Orozco Sandoval et.al. | 2407.03524 | null |
2024-07-03 | Model Guidance via Explanations Turns Image Classifiers into Segmentation Models | Xiaoyan Yu et.al. | 2407.03009 | null |
2024-07-03 | ShiftAddAug: Augment Multiplication-Free Tiny Neural Network with Hybrid Computation | Yipin Guo et.al. | 2407.02881 | null |
2024-07-03 | Fine-Grained Scene Image Classification with Modality-Agnostic Adapter | Yiqun Wang et.al. | 2407.02769 | link |
2024-07-03 | ADFQ-ViT: Activation-Distribution-Friendly Post-Training Quantization for Vision Transformers | Yanfeng Jiang et.al. | 2407.02763 | null |
2024-07-02 | Spectral Graph Reasoning Network for Hyperspectral Image Classification | Huiling Wang et.al. | 2407.02647 | null |
2024-07-01 | CGRclust: Chaos Game Representation for Twin Contrastive Clustering of Unlabelled DNA Sequences | Fatemeh Alipour et.al. | 2407.02538 | link |
2024-07-02 | Exploring the Role of Transliteration in In-Context Learning for Low-resource Languages Written in Non-Latin Scripts | Chunlan Ma et.al. | 2407.02320 | null |
2024-07-03 | Federated Distillation for Medical Image Classification: Towards Trustworthy Computer-Aided Diagnosis | Sufen Ren et.al. | 2407.02261 | null |
2024-07-02 | Hybrid Feature Collaborative Reconstruction Network for Few-Shot Fine-Grained Image Classification | Shulei Qiu et.al. | 2407.02123 | null |
2024-07-01 | Optimized Learning for X-Ray Image Classification for Multi-Class Disease Diagnoses with Accelerated Computing Strategies | Sebastian A. Cruz Romero et.al. | 2407.01705 | null |
2024-07-02 | xLSTM-UNet can be an Effective 2D & 3D Medical Image Segmentation Backbone with Vision-LSTM (ViL) better than its Mamba Counterpart | Tianrun Chen et.al. | 2407.01530 | link |
2024-07-01 | Scarecrow monitoring system:employing mobilenet ssd for enhanced animal supervision | Balaji VS et.al. | 2407.01435 | null |
2024-07-01 | Semantic Compositions Enhance Vision-Language Contrastive Learning | Maxwell Aladago et.al. | 2407.01408 | null |
2024-07-01 | GalLoP: Learning Global and Local Prompts for Vision-Language Models | Marc Lafon et.al. | 2407.01400 | null |
2024-07-01 | Protecting Privacy in Classifiers by Token Manipulation | Re'em Harel et.al. | 2407.01334 | null |
2024-07-01 | Gradient-based Class Weighting for Unsupervised Domain Adaptation in Dense Prediction Visual Tasks | Roberto Alcover-Couso et.al. | 2407.01327 | null |
2024-06-28 | Extract More from Less: Efficient Fine-Grained Visual Recognition in Low-Data Regimes | Dmitry Demidov et.al. | 2406.19814 | link |
2024-06-27 | Fibottention: Inceptive Visual Representation Learning with Diverse Attention Across Heads | Ali Khaleghi Rahimian et.al. | 2406.19391 | link |
2024-06-27 | Learning Visual Conditioning Tokens to Correct Domain Shift for Fully Test-time Adaptation | Yushun Tang et.al. | 2406.19341 | null |
2024-06-27 | Spiking Convolutional Neural Networks for Text Classification | Changze Lv et.al. | 2406.19230 | link |
2024-06-27 | Adaptive Stochastic Weight Averaging | Caglar Demir et.al. | 2406.19092 | link |
2024-06-27 | FedMLP: Federated Multi-Label Medical Image Classification under Task Heterogeneity | Zhaobin Sun et.al. | 2406.18995 | link |
2024-06-26 | Detecting Machine-Generated Texts: Not Just "AI vs Humans" and Explainability is Complicated | Jiazhou Ji et.al. | 2406.18259 | null |
2024-06-26 | ViT-1.58b: Mobile Vision Transformers in the 1-bit Era | Zhengqing Yuan et.al. | 2406.18051 | null |
2024-06-25 | Benchmarking Deep Learning Models on NVIDIA Jetson Nano for Real-Time Systems: An Empirical Investigation | Tushar Prasanna Swaminathan et.al. | 2406.17749 | link |
2024-06-25 | Structured Unrestricted-Rank Matrices for Parameter Efficient Fine-tuning | Arijit Sehanobish et.al. | 2406.17740 | null |
2024-06-25 | BayTTA: Uncertainty-aware medical image classification with optimized test-time augmentation using Bayesian model averaging | Zeinab Sherkatghanad et.al. | 2406.17640 | link |
2024-06-26 | Mitigate the Gap: Investigating Approaches for Improving Cross-Modal Alignment in CLIP | Sedigheh Eslami et.al. | 2406.17639 | null |
2024-06-25 | Knowledge Distillation in Automated Annotation: Supervised Text Classification with LLM-Generated Training Labels | Nicholas Pangakis et.al. | 2406.17633 | null |
2024-06-25 | Retrieval-style In-Context Learning for Few-shot Hierarchical Text Classification | Huiyao Chen et.al. | 2406.17534 | link |
2024-06-25 | TSynD: Targeted Synthetic Data Generation for Enhanced Medical Image Classification | Joshua Niemeijer et.al. | 2406.17473 | null |
2024-06-25 | Dynamic Scheduling for Vehicle-to-Vehicle Communications Enhanced Federated Learning | Jintao Yan et.al. | 2406.17470 | null |
2024-06-25 | Implicit-Zoo: A Large-Scale Dataset of Neural Implicit Functions for 2D Images and 3D Scenes | Qi Ma et.al. | 2406.17438 | null |
2024-06-25 | Robustly Optimized Deep Feature Decoupling Network for Fatty Liver Diseases Detection | Peng Huang et.al. | 2406.17338 | null |
2024-06-24 | Evaluation of Language Models in the Medical Context Under Resource-Constrained Settings | Andrea Posada et.al. | 2406.16611 | link |
2024-06-24 | Improving robustness to corruptions with multiplicative weight perturbations | Trung Trinh et.al. | 2406.16540 | null |
2024-06-24 | UNICAD: A Unified Approach for Attack Detection, Noise Reduction and Novel Class Identification | Alvaro Lopez Pellicer et.al. | 2406.16501 | null |
2024-06-24 | Improving Quaternion Neural Networks with Quaternionic Activation Functions | Johannes Pöppelbaum et.al. | 2406.16481 | null |
2024-06-24 | Learning in Wilson-Cowan model for metapopulation | Raffaele Marino et.al. | 2406.16453 | link |
2024-06-24 | Context-augmented Retrieval: A Novel Framework for Fast Information Retrieval based Response Generation using Large Language Model | Sai Ganesh et.al. | 2406.16383 | null |
2024-06-24 | Combining Supervised Learning and Reinforcement Learning for Multi-Label Classification Tasks with Partial Labels | Zixia Jia et.al. | 2406.16293 | null |
2024-06-23 | Jacobian Descent for Multi-Objective Optimization | Pierre Quinton et.al. | 2406.16232 | null |
2024-06-23 | Learning with Noisy Ground Truth: From 2D Classification to 3D Reconstruction | Yangdi Lu et.al. | 2406.15982 | null |
2024-06-22 | PUDD: Towards Robust Multi-modal Prototype-based Deepfake Detection | Alvaro Lopez Pellcier et.al. | 2406.15921 | null |
2024-06-21 | Retrieval Augmented Zero-Shot Text Classification | Tassallah Abdullahi et.al. | 2406.15241 | null |
2024-06-21 | DiffExplainer: Unveiling Black Box Models Via Counterfactual Generation | Yingying Fang et.al. | 2406.15182 | null |
2024-06-21 | This actually looks like that: Proto-BagNets for local and global interpretability-by-design | Kerol Djoumessi et.al. | 2406.15168 | link |
2024-06-21 | Hierarchical thematic classification of major conference proceedings | Arsentii Kuzmin et.al. | 2406.14983 | null |
2024-06-21 | Demonstrating the Efficacy of Kolmogorov-Arnold Networks in Vision Tasks | Minjong Cheon et.al. | 2406.14916 | link |
2024-06-21 | MU-Bench: A Multitask Multimodal Benchmark for Machine Unlearning | Jiali Cheng et.al. | 2406.14796 | null |
2024-06-20 | Depth |
Parker Seegmiller et.al. | 2406.14695 | null |
2024-06-20 | Automatic Labels are as Effective as Manual Labels in Biomedical Images Classification with Deep Learning | Niccolò Marini et.al. | 2406.14351 | null |
2024-06-20 | Self-supervised Interpretable Concept-based Models for Text Classification | Francesco De Santis et.al. | 2406.14335 | null |
2024-06-20 | Adaptive Adversarial Cross-Entropy Loss for Sharpness-Aware Minimization | Tanapat Ratchatorn et.al. | 2406.14329 | null |
2024-06-20 | Boosting Hyperspectral Image Classification with Gate-Shift-Fuse Mechanisms in a Novel CNN-Transformer Approach | Mohamed Fadhlallah Guerri et.al. | 2406.14120 | null |
2024-06-20 | Seg-LSTM: Performance of xLSTM for Semantic Segmentation of Remotely Sensed Images | Qinfeng Zhu et.al. | 2406.14086 | link |
2024-06-21 | CMTNet: Convolutional Meets Transformer Network for Hyperspectral Images Classification | Faxu Guo et.al. | 2406.14080 | null |
2024-06-20 | Communication-Efficient Adaptive Batch Size Strategies for Distributed Local Gradient Methods | Tim Tsz-Kit Lau et.al. | 2406.13936 | null |
2024-06-19 | WATT: Weight Average Test-Time Adaption of CLIP | David Osowiechi et.al. | 2406.13875 | link |
2024-06-19 | CNN Based Flank Predictor for Quadruped Animal Species | Vanessa Suessle et.al. | 2406.13588 | null |
2024-06-19 | Online Domain-Incremental Learning Approach to Classify Acoustic Scenes in All Locations | Manjunath Mulimani et.al. | 2406.13386 | null |
2024-06-18 | LayerMerge: Neural Network Depth Compression through Layer Pruning and Merging | Jinuk Kim et.al. | 2406.12837 | link |
2024-06-18 | Privacy Preserving Federated Learning in Medical Imaging with Uncertainty Estimation | Nikolas Koutsoubis et.al. | 2406.12815 | link |
2024-06-18 | Online Anchor-based Training for Image Classification Tasks | Maria Tzelepi et.al. | 2406.12662 | null |
2024-06-18 | Fighting Randomness with Randomness: Mitigating Optimisation Instability of Fine-Tuning using Delayed Ensemble and Noisy Interpolation | Branislav Pecher et.al. | 2406.12471 | null |
2024-06-18 | GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theory | Haoze Wu et.al. | 2406.12375 | null |
2024-06-18 | What Did I Do Wrong? Quantifying LLMs' Sensitivity and Consistency to Prompt Engineering | Federico Errica et.al. | 2406.12334 | null |
2024-06-18 | Unleashing the Potential of Open-set Noisy Samples Against Label Noise for Medical Image Classification | Zehui Liao et.al. | 2406.12293 | null |
2024-06-18 | Advancing Cross-Domain Generalizability in Face Anti-Spoofing: Insights, Design, and Metrics | Hyojin Kim et.al. | 2406.12258 | null |
2024-06-19 | MiSuRe is all you need to explain your image segmentation | Syed Nouman Hasany et.al. | 2406.12173 | null |
2024-06-17 | Enhancing Text Classification through LLM-Driven Active Learning and Human Annotation | Hamidreza Rouzegar et.al. | 2406.12114 | link |
2024-06-17 | Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99% | Lei Zhu et.al. | 2406.11837 | link |
2024-06-17 | PrAViC: Probabilistic Adaptation Framework for Real-Time Video Classification | Magdalena Trędowicz et.al. | 2406.11443 | null |
2024-06-17 | Cross-domain Open-world Discovery | Shuo Wen et.al. | 2406.11422 | link |
2024-06-17 | BaFTA: Backprop-Free Test-Time Adaptation For Zero-Shot Vision-Language Models | Xuefeng Hu et.al. | 2406.11309 | null |
2024-06-17 | An Empirical Investigation of Matrix Factorization Methods for Pre-trained Transformers | Ashim Gupta et.al. | 2406.11307 | null |
2024-06-17 | Text Grafting: Near-Distribution Weak Supervision for Minority Classes in Text Classification | Letian Peng et.al. | 2406.11115 | null |
2024-06-16 | Fine-grained Classes and How to Find Them | Matej Grcić et.al. | 2406.11070 | link |
2024-06-16 | Leveraging Foundation Models for Multi-modal Federated Learning with Incomplete Modality | Liwei Che et.al. | 2406.11048 | null |
2024-06-16 | Curating Stopwords in Marathi: A TF-IDF Approach for Improved Text Analysis and Information Retrieval | Rohan Chavan et.al. | 2406.11029 | link |
2024-06-16 | Universal Cross-Lingual Text Classification | Riya Savant et.al. | 2406.11028 | null |
2024-06-14 | UniAudio 1.5: Large Language Model-driven Audio Codec is A Few-shot Audio Task Learner | Dongchao Yang et.al. | 2406.10056 | null |
2024-06-14 | Comparison of fine-tuning strategies for transfer learning in medical image classification | Ana Davila et.al. | 2406.10050 | null |
2024-06-14 | Forgetting Order of Continual Learning: Examples That are Learned First are Forgotten Last | Guy Hacohen et.al. | 2406.09935 | null |
2024-06-13 | MirrorCheck: Efficient Adversarial Defense for Vision-Language Models | Samar Fares et.al. | 2406.09250 | null |
2024-06-13 | Self-Training for Sample-Efficient Active Learning for Text Classification with Pre-Trained Language Models | Christopher Schröder et.al. | 2406.09206 | null |
2024-06-13 | Large-Scale Evaluation of Open-Set Image Classification Techniques | Halil Bisgin et.al. | 2406.09112 | link |
2024-06-13 | LaCoOT: Layer Collapse through Optimal Transport | Victor Quétu et.al. | 2406.08933 | null |
2024-06-13 | The Penalized Inverse Probability Measure for Conformal Classification | Paul Melki et.al. | 2406.08884 | null |
2024-06-13 | Conceptual Learning via Embedding Approximations for Reinforcing Interpretability and Transparency | Maor Dikter et.al. | 2406.08840 | link |
2024-06-13 | DenoiseReID: Denoising Model for Representation Learning of Person Re-Identification | Zhengrui Xu et.al. | 2406.08773 | null |
2024-06-12 | Fine-Tuned 'Small' LLMs (Still) Significantly Outperform Zero-Shot Generative AI Models in Text Classification | Martin Juan José Bucher et.al. | 2406.08660 | null |
2024-06-12 | Intelligent Multi-View Test Time Augmentation | Efe Ozturk et.al. | 2406.08593 | null |
2024-06-12 | Transformation-Dependent Adversarial Attacks | Yaoteng Tan et.al. | 2406.08443 | null |
2024-06-12 | AdaNCA: Neural Cellular Automata As Adaptors For More Robust Vision Transformer | Yitao Xu et.al. | 2406.08298 | null |
2024-06-12 | DistilDoc: Knowledge Distillation for Visually-Rich Document Applications | Jordy Van Landeghem et.al. | 2406.08226 | null |
2024-06-12 | Fully Few-shot Class-incremental Audio Classification Using Expandable Dual-embedding Extractor | Yongjie Si et.al. | 2406.08122 | null |
2024-06-12 | Low-Complexity Acoustic Scene Classification Using Parallel Attention-Convolution Network | Yanxiong Li et.al. | 2406.08119 | null |
2024-06-12 | A |
Lixian Zhang et.al. | 2406.08079 | null |
2024-06-12 | Adversarial Evasion Attack Efficiency against Large Language Models | João Vitorino et.al. | 2406.08050 | null |
2024-06-12 | Accurate Explanation Model for Image Classifiers using Class Association Embedding | Ruitao Xie et.al. | 2406.07961 | link |
2024-06-12 | Multi-Teacher Multi-Objective Meta-Learning for Zero-Shot Hyperspectral Band Selection | Jie Feng et.al. | 2406.07949 | null |
2024-06-12 | Small Scale Data-Free Knowledge Distillation | He Liu et.al. | 2406.07876 | link |
2024-06-11 | fKAN: Fractional Kolmogorov-Arnold Networks with trainable Jacobi basis functions | Alireza Afzal Aghaei et.al. | 2406.07456 | link |
2024-06-11 | Minimizing Energy Costs in Deep Learning Model Training: The Gaussian Sampling Approach | Challapalli Phanindra Revanth et.al. | 2406.07332 | null |
2024-06-11 | Noise-Robust Voice Conversion by Conditional Denoising Training Using Latent Variables of Recording Quality and Environment | Takuto Igarashi et.al. | 2406.07280 | null |
2024-06-11 | EEG-ImageNet: An Electroencephalogram Dataset and Benchmarks with Image Visual Stimuli of Multi-Granularity Labels | Shuqi Zhu et.al. | 2406.07151 | link |
2024-06-11 | RS-Agent: Automating Remote Sensing Tasks through Intelligent Agents | Wenjia Xu et.al. | 2406.07089 | null |
2024-06-11 | DualMamba: A Lightweight Spectral-Spatial Mamba-Convolution Network for Hyperspectral Image Classification | Jiamu Sheng et.al. | 2406.07050 | null |
2024-06-11 | Fairness-Aware Meta-Learning via Nash Bargaining | Yi Zeng et.al. | 2406.07029 | null |
2024-06-11 | Mitigating Boundary Ambiguity and Inherent Bias for Text Classification in the Era of Large Language Models | Zhenyi Lu et.al. | 2406.07001 | link |
2024-06-11 | Scaling up masked audio encoder learning for general audio classification | Heinrich Dinkel et.al. | 2406.06992 | null |
2024-06-10 | Multi-Objective Neural Architecture Search for In-Memory Computing | Md Hasibul Amin et.al. | 2406.06746 | null |
2024-06-10 | Robust Latent Representation Tuning for Image-text Classification | Hao Sun et.al. | 2406.06048 | null |
2024-06-09 | Contrastive Learning from Synthetic Audio Doppelgangers | Manuel Cherep et.al. | 2406.05923 | null |
2024-06-09 | Scaling Graph Convolutions for Mobile Vision | William Avery et.al. | 2406.05850 | link |
2024-06-09 | Evolution-aware VAriance (EVA) Coreset Selection for Medical Image Classification | Yuxin Hong et.al. | 2406.05677 | null |
2024-06-09 | Which Backbone to Use: A Resource-efficient Domain Specific Comparison for Computer Vision | Pranav Jeevan et.al. | 2406.05612 | link |
2024-06-08 | Aligning Human Knowledge with Visual Concepts Towards Explainable Medical Image Classification | Yunhe Gao et.al. | 2406.05596 | null |
2024-06-07 | The Unmet Promise of Synthetic Training Images: Using Retrieved Real Images Performs Better | Scott Geng et.al. | 2406.05184 | link |
2024-06-07 | A Novel Time Series-to-Image Encoding Approach for Weather Phenomena Classification | Christian Giannetti et.al. | 2406.05096 | null |
2024-06-07 | Classification Metrics for Image Explanations: Towards Building Reliable XAI-Evaluations | Benjamin Fresz et.al. | 2406.05068 | link |
2024-06-07 | REP: Resource-Efficient Prompting for On-device Continual Learning | Sungho Jeon et.al. | 2406.04772 | null |
2024-06-07 | AICoderEval: Improving AI Domain Code Generation of Large Language Models | Yinghui Xia et.al. | 2406.04712 | null |
2024-06-07 | Cooperative Meta-Learning with Gradient Augmentation | Jongyun Shin et.al. | 2406.04639 | link |
2024-06-06 | OCCAM: Towards Cost-Efficient and Accuracy-Aware Image Classification Inference | Dujian Ding et.al. | 2406.04508 | null |
2024-06-06 | Can Language Models Use Forecasting Strategies? | Sarah Pratt et.al. | 2406.04446 | null |
2024-06-06 | Parameter-Inverted Image Pyramid Networks | Xizhou Zhu et.al. | 2406.04330 | link |
2024-06-07 | BEADs: Bias Evaluation Across Domains | Shaina Raza et.al. | 2406.04220 | null |
2024-06-06 | What Do Language Models Learn in Context? The Structured Task Hypothesis | Jiaoda Li et.al. | 2406.04216 | null |
2024-06-06 | Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness | Lars Hillebrand et.al. | 2406.04156 | link |
2024-06-07 | ReDistill: Residual Encoded Distillation for Peak Memory Reduction | Fang Chen et.al. | 2406.03744 | null |
2024-06-06 | LLMEmbed: Rethinking Lightweight LLM's Genuine Function in Text Classification | Chun Liu et.al. | 2406.03725 | link |
2024-06-05 | Convolutional Neural Networks and Vision Transformers for Fashion MNIST Classification: A Literature Review | Sonia Bbouzidi et.al. | 2406.03478 | null |
2024-06-05 | IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models | David Ifeoluwa Adelani et.al. | 2406.03368 | null |
2024-06-05 | Audio Mamba: Bidirectional State Space Model for Audio Representation Learning | Mehmet Hamza Erol et.al. | 2406.03344 | link |
2024-06-05 | FusionBench: A Comprehensive Benchmark of Deep Model Fusion | Anke Tang et.al. | 2406.03280 | null |
2024-06-05 | VWise: A novel benchmark for evaluating scene classification for vehicular applications | Pedro Azevedo et.al. | 2406.03273 | null |
2024-06-05 | Tiny models from tiny data: Textual and null-text inversion for few-shot distillation | Erik Landolsi et.al. | 2406.03146 | link |
2024-06-05 | Exploiting LMM-based knowledge for image classification tasks | Maria Tzelepi et.al. | 2406.03071 | null |
2024-06-04 | Randomized Geometric Algebra Methods for Convex Neural Networks | Yifei Wang et.al. | 2406.02806 | null |
2024-06-04 | DL-KDD: Dual-Light Knowledge Distillation for Action Recognition in the Dark | Chi-Jui Chang et.al. | 2406.02468 | null |
2024-06-04 | GrootVL: Tree Topology is All You Need in State Space Model | Yicheng Xiao et.al. | 2406.02395 | link |
2024-06-04 | Hybrid Quantum-Classical Neural Network for LAB Color Space Image Classification | Kwokho Ng et.al. | 2406.02229 | null |
2024-06-03 | Few-Shot Classification of Interactive Activities of Daily Living (InteractADL) | Zane Durante et.al. | 2406.01662 | link |
2024-06-03 | CoLa-DCE -- Concept-guided Latent Diffusion Counterfactual Explanations | Franz Motzkus et.al. | 2406.01649 | null |
2024-06-03 | Asynchronous Multi-Server Federated Learning for Geo-Distributed Clients | Yuncong Zuo et.al. | 2406.01439 | null |
2024-06-03 | Compute-Efficient Medical Image Classification with Softmax-Free Transformers and Sequence Normalization | Firas Khader et.al. | 2406.01314 | null |
2024-06-03 | Continuous Geometry-Aware Graph Diffusion via Hyperbolic Neural PDE | Jiaxu Liu et.al. | 2406.01282 | null |
2024-06-04 | MultiMax: Sparse and Multi-Modal Attention Learning | Yuxuan Zhou et.al. | 2406.01189 | link |
2024-06-03 | Synergizing Unsupervised and Supervised Learning: A Hybrid Approach for Accurate Natural Language Task Modeling | Wrick Talukdar et.al. | 2406.01096 | null |
2024-05-31 | You Only Scan Once: Efficient Multi-dimension Sequential Modeling with LightNet | Zhen Qin et.al. | 2405.21022 | null |
2024-05-31 | Investigating Calibration and Corruption Robustness of Post-hoc Pruned Perception CNNs: An Image Classification Benchmark Study | Pallavi Mitra et.al. | 2405.20876 | null |
2024-05-31 | Improving Generalization and Convergence by Enhancing Implicit Regularization | Mingze Wang et.al. | 2405.20763 | null |
2024-05-31 | Robust Stable Spiking Neural Networks | Jianhao Ding et.al. | 2405.20694 | null |
2024-05-31 | Enhancing Counterfactual Image Generation Using Mahalanobis Distance with Distribution Preferences in Feature Space | Yukai Zhang et.al. | 2405.20685 | null |
2024-05-31 | GenMix: Combining Generative and Mixture Data Augmentation for Medical Image Classification | Hansang Lee et.al. | 2405.20650 | null |
2024-05-31 | ToxVidLLM: A Multimodal LLM-based Framework for Toxicity Detection in Code-Mixed Videos | Krishanu Maity et.al. | 2405.20628 | null |
2024-05-30 | Mitigating the Impact of Labeling Errors on Training via Rockafellian Relaxation | Louis L. Chen et.al. | 2405.20531 | null |
2024-05-30 | DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark | Haoxing Chen et.al. | 2405.19707 | link |
2024-05-30 | A Novel Approach for Automated Design Information Mining from Issue Logs | Jiuang Zhao et.al. | 2405.19623 | null |
2024-05-29 | I Bet You Did Not Mean That: Testing Semantic Importance via Betting | Jacopo Teneggi et.al. | 2405.19146 | link |
2024-05-29 | Verifiably Robust Conformal Prediction | Linus Jeary et.al. | 2405.18942 | null |
2024-05-29 | Leveraging Many-To-Many Relationships for Defending Against Visual-Language Adversarial Attacks | Futa Waseda et.al. | 2405.18770 | null |
2024-05-29 | GIST: Greedy Independent Set Thresholding for Diverse Data Summarization | Matthew Fahrbach et.al. | 2405.18754 | null |
2024-05-29 | LLM-based Hierarchical Concept Decomposition for Interpretable Fine-Grained Image Classification | Renyi Qu et.al. | 2405.18672 | null |
2024-05-28 | Its Not a Modality Gap: Characterizing and Addressing the Contrastive Gap | Abrar Fahim et.al. | 2405.18570 | null |
2024-05-28 | Why are Visually-Grounded Language Models Bad at Image Classification? | Yuhui Zhang et.al. | 2405.18415 | link |
2024-05-28 | MSPE: Multi-Scale Patch Embedding Prompts Vision Transformers to Any Resolution | Wenzhuo Liu et.al. | 2405.18240 | null |
2024-05-28 | Confidence-aware multi-modality learning for eye disease screening | Ke Zou et.al. | 2405.18167 | link |
2024-05-28 | 4-bit Shampoo for Memory-Efficient Network Training | Sike Wang et.al. | 2405.18144 | null |
2024-05-28 | DMT-JEPA: Discriminative Masked Targets for Joint-Embedding Predictive Architecture | Shentong Mo et.al. | 2405.17995 | null |
2024-05-27 | WASH: Train your Ensemble with Communication-Efficient Weight Shuffling, then Average | Louis Fournier et.al. | 2405.17517 | null |
2024-05-27 | Model-Agnostic Zeroth-Order Policy Optimization for Meta-Learning of Ergodic Linear Quadratic Regulators | Yunian Pan et.al. | 2405.17370 | null |
2024-05-27 | On the Noise Robustness of In-Context Learning for Text Generation | Hongfu Gao et.al. | 2405.17264 | null |
2024-05-27 | Superpixelwise Low-rank Approximation based Partial Label Learning for Hyperspectral Image Classification | Shujun Yang et.al. | 2405.17110 | link |
2024-05-26 | Demystify Mamba in Vision: A Linear Attention Perspective | Dongchen Han et.al. | 2405.16605 | null |
2024-05-26 | AdaFisher: Adaptive Second Order Optimization via Fisher Information | Damien Martins Gomes et.al. | 2405.16397 | null |
2024-05-25 | ModelLock: Locking Your Model With a Spell | Yifeng Gao et.al. | 2405.16285 | null |
2024-05-25 | Accelerating Transformers with Spectrum-Preserving Token Merging | Hoai-Chau Tran et.al. | 2405.16148 | null |
2024-05-25 | Breaking the False Sense of Security in Backdoor Defense through Re-Activation Attack | Mingli Zhu et.al. | 2405.16134 | null |
2024-05-24 | Grounding Stylistic Domain Generalization with Quantitative Domain Shift Measures and Synthetic Scene Images | Yiran Luo et.al. | 2405.15961 | null |
2024-05-24 | A Neurosymbolic Framework for Bias Correction in CNNs | Parth Padalkar et.al. | 2405.15886 | null |
2024-05-24 | What Do You See? Enhancing Zero-Shot Image Classification with Multimodal Large Language Models | Abdelrahman Abdelhamed et.al. | 2405.15668 | null |
2024-05-24 | Class Machine Unlearning for Complex Data via Concepts Inference and Data Poisoning | Wenhan Chang et.al. | 2405.15662 | null |
2024-05-24 | Exposing Image Classifier Shortcuts with Counterfactual Frequency (CoF) Tables | James Hinns et.al. | 2405.15661 | null |
2024-05-24 | Harnessing Increased Client Participation with Cohort-Parallel Federated Learning | Akash Dhasade et.al. | 2405.15644 | null |
2024-05-24 | Transformer-based Federated Learning for Multi-Label Remote Sensing Image Classification | Barış Büyüktaş et.al. | 2405.15405 | null |
2024-05-24 | CLIP model is an Efficient Online Lifelong Learner | Leyuan Wang et.al. | 2405.15155 | null |
2024-05-24 | OptLLM: Optimal Assignment of Queries to Large Language Models | Yueyue Liu et.al. | 2405.15130 | null |
2024-05-23 | A Lost Opportunity for Vision-Language Models: A Comparative Study of Online Test-time Adaptation for Vision-Language Models | Mario Döbler et.al. | 2405.14977 | link |
2024-05-23 | Domain Wall Magnetic Tunnel Junction Reliable Integrate and Fire Neuron | Can Cui1 et.al. | 2405.14851 | null |
2024-05-23 | Explaining Black-box Model Predictions via Two-level Nested Feature Attributions with Consistency Property | Yuya Yoshikawa et.al. | 2405.14522 | null |
2024-05-23 | SIAVC: Semi-Supervised Framework for Industrial Accident Video Classification | Zuoyong Li et.al. | 2405.14506 | null |
2024-05-23 | Scalable Visual State Space Model with Fractal Scanning | Lv Tang et.al. | 2405.14480 | null |
2024-05-23 | Segformer++: Efficient Token-Merging Strategies for High-Resolution Semantic Segmentation | Daniel Kienzle et.al. | 2405.14467 | null |
2024-05-23 | Boosting Robustness by Clipping Gradients in Distributed Learning | Youssef Allouah et.al. | 2405.14432 | null |
2024-05-23 | Advancing Spiking Neural Networks for Sequential Modeling with Central Pattern Generators | Changze Lv et.al. | 2405.14362 | null |
2024-05-23 | Simple Hamiltonian dynamics is a powerful quantum processing resource | Akitada Sakurai et.al. | 2405.14245 | null |
2024-05-23 | ChronosLex: Time-aware Incremental Training for Temporal Generalization of Legal Classification Tasks | T. Y. S. S Santosh et.al. | 2405.14211 | null |
2024-05-22 | Just rotate it! Uncertainty estimation in closed-source models via multiple queries | Konstantinos Pitas et.al. | 2405.13864 | null |
2024-05-21 | Decentralized Federated Learning Over Imperfect Communication Channels | Weicai Li et.al. | 2405.12894 | null |
2024-05-21 | Multimodal Adaptive Inference for Document Image Classification with Anytime Early Exiting | Omar Hamed et.al. | 2405.12705 | null |
2024-05-21 | Exploration of Masked and Causal Language Modelling for Text Generation | Nicolo Micheletti et.al. | 2405.12630 | null |
2024-05-21 | 3DSS-Mamba: 3D-Spectral-Spatial Mamba for Hyperspectral Image Classification | Yan He et.al. | 2405.12487 | null |
2024-05-20 | Alzheimer's Magnetic Resonance Imaging Classification Using Deep and Meta-Learning Models | Nida Nasir et.al. | 2405.12126 | null |
2024-05-20 | Mamba-in-Mamba: Centralized Mamba-Cross-Scan in Tokenized Mamba Model for Hyperspectral Image Classification | Weilian Zhou et.al. | 2405.12003 | link |
2024-05-20 | A Constraint-Enforcing Reward for Adversarial Attacks on Text Classifiers | Tom Roth et.al. | 2405.11904 | null |
2024-05-21 | A Novel Cartography-Based Curriculum Learning Method Applied on RoNLI: The First Romanian Natural Language Inference Corpus | Eduard Poesina et.al. | 2405.11877 | link |
2024-05-20 | SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model | Siavash Shams et.al. | 2405.11831 | link |
2024-05-20 | Exploring Ordinality in Text Classification: A Comparative Study of Explicit and Implicit Techniques | Siva Rajesh Kasa et.al. | 2405.11775 | null |
2024-05-19 | SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization | Jialong Guo et.al. | 2405.11582 | link |
2024-05-19 | Reproducibility Study of CDUL: CLIP-Driven Unsupervised Learning for Multi-Label Image Classification | Manan Shah et.al. | 2405.11574 | link |
2024-05-19 | An Invisible Backdoor Attack Based On Semantic Feature | Yangming Chen et.al. | 2405.11551 | null |
2024-05-19 | Verification technology for finger vein biometric | George Kumi Kyeremeh et.al. | 2405.11540 | null |
2024-05-17 | Reduced storage direct tensor ring decomposition for convolutional neural networks compression | Mateusz Gabor et.al. | 2405.10802 | link |
2024-05-17 | Benchmarking Large Language Models on CFLUE -- A Chinese Financial Language Understanding Evaluation Dataset | Jie Zhu et.al. | 2405.10542 | link |
2024-05-17 | Smart Expert System: Large Language Models as Text Classifiers | Zhiqiang Wang et.al. | 2405.10523 | link |
2024-05-16 | Data-Efficient Low-Complexity Acoustic Scene Classification in the DCASE 2024 Challenge | Florian Schmid et.al. | 2405.10018 | null |
2024-05-16 | ROCOv2: Radiology Objects in COntext Version 2, an Updated Multimodal Image Dataset | Johannes Rückert et.al. | 2405.10004 | link |
2024-05-15 | Improving Label Error Detection and Elimination with Uncertainty Quantification | Johannes Jakubik et.al. | 2405.09602 | null |
2024-05-15 | Tackling Distribution Shifts in Task-Oriented Communication with Information Bottleneck | Hongru Li et.al. | 2405.09514 | null |
2024-05-15 | Feature-based Federated Transfer Learning: Communication Efficiency, Robustness and Privacy | Feng Wang et.al. | 2405.09014 | link |
2024-05-14 | The Pitfalls and Promise of Conformal Inference Under Adversarial Attacks | Ziquan Liu et.al. | 2405.08886 | link |
2024-05-14 | Harnessing the power of longitudinal medical imaging for eye disease prognosis using Transformer-based sequence modeling | Gregory Holste et.al. | 2405.08780 | null |
2024-05-14 | FolkTalent: Enhancing Classification and Tagging of Indian Folk Paintings | Nancy Hada et.al. | 2405.08776 | null |
2024-05-14 | The impact of Compositionality in Zero-shot Multi-label action recognition for Object-based tasks | Carmela Calabrese et.al. | 2405.08695 | null |
2024-05-14 | Achieving Fairness Through Channel Pruning for Dermatological Disease Diagnosis | Qingpeng Kong et.al. | 2405.08681 | link |
2024-05-14 | Investigating Design Choices in Joint-Embedding Predictive Architectures for General Audio Representation Learning | Alain Riou et.al. | 2405.08679 | null |
2024-05-14 | Dual-Branch Network for Portrait Image Quality Assessment | Wei Sun et.al. | 2405.08555 | null |
2024-05-13 | Who's in and who's out? A case study of multimodal CLIP-filtering in DataComp | Rachel Hong et.al. | 2405.08209 | link |
2024-05-14 | MambaOut: Do We Really Need Mamba for Vision? | Weihao Yu et.al. | 2405.07992 | link |
2024-05-13 | Constrained Exploration via Reflected Replica Exchange Stochastic Gradient Langevin Dynamics | Haoyang Zheng et.al. | 2405.07839 | link |
2024-05-13 | Analysis of the rate of convergence of an over-parametrized convolutional neural network image classifier learned by gradient descent | Michael Kohler et.al. | 2405.07619 | null |
2024-05-13 | On-device Online Learning and Semantic Management of TinyML Systems | Haoyu Ren et.al. | 2405.07601 | null |
2024-05-13 | GLiRA: Black-Box Membership Inference Attack via Knowledge Distillation | Andrey V. Galichin et.al. | 2405.07562 | null |
2024-05-13 | Fine-tuning the SwissBERT Encoder Model for Embedding Sentences and Documents | Juri Grosjean et.al. | 2405.07513 | null |
2024-05-13 | MoVL:Exploring Fusion Strategies for the Domain-Adaptive Application of Pretrained Models in Medical Imaging Tasks | Haijiang Tian et.al. | 2405.07411 | null |
2024-05-12 | Explainable Convolutional Neural Networks for Retinal Fundus Classification and Cutting-Edge Segmentation Models for Retinal Blood Vessels from Fundus Images | Fatema Tuj Johora Faria et.al. | 2405.07338 | null |
2024-05-12 | Differentiable Model Scaling using Differentiable Topk | Kai Liu et.al. | 2405.07194 | null |
2024-05-11 | A framework of text-dependent speaker verification for chinese numerical string corpus | Litong Zheng et.al. | 2405.07029 | null |
2024-05-10 | Pseudo-Prompt Generating in Pre-trained Vision-Language Models for Multi-Label Medical Image Classification | Yaoqin Ye et.al. | 2405.06468 | null |
2024-05-10 | Multi-level Personalized Federated Learning on Heterogeneous and Long-Tailed Data | Rongyu Zhang et.al. | 2405.06413 | null |
2024-05-10 | SaudiBERT: A Large Language Model Pretrained on Saudi Dialect Corpora | Faisal Qarah et.al. | 2405.06239 | null |
2024-05-09 | Deep Multi-Task Learning for Malware Image Classification | Ahmed Bensaoud et.al. | 2405.05906 | null |
2024-05-09 | Enhancing Suicide Risk Detection on Social Media through Semi-Supervised Deep Label Smoothing | Matthew Squires et.al. | 2405.05795 | null |
2024-05-09 | CSA-Net: Channel-wise Spatially Autocorrelated Attention Networks | Nick et.al. | 2405.05755 | null |
2024-05-09 | How Quality Affects Deep Neural Networks in Fine-Grained Image Classification | Joseph Smith et.al. | 2405.05742 | null |
2024-05-09 | End-to-End Generative Semantic Communication Powered by Shared Semantic Knowledge Base | Shuling Li et.al. | 2405.05738 | null |
2024-05-09 | Using Machine Translation to Augment Multilingual Classification | Adam King et.al. | 2405.05478 | null |
2024-05-08 | AFEN: Respiratory Disease Classification using Ensemble Learning | Rahul Nadkarni et.al. | 2405.05467 | null |
2024-05-08 | XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples | Peiqin Lin et.al. | 2405.05116 | link |
2024-05-08 | Explanation as a Watermark: Towards Harmless and Multi-bit Model Ownership Verification via Watermarking Feature Attribution | Shuo Shao et.al. | 2405.04825 | null |
2024-05-07 | Exploring Explainable AI Techniques for Improved Interpretability in Lung and Colon Cancer Classification | Mukaffi Bin Moin et.al. | 2405.04610 | link |
2024-05-07 | Pragmatist Intelligence: Where the Principle of Usefulness Can Take ANNs | Antonio Bikić et.al. | 2405.04386 | null |
2024-05-07 | Semi-Supervised Disease Classification based on Limited Medical Image Data | Yan Zhang et.al. | 2405.04295 | null |
2024-05-07 | DCNN: Dual Cross-current Neural Networks Realized Using An Interactive Deep Learning Discriminator for Fine-grained Objects | Da Fu et.al. | 2405.04093 | null |
2024-05-07 | Feature Map Convergence Evaluation for Functional Module | Ludan Zhang et.al. | 2405.04041 | null |
2024-05-07 | VMambaCC: A Visual State Space Model for Crowd Counting | Hao-Yuan Ma et.al. | 2405.03978 | null |
2024-05-06 | On Adversarial Examples for Text Classification by Perturbing Latent Representations | Korn Sooksatra et.al. | 2405.03789 | null |
2024-05-06 | CICA: Content-Injected Contrastive Alignment for Zero-Shot Document Image Classification | Sankalp Sinha et.al. | 2405.03660 | null |
2024-05-06 | Deep Space Separable Distillation for Lightweight Acoustic Scene Classification | ShuQi Ye et.al. | 2405.03567 | null |
2024-05-06 | Liberating Seen Classes: Boosting Few-Shot and Zero-Shot Text Classification via Anchor Generation and Classification Reframing | Han Liu et.al. | 2405.03565 | null |
2024-05-06 | A Lightweight Neural Architecture Search Model for Medical Image Classification | Lunchen Xie et.al. | 2405.03462 | null |
2024-05-06 | Interpretable Network Visualizations: A Human-in-the-Loop Approach for Post-hoc Explainability of CNN-based Image Classification | Matteo Bianchi et.al. | 2405.03301 | null |
2024-05-06 | TED: Accelerate Model Training by Internal Generalization | Jinying Xiao et.al. | 2405.03228 | null |
2024-05-06 | Advancing Multimodal Medical Capabilities of Gemini | Lin Yang et.al. | 2405.03162 | null |
2024-05-05 | A scoping review of using Large Language Models (LLMs) to investigate Electronic Health Records (EHRs) | Lingyao Li et.al. | 2405.03066 | null |
2024-05-05 | Parameter-Efficient Fine-Tuning with Discrete Fourier Transform | Ziqi Gao et.al. | 2405.03003 | null |
2024-05-04 | MMEarth: Exploring Multi-Modal Pretext Tasks For Geospatial Representation Learning | Vishal Nedungadi et.al. | 2405.02771 | null |
2024-05-03 | Multi-method Integration with Confidence-based Weighting for Zero-shot Image Classification | Siqi Yin et.al. | 2405.02155 | null |
2024-05-03 | The Trade-off between Performance, Efficiency, and Fairness in Adapter Modules for Text Classification | Minh Duc Bui et.al. | 2405.02010 | null |
2024-05-03 | Which Identities Are Mobilized: Towards an automated detection of social group appeals in political texts | Felicia Riethmüller et.al. | 2405.01904 | null |
2024-05-02 | PVF (Parameter Vulnerability Factor): A Quantitative Metric Measuring AI Vulnerability and Resilience Against Parameter Corruptions | Xun Jiao et.al. | 2405.01741 | null |
2024-05-02 | Development of Skip Connection in Deep Neural Networks for Computer Vision and Medical Image Analysis: A Survey | Guoping Xu et.al. | 2405.01725 | link |
2024-05-02 | SOAR: Advancements in Small Body Object Detection for Aerial Imagery Using State Space Models and Programmable Gradients | Tushar Verma et.al. | 2405.01699 | null |
2024-05-02 | Explainable AI (XAI) in Image Segmentation in Medicine, Industry, and Beyond: A Survey | Rokas Gipiškis et.al. | 2405.01636 | null |
2024-05-02 | Improving Intervention Efficacy via Concept Realignment in Concept Bottleneck Models | Nishad Singhi et.al. | 2405.01531 | null |
2024-05-03 | Decoupling Feature Extraction and Classification Layers for Calibrated Neural Networks | Mikkel Jordahn et.al. | 2405.01196 | null |
2024-05-02 | Uncertainty-aware self-training with expectation maximization basis transformation | Zijia Wang et.al. | 2405.01175 | null |
2024-05-02 | Transformers Fusion across Disjoint Samples for Hyperspectral Image Classification | Muhammad Ahmad et.al. | 2405.01095 | null |
2024-05-02 | Efficient and Flexible Method for Reducing Moderate-size Deep Neural Networks with Condensation | Tianyi Chen et.al. | 2405.01041 | null |
2024-05-02 | Benchmarking Representations for Speech, Music, and Acoustic Events | Moreno La Quatra et.al. | 2405.00934 | link |
2024-05-01 | Digital-analog quantum convolutional neural networks for image classification | Anton Simen et.al. | 2405.00548 | null |
2024-05-03 | BiomedRAG: A Retrieval Augmented Large Language Model for Biomedicine | Mingchen Li et.al. | 2405.00465 | null |
2024-05-01 | Visual and audio scene classification for detecting discrepancies in video: a baseline method and experimental protocol | Konstantinos Apostolidis et.al. | 2405.00384 | null |
2024-05-01 | Data Augmentation Policy Search for Long-Term Forecasting | Liran Nochumsohn et.al. | 2405.00319 | null |
2024-04-30 | Let's Focus: Focused Backdoor Attack against Federated Transfer Learning | Marco Arazzi et.al. | 2404.19420 | null |
2024-04-30 | Large Language Model Informed Patent Image Retrieval | Hao-Cheng Lo et.al. | 2404.19360 | null |
2024-04-30 | Enhancing Intrinsic Features for Debiasing via Investigating Class-Discerning Common Attributes in Bias-Contrastive Pair | Jeonghoon Park et.al. | 2404.19250 | null |
2024-04-29 | Spectral-Spatial Mamba for Hyperspectral Image Classification | Lingbo Huang et.al. | 2404.18401 | null |
2024-04-28 | TextGram: Towards a better domain-adaptive pretraining | Sharayu Hiwarkhedkar et.al. | 2404.18228 | null |
2024-04-28 | L3Cube-MahaNews: News-based Short Text and Long Document Classification Datasets in Marathi | Saloni Mittal et.al. | 2404.18216 | link |
2024-04-28 | S |
Guanchun Wang et.al. | 2404.18213 | null |
2024-04-27 | Implicit Generative Prior for Bayesian Neural Networks | Yijia Liu et.al. | 2404.18008 | link |
2024-04-27 | Towards Privacy-Preserving Audio Classification Systems | Bhawana Chhaglani et.al. | 2404.18002 | null |
2024-04-27 | A Method of Moments Embedding Constraint and its Application to Semi-Supervised Learning | Michael Majurski et.al. | 2404.17978 | null |
2024-04-27 | Spatial, Temporal, and Geometric Fusion for Remote Sensing Images | Hessah Albanwan et.al. | 2404.17851 | null |
2024-04-27 | Leveraging Cross-Modal Neighbor Representation for Improved CLIP Classification | Chao Yi et.al. | 2404.17753 | link |
2024-04-26 | SPLICE -- Streamlining Digital Pathology Image Processing | Areej Alsaafin et.al. | 2404.17704 | null |
2024-04-26 | SDFD: Building a Versatile Synthetic Face Image Dataset with Diverse Attributes | Georgia Baltsou et.al. | 2404.17255 | null |
2024-04-25 | Incorporating Lexical and Syntactic Knowledge for Unsupervised Cross-Lingual Transfer | Jianyu Zheng et.al. | 2404.16627 | link |
2024-04-25 | IMWA: Iterative Model Weight Averaging Benefits Class-Imbalanced Learning Tasks | Zitong Huang et.al. | 2404.16331 | null |
2024-04-25 | Lacunarity Pooling Layers for Plant Image Classification using Texture Analysis | Akshatha Mohan et.al. | 2404.16268 | link |
2024-04-24 | MiMICRI: Towards Domain-centered Counterfactual Explanations of Cardiovascular Image Classification Models | Grace Guo et.al. | 2404.16174 | null |
2024-04-24 | MoDE: CLIP Data Experts via Clustering | Jiawei Ma et.al. | 2404.16030 | link |
2024-04-26 | A Survey on Visual Mamba | Hanwei Zhang et.al. | 2404.15956 | null |
2024-04-24 | Vision Transformer-based Adversarial Domain Adaptation | Yahan Li et.al. | 2404.15817 | link |
2024-04-24 | Rethinking Model Prototyping through the MedMNIST+ Dataset Collection | Sebastian Doerrich et.al. | 2404.15786 | null |
2024-04-24 | Efficient Multi-Model Fusion with Adversarial Complementary Representation Learning | Zuheng Kang et.al. | 2404.15704 | null |
2024-04-24 | Brain Storm Optimization Based Swarm Learning for Diabetic Retinopathy Image Classification | Liang Qu et.al. | 2404.15585 | null |
2024-04-23 | An MRP Formulation for Supervised Learning: Generalized Temporal Difference Learning Models | Yangchen Pan et.al. | 2404.15518 | null |
2024-04-23 | Deep multi-prototype capsule networks | Saeid Abbassi et.al. | 2404.15445 | null |
2024-04-23 | A review of deep learning-based information fusion techniques for multimodal medical image classification | Yihao Li et.al. | 2404.15022 | null |
2024-04-23 | Social Media and Artificial Intelligence for Sustainable Cities and Societies: A Water Quality Analysis Use-case | Muhammad Asif Auyb et.al. | 2404.14977 | null |
2024-04-23 | Traditional to Transformers: A Survey on Current Trends and Future Prospects for Hyperspectral Image Classification | Muhammad Ahmad et.al. | 2404.14955 | link |
2024-04-23 | Pyramid Hierarchical Transformer for Hyperspectral Image Classification | Muhammad Ahmad et.al. | 2404.14945 | link |
2024-04-23 | Importance of Disjoint Sampling in Conventional and Transformer Models for Hyperspectral Image Classification | Muhammad Ahmad et.al. | 2404.14944 | link |
2024-04-23 | CoProNN: Concept-based Prototypical Nearest Neighbors for Explaining Vision Models | Teodor Chiaburu et.al. | 2404.14830 | link |
2024-04-22 | WangLab at MEDIQA-M3G 2024: Multimodal Medical Answer Generation using Large Language Models | Ronald Xie et.al. | 2404.14567 | null |
2024-04-22 | CKD: Contrastive Knowledge Distillation from A Sample-wise Perspective | Wencheng Zhu et.al. | 2404.14109 | null |
2024-04-21 | EncodeNet: A Framework for Boosting DNN Accuracy with Entropy-driven Generalized Converting Autoencoder | Hasanul Mahmud et.al. | 2404.13770 | null |
2024-04-21 | PEACH: Pretrained-embedding Explanation Across Contextual and Hierarchical Structure | Feiqi Cao et.al. | 2404.13645 | link |
2024-04-21 | I2CANSAY:Inter-Class Analogical Augmentation and Intra-Class Significance Analysis for Non-Exemplar Online Task-Free Continual Learning | Songlin Dong et.al. | 2404.13576 | null |
2024-04-21 | IMO: Greedy Layer-Wise Sparse Representation Learning for Out-of-Distribution Text Classification with Pre-trained Models | Tao Feng et.al. | 2404.13504 | null |
2024-04-20 | Nested-TNT: Hierarchical Vision Transformers with Multi-Scale Feature Processing | Yuang Liu et.al. | 2404.13434 | null |
2024-04-20 | Evaluating Subword Tokenization: Alien Subword Composition and OOV Generalization Challenge | Khuyagbaatar Batsuren et.al. | 2404.13292 | link |
2024-04-20 | 3D-Convolution Guided Spectral-Spatial Transformer for Hyperspectral Image Classification | Shyam Varahagiri et.al. | 2404.13252 | link |
2024-04-19 | On-board classification of underwater images using hybrid classical-quantum CNN based method | Sreeraj Rajan Warrier et.al. | 2404.13130 | null |
2024-04-19 | Next Generation Loss Function for Image Classification | Shakhnaz Akhmedova et.al. | 2404.12948 | null |
2024-04-19 | A Hybrid Generative and Discriminative PointNet on Unordered Point Sets | Yang Ye et.al. | 2404.12925 | null |
2024-04-19 | Transformer-Based Classification Outcome Prediction for Multimodal Stroke Treatment | Danqing Ma et.al. | 2404.12634 | null |
2024-04-18 | When LLMs are Unfit Use FastFit: Fast and Effective Text Classification with Many Classes | Asaf Yehudai et.al. | 2404.12365 | null |
2024-04-18 | Observation, Analysis, and Solution: Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training | Jin Gao et.al. | 2404.12210 | link |
2024-04-18 | Concept Induction using LLMs: a user experiment for assessment | Adrita Barua et.al. | 2404.11875 | null |
2024-04-17 | Pretraining Billion-scale Geospatial Foundational Models on Frontier | Aristeidis Tsaris et.al. | 2404.11706 | null |
2024-04-17 | AI-Enhanced Cognitive Behavioral Therapy: Deep Learning and Large Language Models for Extracting Cognitive Pathways from Social Media Texts | Meng Jiang et.al. | 2404.11449 | null |
2024-04-17 | Achieving Rotation Invariance in Convolution Operations: Shifting from Data-Driven to Mechanism-Assured | Hanlin Mo et.al. | 2404.11309 | null |
2024-04-17 | A Progressive Framework of Vision-language Knowledge Distillation and Alignment for Multilingual Scene | Wenbo Zhang et.al. | 2404.11249 | null |
2024-04-17 | A Novel ICD Coding Framework Based on Associated and Hierarchical Code Description Distillation | Bin Zhang et.al. | 2404.11132 | null |
2024-04-17 | Small Language Models are Good Too: An Empirical Study of Zero-Shot Classification | Pierre Lepagnol et.al. | 2404.11122 | null |
2024-04-18 | Supervised Contrastive Vision Transformer for Breast Histopathological Image Classification | Mohammad Shiri et.al. | 2404.11052 | null |
2024-04-17 | InfoMatch: Entropy Neural Estimation for Semi-Supervised Image Classification | Qi Han et.al. | 2404.11003 | link |
2024-04-16 | Incubating Text Classifiers Following User Instruction with Nothing but LLM | Letian Peng et.al. | 2404.10877 | null |
2024-04-16 | Vocabulary-free Image Classification and Semantic Segmentation | Alessandro Conti et.al. | 2404.10864 | link |
2024-04-16 | Assessing The Impact of CNN Auto Encoder-Based Image Denoising on Image Classification Tasks | Mohsen Hami et.al. | 2404.10664 | null |
2024-04-16 | Tree Bandits for Generative Bayes | Sean O'Hagan et.al. | 2404.10436 | null |
2024-04-16 | AudioProtoPNet: An interpretable deep learning model for bird sound classification | René Heinrich et.al. | 2404.10420 | null |
2024-04-16 | Lighter, Better, Faster Multi-Source Domain Adaptation with Gaussian Mixture Models and Optimal Transport | Eduardo Fernandes Montesuma et.al. | 2404.10261 | null |
2024-04-15 | Distributed Federated Learning-Based Deep Learning Model for Privacy MRI Brain Tumor Detection | Lisang Zhou et.al. | 2404.10026 | null |
2024-04-15 | Interaction as Explanation: A User Interaction-based Method for Explaining Image Classification Models | Hyeonggeun Yun et.al. | 2404.09828 | null |
2024-04-15 | Quantization of Large Language Models with an Overdetermined Basis | Daniil Merkulov et.al. | 2404.09737 | null |
2024-04-15 | Pseudo-label Learning with Calibrated Confidence Using an Energy-based Model | Masahito Toba et.al. | 2404.09585 | null |
2024-04-14 | Breast Cancer Image Classification Method Based on Deep Transfer Learning | Weimin Wang et.al. | 2404.09226 | null |
2024-04-14 | Coreset Selection for Object Detection | Hojun Lee et.al. | 2404.09161 | null |
2024-04-13 | Exploring Explainability in Video Action Recognition | Avinab Saha et.al. | 2404.09067 | null |
2024-04-13 | Fast Fishing: Approximating BAIT for Efficient and Scalable Deep Active Image Classification | Denis Huseljic et.al. | 2404.08981 | link |
2024-04-13 | PM2: A New Prompting Multi-modal Model Paradigm for Few-shot Medical Image Classification | Zhenwei Wang et.al. | 2404.08915 | null |
2024-04-12 | VertAttack: Taking advantage of Text Classifiers' horizontal vision | Jonathan Rusert et.al. | 2404.08538 | null |
2024-04-12 | SpectralMamba: Efficient Mamba for Hyperspectral Image Classification | Jing Yao et.al. | 2404.08489 | null |
2024-04-12 | OTTER: Improving Zero-Shot Classification via Optimal Transport | Changho Shin et.al. | 2404.08461 | null |
2024-04-12 | A Survey of Neural Network Robustness Assessment in Image Recognition | Jie Wang et.al. | 2404.08285 | null |
2024-04-12 | Convolutional neural network classification of cancer cytopathology images: taking breast cancer as an example | MingXuan Xiao et.al. | 2404.08279 | null |
2024-04-11 | HGRN2: Gated Linear RNNs with State Expansion | Zhen Qin et.al. | 2404.07904 | link |
2024-04-11 | Exploiting Object-based and Segmentation-based Semantic Features for Deep Learning-based Indoor Scene Classification | Ricardo Pereira et.al. | 2404.07739 | null |
2024-04-11 | Contrastive-Based Deep Embeddings for Label Noise-Resilient Histopathology Image Classification | Lucas Dedieu et.al. | 2404.07605 | link |
2024-04-11 | Learning to Classify New Foods Incrementally Via Compressed Exemplars | Justin Yang et.al. | 2404.07507 | null |
2024-04-11 | Interactive Prompt Debugging with Sequence Salience | Ian Tenney et.al. | 2404.07498 | null |
2024-04-11 | Privacy preserving layer partitioning for Deep Neural Network models | Kishore Rajasekar et.al. | 2404.07437 | null |
2024-04-11 | CopilotCAD: Empowering Radiologists with Report Completion Models and Quantitative Evidence from Medical Image Foundation Models | Sheng Wang et.al. | 2404.07424 | null |
2024-04-11 | Improving Shift Invariance in Convolutional Neural Networks with Translation Invariant Polyphase Sampling | Sourajit Saha et.al. | 2404.07410 | null |
2024-04-10 | Lost in Translation: Modern Neural Networks Still Struggle With Small Realistic Image Transformations | Ofir Shifman et.al. | 2404.07153 | null |
2024-04-10 | Learning of deep convolutional network image classifiers via stochastic gradient descent and over-parametrization | Michael Kohler et.al. | 2404.07128 | null |
2024-04-10 | Accelerating Cardiac MRI Reconstruction with CMRatt: An Attention-Driven Approach | Anam Hashmi et.al. | 2404.06941 | null |
2024-04-10 | Multi-Label Continual Learning for the Medical Domain: A Novel Benchmark | Marina Ceccon et.al. | 2404.06859 | null |
2024-04-10 | Neural Optimizer Equation, Decay Function, and Learning Rate Schedule Joint Evolution | Brandon Morgan et.al. | 2404.06679 | null |
2024-04-09 | Variational Stochastic Gradient Descent for Deep Neural Networks | Haotian Chen et.al. | 2404.06549 | link |
2024-04-09 | On adversarial training and the 1 Nearest Neighbor classifier | Amir Hagai et.al. | 2404.06313 | link |
2024-04-09 | Audio-Visual Generalized Zero-Shot Learning using Pre-Trained Large Multi-Modal Models | David Kurzendörfer et.al. | 2404.06309 | link |
2024-04-09 | Counterfactual Reasoning for Multi-Label Image Classification via Patching-Based Training | Ming-Kun Xie et.al. | 2404.06287 | null |
2024-04-09 | Quantum Circuit |
Yuka Hashimoto et.al. | 2404.06218 | null |
2024-04-09 | VI-OOD: A Unified Representation Learning Framework for Textual Out-of-distribution Detection | Li-Ming Zhan et.al. | 2404.06217 | link |
2024-04-09 | Symmetry-guided gradient descent for quantum neural networks | Kaiming Bian et.al. | 2404.06108 | null |
2024-04-10 | Using Few-Shot Learning to Classify Primary Lung Cancer and Other Malignancy with Lung Metastasis in Cytological Imaging via Endobronchial Ultrasound Procedures | Ching-Kai Lin et.al. | 2404.06080 | null |
2024-04-08 | Neural Cellular Automata for Lightweight, Robust and Explainable Classification of White Blood Cell Images | Michael Deutges et.al. | 2404.05584 | null |
2024-04-08 | On the Convergence of Continual Learning with Adaptive Methods | Seungyub Han et.al. | 2404.05555 | null |
2024-04-08 | Multi-Task Learning for Features Extraction in Financial Annual Reports | Syrielle Montariol et.al. | 2404.05281 | link |
2024-04-08 | Allowing humans to interactively guide machines where to look does not always improve a human-AI team's classification accuracy | Giang Nguyen et.al. | 2404.05238 | null |
2024-04-08 | iVPT: Improving Task-relevant Information Sharing in Visual Prompt Tuning by Cross-layer Dynamic Connection | Nan Zhou et.al. | 2404.05207 | null |
2024-04-08 | Semantic Stealth: Adversarial Text Attacks on NLP Using Several Methods | Roopkatha Dey et.al. | 2404.05159 | null |
2024-04-07 | PairAug: What Can Augmented Image-Text Pairs Do for Radiology? | Yutong Xie et.al. | 2404.04960 | link |
2024-04-07 | GvT: A Graph-based Vision Transformer with Talking-Heads Utilizing Sparsity, Trained from Scratch on Small Datasets | Dongjing Shan et.al. | 2404.04924 | null |
2024-04-06 | Focused Active Learning for Histopathological Image Classification | Arne Schmidt et.al. | 2404.04663 | null |
2024-04-06 | Trustless Audits without Revealing Data or Models | Suppakit Waiwitlikhit et.al. | 2404.04500 | null |
2024-04-05 | Evaluating Adversarial Robustness: A Comparison Of FGSM, Carlini-Wagner Attacks, And The Role of Distillation as Defense Mechanism | Trilokesh Ranjan Sarkar et.al. | 2404.04245 | null |
2024-04-05 | Noisy Label Processing for Classification: A Survey | Mengting Li et.al. | 2404.04159 | null |
2024-04-05 | Learning Correlation Structures for Vision Transformers | Manjin Kim et.al. | 2404.03924 | null |
2024-04-05 | LiDAR-Guided Cross-Attention Fusion for Hyperspectral Band Selection and Image Classification | Judy X Yang et.al. | 2404.03883 | null |
2024-04-04 | Dendrites endow artificial neural networks with accurate, robust and parameter-efficient learning | Spyridon Chavlis et.al. | 2404.03708 | null |
2024-04-05 | A Methodology to Study the Impact of Spiking Neural Network Parameters considering Event-Based Automotive Data | Iqra Bano et.al. | 2404.03493 | null |
2024-04-04 | Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks | Lei Zhang et.al. | 2404.03340 | null |
2024-04-04 | Sparse Concept Bottleneck Models: Gumbel Tricks in Contrastive Learning | Andrei Semenov et.al. | 2404.03323 | link |
2024-04-04 | FACTUAL: A Novel Framework for Contrastive Learning Based Robust SAR Image Classification | Xu Wang et.al. | 2404.03225 | null |
2024-04-03 | Exploring the Trade-off Between Model Performance and Explanation Plausibility of Text Classifiers Using Human Rationales | Lucas E. Resck et.al. | 2404.03098 | link |
2024-04-03 | Guarantees of confidentiality via Hammersley-Chapman-Robbins bounds | Kamalika Chaudhuri et.al. | 2404.02866 | link |
2024-04-03 | FPT: Feature Prompt Tuning for Few-shot Readability Assessment | Ziyang Wang et.al. | 2404.02772 | link |
2024-04-03 | Adversarial Attacks and Dimensionality in Text Classifiers | Nandish Chattopadhyay et.al. | 2404.02660 | null |
2024-04-04 | Non-negative Subspace Feature Representation for Few-shot Learning in Medical Imaging | Keqiang Fan et.al. | 2404.02656 | null |
2024-04-03 | Adaptive Cross-lingual Text Classification through In-Context One-Shot Demonstrations | Emilio Villa-Cueva et.al. | 2404.02452 | link |
2024-04-03 | A Novel Approach to Breast Cancer Histopathological Image Classification Using Cross-Colour Space Feature Fusion and Quantum-Classical Stack Ensemble Method | Sambit Mallick et.al. | 2404.02447 | null |
2024-04-03 | Enhancing Low-Resource LLMs Classification with PEFT and Synthetic Data | Parth Patwa et.al. | 2404.02422 | null |
2024-04-02 | Smooth Deep Saliency | Rudolf Herdt et.al. | 2404.02282 | null |
2024-04-02 | Visual Concept Connectome (VCC): Open World Concept Discovery and their Interlayer Connections in Deep Models | Matthew Kowal et.al. | 2404.02233 | null |
2024-04-02 | ImageNot: A contrast with ImageNet preserves model rankings | Olawale Salaudeen et.al. | 2404.02112 | null |
2024-04-02 | Explainability in JupyterLab and Beyond: Interactive XAI Systems for Integrated and Collaborative Workflows | Grace Guo et.al. | 2404.02081 | null |
2024-04-02 | Ukrainian Texts Classification: Exploration of Cross-lingual Knowledge Transfer Approaches | Daryna Dementieva et.al. | 2404.02043 | null |
2024-04-02 | CAM-Based Methods Can See through Walls | Magamed Taimeskhanov et.al. | 2404.01964 | link |
2024-04-02 | Beyond Image Super-Resolution for Image Recognition with Task-Driven Perceptual Loss | Jaeha Kim et.al. | 2404.01692 | null |
2024-04-02 | A Universal Knowledge Embedded Contrastive Learning Framework for Hyperspectral Image Classification | Quanwei Liu et.al. | 2404.01673 | null |
2024-04-01 | Can Biases in ImageNet Models Explain Generalization? | Paul Gavrikov et.al. | 2404.01509 | link |
2024-04-01 | Parallel Proportional Fusion of Spiking Quantum Neural Network for Optimizing Image Classification | Zuyu Xu et.al. | 2404.01359 | null |
2024-04-01 | Bridging Remote Sensors with Multisensor Geospatial Foundation Models | Boran Han et.al. | 2404.01260 | link |
2024-04-01 | Diagnosis of Skin Cancer Using VGG16 and VGG19 Based Transfer Learning Models | Amir Faghihi et.al. | 2404.01160 | null |
2024-03-29 | Learn "No" to Say "Yes" Better: Improving Vision-Language Models via Negations | Jaisidh Singh et.al. | 2403.20312 | link |
2024-03-29 | MCNet: A crowd denstity estimation network based on integrating multiscale attention module | Qiang Guo et.al. | 2403.20173 | null |
2024-03-29 | Segmentation, Classification and Interpretation of Breast Cancer Medical Images using Human-in-the-Loop Machine Learning | David Vázquez-Lema et.al. | 2403.20112 | null |
2024-03-29 | Adverb Is the Key: Simple Text Data Augmentation with Adverb Deletion | Juhwan Choi et.al. | 2403.20015 | null |
2024-03-29 | Diverse Feature Learning by Self-distillation and Reset | Sejik Park et.al. | 2403.19941 | null |
2024-03-29 | Heterogeneous Network Based Contrastive Learning Method for PolSAR Land Cover Classification | Jianfeng Cai et.al. | 2403.19902 | link |
2024-03-28 | X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization | Anna Kukleva et.al. | 2403.19811 | link |
2024-03-28 | RSMamba: Remote Sensing Image Classification with State Space Model | Keyan Chen et.al. | 2403.19654 | link |
2024-03-28 | Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model | Zhicai Wang et.al. | 2403.19600 | link |
2024-03-28 | The Bad Batches: Enhancing Self-Supervised Learning in Image Classification Through Representative Batch Curation | Ozgu Goksu et.al. | 2403.19579 | null |
2024-03-28 | Low-Rank Rescaled Vision Transformer Fine-Tuning: A Residual Design Approach | Wei Dong et.al. | 2403.19067 | link |
2024-03-27 | Evaluating Large Language Models for Health-Related Text Classification Tasks with Public Social Media Data | Yuting Guo et.al. | 2403.19031 | null |
2024-03-27 | Robustness and Visual Explanation for Black Box Image, Video, and ECG Signal Classification with Reinforcement Learning | Soumyendu Sarkar et.al. | 2403.18985 | null |
2024-03-27 | The Impact of Uniform Inputs on Activation Sparsity and Energy-Latency Attacks in Computer Vision | Andreas Müller et.al. | 2403.18587 | link |
2024-03-27 | Uncertainty-Aware SAR ATR: Defending Against Adversarial Attacks via Bayesian Neural Networks | Tian Ye et.al. | 2403.18318 | null |
2024-03-27 | Multi-scale Unified Network for Image Classification | Wenzhuo Liu et.al. | 2403.18294 | null |
2024-03-26 | The Need for Speed: Pruning Transformers with One Recipe | Samir Khaki et.al. | 2403.17921 | link |
2024-03-26 | Compressed Multi-task embeddings for Data-Efficient Downstream training and inference in Earth Observation | Carlos Gomes et.al. | 2403.17886 | null |
2024-03-26 | PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition | Chenhongyi Yang et.al. | 2403.17695 | link |
2024-03-26 | Language Models for Text Classification: Is In-Context Learning Enough? | Aleksandra Edwards et.al. | 2403.17661 | null |
2024-03-26 | Boosting Few-Shot Learning with Disentangled Self-Supervised Learning and Meta-Learning for Medical Image Classification | Eva Pachetti et.al. | 2403.17530 | null |
2024-03-26 | HILL: Hierarchy-aware Information Lossless Contrastive Learning for Hierarchical Text Classification | He Zhu et.al. | 2403.17307 | link |
2024-03-25 | Histogram Layers for Neural Engineered Features | Joshua Peeples et.al. | 2403.17176 | link |
2024-03-25 | Task2Box: Box Embeddings for Modeling Asymmetric Task Relationships | Rangel Daroya et.al. | 2403.17173 | link |
2024-03-25 | CipherFormer: Efficient Transformer Private Inference with Low Round Complexity | Weize Wang et.al. | 2403.16860 | null |
2024-03-25 | Assessing the Performance of Deep Learning for Automated Gleason Grading in Prostate Cancer | Dominik Müller et.al. | 2403.16695 | null |
2024-03-25 | DeepGleason: a System for Automated Gleason Grading of Prostate Cancer using Deep Neural Networks | Dominik Müller et.al. | 2403.16678 | link |
2024-03-25 | LARA: Linguistic-Adaptive Retrieval-Augmented LLMs for Multi-Turn Intent Classification | Liu Junhua et.al. | 2403.16504 | null |
2024-03-24 | On machine learning analysis of atomic force microscopy images for image classification, sample surface recognition | Igor Sokolov et.al. | 2403.16230 | null |
2024-03-24 | Leveraging Deep Learning and Xception Architecture for High-Accuracy MRI Classification in Alzheimer Diagnosis | Shaojie Li et.al. | 2403.16212 | null |
2024-03-24 | Multi-Task Learning with Multi-Task Optimization | Lu Bai et.al. | 2403.16162 | null |
2024-03-24 | CBGT-Net: A Neuromimetic Architecture for Robust Classification of Streaming Data | Shreya Sharma et.al. | 2403.15974 | link |
2024-03-23 | A Deep Learning Architectures for Kidney Disease Classification | Muhammad Shoaib Farooq et.al. | 2403.15895 | null |
2024-03-23 | VLUE: A New Benchmark and Multi-task Knowledge Transfer Learning for Vietnamese Natural Language Understanding | Phong Nguyen-Thuan Do et.al. | 2403.15882 | null |
2024-03-23 | VLM-CPL: Consensus Pseudo Labels from Vision-Language Models for Human Annotation-Free Pathological Image Classification | Lanfeng Zhong et.al. | 2403.15836 | null |
2024-03-22 | Your Image is My Video: Reshaping the Receptive Field via Image-To-Video Differentiable AutoAugmentation and Fusion | Sofia Casarin et.al. | 2403.15194 | null |
2024-03-22 | Image Classification with Rotation-Invariant Variational Quantum Circuits | Paul San Sebastian et.al. | 2403.15031 | null |
2024-03-22 | Extracting Human Attention through Crowdsourced Patch Labeling | Minsuk Chang et.al. | 2403.15013 | null |
2024-03-22 | Clean-image Backdoor Attacks | Dazhong Rong et.al. | 2403.15010 | null |
2024-03-22 | ParFormer: Vision Transformer Baseline with Parallel Local Global Token Mixer and Convolution Attention Patch Embedding | Novendra Setyawan et.al. | 2403.15004 | null |
2024-03-22 | MasonTigers at SemEval-2024 Task 8: Performance Analysis of Transformer-based Models on Machine-Generated Text Detection | Sadiya Sayara Chowdhury Puspo et.al. | 2403.14989 | null |
2024-03-21 | Learning with SASQuaTCh: a Novel Variational Quantum Transformer Architecture with Kernel-Based Self-Attention | Ethan N. Evans et.al. | 2403.14753 | null |
2024-03-21 | Estimating Physical Information Consistency of Channel Data Augmentation for Remote Sensing Images | Tom Burgert et.al. | 2403.14547 | null |
2024-03-21 | Multi-Level Explanations for Generative Language Models | Lucas Monteiro Paes et.al. | 2403.14459 | null |
2024-03-21 | Tensor network compressibility of convolutional models | Sukhbinder Singh et.al. | 2403.14379 | null |
2024-03-21 | LayoutLLM: Large Language Model Instruction Tuning for Visually Rich Document Understanding | Masato Fujitake et.al. | 2403.14252 | null |
2024-03-21 | Safeguarding Medical Image Segmentation Datasets against Unauthorized Training via Contour- and Texture-Aware Perturbations | Xun Lin et.al. | 2403.14250 | null |
2024-03-21 | Improving Image Classification Accuracy through Complementary Intra-Class and Inter-Class Mixup | Ye Xu et.al. | 2403.14137 | link |
2024-03-20 | Bridge the Modality and Capacity Gaps in Vision-Language Model Selection | Chao Yi et.al. | 2403.13797 | null |
2024-03-20 | Leveraging feature communication in federated learning for remote sensing image classification | Anh-Kiet Duong et.al. | 2403.13575 | null |
2024-03-20 | MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining | Di Wang et.al. | 2403.13430 | link |
2024-03-20 | Building Optimal Neural Architectures using Interpretable Knowledge | Keith G. Mills et.al. | 2403.13293 | link |
2024-03-19 | LUWA Dataset: Learning Lithic Use-Wear Analysis on Microscopic Images | Jing Zhang et.al. | 2403.13171 | null |
2024-03-19 | Improved EATFormer: A Vision Transformer for Medical Image Classification | Yulong Shisu et.al. | 2403.13167 | null |
2024-03-19 | SIFT-DBT: Self-supervised Initialization and Fine-Tuning for Imbalanced Digital Breast Tomosynthesis Image Classification | Yuexi Du et.al. | 2403.13148 | link |
2024-03-19 | Using evolutionary computation to optimize task performance of unclocked, recurrent Boolean circuits in FPGAs | Raphael Norman-Tenazas et.al. | 2403.13105 | null |
2024-03-19 | Investigating Text Shortening Strategy in BERT: Truncation vs Summarization | Mirza Alim Mutasodirin et.al. | 2403.12799 | link |
2024-03-18 | Posterior Uncertainty Quantification in Neural Networks using Data Augmentation | Luhuan Wu et.al. | 2403.12729 | null |
2024-03-19 | SEVEN: Pruning Transformer Model by Reserving Sentinels | Jinying Xiao et.al. | 2403.12688 | link |
2024-03-19 | Simple Hack for Transformers against Heavy Long-Text Classification on a Time- and Memory-Limited GPU Service | Mirza Alim Mutasodirin et.al. | 2403.12563 | null |
2024-03-19 | Prompt-Guided Adaptive Model Transformation for Whole Slide Image Classification | Yi Lin et.al. | 2403.12537 | null |
2024-03-19 | CrossTune: Black-Box Few-Shot Classification with Label Enhancement | Danqing Luo et.al. | 2403.12468 | null |
2024-03-18 | Generalizing deep learning models for medical image classification | Matta Sarah et.al. | 2403.12167 | null |
2024-03-19 | Leveraging Spatial and Semantic Feature Extraction for Skin Cancer Diagnosis with Capsule Networks and Graph Neural Networks | K. P. Santoso et.al. | 2403.12009 | null |
2024-03-18 | High-energy physics image classification: A Survey of Jet Applications | Hamza Kheddar et.al. | 2403.11934 | null |
2024-03-18 | Better (pseudo-)labels for semi-supervised instance segmentation | François Porcher et.al. | 2403.11675 | null |
2024-03-18 | Continual Forgetting for Pre-trained Vision Models | Hongbo Zhao et.al. | 2403.11530 | link |
2024-03-18 | Uncertainty-Calibrated Test-Time Model Adaptation without Forgetting | Mingkui Tan et.al. | 2403.11491 | null |
2024-03-17 | Potential of Domain Adaptation in Machine Learning in Ecology and Hydrology to Improve Model Extrapolability | Haiyang Shi et.al. | 2403.11331 | null |
2024-03-17 | A Modified Word Saliency-Based Adversarial Attack on Text Classification Models | Hetvi Waghela et.al. | 2403.11297 | null |
2024-03-17 | Forging the Forger: An Attempt to Improve Authorship Verification via Data Augmentation | Silvia Corbara et.al. | 2403.11265 | null |
2024-03-17 | Multiple Teachers-Meticulous Student: A Domain Adaptive Meta-Knowledge Distillation Model for Medical Image Classification | Shahabedin Nabavi et.al. | 2403.11226 | null |
2024-03-16 | Forward Learning of Graph Neural Networks | Namyong Park et.al. | 2403.11004 | null |
2024-03-16 | Understanding Robustness of Visual State Space Models for Image Classification | Chengbin Du et.al. | 2403.10935 | null |
2024-03-16 | Automatic location detection based on deep learning | Anjali Karangiya et.al. | 2403.10912 | null |
2024-03-14 | Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models | Akhil Kedia et.al. | 2403.09635 | link |
2024-03-14 | XCoOp: Explainable Prompt Learning for Computer-Aided Diagnosis via Concept-guided Context Optimization | Yequan Bie et.al. | 2403.09410 | null |
2024-03-14 | ConDiSR: Contrastive Disentanglement and Style Regularization for Single Domain Generalization | Aleksandr Matsun et.al. | 2403.09400 | null |
2024-03-14 | A Hierarchical Fused Quantum Fuzzy Neural Network for Image Classification | Sheng-Yao Wu et.al. | 2403.09318 | null |
2024-03-14 | CLIP-EBC: CLIP Can Count Accurately through Enhanced Blockwise Classification | Yiming Ma et.al. | 2403.09281 | null |
2024-03-14 | Are Vision Language Models Texture or Shape Biased and Can We Steer Them? | Paul Gavrikov et.al. | 2403.09193 | null |
2024-03-14 | Randomized Principal Component Analysis for Hyperspectral Image Classification | Mustafa Ustuner et.al. | 2403.09117 | null |
2024-03-14 | CardioCaps: Attention-based Capsule Network for Class-Imbalanced Echocardiogram Classification | Hyunkyung Han et.al. | 2403.09108 | link |
2024-03-14 | The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models? | Qinyu Zhao et.al. | 2403.09037 | link |
2024-03-13 | PathM3: A Multimodal Multi-Task Multiple Instance Learning Framework for Whole Slide Image Classification and Captioning | Qifeng Zhou et.al. | 2403.08967 | null |
2024-03-13 | DAM: Dynamic Adapter Merging for Continual Video QA Learning | Feng Cheng et.al. | 2403.08755 | link |
2024-03-13 | Leveraging Compressed Frame Sizes For Ultra-Fast Video Classification | Yuxing Han et.al. | 2403.08580 | null |
2024-03-13 | HOLMES: HOLonym-MEronym based Semantic inspection for Convolutional Image Classifiers | Francesco Dibitonto et.al. | 2403.08536 | link |
2024-03-13 | Pig aggression classification using CNN, Transformers and Recurrent Networks | Junior Silva Souza et.al. | 2403.08528 | null |
2024-03-13 | Reduced Jeffries-Matusita distance: A Novel Loss Function to Improve Generalization Performance of Deep Classification Models | Mohammad Lashkari et.al. | 2403.08408 | null |
2024-03-13 | Iterative Online Image Synthesis via Diffusion Model for Imbalanced Classification | Shuhan Li et.al. | 2403.08407 | null |
2024-03-13 | Advancing Security in AI Systems: A Novel Approach to Detecting Backdoors in Deep Neural Networks | Khondoker Murad Hossain et.al. | 2403.08208 | null |
2024-03-13 | Multiscale Low-Frequency Memory Network for Improved Feature Extraction in Convolutional Neural Networks | Fuzhi Wu et.al. | 2403.08157 | link |
2024-03-12 | Harnessing Artificial Intelligence to Combat Online Hate: Exploring the Challenges and Opportunities of Large Language Models in Hate Speech Detection | Tharindu Kumarage et.al. | 2403.08035 | null |
2024-03-13 | Visual Decoding and Reconstruction via EEG Embeddings with Guided Diffusion | Dongyang Li et.al. | 2403.07721 | link |
2024-03-12 | FPT: Fine-grained Prompt Tuning for Parameter and Memory Efficient Fine Tuning in High-resolution Medical Image Classification | Yijin Huang et.al. | 2403.07576 | null |
2024-03-12 | Backdoor Attack with Mode Mixture Latent Modification | Hongwei Zhang et.al. | 2403.07463 | null |
2024-03-12 | In-context learning enables multimodal large language models to classify cancer pathology images | Dyke Ferber et.al. | 2403.07407 | null |
2024-03-12 | Premonition: Using Generative Models to Preempt Future Data Changes in Continual Learning | Mark D. McDonnell et.al. | 2403.07356 | null |
2024-03-12 | How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance | Hongkang Li et.al. | 2403.07310 | null |
2024-03-12 | A Bayesian Approach to OOD Robustness in Image Classification | Prakhar Kaushik et.al. | 2403.07277 | null |
2024-03-11 | LeOCLR: Leveraging Original Images for Contrastive Learning of Visual Representations | Mohammad Alkhalefi et.al. | 2403.06813 | null |
2024-03-11 | Dynamic Perturbation-Adaptive Adversarial Training on Medical Image Classification | Shuai Li et.al. | 2403.06798 | null |
2024-03-11 | Leveraging Internal Representations of Model for Magnetic Image Classification | Adarsh N L et.al. | 2403.06797 | null |
2024-03-11 | Shortcut Learning in Medical Image Segmentation | Manxi Lin et.al. | 2403.06748 | null |
2024-03-11 | Active Generation for Image Classification | Tao Huang et.al. | 2403.06517 | null |
2024-03-11 | Evolving Knowledge Distillation with Large Language Models and Active Learning | Chengyuan Liu et.al. | 2403.06414 | null |
2024-03-11 | 'One size doesn't fit all': Learning how many Examples to use for In-Context Learning for Improved Text Classification | Manish Chandra et.al. | 2403.06402 | null |
2024-03-10 | Probing Image Compression For Class-Incremental Learning | Justin Yang et.al. | 2403.06288 | null |
2024-03-10 | Bayesian Random Semantic Data Augmentation for Medical Image Classification | Yaoyao Zhu et.al. | 2403.06138 | link |
2024-03-10 | Universal Debiased Editing for Fair Medical Image Classification | Ruinan Jin et.al. | 2403.06104 | null |
2024-03-08 | Tune without Validation: Searching for Learning Rate and Weight Decay on Training Sets | Lorenzo Brigato et.al. | 2403.05532 | null |
2024-03-08 | Generalized Correspondence Matching via Flexible Hierarchical Refinement and Patch Descriptor Distillation | Yu Han et.al. | 2403.05388 | null |
2024-03-08 | The Impact of Quantization on the Robustness of Transformer-based Text Classifiers | Seyed Parsa Neshaei et.al. | 2403.05365 | null |
2024-03-08 | Multiple Instance Learning with random sampling for Whole Slide Image Classification | H. Keshvarikhojasteh et.al. | 2403.05351 | null |
2024-03-08 | Learning Expressive And Generalizable Motion Features For Face Forgery Detection | Jingyi Zhang et.al. | 2403.05172 | null |
2024-03-08 | Defending Against Unforeseen Failure Modes with Latent Adversarial Training | Stephen Casper et.al. | 2403.05030 | link |
2024-03-07 | Fooling Neural Networks for Motion Forecasting via Adversarial Attacks | Edgar Medina et.al. | 2403.04954 | null |
2024-03-07 | T-TAME: Trainable Attention Mechanism for Explaining Convolutional Networks and Vision Transformers | Mariano V. Ntrougkas et.al. | 2403.04523 | null |
2024-03-07 | Source Matters: Source Dataset Impact on Model Robustness in Medical Imaging | Dovile Juodelyte et.al. | 2403.04484 | link |
2024-03-07 | Advancing Biomedical Text Mining with Community Challenges | Hui Zong et.al. | 2403.04261 | null |
2024-03-07 | Scalable On-Chip Optical Linear Processing Unit Using a Single Thin-Film Lithium Niobate Ring Modulator | Zhaoang Deng et.al. | 2403.04216 | null |
2024-03-07 | Scalable and Robust Transformer Decoders for Interpretable Image Classification with Foundation Models | Evelyn Mannix et.al. | 2403.04125 | null |
2024-03-07 | Privacy-preserving Fine-tuning of Large Language Models through Flatness | Tiejin Chen et.al. | 2403.04124 | null |
2024-03-06 | MedMamba: Vision Mamba for Medical Image Classification | Yubiao Yue et.al. | 2403.03849 | link |
2024-03-06 | On the Effectiveness of Distillation in Mitigating Backdoors in Pre-trained Encoder | Tingxu Han et.al. | 2403.03846 | link |
2024-03-06 | RADIA -- Radio Advertisement Detection with Intelligent Analytics | Jorge Álvarez et.al. | 2403.03538 | null |
2024-03-06 | Inverse-Free Fast Natural Gradient Descent Method for Deep Learning | Xinwei Ou et.al. | 2403.03473 | null |
2024-03-06 | Sparse Spiking Neural Network: Exploiting Heterogeneity in Timescales for Pruning Recurrent SNN | Biswadeep Chakraborty et.al. | 2403.03409 | null |
2024-03-05 | RulePrompt: Weakly Supervised Text Classification with Prompting PLMs and Self-Iterative Logical Rules | Miaomiao Li et.al. | 2403.02932 | link |
2024-03-05 | Demonstrating Mutual Reinforcement Effect through Information Flow | Chengguang Gan et.al. | 2403.02902 | null |
2024-03-05 | Quantum Mixed-State Self-Attention Network | Fu Chen et.al. | 2403.02871 | null |
2024-03-05 | SOFIM: Stochastic Optimization Using Regularized Fisher Information Matrix | Gayathri C et.al. | 2403.02833 | null |
2024-03-05 | SGD with Partial Hessian for Deep Neural Networks Optimization | Ying Sun et.al. | 2403.02681 | link |
2024-03-05 | G-EvoNAS: Evolutionary Neural Architecture Search Based on Network Growth | Juan Zou et.al. | 2403.02667 | null |
2024-03-05 | Remove that Square Root: A New Efficient Scale-Invariant Version of AdaGrad | Sayantan Choudhury et.al. | 2403.02648 | link |
2024-03-05 | Modeling Collaborator: Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use | Imad Eddine Toubal et.al. | 2403.02626 | null |
2024-03-04 | When do Convolutional Neural Networks Stop Learning? | Sahan Ahmad et.al. | 2403.02473 | link |
2024-03-04 | NiNformer: A Network in Network Transformer with Token Mixing Generated Gating Function | Abdullah Nazhat Abdullah et.al. | 2403.02411 | link |
2024-03-02 | Can a Confident Prior Replace a Cold Posterior? | Martin Marek et.al. | 2403.01272 | link |
2024-03-02 | Leveraging Self-Supervised Learning for Scene Recognition in Child Sexual Abuse Imagery | Pedro H. V. Valois et.al. | 2403.01183 | null |
2024-03-02 | Auxiliary Tasks Enhanced Dual-affinity Learning for Weakly Supervised Semantic Segmentation | Lian Xu et.al. | 2403.01156 | null |
2024-03-02 | ELA: Efficient Local Attention for Deep Convolutional Neural Networks | Wei Xu et.al. | 2403.01123 | null |
2024-03-01 | Margin Discrepancy-based Adversarial Training for Multi-Domain Text Classification | Yuan Wu et.al. | 2403.00888 | null |
2024-03-01 | Text classification of column headers with a controlled vocabulary: leveraging LLMs for metadata enrichment | Margherita Martorana et.al. | 2403.00884 | null |
2024-03-01 | SURE: SUrvey REcipes for building reliable and robust deep networks | Yuting Li et.al. | 2403.00543 | link |
2024-03-01 | Invariant Test-Time Adaptation for Vision-Language Model Generalization | Huan Ma et.al. | 2403.00376 | null |
2024-02-29 | TELEClass: Taxonomy Enrichment and LLM-Enhanced Hierarchical Text Classification with Minimal Supervision | Yunyi Zhang et.al. | 2403.00165 | null |
2024-02-29 | Assessing Visually-Continuous Corruption Robustness of Neural Networks Relative to Human Performance | Huakun Shen et.al. | 2402.19401 | null |
2024-02-29 | Stitching Gaps: Fusing Situated Perceptual Knowledge with Vision Transformers for High-Level Image Classification | Delfina Sol Martinez Pandiani et.al. | 2402.19339 | null |
2024-02-29 | Generalizable Whole Slide Image Classification with Fine-Grained Visual-Semantic Interaction | Hao Li et.al. | 2402.19326 | null |
2024-02-29 | Decompose-and-Compose: A Compositional Approach to Mitigating Spurious Correlation | Fahimeh Hosseini Noohdani et.al. | 2402.18919 | null |
2024-02-29 | Utilizing Local Hierarchy with Adversarial Training for Hierarchical Text Classification | Zihan Wang et.al. | 2402.18825 | link |
2024-02-28 | Comparing Importance Sampling Based Methods for Mitigating the Effect of Class Imbalance | Indu Panigrahi et.al. | 2402.18742 | link |
2024-02-28 | Deep Neural Network Models Trained With A Fixed Random Classifier Transfer Better Across Domains | Hafiz Tiomoko Ali et.al. | 2402.18614 | null |
2024-02-28 | Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling | Mahdi Karami et.al. | 2402.18508 | null |
2024-02-28 | Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization | Deng Li et.al. | 2402.18447 | null |
2024-02-29 | A Modular System for Enhanced Robustness of Multimedia Understanding Networks via Deep Parametric Estimation | Francesco Barbato et.al. | 2402.18402 | null |
2024-02-28 | A Multimodal Handover Failure Detection Dataset and Baselines | Santosh Thoduka et.al. | 2402.18319 | null |
2024-02-28 | Classes Are Not Equal: An Empirical Study on Image Recognition Fairness | Jiequan Cui et.al. | 2402.18133 | null |
2024-02-27 | Understanding Neural Network Binarization with Forward and Backward Proximal Quantizers | Yiwei Lu et.al. | 2402.17710 | null |
2024-02-27 | SDF2Net: Shallow to Deep Feature Fusion Network for PolSAR Image Classification | Mohammed Q. Alkhatib et.al. | 2402.17672 | link |
2024-02-27 | Predict the Next Word: | Evgenia Ilia et.al. | 2402.17527 | null |
2024-02-27 | Scaling Supervised Local Learning with Augmented Auxiliary Networks | Chenxiang Ma et.al. | 2402.17318 | link |
2024-02-26 | Offline Writer Identification Using Convolutional Neural Network Activation Features | Vincent Christlein et.al. | 2402.17029 | null |
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2025-04-10 | Detect Anything 3D in the Wild | Hanxue Zhang et.al. | 2504.07958 | null |
2025-04-10 | Pychop: Emulating Low-Precision Arithmetic in Numerical Methods and Neural Networks | Erin Carson et.al. | 2504.07835 | null |
2025-04-10 | P2Object: Single Point Supervised Object Detection and Instance Segmentation | Pengfei Chen et.al. | 2504.07813 | null |
2025-04-10 | Nonlocal Retinex-Based Variational Model and its Deep Unfolding Twin for Low-Light Image Enhancement | Daniel Torres et.al. | 2504.07810 | null |
2025-04-10 | Adaptive Detection of Fast Moving Celestial Objects Using a Mixture of Experts and Physical-Inspired Neural Network | Peng Jia et.al. | 2504.07777 | null |
2025-04-10 | Prediction of Usage Probabilities of Shopping-Mall Corridors Using Heterogeneous Graph Neural Networks | Malik M Barakathullah et.al. | 2504.07645 | null |
2025-04-10 | VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model | Haozhan Shen et.al. | 2504.07615 | null |
2025-04-10 | RASMD: RGB And SWIR Multispectral Driving Dataset for Robust Perception in Adverse Conditions | Youngwan Jin et.al. | 2504.07603 | null |
2025-04-10 | WS-DETR: Robust Water Surface Object Detection through Vision-Radar Fusion with Detection Transformer | Huilin Yin et.al. | 2504.07441 | null |
2025-04-10 | Model Discrepancy Learning: Synthetic Faces Detection Based on Multi-Reconstruction | Qingchao Jiang et.al. | 2504.07382 | null |
2025-04-09 | Generalized Semantic Contrastive Learning via Embedding Side Information for Few-Shot Object Detection | Ruoyu Chen et.al. | 2504.07060 | null |
2025-04-09 | UAV Position Estimation using a LiDAR-based 3D Object Detection Method | Uthman Olawoye et.al. | 2504.07028 | null |
2025-04-09 | Towards Efficient Roadside LiDAR Deployment: A Fast Surrogate Metric Based on Entropy-Guided Visibility | Yuze Jiang et.al. | 2504.06772 | null |
2025-04-09 | Domain-Conditioned Scene Graphs for State-Grounded Task Planning | Jonas Herzog et.al. | 2504.06661 | null |
2025-04-09 | Visually Similar Pair Alignment for Robust Cross-Domain Object Detection | Onkar Krishna et.al. | 2504.06607 | null |
2025-04-08 | From Broadcast to Minimap: Achieving State-of-the-Art SoccerNet Game State Reconstruction | Vladimir Golovkin et.al. | 2504.06357 | null |
2025-04-08 | Analyzing the Impact of Low-Rank Adaptation for Cross-Domain Few-Shot Object Detection in Aerial Images | Hicham Talaoubrid et.al. | 2504.06330 | null |
2025-04-08 | Security Analysis of Thumbnail-Preserving Image Encryption and a New Framework | Dong Xie et.al. | 2504.06083 | null |
2025-04-08 | Balancing long- and short-term dynamics for the modeling of saliency in videos | Theodor Wulff et.al. | 2504.05913 | null |
2025-04-08 | PRIMEDrive-CoT: A Precognitive Chain-of-Thought Framework for Uncertainty-Aware Object Interaction in Driving Scene Scenario | Sriram Mandalika et.al. | 2504.05908 | null |
2025-04-08 | Intrinsic Saliency Guided Trunk-Collateral Network for Unsupervised Video Object Segmentation | Xiangyu Zheng et.al. | 2504.05904 | null |
2025-04-08 | KAN-SAM: Kolmogorov-Arnold Network Guided Segment Anything Model for RGB-T Salient Object Detection | Xingyuan Li et.al. | 2504.05878 | null |
2025-04-08 | DefMamba: Deformable Visual State Space Model | Leiye Liu et.al. | 2504.05794 | null |
2025-04-08 | Event-based Civil Infrastructure Visual Defect Detection: ev-CIVIL Dataset and Benchmark | Udayanga G. W. K. N. Gamage et.al. | 2504.05679 | null |
2025-04-08 | POD: Predictive Object Detection with Single-Frame FMCW LiDAR Point Cloud | Yining Shi et.al. | 2504.05649 | null |
2025-04-08 | AD-Det: Boosting Object Detection in UAV Images with Focused Small Objects and Balanced Tail Classes | Zhenteng Li et.al. | 2504.05601 | null |
2025-04-07 | SSLFusion: Scale & Space Aligned Latent Fusion Model for Multimodal 3D Object Detection | Bonan Ding et.al. | 2504.05170 | null |
2025-04-07 | Inland Waterway Object Detection in Multi-environment: Dataset and Approach | Shanshan Wang et.al. | 2504.04835 | null |
2025-04-07 | Playing Non-Embedded Card-Based Games with Reinforcement Learning | Tianyang Wu et.al. | 2504.04783 | null |
2025-04-07 | Feedback-Enhanced Hallucination-Resistant Vision-Language Model for Real-Time Scene Understanding | Zahir Alsulaimawi et.al. | 2504.04772 | null |
2025-04-07 | Inverse++: Vision-Centric 3D Semantic Occupancy Prediction Assisted with 3D Object Detection | Zhenxing Ming et.al. | 2504.04732 | null |
2025-04-06 | Enhance Then Search: An Augmentation-Search Strategy with Foundation Models for Cross-Domain Few-Shot Object Detection | Jiancheng Pan et.al. | 2504.04517 | link |
2025-04-06 | eKalibr-Stereo: Continuous-Time Spatiotemporal Calibration for Event-Based Stereo Visual Systems | Shuolong Chen et.al. | 2504.04451 | link |
2025-04-05 | Autoregressive High-Order Finite Difference Modulo Imaging: High-Dynamic Range for Computer Vision Applications | Brayan Monroy et.al. | 2504.04228 | null |
2025-04-05 | An Optimized Density-Based Lane Keeping System for A Cost-Efficient Autonomous Vehicle Platform: AurigaBot V1 | Farbod Younesi et.al. | 2504.04217 | null |
2025-04-05 | Learning about the Physical World through Analytic Concepts | Jianhua Sun et.al. | 2504.04170 | null |
2025-04-04 | VISTA-OCR: Towards generative and interactive end to end OCR models | Laziz Hamdi et.al. | 2504.03621 | null |
2025-04-04 | PF3Det: A Prompted Foundation Feature Assisted Visual LiDAR 3D Detector | Kaidong Li et.al. | 2504.03563 | null |
2025-04-04 | ZFusion: An Effective Fuser of Camera and 4D Radar for 3D Object Perception in Autonomous Driving | Sheng Yang et.al. | 2504.03438 | null |
2025-04-04 | Infrared bubble recognition in the Milky Way and beyond using deep learning | Shimpei Nishimoto et.al. | 2504.03367 | null |
2025-04-04 | Real-Time Roadway Obstacle Detection for Electric Scooters Using Deep Learning and Multi-Sensor Fusion | Zeyang Zheng et.al. | 2504.03171 | null |
2025-04-04 | Finding the Reflection Point: Unpadding Images to Remove Data Augmentation Artifacts in Large Open Source Image Datasets for Machine Learning | Lucas Choi et.al. | 2504.03168 | null |
2025-04-03 | Attention-Aware Multi-View Pedestrian Tracking | Reef Alturki et.al. | 2504.03047 | null |
2025-04-03 | LiDAR-based Object Detection with Real-time Voice Specifications | Anurag Kulkarni et.al. | 2504.02920 | null |
2025-04-03 | BOP Challenge 2024 on Model-Based and Model-Free 6D Object Pose Estimation | Van Nguyen Nguyen et.al. | 2504.02812 | null |
2025-04-03 | Rip Current Segmentation: A Novel Benchmark and YOLOv8 Baseline Results | Andrei Dumitriu et.al. | 2504.02558 | null |
2025-04-03 | Multimodal Fusion and Vision-Language Models: A Survey for Robot Vision | Xiaofeng Han et.al. | 2504.02477 | null |
2025-04-03 | CornerPoint3D: Look at the Nearest Corner Instead of the Center | Ruixiao Zhang et.al. | 2504.02464 | null |
2025-04-03 | Hyperspectral Remote Sensing Images Salient Object Detection: The First Benchmark Dataset and Baseline | Peifu Liu et.al. | 2504.02416 | null |
2025-04-03 | SemiISP/SemiIE: Semi-Supervised Image Signal Processor and Image Enhancement Leveraging One-to-Many Mapping sRGB-to-RAW | Masakazu Yoshimura et.al. | 2504.02345 | null |
2025-04-03 | Improving Harmful Text Detection with Joint Retrieval and External Knowledge | Zidong Yu et.al. | 2504.02310 | null |
2025-04-03 | LLM-Guided Evolution: An Autonomous Model Optimization for Object Detection | YiMing Yu et.al. | 2504.02280 | null |
2025-04-02 | Cat-Eye Inspired Active-Passive-Composite Aperture-Shared Sub-Terahertz Meta-Imager for Non-Interactive Concealed Object Detection | Mingshuang Hu et.al. | 2504.01473 | null |
2025-04-02 | CFMD: Dynamic Cross-layer Feature Fusion for Salient Object Detection | Jin Lian et.al. | 2504.01326 | null |
2025-04-01 | Enabling Efficient Processing of Spiking Neural Networks with On-Chip Learning on Commodity Neuromorphic Processors for Edge AI Systems | Rachmad Vidya Wicaksana Putra et.al. | 2504.00957 | null |
2025-04-01 | NeuRadar: Neural Radiance Fields for Automotive Radar Point Clouds | Mahan Rafidashti et.al. | 2504.00859 | null |
2025-04-01 | AttentiveGRU: Recurrent Spatio-Temporal Modeling for Advanced Radar-Based BEV Object Detection | Loveneet Saini et.al. | 2504.00559 | null |
2025-04-01 | Archival Faces: Detection of Faces in Digitized Historical Documents | Marek Vaško et.al. | 2504.00558 | null |
2025-04-01 | High-Quality Pseudo-Label Generation Based on Visual Prompt Assisted Cloud Model Update | Xinrun Xu et.al. | 2504.00526 | null |
2025-04-01 | Intrinsic-feature-guided 3D Object Detection | Wanjing Zhang et.al. | 2504.00382 | null |
2025-04-01 | CamoSAM2: Motion-Appearance Induced Auto-Refining Prompts for Video Camouflaged Object Detection | Xin Zhang et.al. | 2504.00375 | null |
2025-03-31 | Towards Precise Action Spotting: Addressing Temporal Misalignment in Labels with Dynamic Label Assignment | Masato Tamura et.al. | 2504.00149 | null |
2025-03-31 | SU-YOLO: Spiking Neural Network for Efficient Underwater Object Detection | Chenyang Li et.al. | 2503.24389 | link |
2025-03-31 | MB-ORES: A Multi-Branch Object Reasoner for Visual Grounding in Remote Sensing | Karim Radouane et.al. | 2503.24219 | link |
2025-03-31 | Spectral-Adaptive Modulation Networks for Visual Perception | Guhnoo Yun et.al. | 2503.23947 | null |
2025-03-31 | Reliable Traffic Monitoring Using Low-Cost Doppler Radar Units | Mishay Naidoo et.al. | 2503.23926 | null |
2025-03-31 | Expanding-and-Shrinking Binary Neural Networks | Xulong Shi et.al. | 2503.23709 | link |
2025-03-30 | Beyond Detection: Designing AI-Resilient Assessments with Automated Feedback Tool to Foster Critical Thinking | Muhammad Sajjad Akbar et.al. | 2503.23622 | null |
2025-03-30 | Re-Aligning Language to Visual Objects with an Agentic Workflow | Yuming Chen et.al. | 2503.23508 | null |
2025-03-30 | EagleVision: Object-level Attribute Multimodal LLM for Remote Sensing | Hongxiang Jiang et.al. | 2503.23330 | null |
2025-03-29 | Context in object detection: a systematic literature review | Mahtab Jamali et.al. | 2503.23249 | null |
2025-03-29 | Large Self-Supervised Models Bridge the Gap in Domain Adaptive Object Detection | Marc-Antoine Lavoie et.al. | 2503.23220 | null |
2025-03-28 | AnnoPage Dataset: Dataset of Non-Textual Elements in Documents with Fine-Grained Categorization | Martin Kišš et.al. | 2503.22526 | null |
2025-03-28 | Data Quality Matters: Quantifying Image Quality Impact on Machine Learning Performance | Christian Steinhauser et.al. | 2503.22375 | null |
2025-03-28 | ForcePose: A Deep Learning Approach for Force Calculation Based on Action Recognition Using MediaPipe Pose Estimation Combined with Object Detection | Nandakishor M et.al. | 2503.22363 | null |
2025-03-28 | SKDU at De-Factify 4.0: Natural Language Features for AI-Generated Text-Detection | Shrikant Malviya et.al. | 2503.22338 | link |
2025-03-28 | Knowledge Rectification for Camouflaged Object Detection: Unlocking Insights from Low-Quality Data | Juwei Guan et.al. | 2503.22180 | null |
2025-03-28 | A Survey on Remote Sensing Foundation Models: From Vision to Multimodality | Ziyue Huang et.al. | 2503.22081 | null |
2025-03-27 | AGILE: A Diffusion-Based Attention-Guided Image and Label Translation for Efficient Cross-Domain Plant Trait Identification | Earl Ranario et.al. | 2503.22019 | null |
2025-03-27 | FACETS: Efficient Once-for-all Object Detection via Constrained Iterative Search | Tony Tran et.al. | 2503.21999 | null |
2025-03-27 | Exponentially Weighted Instance-Aware Repeat Factor Sampling for Long-Tailed Object Detection Model Training in Unmanned Aerial Vehicles Surveillance Scenarios | Taufiq Ahmed et.al. | 2503.21893 | null |
2025-03-27 | Learning Class Prototypes for Unified Sparse Supervised 3D Object Detection | Yun Zhu et.al. | 2503.21099 | link |
2025-03-26 | SaViD: Spectravista Aesthetic Vision Integration for Robust and Discerning 3D Object Detection in Challenging Environments | Tanmoy Dam et.al. | 2503.20614 | link |
2025-03-26 | Small Object Detection: A Comprehensive Survey on Challenges, Techniques and Real-World Applications | Mahya Nikouei et.al. | 2503.20516 | null |
2025-03-25 | Gemini Robotics: Bringing AI into the Physical World | Gemini Robotics Team et.al. | 2503.20020 | null |
2025-03-25 | Hyperdimensional Uncertainty Quantification for Multimodal Uncertainty Fusion in Autonomous Vehicles Perception | Luke Chen et.al. | 2503.20011 | null |
2025-03-25 | Mind the Gap: Benchmarking Spatial Reasoning in Vision-Language Models | Ilias Stogiannidis et.al. | 2503.19707 | null |
2025-03-25 | BiblioPage: A Dataset of Scanned Title Pages for Bibliographic Metadata Extraction | Jan Kohút et.al. | 2503.19658 | null |
2025-03-25 | Single Shot AI-assisted quantification of KI-67 proliferation index in breast cancer | Deepti Madurai Muthu et.al. | 2503.19606 | null |
2025-03-25 | MATT-GS: Masked Attention-based 3DGS for Robot Perception and Object Detection | Jee Won Lee et.al. | 2503.19330 | null |
2025-03-25 | Multiscale Feature Importance-based Bit Allocation for End-to-End Feature Coding for Machines | Junle Liu et.al. | 2503.19278 | null |
2025-03-24 | Benchmarking Object Detectors under Real-World Distribution Shifts in Satellite Imagery | Sara Al-Emadi et.al. | 2503.19202 | null |
2025-03-24 | Pitch Contour Exploration Across Audio Domains: A Vision-Based Transfer Learning Approach | Jakob Abeßer et.al. | 2503.19161 | null |
2025-03-24 | Cooperative Control of Multi-Quadrotors for Transporting Cable-Suspended Payloads: Obstacle-Aware Planning and Event-Based Nonlinear Model Predictive Control | Tohid Kargar Tasooji et.al. | 2503.19135 | null |
2025-03-24 | Building Blocks for Robust and Effective Semi-Supervised Real-World Object Detection | Moussa Kassem Sbeyti et.al. | 2503.18903 | null |
2025-03-24 | LGI-DETR: Local-Global Interaction for UAV Object Detection | Zifa Chen et.al. | 2503.18785 | null |
2025-03-25 | Frequency Dynamic Convolution for Dense Image Prediction | Linwei Chen et.al. | 2503.18783 | null |
2025-03-24 | CQ-DINO: Mitigating Gradient Dilution via Category Queries for Vast Vocabulary Object Detection | Zhichao Sun et.al. | 2503.18430 | null |
2025-03-24 | Vision-Guided Loco-Manipulation with a Snake Robot | Adarsh Salagame et.al. | 2503.18308 | null |
2025-03-23 | Extended Visibility of Autonomous Vehicles via Optimized Cooperative Perception under Imperfect Communication | Ahmad Sarlak et.al. | 2503.18192 | null |
2025-03-22 | MAMAT: 3D Mamba-Based Atmospheric Turbulence Removal and its Object Detection Capability | Paul Hill et.al. | 2503.17700 | null |
2025-03-22 | Sense4FL: Vehicular Crowdsensing Enhanced Federated Learning for Autonomous Driving | Yanan Ma et.al. | 2503.17697 | null |
2025-03-21 | Should we pre-train a decoder in contrastive learning for dense prediction tasks? | Sébastien Quetin et.al. | 2503.17526 | null |
2025-03-21 | Event-Based Crossing Dataset (EBCD) | Joey Mulé et.al. | 2503.17499 | null |
2025-03-21 | An Iterative Feedback Mechanism for Improving Natural Language Class Descriptions in Open-Vocabulary Object Detection | Louis Y. Kim et.al. | 2503.17285 | null |
2025-03-21 | Which2comm: An Efficient Collaborative Perception Framework for 3D Object Detection | Duanrui Yu et.al. | 2503.17175 | null |
2025-03-21 | Hi-ALPS -- An Experimental Robustness Quantification of Six LiDAR-based Object Detection Systems for Autonomous Driving | Alexandra Arzberger et.al. | 2503.17168 | null |
2025-03-21 | R-LiViT: A LiDAR-Visual-Thermal Dataset Enabling Vulnerable Road User Focused Roadside Perception | Jonas Mirlach et.al. | 2503.17122 | null |
2025-03-21 | Exploring Few-Shot Object Detection on Blood Smear Images: A Case Study of Leukocytes and Schistocytes | Davide Antonio Mura et.al. | 2503.17107 | null |
2025-03-21 | R2LDM: An Efficient 4D Radar Super-Resolution Framework Leveraging Diffusion Model | Boyuan Zheng et.al. | 2503.17097 | null |
2025-03-21 | Superpowering Open-Vocabulary Object Detectors for X-ray Vision | Pablo Garcia-Fernandez et.al. | 2503.17071 | null |
2025-03-21 | Scoring, Remember, and Reference: Catching Camouflaged Objects in Videos | Yuang Feng et.al. | 2503.17050 | null |
2025-03-21 | Salient Object Detection in Traffic Scene through the TSOD10K Dataset | Yu Qiu et.al. | 2503.16910 | null |
2025-03-21 | Seg2Box: 3D Object Detection by Point-Wise Semantics Supervision | Maoji Zheng et.al. | 2503.16811 | null |
2025-03-20 | RESFL: An Uncertainty-Aware Framework for Responsible Federated Learning by Balancing Privacy, Fairness and Utility in Autonomous Vehicles | Dawood Wasif et.al. | 2503.16251 | null |
2025-03-20 | MapGlue: Multimodal Remote Sensing Image Matching | Peihao Wu et.al. | 2503.16185 | null |
2025-03-20 | Uncertainty Meets Diversity: A Comprehensive Active Learning Framework for Indoor 3D Object Detection | Jiangyi Wang et.al. | 2503.16125 | null |
2025-03-20 | Semantic-Guided Global-Local Collaborative Networks for Lightweight Image Super-Resolution | Wanshu Fan et.al. | 2503.16056 | null |
2025-03-19 | A Context-Driven Training-Free Network for Lightweight Scene Text Segmentation and Recognition | Ritabrata Chakraborty et.al. | 2503.15639 | null |
2025-03-19 | DCA: Dividing and Conquering Amnesia in Incremental Object Detection | Aoting Zhang et.al. | 2503.15295 | null |
2025-03-19 | Test-Time Backdoor Detection for Object Detection Models | Hangtao Zhang et.al. | 2503.15293 | null |
2025-03-19 | GO-N3RDet: Geometry Optimized NeRF-enhanced 3D Object Detector | Zechuan Li et.al. | 2503.15211 | null |
2025-03-19 | UltraFlwr -- An Efficient Federated Medical and Surgical Object Detection Framework | Yang Li et.al. | 2503.15161 | null |
2025-03-19 | An Investigation of Beam Density on LiDAR Object Detection Performance | Christoph Griesbacher et.al. | 2503.15087 | null |
2025-03-19 | SPADE: Systematic Prompt Framework for Automated Dialogue Expansion in Machine-Generated Text Detection | Haoyi Li et.al. | 2503.15044 | null |
2025-03-19 | Fine-Grained Open-Vocabulary Object Detection with Fined-Grained Prompts: Task, Dataset and Benchmark | Ying Liu et.al. | 2503.14862 | null |
2025-03-19 | State Space Model Meets Transformer: A New Paradigm for 3D Object Detection | Chuxin Wang et.al. | 2503.14493 | null |
2025-03-18 | Panoramic Distortion-Aware Tokenization for Person Detection and Localization Using Transformers in Overhead Fisheye Images | Nobuhiko Wakai et.al. | 2503.14228 | null |
2025-03-18 | A Revisit to the Decoder for Camouflaged Object Detection | Seung Woo Ko et.al. | 2503.14035 | null |
2025-03-18 | Shift, Scale and Rotation Invariant Multiple Object Detection using Balanced Joint Transform Correlator | Xi Shen et.al. | 2503.14034 | null |
2025-03-18 | LEGNet: Lightweight Edge-Gaussian Driven Network for Low-Quality Remote Sensing Image Object Detection | Wei Lu et.al. | 2503.14012 | null |
2025-03-18 | FrustumFusionNets: A Three-Dimensional Object Detection Network Based on Tractor Road Scene | Lili Yang et.al. | 2503.13951 | null |
2025-03-18 | Is Discretization Fusion All You Need for Collaborative Perception? | Kang Yang et.al. | 2503.13946 | null |
2025-03-18 | PSA-SSL: Pose and Size-aware Self-Supervised Learning on LiDAR Point Clouds | Barza Nisar et.al. | 2503.13914 | null |
2025-03-18 | HSOD-BIT-V2: A New Challenging Benchmarkfor Hyperspectral Salient Object Detection | Yuhao Qiu et.al. | 2503.13906 | null |
2025-03-18 | TGBFormer: Transformer-GraphFormer Blender Network for Video Object Detection | Qiang Qi et.al. | 2503.13903 | null |
2025-03-17 | Beyond RGB: Adaptive Parallel Processing for RAW Object Detection | Shani Gamrian et.al. | 2503.13163 | null |
2025-03-17 | Who Wrote This? Identifying Machine vs Human-Generated Text in Hausa | Babangida Sani et.al. | 2503.13101 | null |
2025-03-17 | SparseAlign: A Fully Sparse Framework for Cooperative Object Detection | Yunshuang Yuan et.al. | 2503.12982 | null |
2025-03-17 | Efficient Multimodal 3D Object Detector via Instance-Level Contrastive Distillation | Zhuoqun Su et.al. | 2503.12914 | null |
2025-03-16 | Point Cloud Based Scene Segmentation: A Survey | Dan Halperin et.al. | 2503.12595 | null |
2025-03-16 | GeoRSMLLM: A Multimodal Large Language Model for Vision-Language Tasks in Geoscience and Remote Sensing | Zilun Zhang et.al. | 2503.12490 | null |
2025-03-16 | Deepfake Detection with Optimized Hybrid Model: EAR Biometric Descriptor via Improved RCNN | Ruchika Sharma et.al. | 2503.12381 | null |
2025-03-15 | An Efficient Deep Learning-Based Approach to Automating Invoice Document Validation | Aziz Amari et.al. | 2503.12267 | null |
2025-03-15 | Minuscule Cell Detection in AS-OCT Images with Progressive Field-of-View Focusing | Boyu Chen et.al. | 2503.12249 | null |
2025-03-15 | SFMNet: Sparse Focal Modulation for 3D Object Detection | Oren Shrout et.al. | 2503.12093 | null |
2025-03-14 | FLASHμ: Fast Localizing And Sizing of Holographic Microparticles | Ayush Paliwal et.al. | 2503.11538 | null |
2025-03-14 | Falcon: A Remote Sensing Vision-Language Foundation Model | Kelu Yao et.al. | 2503.11070 | null |
2025-03-14 | FMNet: Frequency-Assisted Mamba-Like Linear Attention Network for Camouflaged Object Detection | Ming Deng et.al. | 2503.11030 | null |
2025-03-14 | Comparative Analysis of Advanced AI-based Object Detection Models for Pavement Marking Quality Assessment during Daytime | Gian Antariksa et.al. | 2503.11008 | null |
2025-03-14 | Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object Detection | Chuhan Zhang et.al. | 2503.11005 | null |
2025-03-14 | Enhanced Multi-View Pedestrian Detection Using Probabilistic Occupancy Volume | Reef Alturki et.al. | 2503.10982 | null |
2025-03-13 | The Power of One: A Single Example is All it Takes for Segmentation in VLMs | Mir Rayat Imtiaz Hossain et.al. | 2503.10779 | null |
2025-03-13 | HeightFormer: Learning Height Prediction in Voxel Features for Roadside Vision Centric 3D Object Detection via Transformer | Zhang Zhang et.al. | 2503.10777 | null |
2025-03-13 | Semantic-Supervised Spatial-Temporal Fusion for LiDAR-based 3D Object Detection | Chaoqun Wang et.al. | 2503.10579 | null |
2025-03-13 | RoCo-Sim: Enhancing Roadside Collaborative Perception through Foreground Simulation | Yuwen Du et.al. | 2503.10410 | link |
2025-03-13 | RoMA: Scaling up Mamba-based Foundation Models for Remote Sensing | Fengxiang Wang et.al. | 2503.10392 | link |
2025-03-13 | Object detection characteristics in a learning factory environment using YOLOv8 | Toni Schneidereit et.al. | 2503.10356 | null |
2025-03-13 | TARS: Traffic-Aware Radar Scene Flow Estimation | Jialong Wu et.al. | 2503.10210 | null |
2025-03-13 | A Hierarchical Semantic Distillation Framework for Open-Vocabulary Object Detection | Shenghao Fu et.al. | 2503.10152 | link |
2025-03-13 | Deep Learning-Based Direct Leaf Area Estimation using Two RGBD Datasets for Model Development | Namal Jayasuriya et.al. | 2503.10129 | null |
2025-03-13 | Style Evolving along Chain-of-Thought for Unknown-Domain Object Detection | Zihao Zhang et.al. | 2503.09968 | null |
2025-03-12 | CleverDistiller: Simple and Spatially Consistent Cross-modal Distillation | Hariprasath Govindarajan et.al. | 2503.09878 | null |
2025-03-12 | How good are deep learning methods for automated road safety analysis using video data? An experimental study | Qingwu Liu et.al. | 2503.09807 | null |
2025-03-12 | Deep Learning for Climate Action: Computer Vision Analysis of Visual Narratives on X | Katharina Prasse et.al. | 2503.09361 | null |
2025-03-12 | Fully-Synthetic Training for Visual Quality Inspection in Automotive Production | Christoph Huber et.al. | 2503.09354 | null |
2025-03-12 | DitHub: A Modular Framework for Incremental Open-Vocabulary Object Detection | Chiara Cappellino et.al. | 2503.09271 | null |
2025-03-12 | Polygonizing Roof Segments from High-Resolution Aerial Images Using Yolov8-Based Edge Detection | Qipeng Mei et.al. | 2503.09187 | null |
2025-03-12 | RFUAV: A Benchmark Dataset for Unmanned Aerial Vehicle Detection and Identification | Rui Shi et.al. | 2503.09033 | null |
2025-03-12 | Dual-Domain Homogeneous Fusion with Cross-Modal Mamba and Progressive Decoder for 3D Object Detection | Xuzhong Hu et.al. | 2503.08992 | null |
2025-03-11 | GBlobs: Explicit Local Structure via Gaussian Blobs for Improved Cross-Domain LiDAR-based 3D Object Detection | Dušan Malić et.al. | 2503.08639 | null |
2025-03-11 | Referring to Any Person | Qing Jiang et.al. | 2503.08507 | null |
2025-03-11 | SuperCap: Multi-resolution Superpixel-based Image Captioning | Henry Senior et.al. | 2503.08496 | null |
2025-03-13 | Learning to Detect Objects from Multi-Agent LiDAR Scans without Manual Labels | Qiming Xia et.al. | 2503.08421 | null |
2025-03-11 | Embodied Crowd Counting | Runling Long et.al. | 2503.08367 | null |
2025-03-11 | Physics-based AI methodology for Material Parameter Extraction from Optical Data | M. Koumans et.al. | 2503.08183 | null |
2025-03-11 | Bring Remote Sensing Object Detect Into Nature Language Model: Using SFT Method | Fei Wang et.al. | 2503.08144 | null |
2025-03-11 | Accelerate 3D Object Detection Models via Zero-Shot Attention Key Pruning | Lizhen Xu et.al. | 2503.08101 | link |
2025-03-11 | SparseVoxFormer: Sparse Voxel-based Transformer for Multi-modal 3D Object Detection | Hyeongseok Son et.al. | 2503.08092 | null |
2025-03-11 | Simulating Automotive Radar with Lidar and Camera Inputs | Peili Song et.al. | 2503.08068 | null |
2025-03-10 | YOLOE: Real-Time Seeing Anything | Ao Wang et.al. | 2503.07465 | link |
2025-03-10 | HGO-YOLO: Advancing Anomaly Behavior Detection with Hierarchical Features and Lightweight Optimized Detection | Qizhi Zheng et.al. | 2503.07371 | null |
2025-03-10 | Mitigating Hallucinations in YOLO-based Object Detection Models: A Revisit to Out-of-Distribution Detection | Weicheng He et.al. | 2503.07330 | null |
2025-03-10 | Semantic Communications with Computer Vision Sensing for Edge Video Transmission | Yubo Peng et.al. | 2503.07252 | null |
2025-03-10 | MIRAM: Masked Image Reconstruction Across Multiple Scales for Breast Lesion Risk Prediction | Hung Q. Vo et.al. | 2503.07157 | null |
2025-03-10 | A Light Perspective for 3D Object Detection | Marcelo Eduardo Pederiva et.al. | 2503.07133 | null |
2025-03-10 | SimROD: A Simple Baseline for Raw Object Detection with Global and Local Enhancements | Haiyang Xie et.al. | 2503.07101 | null |
2025-03-10 | RS2V-L: Vehicle-Mounted LiDAR Data Generation from Roadside Sensor Observations | Ruidan Xing et.al. | 2503.07085 | null |
2025-03-10 | Availability-aware Sensor Fusion via Unified Canonical Space for 4D Radar, LiDAR, and Camera | Dong-Hee Paek et.al. | 2503.07029 | null |
2025-03-10 | Large Language Model Guided Progressive Feature Alignment for Multimodal UAV Object Detection | Wentao Wu et.al. | 2503.06948 | null |
2025-03-06 | Collaborative Evaluation of Deepfake Text with Deliberation-Enhancing Dialogue Systems | Jooyoung Lee et.al. | 2503.04945 | null |
2025-03-06 | Fine-Tuning Florence2 for Enhanced Object Detection in Un-constructed Environments: Vision-Language Model Approach | Soumyadeep Ro et.al. | 2503.04918 | null |
2025-03-06 | Floxels: Fast Unsupervised Voxel Based Scene Flow Estimation | David T. Hoffmann et.al. | 2503.04718 | null |
2025-03-06 | DEAL-YOLO: Drone-based Efficient Animal Localization using YOLO | Aditya Prashant Naidu et.al. | 2503.04698 | null |
2025-03-06 | Teach YOLO to Remember: A Self-Distillation Approach for Continual Object Detection | Riccardo De Monte et.al. | 2503.04688 | null |
2025-03-06 | ReynoldsFlow: Exquisite Flow Estimation via Reynolds Transport Theorem | Yu-Hsi Chen et.al. | 2503.04500 | null |
2025-03-06 | A lightweight model FDM-YOLO for small target improvement based on YOLOv8 | Xuerui Zhang et.al. | 2503.04452 | null |
2025-03-06 | Shaken, Not Stirred: A Novel Dataset for Visual Understanding of Glasses in Human-Robot Bartending Tasks | Lukáš Gajdošech et.al. | 2503.04308 | null |
2025-03-06 | CA-W3D: Leveraging Context-Aware Knowledge for Weakly Supervised Monocular 3D Detection | Chupeng Liu et.al. | 2503.04154 | null |
2025-03-06 | Robust Computer-Vision based Construction Site Detection for Assistive-Technology Applications | Junchi Feng et.al. | 2503.04139 | null |
2025-03-06 | Fractional Correspondence Framework in Detection Transformer | Masoumeh Zareapoor et.al. | 2503.04107 | null |
2025-03-05 | DualDiff+: Dual-Branch Diffusion for High-Fidelity Video Generation with Reward Guidance | Zhao Yang et.al. | 2503.03689 | null |
2025-03-05 | 4D Radar Ground Truth Augmentation with LiDAR-to-4D Radar Data Synthesis | Woo-Jin Jung et.al. | 2503.03637 | null |
2025-03-05 | Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders | Kristian Kuznetsov et.al. | 2503.03601 | null |
2025-03-05 | Simulation-Based Performance Evaluation of 3D Object Detection Methods with Deep Learning for a LiDAR Point Cloud Dataset in a SOTIF-related Use Case | Milin Patel et.al. | 2503.03548 | link |
2025-03-05 | AI-Driven Multi-Stage Computer Vision System for Defect Detection in Laser-Engraved Industrial Nameplates | Adhish Anitha Vilasan et.al. | 2503.03395 | null |
2025-03-05 | MIAdapt: Source-free Few-shot Domain Adaptive Object Detection for Microscopic Images | Nimra Dilawar et.al. | 2503.03370 | null |
2025-03-05 | Automated Attendee Recognition System for Large-Scale Social Events or Conference Gathering | Dhruv Motwani et.al. | 2503.03330 | null |
2025-03-05 | BEVMOSNet: Multimodal Fusion for BEV Moving Object Segmentation | Hiep Truong Cong et.al. | 2503.03280 | null |
2025-03-05 | Find Matching Faces Based On Face Parameters | Setu A. Bhatt et.al. | 2503.03204 | null |
2025-03-04 | Revolutionizing Traffic Management with AI-Powered Machine Vision: A Step Toward Smart Cities | Seyed Hossein Hosseini DolatAbadi et.al. | 2503.02967 | null |
2025-03-04 | Class-Aware PillarMix: Can Mixed Sample Data Augmentation Enhance 3D Object Detection with Radar Point Clouds? | Miao Zhang et.al. | 2503.02687 | null |
2025-03-04 | Exploring Model Quantization in GenAI-based Image Inpainting and Detection of Arable Plants | Sourav Modak et.al. | 2503.02420 | null |
2025-03-04 | Robust detection of overlapping bioacoustic sound events | Louis Mahon et.al. | 2503.02389 | null |
2025-03-04 | YOLO-PRO: Enhancing Instance-Specific Object Detection with Full-Channel Global Self-Attention | Lin Huang et.al. | 2503.02348 | null |
2025-03-04 | SSNet: Saliency Prior and State Space Model-based Network for Salient Object Detection in RGB-D Images | Gargi Panda et.al. | 2503.02270 | null |
2025-03-03 | Generalized Diffusion Detector: Mining Robust Features from Diffusion Models for Domain-Generalized Detection | Boyong He et.al. | 2503.02101 | null |
2025-03-03 | Uncertainty Representation in a SOTIF-Related Use Case with Dempster-Shafer Theory for LiDAR Sensor-Based Object Detection | Milin Patel et.al. | 2503.02087 | link |
2025-03-03 | Visual-RFT: Visual Reinforcement Fine-Tuning | Ziyu Liu et.al. | 2503.01785 | link |
2025-03-03 | Enhancing Object Detection Accuracy in Underwater Sonar Images through Deep Learning-based Denoising | Ziyu Wang et.al. | 2503.01655 | null |
2025-03-03 | Evaluating Stenosis Detection with Grounding DINO, YOLO, and DINO-DETR | Muhammad Musab Ansari et.al. | 2503.01601 | null |
2025-02-28 | The Common Objects Underwater (COU) Dataset for Robust Underwater Object Detection | Rishi Mukherjee et.al. | 2502.20651 | null |
2025-02-28 | RTGen: Real-Time Generative Detection Transformer | Chi Ruan et.al. | 2502.20622 | null |
2025-02-28 | LV-DOT: LiDAR-visual dynamic obstacle detection and tracking for autonomous robot navigation | Zhefan Xu et.al. | 2502.20607 | null |
2025-02-27 | Multi-Scale Neighborhood Occupancy Masked Autoencoder for Self-Supervised Learning in LiDAR Point Clouds | Mohamed Abdelsamad et.al. | 2502.20316 | null |
2025-02-27 | OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels | Meng Lou et.al. | 2502.20087 | link |
2025-02-27 | Night-Voyager: Consistent and Efficient Nocturnal Vision-Aided State Estimation in Object Maps | Tianxiao Gao et.al. | 2502.20054 | null |
2025-02-27 | Learning Mask Invariant Mutual Information for Masked Image Modeling | Tao Huang et.al. | 2502.19718 | null |
2025-02-27 | BEVDiffuser: Plug-and-Play Diffusion Model for BEV Denoising with Ground-Truth Guidance | Xin Ye et.al. | 2502.19694 | null |
2025-02-26 | Ev-3DOD: Pushing the Temporal Boundaries of 3D Object Detection with Event Cameras | Hoonhee Cho et.al. | 2502.19630 | null |
2025-02-26 | Is Your Paper Being Reviewed by an LLM? A New Benchmark Dataset and Approach for Detecting AI Text in Peer Review | Sungduk Yu et.al. | 2502.19614 | null |
2025-02-23 | Rewards-based image analysis in microscopy | Kamyar Barakati et.al. | 2502.18522 | null |
2025-02-25 | Multi-Perspective Data Augmentation for Few-shot Object Detection | Anh-Khoa Nguyen Vu et.al. | 2502.18195 | null |
2025-02-25 | Progressive Local Alignment for Medical Multimodal Pre-training | Huimin Yan et.al. | 2502.18047 | null |
2025-02-25 | Automatic Vehicle Detection using DETR: A Transformer-Based Approach for Navigating Treacherous Roads | Istiaq Ahmed Fahad et.al. | 2502.17843 | null |
2025-02-24 | Semi-Supervised Weed Detection in Vegetable Fields: In-domain and Cross-domain Experiments | Boyang Deng et.al. | 2502.17673 | null |
2025-02-24 | Experimental validation of UAV search and detection system in real wilderness environment | Stella Dumenčić et.al. | 2502.17372 | null |
2025-02-24 | LCV2I: Communication-Efficient and High-Performance Collaborative Perception Framework with Low-Resolution LiDAR | Xinxin Feng et.al. | 2502.17039 | null |
2025-02-24 | Sarang at DEFACTIFY 4.0: Detecting AI-Generated Text Using Noised Data and an Ensemble of DeBERTa Models | Avinash Trivedi et.al. | 2502.16857 | null |
2025-02-23 | Geometry-Aware 3D Salient Object Detection Network | Chen Wang et.al. | 2502.16488 | null |
2025-02-26 | MQADet: A Plug-and-Play Paradigm for Enhancing Open-Vocabulary Object Detection via Multimodal Question Answering | Caixiong Li et.al. | 2502.16486 | null |
2025-02-23 | Cross-domain Few-shot Object Detection with Multi-modal Textual Enrichment | Zeyu Shangguan et.al. | 2502.16469 | null |
2025-02-23 | Deep learning approaches to surgical video segmentation and object detection: A Scoping Review | Devanish N. Kamtam et.al. | 2502.16459 | null |
2025-02-22 | FeatSharp: Your Vision Model Features, Sharper | Mike Ranzinger et.al. | 2502.16025 | null |
2025-02-21 | Generative AI Framework for 3D Object Generation in Augmented Reality | Majid Behravan et.al. | 2502.15869 | null |
2025-02-21 | Machine-generated text detection prevents language model collapse | George Drayson et.al. | 2502.15654 | null |
2025-02-21 | Depth-aware Fusion Method based on Image and 4D Radar Spectrum for 3D Object Detection | Yue Sun et.al. | 2502.15516 | null |
2025-02-21 | Q-PETR: Quant-aware Position Embedding Transformation for Multi-View 3D Object Detection | Jiangyong Yu et.al. | 2502.15488 | null |
2025-02-21 | PFSD: A Multi-Modal Pedestrian-Focus Scene Dataset for Rich Tasks in Semi-Structured Environments | Yueting Liu et.al. | 2502.15342 | null |
2025-02-20 | Synth It Like KITTI: Synthetic Data Generation for Object Detection in Driving Scenarios | Richard Marcus et.al. | 2502.15076 | null |
2025-02-20 | YOLOv12: A Breakdown of the Key Architectural Features | Mujadded Al Rabbani Alif et.al. | 2502.14740 | null |
2025-02-20 | LXLv2: Enhanced LiDAR Excluded Lean 3D Object Detection with Fusion of 4D Radar and Camera | Weiyi Xiong et.al. | 2502.14503 | null |
2025-02-20 | ODVerse33: Is the New YOLO Version Always Better? A Multi Domain benchmark from YOLO v5 to v11 | Tianyou Jiang et.al. | 2502.14314 | null |
2025-02-19 | PedDet: Adaptive Spectral Optimization for Multimodal Pedestrian Detection | Rui Zhao et.al. | 2502.14063 | link |
2025-02-19 | Image compositing is all you need for data augmentation | Ang Jia Ning Shermaine et.al. | 2502.13936 | null |
2025-02-19 | MSVCOD:A Large-Scale Multi-Scene Dataset for Video Camouflage Object Detection | Shuyong Gao et.al. | 2502.13859 | null |
2025-02-19 | An Overall Real-Time Mechanism for Classification and Quality Evaluation of Rice | Wanke Xia et.al. | 2502.13764 | null |
2025-02-18 | Multiple Distribution Shift -- Aerial (MDS-A): A Dataset for Test-Time Error Detection and Model Adaptation | Noel Ngu et.al. | 2502.13289 | null |
2025-02-18 | RobuRCDet: Enhancing Robustness of Radar-Camera Fusion in Bird's Eye View for 3D Object Detection | Jingtong Yue et.al. | 2502.13071 | null |
2025-02-18 | Task-Oriented Semantic Communication for Stereo-Vision 3D Object Detection | Zijian Cao et.al. | 2502.12735 | null |
2025-02-18 | Iron Sharpens Iron: Defending Against Attacks in Machine-Generated Text Detection with Adversarial Training | Yuanfan Li et.al. | 2502.12734 | null |
2025-02-18 | DAMamba: Vision State Space Model with Dynamic Adaptive Scan | Tanzhe Li et.al. | 2502.12627 | null |
2025-02-18 | Who Writes What: Unveiling the Impact of Author Roles on AI-generated Text Detection | Jiatao Li et.al. | 2502.12611 | null |
2025-02-18 | Gaseous Object Detection | Kailai Zhou et.al. | 2502.12415 | null |
2025-02-17 | AI-generated Text Detection with a GLTR-based Approach | Lucía Yan Wu et.al. | 2502.12064 | null |
2025-02-17 | Enhancing Transparent Object Pose Estimation: A Fusion of GDR-Net and Edge Detection | Tessa Pulli et.al. | 2502.12027 | null |
2025-02-17 | ExaGPT: Example-Based Machine-Generated Text Detection for Human Interpretability | Ryuto Koike et.al. | 2502.11336 | null |
2025-02-16 | DAViMNet: SSMs-Based Domain Adaptive Object Detection | A. Enes Doruk et.al. | 2502.11178 | null |
2025-02-15 | CLoCKDistill: Consistent Location-and-Context-aware Knowledge Distillation for DETRs | Qizhen Lan et.al. | 2502.10683 | null |
2025-02-14 | Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding | Wenxuan Guo et.al. | 2502.10392 | null |
2025-02-14 | Object Detection and Tracking | Md Pranto et.al. | 2502.10310 | null |
2025-02-14 | Artificial Intelligence to Assess Dental Findings from Panoramic Radiographs -- A Multinational Study | Yin-Chih Chelsea Wang et.al. | 2502.10277 | null |
2025-02-13 | Instance Segmentation of Scene Sketches Using Natural Image Priors | Mia Tang et.al. | 2502.09608 | null |
2025-02-13 | Wholly-WOOD: Wholly Leveraging Diversified-quality Labels for Weakly-supervised Oriented Object Detection | Yi Yu et.al. | 2502.09471 | link |
2025-02-13 | Mitigating the Impact of Prominent Position Shift in Drone-based RGBT Object Detection | Yan Zhang et.al. | 2502.09311 | null |
2025-02-13 | Billet Number Recognition Based on Test-Time Adaptation | Yuan Wei et.al. | 2502.09026 | null |
2025-02-12 | Uncertainty Aware Human-machine Collaboration in Camouflaged Object Detection | Ziyue Yang et.al. | 2502.08373 | link |
2025-02-12 | Modification and Generated-Text Detection: Achieving Dual Detection Capabilities for the Outputs of LLM by Watermark | Yuhang Cai et.al. | 2502.08332 | null |
2025-02-12 | Plantation Monitoring Using Drone Images: A Dataset and Performance Review | Yashwanth Karumanchi et.al. | 2502.08233 | null |
2025-02-12 | Take What You Need: Flexible Multi-Task Semantic Communications with Channel Adaptation | Xiang Chen et.al. | 2502.08221 | null |
2025-02-13 | SARChat-Bench-2M: A Multi-Task Vision-Language Benchmark for SAR Image Interpretation | Zhiming Ma et.al. | 2502.08168 | null |
2025-02-12 | Knowledge Swapping via Learning and Unlearning | Mingyu Xing et.al. | 2502.08075 | null |
2025-02-11 | Visual-based spatial audio generation system for multi-speaker environments | Xiaojing Liu et.al. | 2502.07538 | null |
2025-02-11 | Quantitative Analysis of Objects in Prisoner Artworks | Thea Christoffersen et.al. | 2502.07440 | null |
2025-02-11 | Fast-COS: A Fast One-Stage Object Detector Based on Reparameterized Attention Vision Transformer for Autonomous Driving | Novendra Setyawan et.al. | 2502.07417 | null |
2025-02-11 | Multi-Task-oriented Nighttime Haze Imaging Enhancer for Vision-driven Measurement Systems | Ai Chen et.al. | 2502.07351 | link |
2025-02-11 | SparseFormer: Detecting Objects in HRW Shots via Sparse Vision Transformer | Wenxi Li et.al. | 2502.07216 | null |
2025-02-11 | Dense Object Detection Based on De-homogenized Queries | Yueming Huang et.al. | 2502.07194 | null |
2025-02-11 | Foreign-Object Detection in High-Voltage Transmission Line Based on Improved YOLOv8m | Zhenyue Wang et.al. | 2502.07175 | null |
2025-02-11 | A Survey on Mamba Architecture for Vision Applications | Fady Ibrahim et.al. | 2502.07161 | null |
2025-02-10 | Multimodal Search on a Line | Jared Coleman et.al. | 2502.07000 | null |
2025-02-10 | AgilePilot: DRL-Based Drone Agent for Real-Time Motion Planning in Dynamic Environments by Leveraging Object Detection | Roohan Ahmed Khan et.al. | 2502.06725 | null |
2025-02-10 | EdgeMLBalancer: A Self-Adaptive Approach for Dynamic Model Switching on Resource-Constrained Edge Devices | Akhila Matathammal et.al. | 2502.06493 | null |
2025-02-10 | PLATTER: A Page-Level Handwritten Text Recognition System for Indic Scripts | Badri Vishal Kasuba et.al. | 2502.06172 | null |
2025-02-10 | Enhancing Document Key Information Localization Through Data Augmentation | Yue Dai et.al. | 2502.06132 | null |
2025-02-10 | Improved YOLOv5s model for key components detection of power transmission lines | Chen Chen et.al. | 2502.06127 | null |
2025-02-10 | A Novel Multi-Teacher Knowledge Distillation for Real-Time Object Detection using 4D Radar | Seung-Hyun Song et.al. | 2502.06114 | null |
2025-02-09 | Training-free Anomaly Event Detection via LLM-guided Symbolic Pattern Discovery | Yuhui Zeng et.al. | 2502.05843 | null |
2025-02-08 | Demystifying Catastrophic Forgetting in Two-Stage Incremental Object Detector | Qirui Wu et.al. | 2502.05540 | null |
2025-02-07 | Invizo: Arabic Handwritten Document Optical Character Recognition Solution | Alhossien Waly et.al. | 2502.05277 | null |
2025-02-07 | LP-DETR: Layer-wise Progressive Relations for Object Detection | Zhengjian Kang et.al. | 2502.05147 | null |
2025-02-07 | Counting Fish with Temporal Representations of Sonar Video | Kai Van Brunt et.al. | 2502.05129 | null |
2025-02-07 | DetVPCC: RoI-based Point Cloud Sequence Compression for 3D Object Detection | Mingxuan Yan et.al. | 2502.04804 | null |
2025-02-07 | MHAF-YOLO: Multi-Branch Heterogeneous Auxiliary Fusion YOLO for accurate object detection | Zhiqiang Yang et.al. | 2502.04656 | null |
2025-02-07 | AIQViT: Architecture-Informed Post-Training Quantization for Vision Transformers | Runqing Jiang et.al. | 2502.04628 | null |
2025-02-06 | An Optimized YOLOv5 Based Approach For Real-time Vehicle Detection At Road Intersections Using Fisheye Cameras | Md. Jahin Alam et.al. | 2502.04566 | null |
2025-02-06 | Group-Adaptive Threshold Optimization for Robust AI-Generated Text Detection | Minseok Jung et.al. | 2502.04528 | null |
2025-02-06 | OneTrack-M: A multitask approach to transformer-based MOT models | Luiz C. S. de Araujo et.al. | 2502.04478 | null |
2025-02-07 | Point2RBox-v2: Rethinking Point-supervised Oriented Object Detection with Spatial Layout Among Instances | Yi Yu et.al. | 2502.04268 | null |
2025-02-06 | An object detection approach for lane change and overtake detection from motion profiles | Andrea Benericetti et.al. | 2502.04244 | null |
2025-02-06 | YOLOv4: A Breakthrough in Real-Time Object Detection | Athulya Sundaresan Geetha et.al. | 2502.04161 | null |
2025-02-06 | Advanced Object Detection and Pose Estimation with Hybrid Task Cascade and High-Resolution Networks | Yuhui Jin et.al. | 2502.03877 | null |
2025-02-06 | Pursuing Better Decision Boundaries for Long-Tailed Object Detection via Category Information Amount | Yanbiao Ma et.al. | 2502.03852 | null |
2025-02-06 | Single-Domain Generalized Object Detection by Balancing Domain Diversity and Invariance | Zhenwei He et.al. | 2502.03835 | null |
2025-02-06 | UAV Cognitive Semantic Communications Enabled by Knowledge Graph for Robust Object Detection | Xi Song et.al. | 2502.03761 | null |
2025-02-06 | RAMOTS: A Real-Time System for Aerial Multi-Object Tracking based on Deep Learning and Big Data Technology | Nhat-Tan Do et.al. | 2502.03760 | null |
2025-02-05 | An Empirical Study of Methods for Small Object Detection from Satellite Imagery | Xiaohui Yuan et.al. | 2502.03674 | null |
2025-02-05 | Gompertz Linear Units: Leveraging Asymmetry for Enhanced Learning Dynamics | Indrashis Das et.al. | 2502.03654 | null |
2025-02-05 | RoboGrasp: A Universal Grasping Policy for Robust Robotic Control | Yiqi Huang et.al. | 2502.03072 | null |
2025-02-05 | Enhancing Quantum-ready QUBO-based Suppression for Object Detection with Appearance and Confidence Features | Keiichiro Yamamura et.al. | 2502.02895 | null |
2025-02-05 | RS-YOLOX: A High Precision Detector for Object Detection in Satellite Remote Sensing Images | Lei Yang et.al. | 2502.02850 | null |
2025-02-04 | Learning the RoPEs: Better 2D and 3D Position Encodings with STRING | Connor Schenck et.al. | 2502.02562 | null |
2025-02-04 | Uncertainty Quantification for Collaborative Object Detection Under Adversarial Attacks | Huiqun Huang et.al. | 2502.02537 | null |
2025-02-04 | Improving Generalization Ability for 3D Object Detection by Learning Sparsity-invariant Features | Hsin-Cheng Lu et.al. | 2502.02322 | null |
2025-02-04 | From Fog to Failure: How Dehazing Can Harm Clear Image Object Detection | Ashutosh Kumar et.al. | 2502.02027 | null |
2025-02-04 | Memory Efficient Transformer Adapter for Dense Predictions | Dong Zhang et.al. | 2502.01962 | null |
2025-02-04 | INTACT: Inducing Noise Tolerance through Adversarial Curriculum Training for LiDAR-based Safety-Critical Perception and Autonomy | Nastaran Darabi et.al. | 2502.01896 | null |
2025-02-04 | SimBEV: A Synthetic Multi-Task Multi-Sensor Driving Data Generation Tool and Dataset | Goodarz Mehr et.al. | 2502.01894 | null |
2025-02-03 | Reliability-Driven LiDAR-Camera Fusion for Robust 3D Object Detection | Reza Sadeghian et.al. | 2502.01856 | null |
2025-02-03 | GauCho: Gaussian Distributions with Cholesky Decomposition for Oriented Object Detection | Jeffri Murrugarra-LLerena et.al. | 2502.01565 | null |
2025-02-03 | Human Body Restoration with One-Step Diffusion Model and A New Benchmark | Jue Gong et.al. | 2502.01411 | null |
2025-01-31 | Let Human Sketches Help: Empowering Challenging Image Segmentation Task with Freehand Sketches | Ying Zang et.al. | 2501.19329 | null |
2025-01-31 | Beyond checkmate: exploring the creative chokepoints in AI text | Nafis Irtiza Tripto et.al. | 2501.19301 | link |
2025-01-31 | GO: The Great Outdoors Multimodal Dataset | Peng Jiang et.al. | 2501.19274 | null |
2025-01-31 | Adversarial Attacks on AI-Generated Text Detection Models: A Token Probability-Based Approach Using Embeddings | Ahmed K. Kadhim et.al. | 2501.18998 | null |
2025-01-31 | Early Diagnosis and Severity Assessment of Weligama Coconut Leaf Wilt Disease and Coconut Caterpillar Infestation using Deep Learning-based Image Processing Techniques | Samitha Vidhanaarachchi et.al. | 2501.18835 | null |
2025-01-30 | Tuning Event Camera Biases Heuristic for Object Detection Applications in Staring Scenarios | David El-Chai Ben-Ezra et.al. | 2501.18788 | null |
2025-01-30 | Adaptive Object Detection for Indoor Navigation Assistance: A Performance Evaluation of Real-Time Algorithms | Abhinav Pratap et.al. | 2501.18444 | null |
2025-01-29 | Real Time Scheduling Framework for Multi Object Detection via Spiking Neural Networks | Donghwa Kang et.al. | 2501.18412 | null |
2025-01-30 | IROAM: Improving Roadside Monocular 3D Object Detection Learning from Autonomous Vehicle Data Domain | Zhe Wang et.al. | 2501.18162 | null |
2025-02-03 | Efficient Feature Fusion for UAV Object Detection | Xudong Wang et.al. | 2501.17983 | null |
2025-01-29 | TransRAD: Retentive Vision Transformer for Enhanced Radar Object Detection | Lei Cheng et.al. | 2501.17977 | link |
2025-01-28 | Object Detection with Deep Learning for Rare Event Search in the GADGET II TPC | Tyler Wheeler et.al. | 2501.17892 | null |
2025-01-29 | Detection of Oscillation-like Patterns in Eclipsing Binary Light Curves using Neural Network-based Object Detection Algorithms | Burak Ulaş et.al. | 2501.17538 | null |
2025-01-30 | Assessing the Capability of YOLO- and Transformer-based Object Detectors for Real-time Weed Detection | Alicia Allmendinger et.al. | 2501.17387 | null |
2025-01-28 | DINOSTAR: Deep Iterative Neural Object Detector Self-Supervised Training for Roadside LiDAR Applications | Muhammad Shahbaz et.al. | 2501.17076 | null |
2025-01-28 | Contextual Self-paced Learning for Weakly Supervised Spatio-Temporal Video Grounding | Akash Kumar et.al. | 2501.17053 | null |
2025-01-28 | Approach Towards Semi-Automated Certification for Low Criticality ML-Enabled Airborne Applications | Chandrasekar Sridhar et.al. | 2501.17028 | null |
2025-01-28 | Modulating CNN Features with Pre-Trained ViT Representations for Open-Vocabulary Object Detection | Xiangyu Gao et.al. | 2501.16981 | null |
2025-01-28 | B-FPGM: Lightweight Face Detection via Bayesian-Optimized Soft FPGM Pruning | Nikolaos Kaparinos et.al. | 2501.16917 | null |
2025-01-28 | SSF-PAN: Semantic Scene Flow-Based Perception for Autonomous Navigation in Traffic Scenarios | Yinqi Chen et.al. | 2501.16754 | null |
2025-01-28 | DebugAgent: Efficient and Interpretable Error Slice Discovery for Comprehensive Model Debugging | Muxi Chen et.al. | 2501.16751 | null |
2025-01-28 | DFCon: Attention-Driven Supervised Contrastive Learning for Robust Deepfake Detection | MD Sadik Hossain Shanto et.al. | 2501.16704 | null |
2025-01-27 | Efficient Object Detection of Marine Debris using Pruned YOLO Model | Abi Aryaza et.al. | 2501.16571 | null |
2025-01-27 | Object Detection for Medical Image Analysis: Insights from the RT-DETR Model | Weijie He et.al. | 2501.16469 | null |
2025-01-27 | The Linear Attention Resurrection in Vision Transformer | Chuanyang Zheng et.al. | 2501.16182 | null |
2025-01-27 | Real-Time Brain Tumor Detection in Intraoperative Ultrasound Using YOLO11: From Model Training to Deployment in the Operating Room | Santiago Cepeda et.al. | 2501.15994 | null |
2025-01-26 | Classifying Deepfakes Using Swin Transformers | Aprille J. Xi et.al. | 2501.15656 | null |
2025-01-26 | A Privacy Enhancing Technique to Evade Detection by Street Video Cameras Without Using Adversarial Accessories | Jacob Shams et.al. | 2501.15653 | null |
2025-01-26 | Breaking the SSL-AL Barrier: A Synergistic Semi-Supervised Active Learning Framework for 3D Object Detection | Zengran Wang et.al. | 2501.15449 | null |
2025-01-26 | FAVbot: An Autonomous Target Tracking Micro-Robot with Frequency Actuation Control | Zhijian Hao et.al. | 2501.15426 | null |
2025-01-26 | Doracamom: Joint 3D Detection and Occupancy Prediction with Multi-view 4D Radars and Cameras for Omnidirectional Perception | Lianqing Zheng et.al. | 2501.15394 | null |
2025-01-26 | iFormer: Integrating ConvNet and Transformer for Mobile Application | Chuanyang Zheng et.al. | 2501.15369 | link |
2025-01-25 | Explainable YOLO-Based Dyslexia Detection in Synthetic Handwriting Data | Nora Fink et.al. | 2501.15263 | null |
2025-01-25 | SpikSSD: Better Extraction and Fusion for Object Detection with Spiking Neuron Networks | Yimeng Fan et.al. | 2501.15151 | link |
2025-01-24 | LiDAR-Based Vehicle Detection and Tracking for Autonomous Racing | Marcello Cellina et.al. | 2501.14502 | null |
2025-01-24 | TD-RD: A Top-Down Benchmark with Real-Time Framework for Road Damage Detection | Xi Xiao et.al. | 2501.14302 | null |
2025-01-24 | A Comprehensive Framework for Semantic Similarity Detection Using Transformer Architectures and Enhanced Ensemble Techniques | Lifu Gao et.al. | 2501.14288 | null |
2025-01-23 | Efficient Precision Control in Object Detection Models for Enhanced and Reliable Ovarian Follicle Counting | Vincent Blot et.al. | 2501.14036 | null |
2025-01-23 | PointOBB-v3: Expanding Performance Boundaries of Single Point-Supervised Oriented Object Detection | Peiyuan Zhang et.al. | 2501.13898 | link |
2025-01-23 | First Lessons Learned of an Artificial Intelligence Robotic System for Autonomous Coarse Waste Recycling Using Multispectral Imaging-Based Methods | Timo Lange et.al. | 2501.13855 | null |
2025-01-23 | Integrating Causality with Neurochaos Learning: Proposed Approach and Research Agenda | Nanjangud C. Narendra et.al. | 2501.13763 | null |
2025-01-23 | You Only Crash Once v2: Perceptually Consistent Strong Features for One-Stage Domain Adaptive Detection of Space Terrain | Timothy Chase Jr et.al. | 2501.13725 | null |
2025-01-23 | YOLO11-JDE: Fast and Accurate Multi-Object Tracking with Self-Supervised Re-ID | Iñaki Erregue et.al. | 2501.13710 | link |
2025-01-23 | Emotion estimation from video footage with LSTM | Samer Attrah et.al. | 2501.13432 | link |
2025-01-23 | Multi-aspect Knowledge Distillation with Large Language Model | Taegyeong Lee et.al. | 2501.13341 | null |
2025-01-22 | MONA: Moving Object Detection from Videos Shot by Dynamic Camera | Boxun Hu et.al. | 2501.13183 | null |
2025-01-21 | Large-image Object Detection for Fine-grained Recognition of Punches Patterns in Medieval Panel Painting | Josh Bruegger et.al. | 2501.12489 | link |
2025-01-21 | TOFFE -- Temporally-binned Object Flow from Events for High-speed and Energy-Efficient Object Detection and Tracking | Adarsh Kumar Kosta et.al. | 2501.12482 | null |
2025-01-21 | Benchmarking Image Perturbations for Testing Automated Driving Assistance Systems | Stefano Carlo Lambertenghi et.al. | 2501.12269 | null |
2025-01-21 | DLEN: Dual Branch of Transformer for Low-Light Image Enhancement in Dual Domains | Junyu Xia et.al. | 2501.12235 | null |
2025-01-21 | SVGS-DSGAT: An IoT-Enabled Innovation in Underwater Robotic Object Detection Technology | Dongli Wu et.al. | 2501.12169 | null |
2025-01-21 | Co-Paced Learning Strategy Based on Confidence for Flying Bird Object Detection Model Training | Zi-Wei Sun et.al. | 2501.12071 | null |
2025-01-21 | SMamba: Sparse Mamba for Event-based Object Detection | Nan Yang et.al. | 2501.11971 | null |
2025-01-21 | LuxVeri at GenAI Detection Task 1: Inverse Perplexity Weighted Ensemble for Robust Detection of AI-Generated Text across English and Multilingual Contexts | Md Kamrujjaman Mobin et.al. | 2501.11914 | null |
2025-01-20 | Synthetic Data Can Mislead Evaluations: Membership Inference as Machine Text Detection | Ali Naseh et.al. | 2501.11786 | null |
2025-01-20 | Everyone's Privacy Matters! An Analysis of Privacy Leakage from Real-World Facial Images on Twitter and Associated User Behaviors | Yuqi Niu et.al. | 2501.11756 | null |
2025-01-20 | Automatic Labelling & Semantic Segmentation with 4D Radar Tensors | Botao Sun et.al. | 2501.11351 | null |
2025-01-20 | Enhancing SAR Object Detection with Self-Supervised Pre-training on Masked Auto-Encoders | Xinyang Pu et.al. | 2501.11249 | null |
2025-01-17 | MutualForce: Mutual-Aware Enhancement for 4D Radar-LiDAR 3D Object Detection | Xiangyuan Peng et.al. | 2501.10266 | null |
2025-01-17 | Leveraging Confident Image Regions for Source-Free Domain-Adaptive Object Detection | Mohamed Lamine Mekhalfi et.al. | 2501.10081 | null |
2025-01-17 | One-D-Piece: Image Tokenizer Meets Quality-Controllable Compression | Keita Miwa et.al. | 2501.10064 | null |
2025-01-17 | LWGANet: A Lightweight Group Attention Backbone for Remote Sensing Visual Tasks | Wei Lu et.al. | 2501.10040 | link |
2025-01-17 | FLORA: Formal Language Model Enables Robust Training-free Zero-shot Object Referring Analysis | Zhe Chen et.al. | 2501.09887 | null |
2025-01-16 | Qwen it detect machine-generated text? | Teodor-George Marchitan et.al. | 2501.09813 | link |
2025-01-16 | A Simple Aerial Detection Baseline of Multimodal Language Models | Qingyun Li et.al. | 2501.09720 | link |
2025-01-16 | Practical Continual Forgetting for Pre-trained Vision Models | Hongbo Zhao et.al. | 2501.09705 | link |
2025-01-16 | Exploring AI-based System Design for Pixel-level Protected Health Information Detection in Medical Images | Tuan Truong et.al. | 2501.09552 | null |
2025-01-16 | Multi-task deep-learning for sleep event detection and stage classification | Adriana Anido-Alonso et.al. | 2501.09519 | link |
2025-01-16 | The Devil is in the Details: Simple Remedies for Image-to-LiDAR Representation Learning | Wonjun Jo et.al. | 2501.09485 | null |
2025-01-16 | MonoSOWA: Scalable monocular 3D Object detector Without human Annotations | Jan Skvrna et.al. | 2501.09481 | null |
2025-01-16 | RE-POSE: Synergizing Reinforcement Learning-Based Partitioning and Offloading for Edge Object Detection | Jianrui Shi et.al. | 2501.09465 | null |
2025-01-16 | On the Relation between Optical Aperture and Automotive Object Detection | Ofer Bar-Shalom et.al. | 2501.09456 | null |
2025-01-16 | SoccerSynth-Detection: A Synthetic Dataset for Soccer Player Detection | Haobin Qin et.al. | 2501.09281 | null |
2025-01-15 | GenAI Content Detection Task 3: Cross-Domain Machine-Generated Text Detection Challenge | Liam Dugan et.al. | 2501.08913 | null |
2025-01-15 | PACF: Prototype Augmented Compact Features for Improving Domain Adaptive Object Detection | Chenguang Liu et.al. | 2501.08605 | null |
2025-01-14 | Predicting Performance of Object Detection Models in Electron Microscopy Using Random Forests | Ni Li et.al. | 2501.08465 | link |
2025-01-14 | Bootstrapping Corner Cases: High-Resolution Inpainting for Safety Critical Detect and Avoid for Automated Flying | Jonathan Lyhs et.al. | 2501.08142 | null |
2025-01-14 | Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation | Yunzhi Zhuge et.al. | 2501.07806 | link |
2025-01-14 | Parameter-Inverted Image Pyramid Networks for Visual Perception and Multimodal Understanding | Zhaokai Wang et.al. | 2501.07783 | link |
2025-01-13 | SST-EM: Advanced Metrics for Evaluating Semantic, Spatial and Temporal Aspects in Video Editing | Varun Biyyala et.al. | 2501.07554 | link |
2025-01-13 | ML Mule: Mobile-Driven Context-Aware Collaborative Learning | Haoxiang Yu et.al. | 2501.07536 | null |
2025-01-13 | TimberVision: A Multi-Task Dataset and Framework for Log-Component Segmentation and Tracking in Autonomous Forestry Operations | Daniel Steininger et.al. | 2501.07360 | null |
2025-01-13 | Toward Realistic Camouflaged Object Detection: Benchmarks and Method | Zhimeng Xin et.al. | 2501.07297 | link |
2025-01-13 | Dual Scale-aware Adaptive Masked Knowledge Distillation for Object Detection | ZhouRui Zhang et.al. | 2501.07101 | null |
2025-01-11 | CoreNet: Conflict Resolution Network for Point-Pixel Misalignment and Sub-Task Suppression of 3D LiDAR-Camera Object Detection | Yiheng Li et.al. | 2501.06550 | link |
2025-01-11 | CPDR: Towards Highly-Efficient Salient Object Detection via Crossed Post-decoder Refinement | Yijie Li et.al. | 2501.06441 | null |
2025-01-11 | FocusDD: Real-World Scene Infusion for Robust Dataset Distillation | Youbing Hu et.al. | 2501.06405 | null |
2025-01-10 | A Holistically Point-guided Text Framework for Weakly-Supervised Camouflaged Object Detection | Tsui Qin Mok et.al. | 2501.06038 | null |
2025-01-10 | Minimizing Occlusion Effect on Multi-View Camera Perception in BEV with Multi-Sensor Fusion | Sanjay Kumar et.al. | 2501.05997 | null |
2025-01-10 | EDNet: Edge-Optimized Small Target Detection in UAV Imagery -- Faster Context Attention, Better Feature Fusion, and Hardware Acceleration | Zhifan Song et.al. | 2501.05885 | null |
2025-01-10 | Automatic detection of single-electron regime of quantum dots and definition of virtual gates using U-Net and clustering | Yui Muto et.al. | 2501.05878 | null |
2025-01-10 | Zero-shot Shark Tracking and Biometrics from Aerial Imagery | Chinmay K Lalgudi et.al. | 2501.05717 | null |
2025-01-10 | Dark Energy Survey Year 6 Results: Synthetic-source Injection Across the Full Survey Using Balrog | D. Anbajagane et.al. | 2501.05683 | null |
2025-01-09 | Approximate Supervised Object Distance Estimation on Unmanned Surface Vehicles | Benjamin Kiefer et.al. | 2501.05567 | null |
2025-01-09 | Performance of YOLOv7 in Kitchen Safety While Handling Knife | Athulya Sundaresan Geetha et.al. | 2501.05399 | null |
2025-01-09 | The global consensus on the risk management of autonomous driving | Sebastian Krügel et.al. | 2501.05391 | null |
2025-01-09 | A Systematic Literature Review on Deep Learning-based Depth Estimation in Computer Vision | Ali Rohan et.al. | 2501.05147 | null |
2025-01-09 | CorrDiff: Adaptive Delay-aware Detector with Temporal Cue Inputs for Real-time Object Detection | Xiang Zhang et.al. | 2501.05132 | null |
2025-01-09 | AD-L-JEPA: Self-Supervised Spatial World Models with Joint Embedding Predictive Architecture for Autonomous Driving with LiDAR Data | Haoran Zhu et.al. | 2501.04969 | link |
2025-01-09 | Online Continual Learning: A Systematic Literature Review of Approaches, Challenges, and Benchmarks | Seyed Amir Bidaki et.al. | 2501.04897 | link |
2025-01-08 | Video Summarisation with Incident and Context Information using Generative AI | Ulindu De Silva et.al. | 2501.04764 | null |
2025-01-08 | Boosting Salient Object Detection with Knowledge Distillated from Large Foundation Models | Miaoyang He et.al. | 2501.04582 | null |
2025-01-08 | Combining YOLO and Visual Rhythm for Vehicle Counting | Victor Nascimento Ribeiro et.al. | 2501.04534 | link |
2025-01-08 | RSAR: Restricted State Angle Resolver and Rotated SAR Benchmark | Xin Zhang et.al. | 2501.04440 | link |
2025-01-08 | Integrating LLMs with ITS: Recent Advances, Potentials, Challenges, and Future Directions | Doaa Mahmud et.al. | 2501.04437 | null |
2025-01-08 | FGU3R: Fine-Grained Fusion via Unified 3D Representation for Multimodal 3D Object Detection | Guoxin Zhang et.al. | 2501.04373 | null |
2025-01-08 | H-MBA: Hierarchical MamBa Adaptation for Multi-Modal Video Understanding in Autonomous Driving | Siran Chen et.al. | 2501.04302 | null |
2025-01-08 | UPAQ: A Framework for Real-Time and Energy-Efficient 3D Object Detection in Autonomous Vehicles | Abhishek Balasubramaniam et.al. | 2501.04213 | null |
2025-01-07 | LargeAD: Large-Scale Cross-Sensor Data Pretraining for Autonomous Driving | Lingdong Kong et.al. | 2501.04005 | null |
2025-01-07 | Not all tokens are created equal: Perplexity Attention Weighted Networks for AI generated text detection | Pablo Miralles-González et.al. | 2501.03940 | null |
2025-01-07 | Visual question answering: from early developments to recent advances -- a survey | Ngoc Dung Huynh et.al. | 2501.03939 | null |
2025-01-07 | SCC-YOLO: An Improved Object Detector for Assisting in Brain Tumor Diagnosis | Runci Bai et.al. | 2501.03836 | null |
2025-01-07 | Strip R-CNN: Large Strip Convolution for Remote Sensing Object Detection | Xinbin Yuan et.al. | 2501.03775 | link |
2025-01-07 | AuxDepthNet: Real-Time Monocular 3D Object Detection with Depth-Sensitive Features | Ruochen Zhang et.al. | 2501.03700 | null |
2025-01-07 | Anomaly Triplet-Net: Progress Recognition Model Using Deep Metric Learning Considering Occlusion for Manual Assembly Work | Takumi Kitsukawa et.al. | 2501.03533 | null |
2025-01-07 | SceneVTG++: Controllable Multilingual Visual Text Generation in the Wild | Jiawei Liu et.al. | 2501.02962 | null |
2025-01-05 | Multispectral Pedestrian Detection with Sparsely Annotated Label | Chan Lee et.al. | 2501.02640 | null |
2025-01-05 | Generalization-Enhanced Few-Shot Object Detection in Remote Sensing | Hui Lin et.al. | 2501.02474 | link |
2025-01-04 | Who Wrote This? Zero-Shot Statistical Tests for LLM-Generated Text Detection using Finite Sample Concentration Inequalities | Tara Radvand et.al. | 2501.02406 | null |
2025-01-04 | V2X-DGPE: Addressing Domain Gaps and Pose Errors for Robust Collaborative 3D Object Detection | Sichao Wang et.al. | 2501.02363 | null |
2025-01-04 | Accurate Crop Yield Estimation of Blueberries using Deep Learning and Smart Drones | Hieu D. Nguyen et.al. | 2501.02344 | null |
2025-01-04 | On The Causal Network Of Face-selective Regions In Human Brain During Movie Watching | Ali Bavafa et.al. | 2501.02333 | null |
2025-01-04 | RadarNeXt: Real-Time and Reliable 3D Object Detector Based On 4D mmWave Imaging Radar | Liye Jia et.al. | 2501.02314 | null |
2025-01-03 | A Separable Self-attention Inspired by the State Space Model for Computer Vision | Juntao Zhang et.al. | 2501.02040 | link |
2025-01-03 | UAV-DETR: Efficient End-to-End Object Detection for Unmanned Aerial Vehicle Imagery | Huaxiang Zhang et.al. | 2501.01855 | null |
2025-01-03 | Dual Mutual Learning Network with Global-local Awareness for RGB-D Salient Object Detection | Kang Yi et.al. | 2501.01648 | null |
2025-01-02 | A Multi-task Supervised Compression Model for Split Computing | Yoshitomo Matsubara et.al. | 2501.01420 | link |
2025-01-02 | MSC-Bench: Benchmarking and Analyzing Multi-Sensor Corruption for Driving Perception | Xiaoshuai Hao et.al. | 2501.01037 | null |
2025-01-01 | A Novel Approach using CapsNet and Deep Belief Network for Detection and Identification of Oral Leukopenia | Hirthik Mathesh GV et.al. | 2501.00876 | null |
2025-01-01 | NMM-HRI: Natural Multi-modal Human-Robot Interaction with Voice and Deictic Posture via Large Language Model | Yuzhi Lai et.al. | 2501.00785 | null |
2024-12-31 | Gaussian Building Mesh (GBM): Extract a Building's 3D Mesh with Google Earth and Gaussian Splatting | Kyle Gao et.al. | 2501.00625 | null |
2024-12-31 | B2Net: Camouflaged Object Detection via Boundary Aware and Boundary Fusion | Junmin Cai et.al. | 2501.00426 | null |
2024-12-31 | Research on vehicle detection based on improved YOLOv8 network | Haocheng Guo et.al. | 2501.00300 | null |
2024-12-30 | TiGDistill-BEV: Multi-view BEV 3D Object Detection via Target Inner-Geometry Learning Distillation | Shaoqing Xu et.al. | 2412.20911 | link |
2024-12-30 | Humanoid Robot RHP Friends: Seamless Combination of Autonomous and Teleoperated Tasks in a Nursing Context | Mehdi Benallegue et.al. | 2412.20770 | null |
2024-12-30 | Solar Filaments Detection using Active Contours Without Edges | Sanmoy Bandyopadhyay et.al. | 2412.20749 | null |
2024-12-30 | Open-Set Object Detection By Aligning Known Class Representations | Hiran Sarkar et.al. | 2412.20701 | null |
2024-12-30 | SM3Det: A Unified Model for Multi-Modal Remote Sensing Object Detection | Yuxuan Li et.al. | 2412.20665 | link |
2024-12-30 | YOLO-UniOW: Efficient Universal Open-World Object Detection | Lihao Liu et.al. | 2412.20645 | link |
2024-12-29 | Controlling Out-of-Domain Gaps in LLMs for Genre Classification and Generated Text Detection | Dmitri Roussinov et.al. | 2412.20595 | link |
2024-12-29 | A Novel FPGA-based CNN Hardware Accelerator: Optimization for Convolutional Layers using Karatsuba Ofman Multiplier | Amit Sarkar et.al. | 2412.20393 | null |
2024-12-29 | Differential Evolution Integrated Hybrid Deep Learning Model for Object Detection in Pre-made Dishes | Lujia Lv et.al. | 2412.20370 | null |
2024-12-28 | Plastic Waste Classification Using Deep Learning: Insights from the WaDaBa Dataset | Suman Kunwar et.al. | 2412.20232 | null |
2024-12-27 | Chimera: A Block-Based Neural Architecture Search Framework for Event-Based Object Detection | Diego A. Silva et.al. | 2412.19646 | null |
2024-12-27 | Optimizing Helmet Detection with Hybrid YOLO Pipelines: A Detailed Analysis | Vaikunth M et.al. | 2412.19467 | null |
2024-12-26 | Revisiting Monocular 3D Object Detection from Scene-Level Depth Retargeting to Instance-Level Spatial Refinement | Qiude Zhang et.al. | 2412.19165 | null |
2024-12-26 | From Coin to Data: The Impact of Object Detection on Digital Numismatics | Rafael Cabral et.al. | 2412.19091 | null |
2024-12-26 | Assessing Pre-trained Models for Transfer Learning through Distribution of Spectral Components | Tengxue Zhang et.al. | 2412.19085 | null |
2024-12-25 | MTCAE-DFER: Multi-Task Cascaded Autoencoder for Dynamic Facial Expression Recognition | Peihao Xiang et.al. | 2412.18988 | null |
2024-12-25 | CGCOD: Class-Guided Camouflaged Object Detection | Chenxi Zhang et.al. | 2412.18977 | null |
2024-12-25 | HV-BEV: Decoupling Horizontal and Vertical Feature Sampling for Multi-View 3D Object Detection | Di Wu et.al. | 2412.18884 | null |
2024-12-25 | TSceneJAL: Joint Active Learning of Traffic Scenes for 3D Object Detection | Chenyang Lei et.al. | 2412.18870 | null |
2024-12-25 | Distortion-Aware Adversarial Attacks on Bounding Boxes of Object Detectors | Pham Phuc et.al. | 2412.18815 | link |
2024-12-24 | Sampling Bag of Views for Open-Vocabulary Object Detection | Hojun Choi et.al. | 2412.18273 | null |
2024-12-24 | Efficient Detection Framework Adaptation for Edge Computing: A Plug-and-play Neural Network Toolbox Enabling Edge Deployment | Jiaqi Wu et.al. | 2412.18230 | null |
2024-12-24 | SDM-Car: A Dataset for Small and Dim Moving Vehicles Detection in Satellite Videos | Zhen Zhang et.al. | 2412.18214 | link |
2024-12-24 | Spectrum-oriented Point-supervised Saliency Detector for Hyperspectral Images | Peifu Liu et.al. | 2412.18112 | link |
2024-12-24 | Multi-Point Positional Insertion Tuning for Small Object Detection | Kanoko Goto et.al. | 2412.18090 | null |
2024-12-24 | COMO: Cross-Mamba Interaction and Offset-Guided Fusion for Multimodal Object Detection | Chang Liu et.al. | 2412.18076 | null |
2024-12-23 | Comprehensive Multi-Modal Prototypes are Simple and Effective Classifiers for Vast-Vocabulary Object Detection | Yitong Chen et.al. | 2412.17800 | link |
2024-12-23 | Enhanced Temporal Processing in Spiking Neural Networks for Static Object Detection Using 3D Convolutions | Huaxu He et.al. | 2412.17654 | null |
2024-12-23 | Impact of Evidence Theory Uncertainty on Training Object Detection Models | M. Tahasanul Ibrahim et.al. | 2412.17405 | null |
2024-12-23 | Feature Based Methods Domain Adaptation for Object Detection: A Review Paper | Helia Mohamadi et.al. | 2412.17325 | null |
2024-12-23 | Towards Unsupervised Model Selection for Domain Adaptive Object Detection | Hengfu Yu et.al. | 2412.17284 | null |
2024-12-22 | NumbOD: A Spatial-Frequency Fusion Attack Against Object Detectors | Ziqi Zhou et.al. | 2412.16955 | link |
2024-12-22 | Separating Drone Point Clouds From Complex Backgrounds by Cluster Filter -- Technical Report for CVPR 2024 UG2 Challenge | Hanfang Liang et.al. | 2412.16947 | null |
2024-12-22 | Seamless Detection: Unifying Salient Object Detection and Camouflaged Object Detection | Yi Liu et.al. | 2412.16840 | link |
2024-12-22 | Human-Guided Image Generation for Expanding Small-Scale Training Image Datasets | Changjian Chen et.al. | 2412.16839 | null |
2024-12-21 | IV-tuning: Parameter-Efficient Transfer Learning for Infrared-Visible Tasks | Yaming Zhang et.al. | 2412.16654 | link |
2024-12-20 | NeRF-To-Real Tester: Neural Radiance Fields as Test Image Generators for Vision of Autonomous Systems | Laura Weihl et.al. | 2412.16141 | null |
2024-12-20 | MR-GDINO: Efficient Open-World Continual Object Detection | Bowen Dong et.al. | 2412.15979 | link |
2024-12-20 | Mask-RadarNet: Enhancing Transformer With Spatial-Temporal Semantic Context for Radar Object Detection in Autonomous Driving | Yuzhi Wu et.al. | 2412.15595 | null |
2024-12-19 | Exploring Machine Learning Engineering for Object Detection and Tracking by Unmanned Aerial Vehicle (UAV) | Aneesha Guna et.al. | 2412.15347 | null |
2024-12-19 | Leveraging Color Channel Independence for Improved Unsupervised Object Detection | Bastian Jäckl et.al. | 2412.15150 | null |
2024-12-19 | Explainable Tampered Text Detection via Multimodal Large Models | Chenfan Qu et.al. | 2412.14816 | null |
2024-12-19 | Explicit Relational Reasoning Network for Scene Text Detection | Yuchen Su et.al. | 2412.14692 | null |
2024-12-19 | A Light-Weight Framework for Open-Set Object Detection with Decoupled Feature Alignment in Joint Space | Yonghao He et.al. | 2412.14680 | link |
2024-12-19 | Progressive Fine-to-Coarse Reconstruction for Accurate Low-Bit Post-Training Quantization in Vision Transformers | Rui Ding et.al. | 2412.14633 | null |
2024-12-19 | Alignment-Free RGB-T Salient Object Detection: A Large-scale Dataset and Progressive Correlation Network | Kunpeng Wang et.al. | 2412.14576 | link |
2024-12-19 | SCKD: Semi-Supervised Cross-Modality Knowledge Distillation for 4D Radar Object Detection | Ruoyu Xu et.al. | 2412.14571 | null |
2024-12-18 | HA-RDet: Hybrid Anchor Rotation Detector for Oriented Object Detection | Phuc D. A. Nguyen et.al. | 2412.14379 | link |
2024-12-18 | Joint Perception and Prediction for Autonomous Driving: A Survey | Lucas Dal'Col et.al. | 2412.14088 | link |
2024-12-18 | Object Style Diffusion for Generalized Object Detection in Urban Scene | Hao Li et.al. | 2412.13815 | null |
2024-12-18 | MMO-IG: Multi-Class and Multi-Scale Object Image Generation for Remote Sensing | Chuang Yang et.al. | 2412.13684 | null |
2024-12-18 | Evaluation of LLM Vulnerabilities to Being Misused for Personalized Disinformation Generation | Aneta Zugecova et.al. | 2412.13666 | null |
2024-12-18 | Multi-View Pedestrian Occupancy Prediction with a Novel Synthetic Dataset | Sithu Aung et.al. | 2412.13569 | null |
2024-12-18 | Comparative Analysis of YOLOv9, YOLOv10 and RT-DETR for Real-Time Weed Detection | Ahmet Oğuz Saltık et.al. | 2412.13490 | null |
2024-12-17 | Continuous Patient Monitoring with AI: Real-Time Analysis of Video in Hospital Care Settings | Paolo Gabriel et.al. | 2412.13152 | null |
2024-12-17 | A New Adversarial Perspective for LiDAR-based 3D Object Detection | Shijun Zheng et.al. | 2412.13017 | null |
2024-12-17 | What is YOLOv6? A Deep Insight into the Object Detection Model | Athulya Sundaresan Geetha et.al. | 2412.13006 | null |
2024-12-17 | Differential Alignment for Domain Adaptive Object Detection | Xinyu He et.al. | 2412.12830 | null |
2024-12-17 | RCTrans: Radar-Camera Transformer via Radar Densifier and Sequential Decoder for 3D Object Detection | Yiheng Li et.al. | 2412.12799 | link |
2024-12-17 | RaCFormer: Towards High-Quality 3D Object Detection via Query-based Radar-Camera Fusion | Xiaomeng Chu et.al. | 2412.12725 | null |
2024-12-17 | Efficient Oriented Object Detection with Enhanced Small Object Recognition in Aerial Images | Zhifei Shi et.al. | 2412.12562 | null |
2024-12-17 | CREST: An Efficient Conjointly-trained Spike-driven Framework for Event-based Object Detection Exploiting Spatiotemporal Dynamics | Ruixin Mao et.al. | 2412.12525 | link |
2024-12-17 | PromptDet: A Lightweight 3D Object Detection Framework with LiDAR Prompts | Kun Guo et.al. | 2412.12460 | link |
2024-12-16 | Domain Generalization in Autonomous Driving: Evaluating YOLOv8s, RT-DETR, and YOLO-NAS with the ROAD-Almaty Dataset | Madiyar Alimov et.al. | 2412.12349 | null |
2024-12-16 | Coconut Palm Tree Counting on Drone Images with Deep Object Detection and Synthetic Training Data | Tobias Rohe et.al. | 2412.11949 | null |
2024-12-16 | Sonar-based Deep Learning in Underwater Robotics: Overview, Robustness and Challenges | Martin Aubard et.al. | 2412.11840 | null |
2024-12-16 | CLDA-YOLO: Visual Contrastive Learning Based Domain Adaptive YOLO Detector | Tianheng Qiu et.al. | 2412.11812 | null |
2024-12-16 | PhysAug: A Physical-guided and Frequency-based Data Augmentation for Single-Domain Generalized Object Detection | Xiaoran Xu et.al. | 2412.11807 | link |
2024-12-16 | Impact of Face Alignment on Face Image Quality | Eren Onaran et.al. | 2412.11779 | null |
2024-12-16 | Learning UAV-based path planning for efficient localization of objects using prior knowledge | Rick van Essen et.al. | 2412.11717 | null |
2024-12-16 | Oriented Tiny Object Detection: A Dataset, Benchmark, and Dynamic Unbiased Learning | Chang Xu et.al. | 2412.11582 | null |
2024-12-16 | Glimpse: Enabling White-Box Methods to Use Proprietary Models for Zero-Shot LLM-Generated Text Detection | Guangsheng Bao et.al. | 2412.11506 | link |
2024-12-16 | HGSFusion: Radar-Camera Fusion with Hybrid Generation and Synchronization for 3D Object Detection | Zijian Gu et.al. | 2412.11489 | link |
2024-12-16 | Universal Domain Adaptive Object Detection via Dual Probabilistic Alignment | Yuanfan Zheng et.al. | 2412.11443 | link |
2024-12-13 | A dual contrastive framework | Yuan Sun et.al. | 2412.10348 | null |
2024-12-13 | MVQ:Towards Efficient DNN Compression and Acceleration with Masked Vector Quantization | Shuaiting Li et.al. | 2412.10261 | null |
2024-12-13 | Copy-Move Detection in Optical Microscopy: A Segmentation Network and A Dataset | Hao-Chiang Shao et.al. | 2412.10258 | null |
2024-12-13 | UN-DETR: Promoting Objectness Learning via Joint Supervision for Unknown Object Detection | Haomiao Liu et.al. | 2412.10176 | link |
2024-12-13 | HS-FPN: High Frequency and Spatial Perception FPN for Tiny Object Detection | Zican Shi et.al. | 2412.10116 | null |
2024-12-13 | RemDet: Rethinking Efficient Model Design for UAV Object Detection | Chen Li et.al. | 2412.10040 | link |
2024-12-13 | Timealign: A multi-modal object detection method for time misalignment fusing in autonomous driving | Zhihang Song et.al. | 2412.10033 | null |
2024-12-13 | Object-Focused Data Selection for Dense Prediction Tasks | Niclas Popp et.al. | 2412.10032 | null |
2024-12-13 | CP-DETR: Concept Prompt Guide DETR Toward Stronger Universal Object Detection | Qibo Chen et.al. | 2412.09799 | null |
2024-12-12 | FD2-Net: Frequency-Driven Feature Decomposition Network for Infrared-Visible Object Detection | Ke Li et.al. | 2412.09258 | null |
2024-12-12 | UADet: A Remarkably Simple Yet Effective Uncertainty-Aware Open-Set Object Detection Framework | Silin Cheng et.al. | 2412.09229 | null |
2024-12-12 | ContextHOI: Spatial Context Learning for Human-Object Interaction Detection | Mingda Jia et.al. | 2412.09050 | null |
2024-12-12 | STEAM: Squeeze and Transform Enhanced Attention Module | Rishabh Sabharwal et.al. | 2412.09023 | null |
2024-12-12 | Sensing for Space Safety and Sustainability: A Deep Learning Approach with Vision Transformers | Wenxuan Zhang et.al. | 2412.08913 | null |
2024-12-11 | DALI: Domain Adaptive LiDAR Object Detection via Distribution-level and Instance-level Pseudo Label Denoising | Xiaohu Lu et.al. | 2412.08806 | link |
2024-12-11 | Utilizing Multi-step Loss for Single Image Reflection Removal | Abdelrahman Elnenaey et.al. | 2412.08582 | link |
2024-12-11 | PointCFormer: a Relation-based Progressive Feature Extraction Network for Point Cloud Completion | Yi Zhong et.al. | 2412.08421 | null |
2024-12-11 | Pysical Informed Driving World Model | Zhuoran Yang et.al. | 2412.08410 | null |
2024-12-11 | Wasserstein Distance Rivals Kullback-Leibler Divergence for Knowledge Distillation | Jiaming Lv et.al. | 2412.08139 | null |
2024-12-11 | DTAA: A Detect, Track and Avoid Architecture for navigation in spaces with Multiple Velocity Objects | Samuel Nordström et.al. | 2412.08121 | null |
2024-12-11 | THUD++: Large-Scale Dynamic Indoor Scene Dataset and Benchmark for Mobile Robots | Zeshun Li et.al. | 2412.08096 | null |
2024-12-11 | MAGIC: Mastering Physical Adversarial Generation in Context through Collaborative LLM Agents | Yun Xing et.al. | 2412.08014 | null |
2024-12-10 | Low-Latency Scalable Streaming for Event-Based Vision | Andrew Hamara et.al. | 2412.07889 | null |
2024-12-10 | Leveraging Content and Context Cues for Low-Light Image Enhancement | Igor Morawski et.al. | 2412.07693 | link |
2024-12-10 | Multimodal Contextualized Support for Enhancing Video Retrieval System | Quoc-Bao Nguyen-Le et.al. | 2412.07584 | null |
2024-12-10 | Making the Flow Glow -- Robot Perception under Severe Lighting Conditions using Normalizing Flow Gradients | Simon Kristoffersson Lind et.al. | 2412.07565 | null |
2024-12-10 | Enhancing 3D Object Detection in Autonomous Vehicles Based on Synthetic Virtual Environment Analysis | Vladislav Li et.al. | 2412.07509 | null |
2024-12-10 | DSFEC: Efficient and Deployable Deep Radar Object Detection | Gayathri Dandugula et.al. | 2412.07411 | null |
2024-12-10 | Benchmarking Vision-Based Object Tracking for USVs in Complex Maritime Environments | Muhayy Ud Din et.al. | 2412.07392 | null |
2024-12-09 | FlexEvent: Event Camera Object Detection at Arbitrary Frequencies | Dongyue Lu et.al. | 2412.06708 | null |
2024-12-09 | EMOv2: Pushing 5M Vision Model Frontier | Jiangning Zhang et.al. | 2412.06674 | link |
2024-12-09 | Object Detection using Event Camera: A MoE Heat Conduction based Detector and A New Benchmark Dataset | Xiao Wang et.al. | 2412.06647 | null |
2024-12-09 | Prediction of Occluded Pedestrians in Road Scenes using Human-like Reasoning: Insights from the OccluRoads Dataset | Melo Castillo Angie Nataly et.al. | 2412.06549 | null |
2024-12-09 | Self-Paced Learning Strategy with Easy Sample Prior Based on Confidence for the Flying Bird Object Detection Model Training | Zi-Wei Sun et.al. | 2412.06306 | null |
2024-12-09 | No Annotations for Object Detection in Art through Stable Diffusion | Patrick Ramos et.al. | 2412.06286 | link |
2024-12-09 | DenseVLM: A Retrieval and Decoupled Alignment Framework for Open-Vocabulary Dense Prediction | Yunheng Li et.al. | 2412.06244 | null |
2024-12-09 | A Real-Time Defense Against Object Vanishing Adversarial Patch Attacks for Object Detection in Autonomous Vehicles | Jaden Mu et.al. | 2412.06215 | null |
2024-12-09 | PoLaRIS Dataset: A Maritime Object Detection and Tracking Dataset in Pohang Canal | Jiwon Choi et.al. | 2412.06192 | null |
2024-12-08 | Tiny Object Detection with Single Point Supervision | Haoran Zhu et.al. | 2412.05837 | null |
2024-12-06 | From classical techniques to convolution-based models: A review of object detection algorithms | Fnu Neha et.al. | 2412.05252 | null |
2024-12-06 | Towards Flexible 3D Perception: Object-Centric Occupancy Completion Augments 3D Object Detection | Chaoda Zheng et.al. | 2412.05154 | link |
2024-12-06 | DEYOLO: Dual-Feature-Enhancement YOLO for Cross-Modality Object Detection | Yishuo Chen et.al. | 2412.04931 | link |
2024-12-06 | Beyond Boxes: Mask-Guided Spatio-Temporal Feature Aggregation for Video Object Detection | Khurram Azeem Hashmi et.al. | 2412.04915 | null |
2024-12-05 | Cubify Anything: Scaling Indoor 3D Object Detection | Justin Lazarow et.al. | 2412.04458 | null |
2024-12-05 | Reflective Teacher: Semi-Supervised Multimodal 3D Object Detection in Bird's-Eye-View via Uncertainty Measure | Saheli Hazra et.al. | 2412.04337 | null |
2024-12-05 | YOLO-CCA: A Context-Based Approach for Traffic Sign Detection | Linfeng Jiang et.al. | 2412.04289 | link |
2024-12-05 | DEIM: DETR with Improved Matching for Fast Convergence | Shihua Huang et.al. | 2412.04234 | link |
2024-12-05 | Frequency-Adaptive Low-Latency Object Detection Using Events and Frames | Haitian Zhang et.al. | 2412.04149 | null |
2024-12-05 | MVUDA: Unsupervised Domain Adaptation for Multi-view Pedestrian Detection | Erik Brorsson et.al. | 2412.04117 | link |
2024-12-05 | Thermal and RGB Images Work Better Together in Wind Turbine Damage Detection | Serhii Svystun et.al. | 2412.04114 | null |
2024-12-05 | SoRA: Singular Value Decomposed Low-Rank Adaptation for Domain Generalizable Representation Learning | Seokju Yun et.al. | 2412.04077 | null |
2024-12-05 | Space to Policy: Scalable Brick Kiln Detection and Automatic Compliance Monitoring with Geospatial Data | Zeel B Patel et.al. | 2412.04065 | null |
2024-12-05 | UNCOVER: Unknown Class Object Detection for Autonomous Vehicles in Real-time | Lars Schmarje et.al. | 2412.03986 | null |
2024-12-04 | Perception Tokens Enhance Visual Reasoning in Multimodal Language Models | Mahtab Bigverdi et.al. | 2412.03548 | null |
2024-12-04 | Data Fusion of Semantic and Depth Information in the Context of Object Detection | Md Abu Yusuf et.al. | 2412.03490 | null |
2024-12-04 | Task-driven Image Fusion with Learnable Fusion Loss | Haowen Bai et.al. | 2412.03240 | null |
2024-12-04 | ObjectFinder: Open-Vocabulary Assistive System for Interactive Object Search by Blind People | Ruiping Liu et.al. | 2412.03118 | null |
2024-12-04 | TREND: Unsupervised 3D Representation Learning via Temporal Forecasting for LiDAR Perception | Runjian Chen et.al. | 2412.03054 | null |
2024-12-04 | Assessing the performance of CT image denoisers using Laguerre-Gauss Channelized Hotelling Observer for lesion detection | Prabhat Kc et.al. | 2412.02920 | null |
2024-12-03 | EvRT-DETR: The Surprising Effectiveness of DETR-based Detection for Event Cameras | Dmitrii Torbunov et.al. | 2412.02890 | null |
2024-12-03 | Optimized CNNs for Rapid 3D Point Cloud Object Recognition | Tianyi Lyu et.al. | 2412.02855 | null |
2024-12-03 | Gaussian Splatting Under Attack: Investigating Adversarial Noise in 3D Objects | Abdurrahman Zeybey et.al. | 2412.02803 | null |
2024-12-03 | SJTU:Spatial judgments in multimodal models towards unified segmentation through coordinate detection | Joongwon Chae et.al. | 2412.02565 | null |
2024-12-03 | Underload: Defending against Latency Attacks for Object Detectors on Edge Devices | Tianyi Wang et.al. | 2412.02171 | null |
2024-12-03 | Redundant Queries in DETR-Based 3D Detection Methods: Unnecessary and Prunable | Lizhen Xu et.al. | 2412.02054 | null |
2024-12-02 | Smart Parking with Pixel-Wise ROI Selection for Vehicle Detection Using YOLOv8, YOLOv9, YOLOv10, and YOLOv11 | Gustavo P. C. P. da Luz et.al. | 2412.01983 | null |
2024-12-02 | HPRM: High-Performance Robotic Middleware for Intelligent Autonomous Systems | Jacky Kwok et.al. | 2412.01799 | null |
2024-12-02 | Identifying Reliable Predictions in Detection Transformers | Young-Jin Park et.al. | 2412.01782 | null |
2024-12-02 | FEVER-OOD: Free Energy Vulnerability Elimination for Robust Out-of-Distribution Detection | Brian K. S. Isaac-Medina et.al. | 2412.01596 | null |
2024-12-02 | Divide-and-Conquer: Confluent Triple-Flow Network for RGB-T Salient Object Detection | Hao Tang et.al. | 2412.01556 | null |
2024-12-03 | GFreeDet: Exploiting Gaussian Splatting and Foundation Models for Model-free Unseen Object Detection in the BOP Challenge 2024 | Xingyu Liu et.al. | 2412.01552 | null |
2024-12-02 | Improving Object Detection by Modifying Synthetic Data with Explainable AI | Nitish Mital et.al. | 2412.01477 | null |
2024-11-29 | SpaRC: Sparse Radar-Camera Fusion for 3D Object Detection | Philipp Wolters et.al. | 2411.19860 | null |
2024-11-29 | Feedback-driven object detection and iterative model improvement | Sönke Tenckhoff et.al. | 2411.19835 | link |
2024-11-29 | Real-Time Anomaly Detection in Video Streams | Fabien Poirier et.al. | 2411.19731 | null |
2024-11-29 | LDA-AQU: Adaptive Query-guided Upsampling via Local Deformable Attention | Zewen Du et.al. | 2411.19585 | link |
2024-11-29 | Bootstraping Clustering of Gaussians for View-consistent 3D Scene Understanding | Wenbo Zhang et.al. | 2411.19551 | null |
2024-11-28 | Automatic Prompt Generation and Grounding Object Detection for Zero-Shot Image Anomaly Detection | Tsun-Hin Cheung et.al. | 2411.19220 | null |
2024-11-28 | Co-Learning: Towards Semi-Supervised Object Detection with Road-side Cameras | Jicheng Yuan et.al. | 2411.19143 | null |
2024-11-28 | On Moving Object Segmentation from Monocular Video with Transformers | Christian Homeyer et.al. | 2411.19141 | null |
2024-11-28 | Dynamic Attention and Bi-directional Fusion for Safety Helmet Wearing Detection | Junwei Feng et.al. | 2411.19071 | null |
2024-11-28 | MVFormer: Diversifying Feature Normalization and Token Mixing for Efficient Vision Transformers | Jongseong Bae et.al. | 2411.18995 | null |
2024-11-27 | Exploring Depth Information for Detecting Manipulated Face Videos | Haoyue Wang et.al. | 2411.18572 | null |
2024-11-27 | Efficient Dynamic LiDAR Odometry for Mobile Robots with Structured Point Clouds | Jonathan Lichtenfeld et.al. | 2411.18443 | link |
2024-11-27 | Deep Fourier-embedded Network for Bi-modal Salient Object Detection | Pengfei Lyu et.al. | 2411.18409 | link |
2024-11-27 | Optimizing Multispectral Object Detection: A Bag of Tricks and Comprehensive Benchmarks | Chen Zhou et.al. | 2411.18288 | link |
2024-11-27 | From Open Vocabulary to Open World: Teaching Vision Language Models to Detect Novel Objects | Zizhao Li et.al. | 2411.18207 | link |
2024-11-27 | RPEE-HEADS: A Novel Benchmark for Pedestrian Head Detection in Crowd Videos | Mohamad Abubaker et.al. | 2411.18164 | null |
2024-11-27 | Revisiting Misalignment in Multispectral Pedestrian Detection: A Language-Driven Approach for Cross-modal Alignment Fusion | Taeheon Kim et.al. | 2411.17995 | null |
2024-11-27 | ROICtrl: Boosting Instance Control for Visual Generation | Yuchao Gu et.al. | 2411.17949 | null |
2024-11-26 | Box for Mask and Mask for Box: weak losses for multi-task partially supervised learning | Hoàng-Ân Lê et.al. | 2411.17536 | link |
2024-11-26 | TinyViM: Frequency Decoupling for Tiny Hybrid Vision Mamba | Xiaowen Ma et.al. | 2411.17473 | link |
2024-11-26 | Communication-Efficient Cooperative SLAMMOT via Determining the Number of Collaboration Vehicles | Susu Fang et.al. | 2411.17432 | null |
2024-11-26 | DGNN-YOLO: Dynamic Graph Neural Networks with YOLO11 for Small Object Detection and Tracking in Traffic Surveillance | Shahriar Soudeep et.al. | 2411.17251 | null |
2024-11-26 | Event-based Spiking Neural Networks for Object Detection: A Review of Datasets, Architectures, Learning Rules, and Implementation | Craig Iaboni et.al. | 2411.17006 | link |
2024-11-25 | Online Episodic Memory Visual Query Localization with Egocentric Streaming Object Memory | Zaira Manigrasso et.al. | 2411.16934 | null |
2024-11-25 | Open Vocabulary Monocular 3D Object Detection | Jin Yao et.al. | 2411.16833 | link |
2024-11-25 | Imperceptible Adversarial Examples in the Physical World | Weilin Xu et.al. | 2411.16622 | null |
2024-11-25 | STDWeb: Simple Transient Detection pipeline for the Web | Sergey Karpov et.al. | 2411.16470 | null |
2024-11-25 | Machine Learning for the Digital Typhoon Dataset: Extensions to Multiple Basins and New Developments in Representations and Tasks | Asanobu Kitamoto et.al. | 2411.16421 | link |
2024-11-25 | CutS3D: Cutting Semantics in 3D for 2D Unsupervised Instance Segmentation | Leon Sick et.al. | 2411.16319 | null |
2024-11-25 | Diagnosis of diabetic retinopathy using machine learning & deep learning technique | Eric Shah et.al. | 2411.16250 | null |
2024-11-25 | Interpreting Object-level Foundation Models via Visual Precision Search | Ruoyu Chen et.al. | 2411.16198 | null |
2024-11-25 | Learn from Foundation Model: Fruit Detection Model without Manual Annotation | Yanan Wang et.al. | 2411.16196 | null |
2024-11-25 | CIA: Controllable Image Augmentation Framework Based on Stable Diffusion | Mohamed Benkedadra et.al. | 2411.16128 | null |
2024-11-25 | You only thermoelastically deform once: Point Absorber Detection in LIGO Test Masses with YOLO | Simon R. Goode et.al. | 2411.16104 | null |
2024-11-25 | Leverage Task Context for Object Affordance Ranking | Haojie Huang et.al. | 2411.16082 | null |
2024-11-22 | A Real-Time DETR Approach to Bangladesh Road Object Detection for Autonomous Vehicles | Irfan Nafiz Shahan et.al. | 2411.15110 | null |
2024-11-22 | MSSF: A 4D Radar and Camera Fusion Framework With Multi-Stage Sampling for 3D Object Detection in Autonomous Driving | Hongsi Liu et.al. | 2411.15016 | null |
2024-11-22 | VisionPAD: A Vision-Centric Pre-training Paradigm for Autonomous Driving | Haiming Zhang et.al. | 2411.14716 | null |
2024-11-21 | Unveiling the Hidden: A Comprehensive Evaluation of Underwater Image Enhancement and Its Impact on Object Detection | Ali Awad et.al. | 2411.14626 | null |
2024-11-21 | DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding | Tianhe Ren et.al. | 2411.14347 | link |
2024-11-21 | AnywhereDoor: Multi-Target Backdoor Attacks on Object Detection | Jialin Lu et.al. | 2411.14243 | null |
2024-11-21 | Transforming Static Images Using Generative Models for Video Salient Object Detection | Suhwan Cho et.al. | 2411.13975 | link |
2024-11-21 | Multitask Learning for SAR Ship Detection with Gaussian-Mask Joint Segmentation | Ming Zhao et.al. | 2411.13847 | null |
2024-11-20 | MambaDETR: Query-based Temporal Modeling using State Space Model for Multi-View 3D Object Detection | Tong Ning et.al. | 2411.13628 | null |
2024-11-20 | DIS-Mine: Instance Segmentation for Disaster-Awareness in Poor-Light Condition in Underground Mines | Mizanur Rahman Jewel et.al. | 2411.13544 | null |
2024-11-20 | A Resource Efficient Fusion Network for Object Detection in Bird's-Eye View using Camera and Raw Radar Data | Kavin Chandrasekaran et.al. | 2411.13311 | link |
2024-11-20 | VADet: Multi-frame LiDAR 3D Object Detection using Variable Aggregation | Chengjie Huang et.al. | 2411.13186 | null |
2024-11-20 | RAW-Diffusion: RGB-Guided Diffusion Models for High-Fidelity RAW Image Generation | Christoph Reinders et.al. | 2411.13150 | link |
2024-11-20 | YCB-LUMA: YCB Object Dataset with Luminance Keying for Object Localization | Thomas Pöllabauer et.al. | 2411.13149 | link |
2024-11-20 | Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension | Yongdong Luo et.al. | 2411.13093 | link |
2024-11-20 | Bounding-box Watermarking: Defense against Model Extraction Attacks on Object Detectors | Satoru Koda et.al. | 2411.13047 | null |
2024-11-20 | Collaborative Feature-Logits Contrastive Learning for Open-Set Semi-Supervised Object Detection | Xinhao Zhong et.al. | 2411.13001 | null |
2024-11-19 | Maps from Motion (MfM): Generating 2D Semantic Maps from Sparse Multi-view Images | Matteo Toso et.al. | 2411.12620 | null |
2024-11-19 | GaussianPretrain: A Simple Unified 3D Gaussian Representation for Visual Pre-training in Autonomous Driving | Shaoqing Xu et.al. | 2411.12452 | null |
2024-11-19 | Physics-Guided Detector for SAR Airplanes | Zhongling Huang et.al. | 2411.12301 | link |
2024-11-18 | Scaling Deep Learning Research with Kubernetes on the NRP Nautilus HyperCluster | J. Alex Hurt et.al. | 2411.12038 | null |
2024-11-18 | LightFFDNets: Lightweight Convolutional Neural Networks for Rapid Facial Forgery Detection | Günel Jabbarlı et.al. | 2411.11826 | null |
2024-11-18 | WoodYOLO: A Novel Object Detector for Wood Species Detection in Microscopic Images | Lars Nieradzik et.al. | 2411.11738 | null |
2024-11-18 | Exploring Emerging Trends and Research Opportunities in Visual Place Recognition | Antonios Gasteratos et.al. | 2411.11481 | null |
2024-11-18 | SL-YOLO: A Stronger and Lighter Drone Target Detection Model | Defan Chen et.al. | 2411.11477 | null |
2024-11-19 | EVT: Efficient View Transformation for Multi-Modal 3D Object Detection | Yongjin Lee et.al. | 2411.10715 | null |
2024-11-15 | Vision Eagle Attention: A New Lens for Advancing Image Classification | Mahmudul Hasan et.al. | 2411.10564 | link |
2024-11-15 | Interactive Image-Based Aphid Counting in Yellow Water Traps under Stirring Actions | Xumin Gao et.al. | 2411.10357 | null |
2024-11-15 | RETR: Multi-View Radar Detection Transformer for Indoor Perception | Ryoma Yataka et.al. | 2411.10293 | null |
2024-11-15 | Visual-Linguistic Agent: Towards Collaborative Contextual Object Reasoning | Jingru Yang et.al. | 2411.10252 | null |
2024-11-15 | Real-Time AI-Driven People Tracking and Counting Using Overhead Cameras | Ishrath Ahamed et.al. | 2411.10072 | null |
2024-11-15 | Diachronic Document Dataset for Semantic Layout Analysis | Thibault Clérice et.al. | 2411.10068 | null |
2024-11-14 | Adversarial Attacks Using Differentiable Rendering: A Survey | Matthew Hull et.al. | 2411.09749 | null |
2024-11-14 | Local-Global Attention: An Adaptive Mechanism for Multi-Scale Feature Integration | Yifan Shao et.al. | 2411.09604 | link |
2024-11-14 | Long-Tailed Object Detection Pre-training: Dynamic Rebalancing Contrastive Learning with Dual Reconstruction | Chen-Long Duan et.al. | 2411.09453 | null |
2024-11-14 | Instruction-Driven Fusion of Infrared-Visible Images: Tailoring for Diverse Downstream Tasks | Zengyi Yang et.al. | 2411.09387 | null |
2024-11-14 | DT-JRD: Deep Transformer based Just Recognizable Difference Prediction Model for Video Coding for Machines | Junqi Liu et.al. | 2411.09308 | null |
2024-11-14 | Cross-Modal Consistency in Multimodal Large Language Models | Xiang Zhang et.al. | 2411.09273 | null |
2024-11-14 | LEAP:D -- A Novel Prompt-based Approach for Domain-Generalized Aerial Object Detection | Chanyeong Park et.al. | 2411.09180 | null |
2024-11-13 | Multimodal Object Detection using Depth and Image Data for Manufacturing Parts | Nazanin Mahjourian et.al. | 2411.09062 | null |
2024-11-13 | DART-LLM: Dependency-Aware Multi-Robot Task Decomposition and Execution using Large Language Models | Yongdong Wang et.al. | 2411.09022 | null |
2024-11-13 | UIFormer: A Unified Transformer-based Framework for Incremental Few-Shot Object Detection and Instance Segmentation | Chengyuan Zhang et.al. | 2411.08569 | null |
2024-11-13 | Methodology for a Statistical Analysis of Influencing Factors on 3D Object Detection Performance | Anton Kuznietsov et.al. | 2411.08482 | null |
2024-11-13 | V2X-R: Cooperative LiDAR-4D Radar Fusion for 3D Object Detection with Denoising Diffusion | Xun Huang et.al. | 2411.08402 | link |
2024-11-12 | Large-scale Remote Sensing Image Target Recognition and Automatic Annotation | Wuzheng Dong et.al. | 2411.07802 | link |
2024-11-12 | Efficient 3D Perception on Multi-Sweep Point Cloud with Gumbel Spatial Pruning | Jianhao Li et.al. | 2411.07742 | null |
2024-11-12 | Depthwise Separable Convolutions with Deep Residual Convolutions | Md Arid Hasan et.al. | 2411.07544 | null |
2024-11-11 | Transformers for Charged Particle Track Reconstruction in High Energy Physics | Samuel Van Stroud et.al. | 2411.07149 | null |
2024-11-11 | Multi-scale Frequency Enhancement Network for Blind Image Deblurring | Yawen Xiang et.al. | 2411.06893 | null |
2024-11-11 | Fast and Efficient Transformer-based Method for Bird's Eye View Instance Prediction | Miguel Antunes-García et.al. | 2411.06851 | link |
2024-11-11 | AV-PedAware: Self-Supervised Audio-Visual Fusion for Dynamic Pedestrian Awareness | Yizhuo Yang et.al. | 2411.06789 | null |
2024-11-11 | United Domain Cognition Network for Salient Object Detection in Optical Remote Sensing Images | Yanguang Sun et.al. | 2411.06703 | link |
2024-11-11 | Track Any Peppers: Weakly Supervised Sweet Pepper Tracking Using VLMs | Jia Syuen Lim et.al. | 2411.06702 | null |
2024-11-11 | LFSamba: Marry SAM with Mamba for Light Field Salient Object Detection | Zhengyi Liu et.al. | 2411.06652 | null |
2024-11-09 | Robust Detection of LLM-Generated Text: A Comparative Analysis | Yongye Su et.al. | 2411.06248 | null |
2024-11-09 | LSSInst: Improving Geometric Modeling in LSS-Based BEV Perception with Instance Representation | Weijie Ma et.al. | 2411.06173 | link |
2024-11-09 | AI-Compass: A Comprehensive and Effective Multi-module Testing Tool for AI Systems | Zhiyu Zhu et.al. | 2411.06146 | null |
2024-11-08 | Open-set object detection: towards unified problem formulation and benchmarking | Hejer Ammar et.al. | 2411.05564 | null |
2024-11-08 | ZOPP: A Framework of Zero-shot Offboard Panoptic Perception for Autonomous Driving | Tao Ma et.al. | 2411.05311 | null |
2024-11-08 | SimpleBEV: Improved LiDAR-Camera Fusion Architecture for 3D Object Detection | Yun Zhao et.al. | 2411.05292 | null |
2024-11-07 | On the Inherent Robustness of One-Stage Object Detection against Out-of-Distribution Data | Aitor Martinez-Seras et.al. | 2411.04586 | null |
2024-11-07 | l0-Regularized Sparse Coding-based Interpretable Network for Multi-Modal Image Fusion | Gargi Panda et.al. | 2411.04519 | null |
2024-11-07 | Pose2Trajectory: Using Transformers on Body Pose to Predict Tennis Player's Trajectory | Ali K. AlShami et.al. | 2411.04501 | null |
2024-11-07 | SuperQ-GRASP: Superquadrics-based Grasp Pose Estimation on Larger Objects for Mobile-Manipulation | Xun Tu et.al. | 2411.04386 | null |
2024-11-07 | UEVAVD: A Dataset for Developing UAV's Eye View Active Object Detection | Xinhua Jiang et.al. | 2411.04348 | null |
2024-11-07 | GazeGen: Gaze-Driven User Interaction for Visual Content Generation | He-Yen Hsieh et.al. | 2411.04335 | null |
2024-11-06 | An Enhancement of Haar Cascade Algorithm Applied to Face Recognition for Gate Pass Security | Clarence A. Antipona et.al. | 2411.03831 | null |
2024-11-06 | Understanding the Effects of Human-written Paraphrases in LLM-generated Text Detection | Hiu Ting Lau et.al. | 2411.03806 | link |
2024-11-06 | Efficient Fourier Filtering Network with Contrastive Learning for UAV-based Unaligned Bi-modal Salient Object Detection | Pengfei Lyu et.al. | 2411.03728 | link |
2024-11-06 | Estimation of Psychosocial Work Environment Exposures Through Video Object Detection. Proof of Concept Using CCTV Footage | Claus D. Hansen et.al. | 2411.03724 | null |
2024-11-06 | Hybrid Attention for Robust RGB-T Pedestrian Detection in Real-World Conditions | Arunkumar Rathinam et.al. | 2411.03576 | null |
2024-11-05 | An Application-Agnostic Automatic Target Recognition System Using Vision Language Models | Anthony Palladino et.al. | 2411.03491 | null |
2024-11-05 | Self-supervised cross-modality learning for uncertainty-aware object detection and recognition in applications which lack pre-labelled training data | Irum Mehboob et.al. | 2411.03082 | null |
2024-11-05 | CRT-Fusion: Camera, Radar, Temporal Fusion Using Motion Information for 3D Object Detection | Jisong Kim et.al. | 2411.03013 | null |
2024-11-05 | Centerness-based Instance-aware Knowledge Distillation with Task-wise Mutual Lifting for Object Detection on Drone Imagery | Bowei Du et.al. | 2411.02861 | null |
2024-11-05 | Correlation of Object Detection Performance with Visual Saliency and Depth Estimation | Matthias Bartolo et.al. | 2411.02844 | link |
2024-11-05 | ERUP-YOLO: Enhancing Object Detection Robustness for Adverse Weather Condition by Unified Image-Adaptive Processing | Yuka Ogino et.al. | 2411.02799 | null |
2024-11-05 | Real-Time Text Detection with Similar Mask in Traffic, Industrial, and Natural Scenes | Xu Han et.al. | 2411.02794 | link |
2024-11-05 | Efficient Feature Aggregation and Scale-Aware Regression for Monocular 3D Object Detection | Yifan Wang et.al. | 2411.02747 | null |
2024-11-05 | Analysis of Multi-epoch JWST Images of |
Zijian Zhang et.al. | 2411.02729 | null |
2024-11-04 | Intelligent Video Recording Optimization using Activity Detection for Surveillance Systems | Youssef Elmir et.al. | 2411.02632 | null |
2024-11-04 | SIRA: Scalable Inter-frame Relation and Association for Radar Perception | Ryoma Yataka et.al. | 2411.02220 | null |
2024-11-04 | Advanced computer vision for extracting georeferenced vehicle trajectories from drone imagery | Robert Fonod et.al. | 2411.02136 | null |
2024-11-04 | Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation | Yan Li et.al. | 2411.02057 | link |
2024-11-04 | V-CAS: A Realtime Vehicle Anti Collision System Using Vision Transformer on Multi-Camera Streams | Muhammad Waqas Ashraf et.al. | 2411.01963 | null |
2024-11-04 | Exploiting Contextual Uncertainty of Visual Data for Efficient Training of Deep Models | Sharat Agarwal et.al. | 2411.01925 | null |
2024-11-04 | LiDAttack: Robust Black-box Attack on LiDAR-based Object Detection | Jinyin Chen et.al. | 2411.01889 | link |
2024-11-03 | ROAD-Waymo: Action Awareness at Scale for Autonomous Driving | Salman Khan et.al. | 2411.01683 | null |
2024-11-03 | OSAD: Open-Set Aircraft Detection in SAR Images | Xiayang Xiao et.al. | 2411.01597 | null |
2024-11-03 | One for All: Multi-Domain Joint Training for Point Cloud Based 3D Object Detection | Zhenyu Wang et.al. | 2411.01584 | null |
2024-11-03 | A Visual Question Answering Method for SAR Ship: Breaking the Requirement for Multimodal Dataset Construction and Model Fine-Tuning | Fei Wang et.al. | 2411.01445 | null |
2024-10-31 | ImOV3D: Learning Open-Vocabulary Point Clouds 3D Object Detection from Only 2D Images | Timing Yang et.al. | 2410.24001 | link |
2024-10-31 | Localization, balance and affinity: a stronger multifaceted collaborative salient object detector in remote sensing images | Yakun Xie et.al. | 2410.23991 | null |
2024-10-31 | Uncertainty Estimation for 3D Object Detection via Evidential Learning | Nikita Durasov et.al. | 2410.23910 | null |
2024-10-31 | From Web Data to Real Fields: Low-Cost Unsupervised Domain Adaptation for Agricultural Robots | Vasileios Tzouras et.al. | 2410.23906 | null |
2024-10-31 | Open-Set 3D object detection in LiDAR data as an Out-of-Distribution problem | Louis Soum-Fontez et.al. | 2410.23767 | null |
2024-10-31 | DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios | Junchao Wu et.al. | 2410.23746 | link |
2024-10-31 | GigaCheck: Detecting LLM-generated Content | Irina Tolstykh et.al. | 2410.23728 | null |
2024-10-31 | Context-Aware Token Selection and Packing for Enhanced Vision Transformer | Tianyi Zhang et.al. | 2410.23608 | null |
2024-10-30 | EMMA: End-to-End Multimodal Model for Autonomous Driving | Jyh-Jing Hwang et.al. | 2410.23262 | null |
2024-10-30 | S3PT: Scene Semantics and Structure Guided Clustering to Boost Self-Supervised Pre-Training for Autonomous Driving | Maciej K. Wozniak et.al. | 2410.23085 | null |
2024-10-30 | First Place Solution to the ECCV 2024 ROAD++ Challenge @ ROAD++ Spatiotemporal Agent Detection 2024 | Tengfei Zhang et.al. | 2410.23077 | null |
2024-10-30 | AdaptiveISP: Learning an Adaptive Image Signal Processor for Object Detection | Yujin Wang et.al. | 2410.22939 | null |
2024-10-30 | YOLOv11 for Vehicle Detection: Advancements, Performance, and Applications in Intelligent Transportation Systems | Mujadded Al Rabbani Alif et.al. | 2410.22898 | null |
2024-10-29 | Unified Domain Generalization and Adaptation for Multi-View 3D Object Detection | Gyusam Chang et.al. | 2410.22461 | null |
2024-10-29 | Lighten CARAFE: Dynamic Lightweight Upsampling with Guided Reassemble Kernels | Ruigang Fu et.al. | 2410.22139 | link |
2024-10-29 | Data Generation for Hardware-Friendly Post-Training Quantization | Lior Dikstein et.al. | 2410.22110 | null |
2024-10-29 | Cognitive Semantic Augmentation LEO Satellite Networks for Earth Observation | Hong-fu Chou et.al. | 2410.21916 | null |
2024-10-29 | PK-YOLO: Pretrained Knowledge Guided YOLO for Brain Tumor Detection in Multiplanar MRI Slices | Ming Kang et.al. | 2410.21822 | link |
2024-10-28 | MVSDet: Multi-View Indoor 3D Object Detection via Efficient Plane Sweeps | Yating Xu et.al. | 2410.21566 | link |
2024-10-28 | TACO: Adversarial Camouflage Optimization on Trucks to Fool Object Detectors | Adonisz Dimitriu et.al. | 2410.21443 | null |
2024-10-28 | Joint Audio-Visual Idling Vehicle Detection with Streamlined Input Dependencies | Xiwen Li et.al. | 2410.21170 | null |
2024-10-28 | Synthetica: Large Scale Synthetic Data for Robot Perception | Ritvik Singh et.al. | 2410.21153 | null |
2024-10-28 | DeTeCtive: Detecting AI-generated Text via Multi-Level Contrastive Learning | Xun Guo et.al. | 2410.20964 | null |
2024-10-28 | IndraEye: Infrared Electro-Optical UAV-based Perception Dataset for Robust Downstream Tasks | Manjunath D et.al. | 2410.20953 | null |
2024-10-28 | SparseTem: Boosting the Efficiency of CNN-Based Video Encoders by Exploiting Temporal Continuity | Kunyun Wang et.al. | 2410.20790 | null |
2024-10-27 | Sebica: Lightweight Spatial and Efficient Bidirectional Channel Attention Super Resolution Network | Chongxiao Liu et.al. | 2410.20546 | null |
2024-10-27 | Guidance Disentanglement Network for Optics-Guided Thermal UAV Image Super-Resolution | Zhicheng Zhao et.al. | 2410.20466 | link |
2024-10-27 | Open-Vocabulary Object Detection via Language Hierarchy | Jiaxing Huang et.al. | 2410.20371 | null |
2024-10-27 | Historical Test-time Prompt Tuning for Vision Foundation Models | Jingyi Zhang et.al. | 2410.20346 | null |
2024-10-25 | OReole-FM: successes and challenges toward billion-parameter foundation models for high-resolution satellite imagery | Philipe Dias et.al. | 2410.19965 | null |
2024-10-25 | MetaTrading: An Immersion-Aware Model Trading Framework for Vehicular Metaverse Services | Hongjia Wu et.al. | 2410.19665 | null |
2024-10-25 | Frozen-DETR: Enhancing DETR with Image Understanding from Frozen Foundation Models | Shenghao Fu et.al. | 2410.19635 | null |
2024-10-25 | MonoDGP: Monocular 3D Object Detection with Decoupled-Query and Geometry-Error Priors | Fanqi Pu et.al. | 2410.19590 | null |
2024-10-25 | DECADE: Towards Designing Efficient-yet-Accurate Distance Estimation Modules for Collision Avoidance in Mobile Advanced Driver Assistance Systems | Muhammad Zaeem Shahzad et.al. | 2410.19336 | null |
2024-10-25 | In-Simulation Testing of Deep Learning Vision Models in Autonomous Robotic Manipulators | Dmytro Humeniuk et.al. | 2410.19277 | null |
2024-10-24 | HUE Dataset: High-Resolution Event and Frame Sequences for Low-Light Vision | Burak Ercan et.al. | 2410.19164 | null |
2024-10-24 | Optimizing Edge Offloading Decisions for Object Detection | Jiaming Qiu et.al. | 2410.18919 | link |
2024-10-24 | You Only Look Around: Learning Illumination Invariant Feature for Low-light Object Detection | Mingbo Hong et.al. | 2410.18398 | null |
2024-10-24 | Thermal Chameleon: Task-Adaptive Tone-mapping for Radiometric Thermal-Infrared images | Dong-Guw Lee et.al. | 2410.18340 | link |
2024-10-23 | KhmerST: A Low-Resource Khmer Scene Text Detection and Recognition Benchmark | Vannkinh Nom et.al. | 2410.18277 | null |
2024-10-23 | Automated Defect Detection and Grading of Piarom Dates Using Deep Learning | Nasrin Azimi et.al. | 2410.18208 | null |
2024-10-23 | DREB-Net: Dual-stream Restoration Embedding Blur-feature Fusion Network for High-mobility UAV Object Detection | Qingpeng Li et.al. | 2410.17822 | link |
2024-10-23 | YOLO-Vehicle-Pro: A Cloud-Edge Collaborative Framework for Object Detection in Autonomous Driving under Adverse Weather Conditions | Xiguang Li et.al. | 2410.17734 | null |
2024-10-23 | YOLOv11: An Overview of the Key Architectural Enhancements | Rahima Khanam et.al. | 2410.17725 | null |
2024-10-23 | PlantCamo: Plant Camouflage Detection | Jinyu Yang et.al. | 2410.17598 | link |
2024-10-23 | OVT-B: A New Large-Scale Benchmark for Open-Vocabulary Multi-Object Tracking | Haiji Liang et.al. | 2410.17534 | link |
2024-10-22 | EPContrast: Effective Point-level Contrastive Learning for Large-scale Point Cloud Understanding | Zhiyi Pan et.al. | 2410.17207 | null |
2024-10-22 | YOLO-TS: Real-Time Traffic Sign Detection with Enhanced Accuracy Using Optimized Receptive Fields and Anchor-Free Fusion | Junzhou Chen et.al. | 2410.17144 | null |
2024-10-22 | FlightAR: AR Flight Assistance Interface with Multiple Video Streams and Object Detection Aimed at Immersive Drone Control | Oleg Sautenkov et.al. | 2410.16943 | null |
2024-10-22 | AttriPrompter: Auto-Prompting with Attribute Semantics for Zero-shot Nuclei Detection via Visual-Language Pre-trained Models | Yongjian Wu et.al. | 2410.16820 | link |
2024-10-22 | DSORT-MCU: Detecting Small Objects in Real-Time on Microcontroller Units | Liam Boyle et.al. | 2410.16769 | null |
2024-10-22 | DI-MaskDINO: A Joint Object Detection and Instance Segmentation Model | Zhixiong Nan et.al. | 2410.16707 | null |
2024-10-22 | Fire and Smoke Detection with Burning Intensity Representation | Xiaoyi Han et.al. | 2410.16642 | link |
2024-10-21 | Griffon-G: Bridging Vision-Language and Vision-Centric Tasks via Large Multimodal Models | Yufei Zhan et.al. | 2410.16163 | link |
2024-10-21 | Multi-Sensor Fusion for UAV Classification Based on Feature Maps of Image and Radar Data | Nikos Sakellariou et.al. | 2410.16089 | null |
2024-10-21 | Few-shot target-driven instance detection based on open-vocabulary object detection models | Ben Crulis et.al. | 2410.16028 | null |
2024-10-21 | How Important are Data Augmentations to Close the Domain Gap for Object Detection in Orbit? | Maximilian Ulmer et.al. | 2410.15766 | null |
2024-10-21 | P-YOLOv8: Efficient and Accurate Real-Time Detection of Distracted Driving | Mohamed R. Elshamy et.al. | 2410.15602 | null |
2024-10-21 | Deep Learning and Machine Learning -- Object Detection and Semantic Segmentation: From Theory to Applications | Jintao Ren et.al. | 2410.15584 | null |
2024-10-21 | Online Pseudo-Label Unified Object Detection for Multiple Datasets Training | XiaoJun Tang et.al. | 2410.15569 | null |
2024-10-20 | TrackMe:A Simple and Effective Multiple Object Tracking Annotation Tool | Thinh Phan et.al. | 2410.15518 | null |
2024-10-20 | YOLO-RD: Introducing Relevant and Compact Explicit Knowledge to YOLO by Retriever-Dictionary | Hao-Tang Tsui et.al. | 2410.15346 | null |
2024-10-20 | Open-vocabulary vs. Closed-set: Best Practice for Few-shot Object Detection Considering Text Describability | Yusuke Hosoya et.al. | 2410.15315 | null |
2024-10-18 | MultiOrg: A Multi-rater Organoid-detection Dataset | Christina Bukas et.al. | 2410.14612 | null |
2024-10-18 | Beyond Binary: Towards Fine-Grained LLM-Generated Text Detection via Role Recognition and Involvement Measurement | Zihao Cheng et.al. | 2410.14259 | null |
2024-10-18 | Multi-Source Spatial Knowledge Understanding for Immersive Visual Text-to-Speech | Shuwei He et.al. | 2410.14101 | link |
2024-10-18 | Enhancing In-vehicle Multiple Object Tracking Systems with Embeddable Ising Machines | Kosuke Tatsumura et.al. | 2410.14093 | null |
2024-10-17 | FaceSaliencyAug: Mitigating Geographic, Gender and Stereotypical Biases via Saliency-Based Data Augmentation | Teerath Kumar et.al. | 2410.14070 | null |
2024-10-17 | Spatiotemporal Object Detection for Improved Aerial Vehicle Detection in Traffic Monitoring | Kristina Telegraph et.al. | 2410.13616 | null |
2024-10-17 | RemoteDet-Mamba: A Hybrid Mamba-CNN Network for Multi-modal Object Detection in Remote Sensing Images | Kejun Ren et.al. | 2410.13532 | null |
2024-10-16 | Syn2Real Domain Generalization for Underwater Mine-like Object Detection Using Side-Scan Sonar | Aayush Agrawal et.al. | 2410.12953 | null |
2024-10-16 | MambaBEV: An efficient 3D detection model with Mamba2 | Zihan You et.al. | 2410.12673 | null |
2024-10-16 | On the Risk of Evidence Pollution for Malicious Social Text Detection in the Era of LLMs | Herun Wan et.al. | 2410.12600 | null |
2024-10-16 | Cocoon: Robust Multi-Modal Perception with Uncertainty-Aware Sensor Fusion | Minkyoung Cho et.al. | 2410.12592 | null |
2024-10-16 | Feature Augmentation for Self-supervised Contrastive Learning: A Closer Look | Yong Zhang et.al. | 2410.12396 | null |
2024-10-16 | Real-time Stereo-based 3D Object Detection for Streaming Perception | Changcai Li et.al. | 2410.12394 | link |
2024-10-16 | Context-Infused Visual Grounding for Art | Selina Khan et.al. | 2410.12369 | link |
2024-10-16 | Fusion from Decomposition: A Self-Supervised Approach for Image Fusion and Beyond | Pengwei Liang et.al. | 2410.12274 | null |
2024-10-16 | Optimizing YOLOv5s Object Detection through Knowledge Distillation algorithm | Guanming Huang et.al. | 2410.12259 | null |
2024-10-16 | SAM-Guided Masked Token Prediction for 3D Scene Understanding | Zhimin Chen et.al. | 2410.12158 | null |
2024-10-16 | Unveiling the Limits of Alignment: Multi-modal Dynamic Local Fusion Network and A Benchmark for Unaligned RGBT Video Object Detection | Qishun Wang et.al. | 2410.12143 | null |
2024-10-15 | Fractal Calibration for long-tailed object detection | Konstantinos Panagiotis Alexandridis et.al. | 2410.11774 | null |
2024-10-15 | POLO -- Point-based, multi-class animal detection | Giacomo May et.al. | 2410.11741 | null |
2024-10-15 | YOLO-ELA: Efficient Local Attention Modeling for High-Performance Real-Time Insulator Defect Detection | Olalekan Akindele et.al. | 2410.11727 | null |
2024-10-15 | SeaDATE: Remedy Dual-Attention Transformer with Semantic Alignment via Contrast Learning for Multimodal Object Detection | Shuhan Dong et.al. | 2410.11358 | null |
2024-10-15 | Open World Object Detection: A Survey | Yiming Li et.al. | 2410.11301 | null |
2024-10-15 | Representation Similarity: A Better Guidance of DNN Layer Sharing for Edge Computing without Training | Bryan Bo Cao et.al. | 2410.11233 | null |
2024-10-15 | TEOcc: Radar-camera Multi-modal Occupancy Prediction via Temporal Enhancement | Zhiwei Lin et.al. | 2410.11228 | null |
2024-10-15 | CVCP-Fusion: On Implicit Depth Estimation for 3D Bounding Box Prediction | Pranav Gupta et.al. | 2410.11211 | link |
2024-10-15 | Multiview Scene Graph | Juexiao Zhang et.al. | 2410.11187 | null |
2024-10-14 | UAV3D: A Large-scale 3D Perception Benchmark for Unmanned Aerial Vehicles | Hui Ye et.al. | 2410.11125 | null |
2024-10-14 | ROSAR: An Adversarial Re-Training Framework for Robust Side-Scan Sonar Object Detection | Martin Aubard et.al. | 2410.10554 | link |
2024-10-14 | Learning to Ground VLMs without Forgetting | Aritra Bhowmik et.al. | 2410.10491 | null |
2024-10-14 | SMART-TRACK: A Novel Kalman Filter-Guided Sensor Fusion For Robust UAV Object Tracking in Dynamic Environments | Khaled Gabr et.al. | 2410.10409 | null |
2024-10-14 | V2M: Visual 2-Dimensional Mamba for Image Representation Learning | Chengkun Wang et.al. | 2410.10382 | link |
2024-10-14 | GlobalMamba: Global Image Serialization for Vision Mamba | Chengkun Wang et.al. | 2410.10316 | link |
2024-10-14 | ROA-BEV: 2D Region-Oriented Attention for BEV-based 3D Object | Jiwei Chen et.al. | 2410.10298 | null |
2024-10-14 | Out-of-Bounding-Box Triggers: A Stealthy Approach to Cheat Object Detectors | Tao Lin et.al. | 2410.10091 | link |
2024-10-15 | Optimizing Waste Management with Advanced Object Detection for Garbage Classification | Everest Z. Kuang et.al. | 2410.09975 | null |
2024-10-13 | EITNet: An IoT-Enhanced Framework for Real-Time Basketball Action Recognition | Jingyu Liu et.al. | 2410.09954 | null |
2024-10-13 | LoLI-Street: Benchmarking Low-Light Image Enhancement and Beyond | Md Tanvir Islam et.al. | 2410.09831 | link |
2024-10-11 | DA-Ada: Learning Domain-Aware Adapter for Domain Adaptive Object Detection | Haochen Li et.al. | 2410.09004 | null |
2024-10-11 | LIME-Eval: Rethinking Low-light Image Enhancement Evaluation via Object Detection | Mingjia Li et.al. | 2410.08810 | null |
2024-10-11 | Hespi: A pipeline for automatically detecting information from hebarium specimen sheets | Robert Turnbull et.al. | 2410.08740 | null |
2024-10-11 | MMLF: Multi-modal Multi-class Late Fusion for Object Detection with Uncertainty Estimation | Qihang Yang et.al. | 2410.08739 | null |
2024-10-11 | Boosting Open-Vocabulary Object Detection by Handling Background Samples | Ruizhe Zeng et.al. | 2410.08645 | null |
2024-10-11 | DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention | Nguyen Huu Bao Long et.al. | 2410.08582 | link |
2024-10-11 | VOVTrack: Exploring the Potentiality in Videos for Open-Vocabulary Object Tracking | Zekun Qian et.al. | 2410.08529 | null |
2024-10-10 | Are We Ready for Real-Time LiDAR Semantic Segmentation in Autonomous Driving? | Samir Abou Haidar et.al. | 2410.08365 | null |
2024-10-10 | PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object Detection | Botao Ren et.al. | 2410.08210 | null |
2024-10-10 | Robust AI-Generated Text Detection by Restricted Embeddings | Kristian Kuznetsov et.al. | 2410.08113 | null |
2024-10-10 | Dynamic Object Catching with Quadruped Robot Front Legs | André Schakkal et.al. | 2410.08065 | null |
2024-10-10 | HeightFormer: A Semantic Alignment Monocular 3D Object Detection Method from Roadside Perspective | Pei Liu et.al. | 2410.07758 | null |
2024-10-10 | O1O: Grouping of Known Classes to Identify Unknown Objects as Odd-One-Out | Mısra Yavuz et.al. | 2410.07514 | null |
2024-10-09 | Progressive Multi-Modal Fusion for Robust 3D Object Detection | Rohit Mohan et.al. | 2410.07475 | null |
2024-10-09 | Self-Supervised Learning for Real-World Object Detection: a Survey | Alina Ciocarlan et.al. | 2410.07442 | null |
2024-10-09 | Robust infrared small target detection using self-supervised and a contrario paradigms | Alina Ciocarlan et.al. | 2410.07437 | null |
2024-10-09 | SurANet: Surrounding-Aware Network for Concealed Object Detection via Highly-Efficient Interactive Contrastive Learning Strategy | Yuhan Kang et.al. | 2410.06842 | link |
2024-10-09 | Rethinking the Evaluation of Visible and Infrared Image Fusion | Dayan Guan et.al. | 2410.06811 | link |
2024-10-09 | QuadMamba: Learning Quadtree-based Selective Scan for Visual State Space Model | Fei Xie et.al. | 2410.06806 | null |
2024-10-09 | QuadBEV: An Efficient Quadruple-Task Perception Framework via Bird's-Eye-View Representation | Yuxin Li et.al. | 2410.06516 | null |
2024-10-08 | Adver-City: Open-Source Multi-Modal Dataset for Collaborative Perception Under Adverse Weather Conditions | Mateus Karvat et.al. | 2410.06380 | null |
2024-10-08 | Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach | Sha Guo et.al. | 2410.06149 | null |
2024-10-08 | Training-free LLM-generated Text Detection by Mining Token Probability Sequences | Yihuai Xu et.al. | 2410.06072 | null |
2024-10-08 | Training-Free Open-Ended Object Detection and Segmentation via Attention as Prompts | Zhiwei Lin et.al. | 2410.05963 | null |
2024-10-08 | Learning Gaussian Data Augmentation in Feature Space for One-shot Object Detection in Manga | Takara Taniguchi et.al. | 2410.05935 | null |
2024-10-08 | Unobserved Object Detection using Generative Models | Subhransu S. Bhattacharjee et.al. | 2410.05869 | null |
2024-10-07 | Real-Time Truly-Coupled Lidar-Inertial Motion Correction and Spatiotemporal Dynamic Object Detection | Cedric Le Gentil et.al. | 2410.05152 | null |
2024-10-07 | Human-in-the-loop Reasoning For Traffic Sign Detection: Collaborative Approach Yolo With Video-llava | Mehdi Azarafza et.al. | 2410.05096 | null |
2024-10-07 | Improving Object Detection via Local-global Contrastive Learning | Danai Triantafyllidou et.al. | 2410.05058 | null |
2024-10-07 | Windshield Integration of Thermal and Color Fusion for Automatic Emergency Braking in Low Visibility Conditions | Gabriel Jobert et.al. | 2410.04928 | null |
2024-10-07 | Improved detection of discarded fish species through BoxAL active learning | Maria Sokolova et.al. | 2410.04880 | link |
2024-10-06 | Learning De-Biased Representations for Remote-Sensing Imagery | Zichen Tian et.al. | 2410.04546 | link |
2024-10-05 | AI as Humanity's Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text | Ximing Lu et.al. | 2410.04265 | null |
2024-10-05 | ETHcavation: A Dataset and Pipeline for Panoptic Scene Understanding and Object Tracking in Dynamic Construction Environments | Lorenzo Terenzi et.al. | 2410.04250 | null |
2024-10-05 | Fast Object Detection with a Machine Learning Edge Device | Richard C. Rodriguez et.al. | 2410.04173 | null |
2024-10-05 | Robust Task-Oriented Communication Framework for Real-Time Collaborative Vision Perception | Zhengru Fang et.al. | 2410.04168 | null |
2024-10-04 | DRAFTS: A Deep Learning-Based Radio Fast Transient Search Pipeline | Yong-Kun Zhang et.al. | 2410.03200 | null |
2024-10-03 | Is Your Paper Being Reviewed by an LLM? Investigating AI Text Detectability in Peer Review | Sungduk Yu et.al. | 2410.03019 | null |
2024-10-04 | Learning 3D Perception from Others' Predictions | Jinsu Yoo et.al. | 2410.02646 | null |
2024-10-02 | Enhancing Screen Time Identification in Children with a Multi-View Vision Language Model and Screen Time Tracker | Xinlong Hou et.al. | 2410.01966 | null |
2024-10-02 | 3DGS-DET: Empower 3D Gaussian Splatting with Boundary Guidance and Box-Focused Sampling for 3D Object Detection | Yang Cao et.al. | 2410.01647 | link |
2024-10-02 | Gaussian-Det: Learning Closed-Surface Gaussians for 3D Object Detection | Hongru Yan et.al. | 2410.01404 | null |
2024-10-02 | Finetuning Pre-trained Model with Limited Data for LiDAR-based 3D Object Detection by Bridging Domain Gaps | Jiyun Jang et.al. | 2410.01319 | null |
2024-10-02 | Panopticus: Omnidirectional 3D Object Detection on Resource-constrained Edge Devices | Jeho Lee et.al. | 2410.01270 | null |
2024-10-02 | High and Low Resolution Tradeoffs in Roadside Multimodal Sensing | Shaozu Ding et.al. | 2410.01250 | null |
2024-10-02 | Perceptual Piercing: Human Visual Cue-based Object Detection in Low Visibility Conditions | Ashutosh Kumar et.al. | 2410.01225 | link |
2024-10-02 | A versatile machine learning workflow for high-throughput analysis of supported metal catalyst particles | Arda Genc et.al. | 2410.01213 | link |
2024-10-01 | Synthetic imagery for fuzzy object detection: A comparative study | Siavash H. Khajavi et.al. | 2410.01124 | null |
2024-10-01 | Generating Seamless Virtual Immunohistochemical Whole Slide Images with Content and Color Consistency | Sitong Liu et.al. | 2410.01072 | null |
2024-10-01 | ARPOV: Expanding Visualization of Object Detection in AR with Panoramic Mosaic Stitching | Erin McGowan et.al. | 2410.01055 | null |
2024-09-30 | Accelerating Non-Maximum Suppression: A Graph Theory Perspective | King-Siong Si et.al. | 2409.20520 | link |
2024-09-30 | NUTRIVISION: A System for Automatic Diet Management in Smart Healthcare | Madhumita Veeramreddy et.al. | 2409.20508 | null |
2024-09-30 | Navigating Threats: A Survey of Physical Adversarial Attacks on LiDAR Perception Systems in Autonomous Vehicles | Amira Guesmi et.al. | 2409.20426 | null |
2024-09-30 | Training a Computer Vision Model for Commercial Bakeries with Primarily Synthetic Images | Thomas H. Schmitt et.al. | 2409.20122 | null |
2024-09-30 | GearTrack: Automating 6D Pose Estimation | Yu Deng et.al. | 2409.19986 | null |
2024-09-30 | TSdetector: Temporal-Spatial Self-correction Collaborative Learning for Colonoscopy Video Detection | Kaini Wang et.al. | 2409.19983 | null |
2024-09-30 | DAOcc: 3D Object Detection Assisted Multi-Sensor Fusion for 3D Occupancy Prediction | Zhen Yang et.al. | 2409.19972 | link |
2024-09-30 | HazyDet: Open-source Benchmark for Drone-view Object Detection with Depth-cues in Hazy Scenes | Changfeng Feng et.al. | 2409.19833 | link |
2024-09-29 | Applying the Lower-Biased Teacher Model in Semi-Suepervised Object Detection | Shuang Wang et.al. | 2409.19703 | null |
2024-09-29 | OrientedFormer: An End-to-End Transformer-Based Oriented Object Detector in Remote Sensing Images | Jiaqi Zhao et.al. | 2409.19648 | link |
2024-09-27 | Spectral Wavelet Dropout: Regularization in the Wavelet Domain | Rinor Cakaj et.al. | 2409.18951 | null |
2024-09-27 | MCUBench: A Benchmark of Tiny Object Detectors on MCUs | Sudhakar Sah et.al. | 2409.18866 | link |
2024-09-27 | A Novel Unified Architecture for Low-Shot Counting by Detection and Segmentation | Jer Pelhan et.al. | 2409.18686 | null |
2024-09-27 | Query matching for spatio-temporal action detection with query-based object detector | Shimon Hori et.al. | 2409.18408 | null |
2024-09-26 | Efficient Microscopic Image Instance Segmentation for Food Crystal Quality Control | Xiaoyu Ji et.al. | 2409.18291 | null |
2024-09-26 | Advancing Object Detection in Transportation with Multimodal Large Language Models (MLLMs): A Comprehensive Review and Empirical Testing | Huthaifa I. Ashqar et.al. | 2409.18286 | null |
2024-09-26 | GSON: A Group-based Social Navigation Framework with Large Multimodal Model | Shangyi Luo et.al. | 2409.18084 | null |
2024-09-27 | A New Dataset for Monocular Depth Estimation Under Viewpoint Shifts | Aurel Pjetri et.al. | 2409.17851 | null |
2024-09-26 | Scene Understanding in Pick-and-Place Tasks: Analyzing Transformations Between Initial and Final Scenes | Seraj Ghasemi et.al. | 2409.17720 | null |
2024-09-26 | SLO-Aware Task Offloading within Collaborative Vehicle Platoons | Boris Sedlak et.al. | 2409.17667 | null |
2024-09-26 | CAMOT: Camera Angle-aware Multi-Object Tracking | Felix Limanta et.al. | 2409.17533 | null |
2024-09-25 | Transient Adversarial 3D Projection Attacks on Object Detection in Autonomous Driving | Ce Zhou et.al. | 2409.17403 | null |
2024-09-25 | AgRegNet: A Deep Regression Network for Flower and Fruit Density Estimation, Localization, and Counting in Orchards | Uddhav Bhattarai et.al. | 2409.17400 | null |
2024-09-25 | Energy-Efficient & Real-Time Computer Vision with Intelligent Skipping via Reconfigurable CMOS Image Sensors | Md Abdullah-Al Kaiser et.al. | 2409.17341 | null |
2024-09-25 | BitQ: Tailoring Block Floating Point Precision for Improved DNN Efficiency on Resource-Constrained Devices | Yongqi Xu et.al. | 2409.17093 | link |
2024-09-25 | EventHDR: from Event to High-Speed HDR Videos and Beyond | Yunhao Zou et.al. | 2409.17029 | null |
2024-09-25 | Focus Entirety and Perceive Environment for Arbitrary-Shaped Text Detection | Xu Han et.al. | 2409.16827 | null |
2024-09-25 | XAI-guided Insulator Anomaly Detection for Imbalanced Datasets | Maximilian Andreas Hoefler et.al. | 2409.16821 | null |
2024-09-25 | Spotlight Text Detector: Spotlight on Candidate Regions Like a Camera | Xu Han et.al. | 2409.16820 | null |
2024-09-25 | Benchmarking Deep Learning Models for Object Detection on Edge Computing Devices | Daghash K. Alqahtani et.al. | 2409.16808 | null |
2024-09-25 | Pix2Next: Leveraging Vision Foundation Models for RGB to NIR Image Translation | Youngwan Jin et.al. | 2409.16706 | null |
2024-09-25 | TSBP: Improving Object Detection in Histology Images via Test-time Self-guided Bounding-box Propagation | Tingting Yang et.al. | 2409.16678 | link |
2024-09-25 | Source-Free Domain Adaptation for YOLO Object Detection | Simon Varailhon et.al. | 2409.16538 | null |
2024-09-24 | Real-Time Detection of Electronic Components in Waste Printed Circuit Boards: A Transformer-Based Approach | Muhammad Mohsin et.al. | 2409.16496 | null |
2024-09-24 | Tiny Robotics Dataset and Benchmark for Continual Object Detection | Francesco Pasti et.al. | 2409.16215 | link |
2024-09-24 | Seeing Faces in Things: A Model and Dataset for Pareidolia | Mark Hamilton et.al. | 2409.16143 | null |
2024-09-24 | HA-FGOVD: Highlighting Fine-grained Attributes via Explicit Linear Composition for Open-Vocabulary Object Detection | Yuqi Ma et.al. | 2409.16136 | null |
2024-09-24 | Neuromorphic Drone Detection: an Event-RGB Multimodal Approach | Gabriele Magrini et.al. | 2409.16099 | null |
2024-09-24 | Open-World Object Detection with Instance Representation Learning | Sunoh Lee et.al. | 2409.16073 | null |
2024-09-24 | Towards Robust Object Detection: Identifying and Removing Backdoors via Module Inconsistency Analysis | Xianda Zhang et.al. | 2409.16057 | null |
2024-09-24 | Zero-Shot Detection of AI-Generated Images | Davide Cozzolino et.al. | 2409.15875 | null |
2024-09-24 | Automated Assessment of Multimodal Answer Sheets in the STEM domain | Rajlaxmi Patil et.al. | 2409.15749 | null |
2024-09-24 | Real-Time Pedestrian Detection on IoT Edge Devices: A Lightweight Deep Learning Approach | Muhammad Dany Alfikri et.al. | 2409.15740 | null |
2024-09-24 | PDT: Uav Target Detection Dataset for Pests and Diseases Tree | Mingle Zhou et.al. | 2409.15679 | link |
2024-09-18 | Applications of Knowledge Distillation in Remote Sensing: A Survey | Yassine Himeur et.al. | 2409.12111 | null |
2024-09-18 | Agglomerative Token Clustering | Joakim Bruslund Haurum et.al. | 2409.11923 | null |
2024-09-18 | RockTrack: A 3D Robust Multi-Camera-Ken Multi-Object Tracking Framework | Xiaoyu Li et.al. | 2409.11749 | null |
2024-09-17 | Open-Set Semantic Uncertainty Aware Metric-Semantic Graph Matching | Kurran Singh et.al. | 2409.11555 | null |
2024-09-17 | VALO: A Versatile Anytime Framework for LiDAR-based Object Detection Deep Neural Networks | Ahmet Soyyigit et.al. | 2409.11542 | link |
2024-09-17 | STCMOT: Spatio-Temporal Cohesion Learning for UAV-Based Multiple Object Tracking | Jianbo Ma et.al. | 2409.11234 | link |
2024-09-19 | Vision foundation models: can they be applied to astrophysics data? | E. Lastufka et.al. | 2409.11175 | null |
2024-09-17 | UltimateDO: An Efficient Framework to Marry Occupancy Prediction with 3D Object Detection via Channel2height | Zichen Yu et.al. | 2409.11160 | null |
2024-09-17 | Unleashing the Potential of Mamba: Boosting a LiDAR 3D Sparse Detector by Using Cross-Model Knowledge Distillation | Rui Yu et.al. | 2409.11018 | null |
2024-09-17 | TrajSSL: Trajectory-Enhanced Semi-Supervised 3D Object Detection | Philip Jacobson et.al. | 2409.10901 | null |
2024-09-18 | Context-Dependent Interactable Graphical User Interface Element Detection for Spatial Computing Applications | Shuqing Li et.al. | 2409.10811 | null |
2024-09-16 | Online Learning via Memory: Retrieval-Augmented Detector Adaptation | Yanan Jian et.al. | 2409.10716 | null |
2024-09-16 | CoMamba: Real-time Cooperative Perception Unlocked with State Space Models | Jinlong Li et.al. | 2409.10699 | null |
2024-09-16 | Point2Graph: An End-to-end Point Cloud-based 3D Open-Vocabulary Scene Graph for Robot Navigation | Yifan Xu et.al. | 2409.10350 | null |
2024-09-16 | Performance of Human Annotators in Object Detection and Segmentation of Remotely Sensed Data | Roni Blushtein-Livnon et.al. | 2409.10272 | null |
2024-09-16 | Self-Updating Vehicle Monitoring Framework Employing Distributed Acoustic Sensing towards Real-World Settings | Xi Wang et.al. | 2409.10259 | null |
2024-09-16 | DAE-Fuse: An Adaptive Discriminative Autoencoder for Multi-Modality Image Fusion | Yuchen Guo et.al. | 2409.10080 | null |
2024-09-16 | Towards Physically-Realizable Adversarial Attacks in Embodied Vision Navigation | Meng Chen et.al. | 2409.10071 | link |
2024-09-16 | LithoHoD: A Litho Simulator-Powered Framework for IC Layout Hotspot Detection | Hao-Chiang Shao et.al. | 2409.10021 | null |
2024-09-16 | Comprehensive Study on Sentiment Analysis: From Rule-based to modern LLM based system | Shailja Gupta et.al. | 2409.09989 | null |
2024-09-15 | Tracking Virtual Meetings in the Wild: Re-identification in Multi-Participant Virtual Meetings | Oriel Perl et.al. | 2409.09841 | null |
2024-09-15 | Template-based Multi-Domain Face Recognition | Anirudh Nanduri et.al. | 2409.09832 | null |
2024-09-15 | PersonaMark: Personalized LLM watermarking for model protection and user attribution | Yuehan Zhang et.al. | 2409.09739 | null |
2024-09-13 | Interactive Masked Image Modeling for Multimodal Object Detection in Remote Sensing | Minh-Duc Vu et.al. | 2409.08885 | null |
2024-09-13 | Direct-CP: Directed Collaborative Perception for Connected and Autonomous Vehicles via Proactive Attention | Yihang Tao et.al. | [240 |