The four-volume set, LNAI 16597-16600 constitutes the proceedings of the 30th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2026, held in Hong Kong, China, during June 9–12, 2026.
The 184 full papers presented in this book were carefully selected and reviewed from 728 submissions.
This program featured three tracks, the Main Track, the Survey Track, and the Special Track on LLMs for Data Science.
The Main Track continued its tradition of being the premier forum for the presentation of research results and experience reports on knowledge discovery, data science, and machine learning.
The Survey Track was introduced in 2025 for the first time, to promote the dissemination of insightful survey papers: survey papers are intended to provide a structured synthesis of a particular topic in the area of data mining, including but not limited to theoretical foundations of mining, inference, and learning, big data technologies, as well as security, privacy, and integrity, for the perusal of junior researchers and of experts from other research fields.
As the rapid advancements in Large Language Models (LLMs) have opened new avenues for innovation and research across various domains, particularly in the field of data science, we introduced the Special Track on LLMs, which aimed to explore the transformative potential of LLMs for data science, bringing together researchers, practitioners, and industry experts to discuss the latest developments, challenges, and opportunities in this rapidly growing area.
Table of Contents:
.- AVER:Adversarial Variational Enhanced Representation Architecture for Abstractive Multi-Document Summarization.
.- TheSelective: Dual Affinity–Guided Diffusion for Selective Molecular Generation.
.- A Novel Hierarchical Multi-Agent System for Payments Using LLMs.
.- Rethinking Unlearnable Examples in the Context of the Right to be Forgotten.
.- Divergence-Based Similarity Function for Multi-View Contrastive Learning.
.- RectiCast: Rectifying Distribution Shift in Cascaded Precipitation Nowcasting.
.- DeltaAug: A Directional Semantic Data Augmentation Framework Guided by Differential Reasoning.
.- Counteracting Popularity Bias Amplification in Bundle Recommendations with Latent Factor Constraints.
.- Neuro-Symbolic Learning for Predictive Process Monitoring via Two-Stage Logic Tensor Networks with Rule Pruning.
.- EmbryoCST: A Stage-Aware ResNet with Layered CNN-3D Swin Transformer for Embryo Stage Classification.
.- PMME: Spatio-Temporal Few-shot Learning via Pattern Matching with Memory Enhancement.
.- Semantic Steer Chain: Efficient and Precise Black-Box Jailbreaking through Adaptive Approximation.
.- AutoPrep-MM: Reinforcement Learning for Multimodal Data Preprocessing with Graph-based Interaction Modeling.
.- Meta-RL with Shared Representations Enables Fast Adaptation in Energy Systems.
.- EarlyShield: Early-Stage Screening for Robust Personalized Federated Learning.
.- R^2R: A Route-to-Rerank Post-Training Framework for Multi-Domain Decoder-Only Rerankers.
.- HIGR: Hierarchical Iterative Graph Reasoner for Document-Level Event Causality Identification.
.- Robust Volume Ratio: A Metric for Fine-Grained Evaluation of Tree Ensemble Robustness.
.- MJOS: A Multi-Stage Joint Optimization Strategy for Convolutional Neural Network Compression.
.- FIGDA: Feature Importance Guided Learning Model for Domain-Adaptive Anomaly Detection.
.- Unified Loss-Level Regularization for Spike-Aware and Anti-Lag Time Series Forecasting.
.- BatterySurAD: A Dataset for Anomaly Detection on Cylindrical Battery Surfaces with Spatial Zone Annotations.
.- MITS: Enhanced Tree Search Reasoning for LLMs via Pointwise Mutual Information.
.- Mosaic: an Accurate and Efficient Kernel-based Multivariate Time Series Classifier.
.- Schema-Agnostic Feature Extractor using LLM and Transformer for RUL Prediction.
.- Recommending under-represented influential researchers using geographically aware contrastive learning.
.- TS-Unlearn: A Dual-Objective Unlearning Framework for Time Series Forecasting.
.- Subgraph Plug-in Boosts up Graph Neural Networks.
.- FedCKD: A Knowledge Distillation Approach to Cross-client Learning in Federated Learning with Label-exclusive Datasets.
.- Distillation-based Scenario-Adaptive Mixture-of-Experts for the Matching Stage of Multi-scenario Recommendation.
.- Tri-ABL: Enhancing ABL Using Three Learners.
.- Unsupervised Symbolic Anomaly Detection.
.- Diverse and Task-Specific Data Selection For Instruction Tuning.
.- Noise-Aware Multi-inference Self-regulation for Self-supervised denoising.
.- DrIM: Context-Driven Nearest Neighbor Imputation using Language Representation.
.- LLM-Guided Role Assignment and Coordination in Multi-Agent Reinforcement Learning.
.- ContinualCropBank: Object-Level Replay for Semi-Supervised Online Continual Object Detection.
.- Unlearning Spurious Concepts Enhances the Robustness of Concept-based Explainable Models in Skin Cancer Diagnosis.
.- ELC U-Mamba: Efficient Linear-Chunked Mamba for Coal Microscopic Image Segmentation.
.- SGF-Net: Fusing SMILES, Graph, and Fingerprints for Molecular Property Prediction.
.- HiMoE-CCTC: A Hierarchical Mixture-of-Experts Framework for Citizen Complaint Text Classification.
.- SAM-3D-MSF: Parameter-Efficient Adaptation of Segment Anything Model for 3D Tooth CBCT Segmentation.
.- Relation Prototype Driven Multimodal Knowledge Graph Completion.
.- Anomaly Edge Detection in Dynamic Graphs Based on Hyperbolic Graph Neural Networks.
.- Risk-aware Skill-coverage Hybrid Workforce Configuration on Social Networks.
.- Channel Dependence, Limited Lookback Windows, and the Simplicity of Datasets: How Biased is Time Series Forecasting?.
.- MultiADC: Advanced Antibody-Drug Conjugate Activity Prediction through Multi-scale Feature Fusion.