Issue Downloads
A Survey on Evaluation of Large Language Models
- Yupeng Chang,
- Xu Wang,
- Jindong Wang,
- Yuan Wu,
- Linyi Yang,
- Kaijie Zhu,
- Hao Chen,
- Xiaoyuan Yi,
- Cunxiang Wang,
- Yidong Wang,
- Wei Ye,
- Yue Zhang,
- Yi Chang,
- Philip S. Yu,
- Qiang Yang,
- Xing Xie
Large language models (LLMs) are gaining increasing popularity in both academia and industry, owing to their unprecedented performance in various applications. As LLMs continue to play a vital role in both research and daily use, their evaluation becomes ...
Deep Learning in Single-cell Analysis
- Dylan Molho,
- Jiayuan Ding,
- Wenzhuo Tang,
- Zhaoheng Li,
- Hongzhi Wen,
- Yixin Wang,
- Julian Venegas,
- Wei Jin,
- Renming Liu,
- Runze Su,
- Patrick Danaher,
- Robert Yang,
- Yu Leo Lei,
- Yuying Xie,
- Jiliang Tang
Single-cell technologies are revolutionizing the entire field of biology. The large volumes of data generated by single-cell technologies are high dimensional, sparse, and heterogeneous and have complicated dependency structures, making analyses using ...
MGRR-Net: Multi-level Graph Relational Reasoning Network for Facial Action Unit Detection
The Facial Action Coding System (FACS) encodes the action units (AUs) in facial images, which has attracted extensive research attention due to its wide use in facial expression analysis. Many methods that perform well on automatic facial action unit (AU) ...
Bayesian Strategy Networks Based Soft Actor-Critic Learning
A strategy refers to the rules that the agent chooses the available actions to achieve goals. Adopting reasonable strategies is challenging but crucial for an intelligent agent with limited resources working in hazardous, unstructured, and dynamic ...
Internal Rehearsals for a Reconfigurable Robot to Improve Area Coverage Performance
Reconfigurable robots are deployed for applications demanding area coverage, such as cleaning and inspections. Reconfiguration per context, considering beyond a small set of predefined shapes, is crucial for area coverage performance. However, the ...
Guidelines for the Regularization of Gammas in Batch Normalization for Deep Residual Networks
L2 regularization for weights in neural networks is widely used as a standard training trick. In addition to weights, the use of batch normalization involves an additional trainable parameter γ, which acts as a scaling factor. However, L2 regularization ...
Multimodal Dialogue Systems via Capturing Context-aware Dependencies and Ordinal Information of Semantic Elements
The topic of multimodal conversation systems has recently garnered significant attention across various industries, including travel and retail, among others. While pioneering works in this field have shown promising performance, they often focus solely ...
CACTUS: A Comprehensive Abstraction and Classification Tool for Uncovering Structures
The availability of large datasets is providing the impetus for driving many current artificial intelligent developments. However, specific challenges arise in developing solutions that exploit small datasets, mainly due to practical and cost-effective ...
Advancing Attribution-Based Neural Network Explainability through Relative Absolute Magnitude Layer-Wise Relevance Propagation and Multi-Component Evaluation
Recent advancement in deep-neural network performance led to the development of new state-of-the-art approaches in numerous areas. However, the black-box nature of neural networks often prohibits their use in areas where model explainability and model ...
Learning Cross-modality Interaction for Robust Depth Perception of Autonomous Driving
As one of the fundamental tasks of autonomous driving, depth perception aims to perceive physical objects in three dimensions and to judge their distances away from the ego vehicle. Although great efforts have been made for depth perception, LiDAR-based ...
Tapestry of Time and Actions: Modeling Human Activity Sequences Using Temporal Point Process Flows
Human beings always engage in a vast range of activities and tasks that demonstrate their ability to adapt to different scenarios. These activities can range from the simplest daily routines, like walking and sitting, to multi-level complex endeavors such ...
Deconfounded Cross-modal Matching for Content-based Micro-video Background Music Recommendation
Object-oriented micro-video background music recommendation is a complicated task where the matching degree between videos and background music is a major issue. However, music selections in user-generated content (UGC) are prone to selection bias caused ...
MHGCN+: Multiplex Heterogeneous Graph Convolutional Network
Heterogeneous graph convolutional networks have gained great popularity in tackling various network analytical tasks on heterogeneous graph data, ranging from link prediction to node classification. However, most existing works ignore the relation ...
A Game-theoretic Framework for Privacy-preserving Federated Learning
In federated learning, benign participants aim to optimize a global model collaboratively. However, the risk of privacy leakage cannot be ignored in the presence of semi-honest adversaries. Existing research has focused either on designing protection ...
Self-supervised Bipartite Graph Representation Learning: A Dirichlet Max-margin Matrix Factorization Approach
Bipartite graph representation learning aims to obtain node embeddings by compressing sparse vectorized representations of interactions between two types of nodes, e.g., users and items. Incorporating structural attributes among homogeneous nodes, such as ...
A Meta-Learning Framework for Tuning Parameters of Protection Mechanisms in Trustworthy Federated Learning
Trustworthy federated learning typically leverages protection mechanisms to guarantee privacy. However, protection mechanisms inevitably introduce utility loss or efficiency reduction while protecting data privacy. Therefore, protection mechanisms and ...
Ensuring Fairness and Gradient Privacy in Personalized Heterogeneous Federated Learning
With the increasing tension between conflicting requirements of the availability of large amounts of data for effective machine learning-based analysis, and for ensuring their privacy, the paradigm of federated learning has emerged, a distributed machine ...
FedCMD: A Federated Cross-modal Knowledge Distillation for Drivers’ Emotion Recognition
Emotion recognition has attracted a lot of interest in recent years in various application areas such as healthcare and autonomous driving. Existing approaches to emotion recognition are based on visual, speech, or psychophysiological signals. However, ...
Perceiving Actions via Temporal Video Frame Pairs
Video action recognition aims at classifying the action category in given videos. In general, semantic-relevant video frame pairs reflect significant action patterns such as object appearance variation and abstract temporal concepts like speed, rhythm, ...
Score-based Graph Learning for Urban Flow Prediction
Accurate urban flow prediction (UFP) is crucial for a range of smart city applications such as traffic management, urban planning, and risk assessment. To capture the intrinsic characteristics of urban flow, recent efforts have utilized spatial and ...
HydraGAN: A Cooperative Agent Model for Multi-Objective Data Generation
Generative adversarial networks have become a de facto approach to generate synthetic data points that resemble their real counterparts. We tackle the situation where the realism of individual samples is not the sole criterion for synthetic data ...
Quintuple-based Representation Learning for Bipartite Heterogeneous Networks
Recent years have seen rapid progress in network representation learning, which removes the need for burdensome feature engineering and facilitates downstream network-based tasks. In reality, networks often exhibit heterogeneity, which means there may ...
Analysing Utterances in LLM-Based User Simulation for Conversational Search
Clarifying underlying user information needs by asking clarifying questions is an important feature of modern conversational search systems. However, evaluation of such systems through answering prompted clarifying questions requires significant human ...