Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiyuan Wang

Be.FM: Open Foundation Models for Human Behavior

May 29, 2025

Yutong Xie, Zhuoheng Li, Xiyuan Wang, Yijun Pan, Qijia Liu, Xingzhi Cui, Kuang-Yu Lo, Ruoyi Gao, Xingjian Zhang, Jin Huang(+3 more)

Abstract:Despite their success in numerous fields, the potential of foundation models for modeling and understanding human behavior remains largely unexplored. We introduce Be.FM, one of the first open foundation models designed for human behavior modeling. Built upon open-source large language models and fine-tuned on a diverse range of behavioral data, Be.FM can be used to understand and predict human decision-making. We construct a comprehensive set of benchmark tasks for testing the capabilities of behavioral foundation models. Our results demonstrate that Be.FM can predict behaviors, infer characteristics of individuals and populations, generate insights about contexts, and apply behavioral science knowledge.

Via

Access Paper or Ask Questions

OCN: Effectively Utilizing Higher-Order Common Neighbors for Better Link Prediction

May 26, 2025

Juntong Wang, Xiyuan Wang, Muhan Zhang

Abstract:Common Neighbors (CNs) and their higher-order variants are important pairwise features widely used in state-of-the-art link prediction methods. However, existing methods often struggle with the repetition across different orders of CNs and fail to fully leverage their potential. We identify that these limitations stem from two key issues: redundancy and over-smoothing in high-order common neighbors. To address these challenges, we design orthogonalization to eliminate redundancy between different-order CNs and normalization to mitigate over-smoothing. By combining these two techniques, we propose Orthogonal Common Neighbor (OCN), a novel approach that significantly outperforms the strongest baselines by an average of 7.7% on popular link prediction benchmarks. A thorough theoretical analysis is provided to support our method. Ablation studies also verify the effectiveness of our orthogonalization and normalization techniques.

* 35 pages, 10 figures

Via

Access Paper or Ask Questions

Griffin: Towards a Graph-Centric Relational Database Foundation Model

May 08, 2025

Yanbo Wang, Xiyuan Wang, Quan Gan, Minjie Wang, Qibin Yang, David Wipf, Muhan Zhang

Abstract:We introduce Griffin, the first foundation model attemptation designed specifically for Relational Databases (RDBs). Unlike previous smaller models focused on single RDB tasks, Griffin unifies the data encoder and task decoder to handle diverse tasks. Additionally, we enhance the architecture by incorporating a cross-attention module and a novel aggregator. Griffin utilizes pretraining on both single-table and RDB datasets, employing advanced encoders for categorical, numerical, and metadata features, along with innovative components such as cross-attention modules and enhanced message-passing neural networks (MPNNs) to capture the complexities of relational data. Evaluated on large-scale, heterogeneous, and temporal graphs extracted from RDBs across various domains (spanning over 150 million nodes), Griffin demonstrates superior or comparable performance to individually trained models, excels in low-data scenarios, and shows strong transferability with similarity and diversity in pretraining across new datasets and tasks, highlighting its potential as a universally applicable foundation model for RDBs. Code available at https://github.com/yanxwb/Griffin.

Via

Access Paper or Ask Questions

Influence Maximization in Temporal Social Networks with a Cold-Start Problem: A Supervised Approach

Apr 15, 2025

Laixin Xie, Ying Zhang, Xiyuan Wang, Shiyi Liu, Shenghan Gao, Xingxing Xing, Wei Wan, Haipeng Zhang, Quan Li

Abstract:Influence Maximization (IM) in temporal graphs focuses on identifying influential "seeds" that are pivotal for maximizing network expansion. We advocate defining these seeds through Influence Propagation Paths (IPPs), which is essential for scaling up the network. Our focus lies in efficiently labeling IPPs and accurately predicting these seeds, while addressing the often-overlooked cold-start issue prevalent in temporal networks. Our strategy introduces a motif-based labeling method and a tensorized Temporal Graph Network (TGN) tailored for multi-relational temporal graphs, bolstering prediction accuracy and computational efficiency. Moreover, we augment cold-start nodes with new neighbors from historical data sharing similar IPPs. The recommendation system within an online team-based gaming environment presents subtle impact on the social network, forming multi-relational (i.e., weak and strong) temporal graphs for our empirical IM study. We conduct offline experiments to assess prediction accuracy and model training efficiency, complemented by online A/B testing to validate practical network growth and the effectiveness in addressing the cold-start issue.

* Accepted by ICWSM 2025

Via

Access Paper or Ask Questions

LIFT: Improving Long Context Understanding of Large Language Models through Long Input Fine-Tuning

Feb 20, 2025

Yansheng Mao, Yufei Xu, Jiaqi Li, Fanxu Meng, Haotong Yang, Zilong Zheng, Xiyuan Wang, Muhan Zhang

Abstract:Long context understanding remains challenging for large language models due to their limited context windows. This paper presents Long Input Fine-Tuning (LIFT), a novel framework for long-context modeling that can improve the long-context performance of arbitrary (short-context) LLMs by dynamically adapting model parameters based on the long input. Importantly, LIFT, rather than endlessly extending the context window size to accommodate increasingly longer inputs in context, chooses to store and absorb the long input in parameter. By fine-tuning the long input into model parameters, LIFT allows short-context LLMs to answer questions even when the required information is not provided in the context during inference. Furthermore, to enhance LIFT performance while maintaining the original in-context learning (ICL) capabilities, we introduce Gated Memory, a specialized attention adapter that automatically balances long input memorization and ICL. We provide a comprehensive analysis of the strengths and limitations of LIFT on long context understanding, offering valuable directions for future research.

* arXiv admin note: text overlap with arXiv:2412.13626

Via

Access Paper or Ask Questions

Using Random Noise Equivariantly to Boost Graph Neural Networks Universally

Feb 04, 2025

Xiyuan Wang, Muhan Zhang

Abstract:Recent advances in Graph Neural Networks (GNNs) have explored the potential of random noise as an input feature to enhance expressivity across diverse tasks. However, naively incorporating noise can degrade performance, while architectures tailored to exploit noise for specific tasks excel yet lack broad applicability. This paper tackles these issues by laying down a theoretical framework that elucidates the increased sample complexity when introducing random noise into GNNs without careful design. We further propose Equivariant Noise GNN (ENGNN), a novel architecture that harnesses the symmetrical properties of noise to mitigate sample complexity and bolster generalization. Our experiments demonstrate that using noise equivariantly significantly enhances performance on node-level, link-level, subgraph, and graph-level tasks and achieves comparable performance to models designed for specific tasks, thereby offering a general method to boost expressivity across various graph tasks.

* Under review

Via

Access Paper or Ask Questions

Do Graph Diffusion Models Accurately Capture and Generate Substructure Distributions?

Feb 04, 2025

Xiyuan Wang, Yewei Liu, Lexi Pang, Siwei Chen, Muhan Zhang

Abstract:Diffusion models have gained popularity in graph generation tasks; however, the extent of their expressivity concerning the graph distributions they can learn is not fully understood. Unlike models in other domains, popular backbones for graph diffusion models, such as Graph Transformers, do not possess universal expressivity to accurately model the distribution scores of complex graph data. Our work addresses this limitation by focusing on the frequency of specific substructures as a key characteristic of target graph distributions. When evaluating existing models using this metric, we find that they fail to maintain the distribution of substructure counts observed in the training set when generating new graphs. To address this issue, we establish a theoretical connection between the expressivity of Graph Neural Networks (GNNs) and the overall performance of graph diffusion models, demonstrating that more expressive GNN backbones can better capture complex distribution patterns. By integrating advanced GNNs into the backbone architecture, we achieve significant improvements in substructure generation.

* Under Review

Via

Access Paper or Ask Questions

Exact Acceleration of Subgraph Graph Neural Networks by Eliminating Computation Redundancy

Dec 24, 2024

Qian Tao, Xiyuan Wang, Muhan Zhang, Shuxian Hu, Wenyuan Yu, Jingren Zhou

Figure 1 for Exact Acceleration of Subgraph Graph Neural Networks by Eliminating Computation Redundancy

Figure 2 for Exact Acceleration of Subgraph Graph Neural Networks by Eliminating Computation Redundancy

Figure 3 for Exact Acceleration of Subgraph Graph Neural Networks by Eliminating Computation Redundancy

Figure 4 for Exact Acceleration of Subgraph Graph Neural Networks by Eliminating Computation Redundancy

Abstract:Graph neural networks (GNNs) have become a prevalent framework for graph tasks. Many recent studies have proposed the use of graph convolution methods over the numerous subgraphs of each graph, a concept known as subgraph graph neural networks (subgraph GNNs), to enhance GNNs' ability to distinguish non-isomorphic graphs. To maximize the expressiveness, subgraph GNNs often require each subgraph to have equal size to the original graph. Despite their impressive performance, subgraph GNNs face challenges due to the vast number and large size of subgraphs which lead to a surge in training data, resulting in both storage and computational inefficiencies. In response to this problem, this paper introduces Ego-Nets-Fit-All (ENFA), a model that uniformly takes the smaller ego nets as subgraphs, thereby providing greater storage and computational efficiency, while at the same time guarantees identical outputs to the original subgraph GNNs even taking the whole graph as subgraphs. The key is to identify and eliminate the redundant computation among subgraphs. For example, a node $v_i$ may appear in multiple subgraphs but is far away from all of their centers (the unsymmetric part between subgraphs). Therefore, its first few rounds of message passing within each subgraph can be computed once in the original graph instead of being computed multiple times within each subgraph. Such strategy enables our ENFA to accelerate subgraph GNNs in an exact way, unlike previous sampling approaches that often lose the performance. Extensive experiments across various datasets reveal that compared with the conventional subgraph GNNs, ENFA can reduce storage space by 29.0% to 84.5% and improve training efficiency by up to 1.66x.

Via

Access Paper or Ask Questions

How Different AI Chatbots Behave? Benchmarking Large Language Models in Behavioral Economics Games

Dec 16, 2024

Yutong Xie, Yiyao Liu, Zhuang Ma, Lin Shi, Xiyuan Wang, Walter Yuan, Matthew O. Jackson, Qiaozhu Mei

Abstract:The deployment of large language models (LLMs) in diverse applications requires a thorough understanding of their decision-making strategies and behavioral patterns. As a supplement to a recent study on the behavioral Turing test, this paper presents a comprehensive analysis of five leading LLM-based chatbot families as they navigate a series of behavioral economics games. By benchmarking these AI chatbots, we aim to uncover and document both common and distinct behavioral patterns across a range of scenarios. The findings provide valuable insights into the strategic preferences of each LLM, highlighting potential implications for their deployment in critical decision-making roles.

* Presented at The First Workshop on AI Behavioral Science (AIBS 2024)

Via

Access Paper or Ask Questions

GL-Fusion: Rethinking the Combination of Graph Neural Network and Large Language model

Dec 08, 2024

Haotong Yang, Xiyuan Wang, Qian Tao, Shuxian Hu, Zhouchen Lin, Muhan Zhang

Figure 1 for GL-Fusion: Rethinking the Combination of Graph Neural Network and Large Language model

Figure 2 for GL-Fusion: Rethinking the Combination of Graph Neural Network and Large Language model

Figure 3 for GL-Fusion: Rethinking the Combination of Graph Neural Network and Large Language model

Figure 4 for GL-Fusion: Rethinking the Combination of Graph Neural Network and Large Language model

Abstract:Recent research on integrating Large Language Models (LLMs) with Graph Neural Networks (GNNs) typically follows two approaches: LLM-centered models, which convert graph data into tokens for LLM processing, and GNN-centered models, which use LLMs to encode text features into node and edge representations for GNN input. LLM-centered models often struggle to capture graph structures effectively, while GNN-centered models compress variable-length textual data into fixed-size vectors, limiting their ability to understand complex semantics. Additionally, GNN-centered approaches require converting tasks into a uniform, manually-designed format, restricting them to classification tasks and preventing language output. To address these limitations, we introduce a new architecture that deeply integrates GNN with LLM, featuring three key innovations: (1) Structure-Aware Transformers, which incorporate GNN's message-passing capabilities directly into LLM's transformer layers, allowing simultaneous processing of textual and structural information and generating outputs from both GNN and LLM; (2) Graph-Text Cross-Attention, which processes full, uncompressed text from graph nodes and edges, ensuring complete semantic integration; and (3) GNN-LLM Twin Predictor, enabling LLM's flexible autoregressive generation alongside GNN's scalable one-pass prediction. GL-Fusion achieves outstand performance on various tasks. Notably, it achieves state-of-the-art performance on OGBN-Arxiv and OGBG-Code2.

* under review

Via

Access Paper or Ask Questions