An Analytical Emotion Framework of Rumour Threads on Social Media
Rui Xing,
Boyang Sun, Kun Zhang, Preslav Nakov,
Timothy Baldwin and
Jey Han Lau
(To Appear) In Proceedings of MisD@ICWSM 2025, Copenhagen, Denmark.
Rumours in online social media pose significant risks to modern society, motivating the need for better understanding of how they develop. We focus specifically on the interface between emotion and rumours in threaded discourses, building on the surprisingly sparse literature on the topic which has largely focused on single aspect of emotions within the original rumour posts themselves, and largely overlooked the comparative differences between rumours and non-rumours. In this work, we take one step further to provide a comprehensive analytical emotion framework with multi-aspect emotion detection, contrasting rumour and non-rumour threads and provide both correlation and causal analysis of emotions. We applied our framework on existing widely-used rumour datasets to further understand the emotion dynamics in online social media threads. Our framework reveals that rumours trigger more negative emotions (e.g., anger, fear, pessimism), while non-rumours evoke more positive ones. Emotions are contagious—rumours spread negativity, non-rumours spread positivity. Causal analysis shows surprise bridges rumours and other emotions; pessimism comes from sadness and fear, while optimism arises from joy and love.
[paper
/
code]
|
Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph
Roman Vashurin, Ekaterina Fadeeva, Artem Vazhentsev, Lyudmila Rvanova, Daniil Vasilev, Akim Tsvigun, Sergey Petrakov, Rui Xing, Abdelrahman Sadallah, Kirill Grishchenkov, Alexander Panchenko, Timothy Baldwin, Preslav Nakov, Maxim Panov, Artem Shelmanov
Transactions of the Association for Computational Linguistics (2025)
The rapid proliferation of large language models (LLMs) has stimulated researchers to seek effective and efficient approaches to deal with LLM hallucinations and low-quality outputs. Uncertainty quantification (UQ) is a key element of machine learning applications in dealing with such challenges. However, research to date on UQ for LLMs has been fragmented in terms of techniques and evaluation methodologies. In this work, we address this issue by introducing a novel benchmark that implements a collection of state-of-the-art UQ baselines and offers an environment for controllable and consistent evaluation of novel UQ techniques over various text generation tasks. Our benchmark also supports the assessment of confidence normalization methods in terms of their ability to provide interpretable scores. Using our benchmark, we conduct a large-scale empirical investigation of UQ and normalization techniques across eleven tasks, identifying the most effective approaches.
[paper
/
code]
|
Evaluating Evidence Attribution in Generated Fact Checking Explanations
Rui Xing,
Timothy Baldwin and
Jey Han Lau
In Proceedings of NAACL HLT 2025, Albuquerque, New Mexico.
Automated fact-checking systems often struggle with trustworthiness, as their generated explanations can include hallucinations. In this work, we explore evidence attribution for fact-checking explanation generation. We introduce a novel evaluation protocol, citation masking and recovery, to assess attribution quality in generated explanations. We implement our protocol using both human annotators and automatic annotators, and find that LLM annotation correlates with human annotation, suggesting that attribution assessment can be automated. Finally, our experiments reveal that: (1) the best-performing LLMs still generate explanations with inaccurate attributions; and (2) human-curated evidence is essential for generating better explanations.
[paper
/
code]
|
FIRE: Fact-checking with Iterative Retrieval and Verification
Zhuohan Xie, Rui Xing, Yuxia Wang, Jiahui Geng, Hasan Iqbal, Dhruv Sahnan, Iryna Gurevych and Preslav Nakov
In Findings of NAACL HLT 2025, Albuquerque, New Mexico.
Fact-checking long-form text is challenging, and it is therefore common practice to break it down into multiple atomic claims. The typical approach to fact-checking these atomic claims involves retrieving a fixed number of pieces of evidence, followed by a verification step. However, this method is usually not cost-effective, as it underutilizes the verification model's internal knowledge of the claim and fails to replicate the iterative reasoning process in human search strategies. To address these limitations, we propose FIRE, a novel agent-based framework that integrates evidence retrieval and claim verification in an iterative manner. Specifically, FIRE employs a unified mechanism to decide whether to provide a final answer or generate a subsequent search query, based on its confidence in the current judgment. We compare FIRE with other strong fact-checking frameworks and find that it achieves slightly better performance while reducing large language model (LLM) costs by an average of 7.6 times and search costs by 16.5 times. These results indicate that FIRE holds promise for application in large-scale fact-checking operations.
[paper
/
code]
|
Automatic Explanation Generation For Climate Science Claims
Rui Xing,
Shraey Bhatia,
Timothy Baldwin and
Jey Han Lau
In Proceedings of The 20th Annual Workshop of the Australasian Language Technology Association
(ALTA 2022)
Climate change is an existential threat to humanity, the proliferation of unsubstantiated claims relating to climate science is manipulating public perception, motivating the need for fact-checking in climate science. In this work, we draw on recent work that uses retrieval-augmented generation for veracity prediction and explanation generation, in framing explanation generation as a query-focused multi-document summarization task.
[paper
/
code]
|
BioRel: towards large-scale biomedical relation extraction
Rui Xing and
Jie Luo
BMC Bioinformatics. 2020 Dec 16;21(Suppl 16):543.
We construct BioRel, a large-scale dataset for biomedical relation extraction problem, by using Unified Medical Language System as knowledge base and Medline as corpus.
[paper
/
data]
|
Distant Supervised Relation Extraction with Separate Head-Tail CNN
Rui Xing and
Jie Luo
In Proceedings of the EMNLP Workshop W-NUT: The 5th Workshop on Noisy User-generated Text (EMNLP
2019 W-NUT)
Distant supervised relation extraction is an efficient and effective strategy to find relations between entities in texts. However, it inevitably suffers from mislabeling problem and the noisy data will hinder the performance. In this paper, we propose the Separate Head-Tail Convolution Neural Network (SHTCNN), a novel neural relation extraction framework to alleviate this issue.
[paper
/
code]
|
2022:
Melbourne Plus digital credential for community engagement, University of Melbourne
|
2022: ALTA Student Travelling Scholarship, Australasian Language Technology Association
|
2016: Honorable Mention in Mathematical Contest in Modeling, the Consortium for Mathematics
and Its Application |
2015, 2016: National Scholarship, Ministry of Education in China |
Tutor |
2024:
Undergraduate Research Internship Program (MBZUAI) |
2023 Semester 1: Natural Language Processing COMP90042 (University of Melbourne) |
2023 Semester 1: Statistical Machine Learning COMP90051 (University of Melbourne) |
2022 Semester 2: Statistical Machine Learning COMP90051 (University of Melbourne) |
2022 Semester 1: Natural Language Processing COMP90042 (University of Melbourne) |
2023-present: reviewer, ACL, EMNLP, NAACL, COLING |
2023-2024: ALTA student representative |
2022-2024: student organizer, University of Melbourne NLP Reading Group |
2023: technology chair, ALTA 2023 Workshop |
Volunteer
|
Voluntary work provides me with a lot of valuable fun experiences. Here are some selected fun events I enjoyed. |
|
Hobbies
|
I like swimming and table tennis. I also go hiking and cycling from time to time. |
I am fond of playing and creating games. Pokemon is my favourite. |
Copyright © 2025 Rui Xing
|
Template from here.
|
|