Sitemap

A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.

Posts

ConflictScore: Measuring How Language Models Handle Conflicting Evidence

4 minute read

Published: March 02, 2026

TL;DR: Existing “factuality/faithfulness” metrics usually ask: is the answer supported by the evidence?
ConflictScore asks a sharper question: what if the evidence set itself disagrees—and the model acts overconfident anyway?
We introduce a claim-level metric (CS-C, CS-R), a benchmark (ConflictBench), and show conflict-aware regeneration improves truthfulness on TruthfulQA.

portfolio

Open Domain QA with Conflicting Contexts

25% of unambiguous, open domain questions can lead to conflicting contexts when retrieved using Google Search.

Information Pollution & Multi-Perspective Search

Perspectives-oriented search engine and multi-perspective news editorial corpus.

News Framing

Detecting frames in news headlines and analyzing framing trends surrounding US gun violence.

publications

Detecting Frames in News Headlines and Its Application to Analyzing News Framing Trends Surrounding US Gun Violence

Published in CoNLL 2019, 2019

Detecting frames in news headlines and analyzing framing trends surrounding US gun violence.

Recommended citation: Siyi Liu, Lei Guo, Kate Mays, Margrit Betke, Derry Tanti Wijaya. "Detecting Frames in News Headlines and Its Application to Analyzing News Framing Trends Surrounding US Gun Violence." CoNLL 2019.
Download Paper

Learning to Mirror Speaking Styles Incrementally

Published in arXiv, 2020

Learning to mirror speaking styles incrementally using neural approaches.

Recommended citation: Siyi Liu*, Ziang Leng*, Derry Wijaya. "Learning to Mirror Speaking Styles Incrementally." arXiv.
Download Paper

MultiOpEd: A Corpus of Multi-Perspective News Editorials

Published in NAACL 2021, 2021

A corpus of multi-perspective news editorials for studying opinion diversity.

Recommended citation: Siyi Liu, Sihao Chen, Xander Uyttendaele, Dan Roth. "MultiOpEd: A Corpus of Multi-Perspective News Editorials." NAACL 2021.
Download Paper | Download Slides

Design Challenges for a Multi-Perspective Search Engine

Published in NAACL 2022 Findings, 2022

Designing a search engine that presents multiple perspectives on controversial topics.

Recommended citation: Sihao Chen*, Siyi Liu*, Xander Uyttendaele, Yi Zhang, William Bruno, Dan Roth. "Design Challenges for a Multi-Perspective Search Engine." NAACL 2022 Findings.
Download Paper

Open-Domain Event Graph Induction for Mitigating Framing Bias

Published in arXiv, 2023

Open-domain event graph induction for mitigating framing bias in news.

Recommended citation: Siyi Liu, Hongming Zhang, Hongwei Wang, Kaiqiang Song, Dan Roth, Dong Yu. "Open-Domain Event Graph Induction for Mitigating Framing Bias." arXiv.
Download Paper

Using LLM for Improving Key Event Discovery: Temporal-Guided News Stream Clustering with Event Summaries

Published in EMNLP 2023, 2023

Temporal-guided news stream clustering with event summaries using LLMs.

Recommended citation: Nishanth Nakshatri, Siyi Liu, Sihao Chen, Daniel Hopkins, Dan Roth, Dan Goldwasser. "Using LLM for Improving Key Event Discovery: Temporal-Guided News Stream Clustering with Event Summaries." EMNLP 2023.
Download Paper

Towards Long Context Hallucination Detection

Published in NAACL 2025 Findings, 2025

Automatic hallucination detection for long context documents.

Recommended citation: Siyi Liu, Kishaloy Halder, et al. "Towards Long Context Hallucination Detection." NAACL 2025 Findings.
Download Paper

Open Domain Question Answering with Conflicting Contexts

Published in NAACL 2025 Findings, 2025

25% of unambiguous open-domain questions can lead to conflicting contexts when retrieved using Google Search.

Recommended citation: Siyi Liu, Qiang Ning, et al. "Open Domain Question Answering with Conflicting Contexts." NAACL 2025 Findings.
Download Paper

DeeptraceReward: Learning Human-Perceived Fakeness in Generated Videos with Multimodal LLMs

Published in NeurIPS GenProCC Workshop 2025, 2025

Learning human-perceived fakeness in generated videos using multimodal LLMs.

Recommended citation: Xingyu Fu, Siyi Liu, et al. "DeeptraceReward: Learning Human-Perceived Fakeness in Generated Videos with Multimodal LLMs." NeurIPS GenProCC Workshop 2025.
Download Paper

ConflictScore: Measuring How Language Models Handle Conflicting Evidence

Published in In submission, 2026

We propose ConflictScore, a metric for measuring how language models handle conflicting evidence.

Recommended citation: Siyi Liu, Patrick Xia, et al. "ConflictScore: Measuring How Language Models Handle Conflicting Evidence." In submission.
Download Paper

Siyi Liu

Sitemap

Pages

Posts

portfolio

publications

talks

teaching