L3X: Long Object List Extraction from Long Documents

Relation extraction methods from text often prioritize high precision but at the expense of recall. However, high recall is crucial for populating long lists of object entities that stand in a specific relation with a given subject. In long texts, cues for relevant objects can be spread across many passages, posing a challenge for extracting long lists. We present the L3X method which tackles this problem in two stages: (1) recall-oriented generation using a large language model with judicious techniques for retrieval augmentation, and (2) precision-oriented scrutinization to validate or prune candidates.

This demo is both an exploratory tool for researchers and a showcase of the capabilities of L3X for information extraction. It's meant to provide an intuitive way to interact with the datasets, compare model configurations, and highlight the recall-oriented approach of our research.


Problem Example


L3X Framework


Interactive Demo

Try out our interactive demo to extract long object lists from long documents using L3X.

Stage 1

This stage focuses on recall-oriented generation using LLMs in retrieval-augmented generation (RAG) mode, iterative probing in an ensemble mode. In the L3X with RAG variants, input is subject, relation, and passages retrieved using various queries/prompts, while the non-RAG mode takes subject, relation, and few-shot examples for in-context inference. The output shown is the list of objects, along with generation frequencies, obtained at the end of top-k iterations and five queries in the ensemble. The following four configurations are available:

  • LLM-only (without retrieval)
  • L3X-def (passages ranked and retrieved using the default retriever scores)
  • L3X-amp (passages retrieved through pseudo-relevance feedback re-ranking)


Stage 1 (ensemble) illustrative examples

Config Dataset Book Subject Relation Link Remark
LLM-only Books A Song of Ice and Fire All All LLM-only extraction LLM-only extraction for A Song of Ice and Fire book, for all test subjects and relation. The averaged recall on this longest and most popular book in the our dataset achieves 60% Recall.
LLM-def Books A Song of Ice and Fire All All LLM-def extraction LLM-def extraction for A Song of Ice and Fire book, for all test subjects and relation. The averaged recall on this longest and most popular book in the our dataset achieves 87% Recall.
LLM-amp Books A Song of Ice and Fire All All LLM-amp extraction LLM-amp extraction for A Song of Ice and Fire book, for all test subjects and relation. The averaged recall on this longest and most popular book in the our dataset achieves 88% Recall.
LLM-only Web - All All LLM-only extraction LLM-only extraction across all test subjects and relation. The averaged recall is 42%.
LLM-def Web - All All LLM-def extraction LLM-def extraction across all test subjects and relation. The averaged recall is 70%.
LLM-amp Web - All All LLM-amp extraction LLM-amp extraction across all test subjects and relation. The averaged recall is 56%.

Stage 1 (drill-down) illustrative examples

Config Dataset Book Subject Relation Link Remark
L3X-amp Books A Song of Ice and Fire Arya Stark enemy opponents of Arya Stark The passage (ID: 2962) is one of the high-quality passages with direct cues for opponent relation, retrieved via pseudo-relevance feedback
L3X-amp Books The Void Trilogy Oscar Monroe family family members of Oscar Monroe The passage (ID: 4440) is one of the high-quality passages with direct cues for family relation, retrieved via pseudo-relevance feedback. Without passages, LLM-only leads to zero recall.

Stage 1 (side-by-side comparison) illustrative examples

Config A Config B Dataset Book Subject Relation Link Remark
LLM-only L3X-amp Books A Song of Ice and Fire Arya Stark enemy enemy for Arya Stark L3X-amp has much higher recall than the baseline LLM-only method.
L3X-def L3X-amp Books The Girl with the Dragon Tattoo Henrik Vanger family family of Henrik Vanger L3X-amp yields more true positives in the high-frequency category, resulting in higher recall at fixed precision.

Stage 2 (ensemble) illustrative examples

Config Dataset Book Subject Relation Link Remark
LLM-only Books A Song of Ice and Fire All All thresholding on LLM-only Thresholding (t=0.7) on LLM-only extractions for A Song of Ice and Fire book, for all test subjects and relation.
L3X-def Books A Song of Ice and Fire All All thresholding on L3X-def Thresholding (t=0.7) on L3X-def extractions for A Song of Ice and Fire book, for all test subjects and relation.
L3X-amp Books A Song of Ice and Fire All All thresholding on L3X-amp Thresholding (t=0.7) on L3X-amp extractions for A Song of Ice and Fire book, for all test subjects and relation.

Datasets

We curated two datasets, covering fiction books and web documents. The books dataset consists of 11 books or book series, and addresses 8 relations of long-tailed nature. The web dataset covers ca. 10 million web documents sampled from the C4 corpus (Dodge et al., 2021), focusing on 3 long-tailed factual relations.


Reference

Recall Them All: Retrieval-Augmented Language Models for Long Object List Extraction from Long Documents
Authors: Sneha Singhania, Simon Razniewski, Gerhard Weikum