L3X: Long Object List Extraction from Long Documents
Relation extraction methods from text often
prioritize high precision but at the expense
of recall. However, high recall is crucial for
populating long lists of object entities that
stand in a specific relation with a given subject. In long texts, cues for relevant objects
can be spread across many passages, posing a challenge for extracting long lists. We
present the L3X method which tackles this
problem in two stages: (1) recall-oriented
generation using a large language model with
judicious techniques for retrieval augmentation,
and (2) precision-oriented scrutinization to validate or prune candidates.
This demo is both an exploratory tool for researchers and a showcase of the capabilities of L3X for information extraction. It's meant to provide an intuitive way to interact with the datasets, compare model configurations, and highlight the recall-oriented approach of our research.
Problem Example
L3X Framework
Interactive Demo
Try out our interactive demo to extract long object lists from long documents using L3X.
Stage 1
This stage focuses on recall-oriented generation using LLMs in retrieval-augmented generation (RAG) mode, iterative probing in an ensemble mode. In the L3X with RAG variants, input is subject, relation, and passages retrieved using various queries/prompts, while the non-RAG mode takes subject, relation, and few-shot examples for in-context inference. The output shown is the list of objects, along with generation frequencies, obtained at the end of top-k iterations and five queries in the ensemble. The following four configurations are available:
- LLM-only (without retrieval)
- L3X-def (passages ranked and retrieved using the default retriever scores)
- L3X-amp (passages retrieved through pseudo-relevance feedback re-ranking)
Stage 1 (ensemble) illustrative examples
| Config | Dataset | Book | Subject | Relation | Link | Remark |
|---|---|---|---|---|---|---|
| LLM-only | Books | A Song of Ice and Fire | All | All | LLM-only extraction | LLM-only extraction for A Song of Ice and Fire book, for all test subjects and relation. The averaged recall on this longest and most popular book in the our dataset achieves 60% Recall. |
| LLM-def | Books | A Song of Ice and Fire | All | All | LLM-def extraction | LLM-def extraction for A Song of Ice and Fire book, for all test subjects and relation. The averaged recall on this longest and most popular book in the our dataset achieves 87% Recall. |
| LLM-amp | Books | A Song of Ice and Fire | All | All | LLM-amp extraction | LLM-amp extraction for A Song of Ice and Fire book, for all test subjects and relation. The averaged recall on this longest and most popular book in the our dataset achieves 88% Recall. |
| LLM-only | Web | - | All | All | LLM-only extraction | LLM-only extraction across all test subjects and relation. The averaged recall is 42%. |
| LLM-def | Web | - | All | All | LLM-def extraction | LLM-def extraction across all test subjects and relation. The averaged recall is 70%. |
| LLM-amp | Web | - | All | All | LLM-amp extraction | LLM-amp extraction across all test subjects and relation. The averaged recall is 56%. |
Stage 1 (drill-down) illustrative examples
| Config | Dataset | Book | Subject | Relation | Link | Remark |
|---|---|---|---|---|---|---|
| L3X-amp | Books | A Song of Ice and Fire | Arya Stark | enemy | opponents of Arya Stark | The passage (ID: 2962) is one of the high-quality passages with direct cues for opponent relation, retrieved via pseudo-relevance feedback |
| L3X-amp | Books | The Void Trilogy | Oscar Monroe | family | family members of Oscar Monroe | The passage (ID: 4440) is one of the high-quality passages with direct cues for family relation, retrieved via pseudo-relevance feedback. Without passages, LLM-only leads to zero recall. |
Stage 1 (side-by-side comparison) illustrative examples
| Config A | Config B | Dataset | Book | Subject | Relation | Link | Remark |
|---|---|---|---|---|---|---|---|
| LLM-only | L3X-amp | Books | A Song of Ice and Fire | Arya Stark | enemy | enemy for Arya Stark | L3X-amp has much higher recall than the baseline LLM-only method. |
| L3X-def | L3X-amp | Books | The Girl with the Dragon Tattoo | Henrik Vanger | family | family of Henrik Vanger | L3X-amp yields more true positives in the high-frequency category, resulting in higher recall at fixed precision. |
Stage 2 (ensemble) illustrative examples
| Config | Dataset | Book | Subject | Relation | Link | Remark |
|---|---|---|---|---|---|---|
| LLM-only | Books | A Song of Ice and Fire | All | All | thresholding on LLM-only | Thresholding (t=0.7) on LLM-only extractions for A Song of Ice and Fire book, for all test subjects and relation. |
| L3X-def | Books | A Song of Ice and Fire | All | All | thresholding on L3X-def | Thresholding (t=0.7) on L3X-def extractions for A Song of Ice and Fire book, for all test subjects and relation. |
| L3X-amp | Books | A Song of Ice and Fire | All | All | thresholding on L3X-amp | Thresholding (t=0.7) on L3X-amp extractions for A Song of Ice and Fire book, for all test subjects and relation. |
Datasets
We curated two datasets, covering fiction books and web documents. The books dataset consists of 11 books or book series, and addresses 8 relations of long-tailed nature. The web dataset covers ca. 10 million web documents sampled from the C4 corpus (Dodge et al., 2021), focusing on 3 long-tailed factual relations.
Reference
Recall Them All: Retrieval-Augmented Language Models for
Long Object List Extraction from Long Documents
Authors: Sneha Singhania, Simon Razniewski, Gerhard Weikum