Comprehensive NER extraction from 68,798 document chunks
Search and filter extracted entities. Browse persons, organizations, and locations with source citations. Note: entity counts include noise from NER extraction.
Explore Entities →Interactive force-directed graph showing relationships between top entities. Drag nodes, zoom, and search.
View Network →Chronologically sorted events (1990-2025) with participants and source documents. Dates parsed from multiple formats.
View Timeline →Searchable table of 16,169 financial transactions with amounts, parties, and purposes.
View Financial →Interactive world map showing geographic references in documents. Click markers to see source documents.
View Map →UMAP visualization of document embeddings showing semantic similarity clusters across all 69,290 documents.
View Clusters →Breakdown of document types and volumes across the corpus.
View Distribution →Source: House Oversight Committee Epstein Files Release (November 2025)
Processing: LLM-based Named Entity Recognition using Claude 3 Haiku via AWS Bedrock
Extraction: Structured extraction of persons, organizations, locations, dates, financial records, and relationships
Deduplication: Fuzzy string matching with blocking optimization (85% similarity threshold)
Victim Protection: Known victim names redacted from public outputs