Epstein Files: Entity Analysis

Comprehensive NER extraction from 68,798 document chunks

68,798
Documents Processed
10,004
Connections Found
16,169
Financial Records
24,712
Timeline Events

Entity Explorer

Search and filter extracted entities. Browse persons, organizations, and locations with source citations. Note: entity counts include noise from NER extraction.

Explore Entities →

Entity Network

Interactive force-directed graph showing relationships between top entities. Drag nodes, zoom, and search.

View Network →

Event Timeline

Chronologically sorted events (1990-2025) with participants and source documents. Dates parsed from multiple formats.

View Timeline →

Financial Records

Searchable table of 16,169 financial transactions with amounts, parties, and purposes.

View Financial →

Location Map

Interactive world map showing geographic references in documents. Click markers to see source documents.

View Map →

Embedding Clusters

UMAP visualization of document embeddings showing semantic similarity clusters across all 69,290 documents.

View Clusters →

Document Distribution

Breakdown of document types and volumes across the corpus.

View Distribution →

Methodology

Source: House Oversight Committee Epstein Files Release (November 2025)

Processing: LLM-based Named Entity Recognition using Claude 3 Haiku via AWS Bedrock

Extraction: Structured extraction of persons, organizations, locations, dates, financial records, and relationships

Deduplication: Fuzzy string matching with blocking optimization (85% similarity threshold)

Victim Protection: Known victim names redacted from public outputs

Full methodology documentation →