Humboldt-Universität zu Berlin · Digital History

Discipline-Oriented Retrieval-Augmented Generation

A framework for redesigning RAG systems around historical methodology — preserving source sovereignty, interpretive transparency, and temporal sensitivity where standard architectures undermine them.

Read the Paper ↗ Explore Instances

Workshop Hands-on with our UN-corpus RAG system · Berlin · 17 July 2026 — details & registration →

Noah Kim-Baumann · Torsten Hiltmann · Professur Digital History

The Problem

Standard RAG wasn't built
for historical research

RAG systems are designed for factual question-answering — find the relevant passages, generate the answer. Historical scholarship demands something different: source criticism before interpretation, temporal sensitivity across decades of discourse, and transparent collaborative reasoning rather than seamless answers.

Standard RAG

Seamless pipeline

Query → retrieval → generation in one step. Source selection is a technical optimisation hidden from the researcher. Similarity-based ranking favours recent vocabulary. No built-in space for source criticism. Output presented as answers.

HistoRAG

Structured research process

Two separated phases restore the historian's workflow: a Heuristik phase for source discovery and evaluation, followed by an Analyse phase for interpretation. The researcher curates what enters computational reading. Outputs are Zwischentexte (interpretive proposals, not conclusions).

Design Interventions

Three architectural commitments

Drawing on Agre's Critical Technical Practice, we embed disciplinary values into system architecture rather than accepting computational defaults as neutral.

① Intervention

Separated Retrieval & Generation

Heuristik → Analyse

Formally decouples corpus construction from interpretation. Researchers examine, critique, and curate retrieved sources before any computational "reading" begins, thus restoring the heuristic phase that standard RAG eliminates.

② Intervention

Temporal Windowing

Kontinuitätsannahme

Enforces proportional retrieval across time periods. Left unchecked, similarity-based search embeds presentist bias, privileging sources whose vocabulary matches modern query terms while suppressing formative periods where concepts emerged.

③ Intervention

LLM-as-Judge

Quellenbeleg

Post-retrieval evaluation against researcher-defined criteria. Turns algorithmic selection from a black box into a transparent, argumentative process with scored justifications that can be reviewed and contested.

Architecture

Two-phase pipeline

HistoRAG separates the RAG pipeline into distinct phases, each with explicit researcher control points. The architecture is transferable with specific implementations configuring chunking, embedding, and evaluation criteria as well as further features for unique use-cases.

Phase 1 — Heuristik (Source Discovery & Evaluation)

Corpus

source documents + metadata

Chunking & Embedding

configurable sizes → vector store

Semantic Retrieval

cosine similarity · HNSW index · FastText expansion

Temporal Windowing

proportional retrieval per time window

LLM-as-Judge Evaluation

scored justifications · researcher-defined criteria

Ranked & Evaluated Chunks

scores + justifications + full provenance

Phase 2 — Analyse (Interpretation & Synthesis)

Curated Corpus

researcher-selected chunks + metadata

+ research questions (≠ retrieval queries)

LLM-Assisted Interpretation

thematic synthesis · pattern recognition · multi-model

Zwischentexte

interpretive judgments, not conclusions

Historian Validation & Development

verify citations · contest interpretations · supply context

Historiographic Analysis

source-grounded · transparent reasoning

Verstehen remains with the historian

Concept

Zwischentexte

HistoRAG generates what we term Zwischentexte (intermediate texts). These are not answers but interpretive proposals: they lie between retrieved sources and historical argument, offering first proposals for interpretation that the historian can verify, contest, and develop.

"The central question for LLMs in digital humanities is not whether machines can 'read' but how we design systems that make their interpretive interventions visible and contestable, thereby preserving the scholar's epistemic agency throughout."

Case Study

SPIEGELragged

Our first implementation of HistoRAG, applied to computerisation discourse in Der Spiegel (1950–1979). Tracking how West German society's understanding of automation evolved — from "Elektronenhirn" to "Computer" to "EDV," and from euphoria to anxiety.

102,189

articles in corpus

years covered (1950–1979)

~200k

embedded chunks

ρ = 0.275

similarity ↔ relevance correlation

Not settled conclusions but Zwischentexte — interpretive proposals the system surfaced for the historian to verify, contest, or discard:

1964 as a possible earlier rupture

The system proposed that public anxiety about automation surfaced around 1964 — earlier than the canonical 1978 "Computer-Revolution" account — through reader letters that keyword search misses. A hypothesis to test, not a verdict.

Rationalisierung as semantic battleground

The same term appears to carry opposed meanings by speaker position — efficiency for management, existential threat for workers — a pattern legible only at corpus scale.

Class migration of anxiety

A proposal that technological anxiety moved upward through the class structure over time, becoming socially charged once it reached the discourse-producing classes.

Implementations

HistoRAG Instances

HistoRAG is a transferable framework. Each instance configures the architecture for a specific corpus and research context.

SPIEGELragged

Der Spiegel, 1949–1999

Computerisation discourse across the full archive — the paper's primary case study uses a clean 1950–1979 window (102,189 articles). Its corpus-trained FastText layer surfaces period-specific vocabulary (e.g. Elektronenhirn) a modern researcher wouldn't know to search.

Closed · private research corpus, not open for access requests

UNragged

UN debates, 1946–2024

Every member state's annual address to the General Assembly — 10,969 speeches from ~200 countries — fused at query time with UN Security Council resolutions. The corpus behind our 17 July workshop.

Invite-only · request access

Abgeordnetenhaus Berlin

West Berlin · Drucksachen, 1954–1989

~19,000 parliamentary papers (Drucksachen) of West Berlin's Abgeordnetenhaus, Wahlperioden 1–10 — motions, written questions, bills and reports from the walled city, distinct from East Berlin's Stadtverordnetenversammlung.

Invite-only · request access

Bundestag

Federal Republic (Bonn), 1949–1990

The West German federal parliament — plenary debates (Plenarprotokolle) and printed papers (Drucksachen: bills, motions, written questions) fused in one search. The Bonn-era counterpart to the GDR's Volkskammer, and to Berlin's city parliament above.

Invite-only · request access

CypherRAG

Cypherpunk mailing list, 1992–2000

~98,000 emails from the foundational archive of digital-privacy and crypto-anarchism — May, Hughes, Finney and others. Distinctive feature: thread-aware retrieval that reconstructs whole conversations, not just isolated messages.

Invite-only · request access

Every instance is invite-only and sits behind a sign-in — none is open to the public. SPIEGELragged is a closed, non-public research corpus and is not available for access requests. For the other four instances, to request access for your own work, write to [email protected]; we add your address to the allowlist and send you the instance link. Researchers connect their own model API key (OpenAI, Anthropic, Google, or DeepSeek), or use a locally hosted, open-weight Qwen model.

Workshop · Berlin

Hands-on with HistoRAG

Work directly with a purpose-built RAG system on the UN General Debate Corpus — the speeches of every UN member state since 1946 — and explore how scholarly sovereignty can be developed and preserved when researching with large language models. For researchers across the humanities, cultural, and social sciences. No programming required; just bring a laptop.

„Souverän forschen mit KI — Kritische Quellenarbeit mit großen Sprachmodellen (LLMs) am Beispiel des UN General Debate Corpus"

DateFriday, 17 July 2026 · 09:00–13:00 PlaceDorotheenstraße 26, Berlin-Mitte · Room 117 (Flexpool) SeatsLimited to 25 — early registration recommended HostProfessur Digital History, Humboldt-Universität zu Berlin

Can't make it to Berlin? The same format is offered at DH2026 (Daejeon, 27–31 July 2026) by Torsten Hiltmann and Noah Kim-Baumann.

Publication

Read the Paper

HistoRAG: Embedding Historical Methodology in Retrieval-Augmented Generation Through Critical Technical Practice

Noah J. Kim-Baumann & Torsten Hiltmann · 2026

The framework paper — open-access preprint on arXiv, June 2026.

Read the preprint (arXiv) ↗

Kim-Baumann, N. J. & Hiltmann, T. (2026). HistoRAG: Embedding
Historical Methodology in Retrieval-Augmented Generation Through
Critical Technical Practice. arXiv:2606.18103.

A companion executable-notebook article on the SPIEGELragged case study is under review at the Journal for Digital History.

View JDH notebook on GitHub ↗

Discipline-Oriented Retrieval-Augmented Generation

Standard RAG wasn't builtfor historical research

Seamless pipeline

Structured research process

Three architectural commitments

Separated Retrieval & Generation

Temporal Windowing

LLM-as-Judge

Two-phase pipeline

Zwischentexte

SPIEGELragged

1964 as a possible earlier rupture

Rationalisierung as semantic battleground

Class migration of anxiety

HistoRAG Instances

Hands-on with HistoRAG

Developed at Humboldt-Universität zu Berlin

Noah Kim-Baumann

Torsten Hiltmann

Read the Paper

Standard RAG wasn't built
for historical research