Berlin Brandenburg Academy of Sciences: CLARIN FCS/SRU Endpoints

Corpora available for retrieval via CLARIN Federated Content Search

1.

Berliner Zeitung (1994–2005)

Berliner Zeitung (1994–2005) – Newspaper archive at the Berlin-Brandenburg Academy of Sciences and Humanities.

237,093,180 tokens851,194 documentsdeu

FCS/SRU
2.

Corpus C4

Joint text corpus drawn from the Digital Dictionary of 20th-Century German (DWDS), the Austrian Academy Corpus (AAC), the Korpus Südtirol, and the Schweizer Textkorpus (CHTK).

49,167,148 tokens23,790 documentsdeu

FCS/SRU
3.

Polytechnical Journal (Dingler Online)

Text corpus of the Polytechnical Journal (Dingler Online) at the BBAW CLARIN-D Service Centre.

77,565,567 tokens42,394 documentsdeu

FCS/SRU
4.

German Text Archive (DTA)

German Text Archive (DTA) – Basis for a reference corpus of the New High German language.

243,321,687 tokens5,337 documentsdeu

FCS/SRU
5.

Die Grenzboten (“Messengers from the Borders”)

Die Grenzboten (“Messengers from the Borders”) (1841–1922) – digitized periodical from the archives of the State and University Library Bremen.

89,235,665 tokens311 documentsdeu

FCS/SRU
6.

Dortmund Chat Corpus

Dortmund Chat Corpus – Basis for and aid to linguistic investigations of synchronic internet-based communication.

1,003,458 tokens470 documentsdeu

FCS/SRU
7.

DWDS-Kernkorpus

DWDS-Kernkorpus – balanced corpus of 20th-century German at the Berlin-Brandenburg Academy of Sciences and Humanities.

121,397,601 tokens79,116 documentsdeu

FCS/SRU
8.

Reference Corpus of Middle High German (ReM)

Reference Corpus of Middle High German (ReM) – a corpus of diplomatically transcribed and annotated texts from Middle High German (1050–1350).

2,519,981 tokens398 documentsdeu

FCS/SRU
9.

Tagesspiegel

Tagesspiegel – Newspaper archive at the Berlin-Brandenburg Academy of Sciences and Humanities.

156,436,295 tokens330,790 documentsdeu

FCS/SRU
10.

Text+Berg (1864–1900)

Text+Berg corpus (Yearbooks and the current issues of the journal “Die Alpen”).

6,517,111 tokens36 documentsdeu

FCS/SRU
11.

Text+Berg (1901–2015)

Text+Berg corpus (Yearbooks and the current issues of the journal “Die Alpen”).

22,157,813 tokens113 documentsdeu

FCS/SRU