Abstract
The complete Corpus of Historical American English in word, lemma, part of speech data format.
TAR folder contains 85 zip files of historical linguistic data across multiple genres (academic, fiction, magazine, newspaper, and television & movie sources) from 1820 to 2010.
Texts are separated by a line with ## and the textID.
File Format
.tar
File Size (MB)
1912.85
Creation Date
5-4-2021
Recommended Citation
Davies, Mark. (2010) The Corpus of Historical American English (COHA). Available online at https://www.english-corpora.org/coha/.
License Restrictions
Corpora data is subject to access and use restrictions, including:
- Data cannot be distributed outside Gonzaga
- Access limited to restricted login or password
- Data cannot be used to create software or products for sale or consumption
- Data is for research and substantial portions (50,000 words or more) cannot be made available to undergraduates
- Any publications or products based on the data should reference the source of the data (see Citation Information)
COinS