Abstract

Sample of the Corpus of Contemporary American English database format for linguistic text data. Data originates from spoken word, fiction, magazine, newspaper, academic writing, movie and television subtitles, blogs, and web page sources.

This TAR file includes 8 .txt files of sample data.

See Full-text corpus data for more information on how to use the database format.

File Format

.zip

File Size (MB)

46.9

Creation Date

1-10-2020

Deposit Date

7-11-2024

Recommended Citation

Davies, Mark. (2008-) The Corpus of Contemporary American English (COCA). Available online at https://www.english-corpora.org/coca/.

License Restrictions

Corpora data is subject to access and use restrictions, including:

Data cannot be distributed outside Gonzaga
Access limited to restricted login or password
Data cannot be used to create software or products for sale or consumption
Data is for research and substantial portions (50,000 words or more) cannot be made available to undergraduates
Any publications or products based on the data should reference the source of the data (see Citation Information)

See the full limitations at Restrictions on use of the corpora.

Download

COinS