Corpus compilation: Representativeness and the CORPOBRAS
Abstract
This paper discusses an important parameter in corpus design and compilation: representativeness. This parameter is related to the need to include in corpora texts that represent several uses of the language so that comprehensive descriptions can be developed. The paper also presents a corpus of Brazilian Portuguese – CORPOBRAS – that comprises 27 discourse genres and is guided by the representativeness parameter. The paper finally lists several corpus-based studies that draw upon CORPOBRAS data.
Key words: CORPOBRAS, corpus linguistics, genre variation, representativeness, oral and written discourse.Downloads
Published
How to Cite
Issue
Section
License
I grant the journal Calidoscópio the first publication of my article, licensed under Creative Commons Attribution license (which allows sharing of work, recognition of authorship and initial publication in this journal).
I confirm that my article is not being submitted to another publication and has not been published in its entirely on another journal. I take full responsibility for its originality and I will also claim responsibility for charges from claims by third parties concerning the authorship of the article.
I also agree that the manuscript will be submitted according to the journal’s publication rules described above.