The effect of corpus size in predicting reaction time in a basic word recognition task-Moving on from Kučera and Francis이 논문에서 규모에 대한 자세한 이야기가 다뤄짐.
KF(Brown Corpus)
The Kucera and Francis (1967; hereafter KF) WF count is derived from a corpus of American English compiled at Brown University. The published WF count is that published by Kucera and Francis in book form. The corpus (referred to as the Brown corpus) and the WF count are also available from the publisher on computer media. The Brown corpus consists of 1,014,000 words of text in 500 samples of roughly 2,000 words each. A more recent version of this book (Francis & Kucera, 1982; hereafter FK) is also based on the Brown corpus. Both versions contain two frequency lists: an alphabetical list and a rank- ordered list. In the KF version, the alphabetical and rank-ordered lists include all 50,406 word entries. A slightly more limited approach to listing items was taken in the KF version, owing to the additional room required by the listing of the different grammatical classes separately. Approximately 36,000 items are listed in the alphabetical list. The rank-ordered table lists 5,996 words, all of them items that reached a certain frequency threshold. A very useful feature of the more recent version is that the lists are tagged by grammatical class. In the experiments re- ported in this paper, we used Form C of the machine-readable version of the FK WF count.