Affiliation:
1. 1104 University of New Mexico , Albuquerque , NM , USA
2. Brigham Young University , Provo , USA
Abstract
Abstract
Studies of word predictability in context show that words in English tend to be shorter if they are predictable from the next word, and to a lesser extent, if they are predictable from the previous word. Some studies distinguish function and content words, but otherwise have not considered grammatical factors, treating all two-word sequences as comparable. Because function words are highly frequent, words occurring with them have low predictability. Highest predictability occurs within bigrams with two content words. Using the Buckeye corpus, we show that content word bigrams from different constructions vary widely in predictability, with adjective–noun and noun–noun sequences (content words within a noun phrase) having the highest scores. It is known that in adjective–noun sequences, the vowel of the adjective is shorter than in other positions. We study noun–noun sequences within the noun phrase and show that the first noun is shorter than in other contexts. It follows that the shorter duration of the first word when it is predictable from the second in many cases is due to the noun phrase construction and not necessarily the regulation of duration corresponding to predictable versus unpredictable information. We conclude that predictability studies must consider the constructions words occur in.