In order to avoid students' negative learning mood, contemporary teachers are required to abandon the application of spoon-feeding teaching method in English classroom teaching, adopt micro-class teaching method, highlight the teaching characteristics of being close to the people, and create an efficient, short, and special teaching space to meet students' learning needs. In this study, short video description technology is applied to college English teaching, and a generation model of short video natural language description based on Attention mechanism is established. The video feature sequence may be out of sync with the generated word sequence, that is to say, the order of objects and behaviors appearing in the video may be different from their positions before and after the description sentence. In this article, a new generation model of short video natural language description based on attention mechanism is designed.