Getting a handle on a Hansard with Python and NLTK, or how to tame the linguistic picture of British politics with NLP

Author:

Gagarin S. N.1ORCID

Affiliation:

1. MGIMO UNIVERSITY

Abstract

The article proposes an optimised starter’s set of basic Python and NLTK (Natural Language Toolkit) methods that are essential in the analysis of massive textual corpora conducted as part of research investigating linguistic images of the world. The need to specify and detail these applied techniques stems from the nature and scope of the inexorable challenges confronted by contemporary cognitive linguistics and lexicology in the realm of unstructured big data analysis. Their viability and practical value are demonstrated in a series of illustrative examples where they are applied to the processing of continuous parallel diachronic corpora of Hansard that capture the discourse of both chambers of the British parliament produced in the years 2006-2023 and jointly amounting to over a third of a billion tokens.The article suggests that the methods it outlines and classifies can be seen as forming an indispensable minimum of IT competences that is capable of delivering a substantial boost to the level of research both as regards its overall quality and its competitive edge. The proposed toolkit includes an essential set of instruments for target vocabulary processing as well as for the assessment and visualisation of word and phrase frequency and collocation.The author presumes that, urged by the need to keep abreast of prevailing trends, the contemporary Russian researcher of linguistic images of the world is highly likely to find themselves compelled at some point to embrace the quantitative analysis methods made possible by combining Python and NLTK. As part of its substantial and varied range of benefits, the latter would arguably help them design and customise research protocols, adapting them with ease and versatility. Lastly and most importantly, the author suggests that Python and NLTK skills may serve as a comfortable gateway towards eventually upgrading one’s linguistic research to cutting-edge global standards of technological sophistication and marketability.

Publisher

MGIMO University

Reference45 articles.

1. Aizenshtat, M.P. Novatsii v parlamentskoi praktike Britanii XVIII stoletiia [Innovations in Britain’s Parliamentary practice of the 18th and 19th centuries] // Honoris causa. Sbornik nauchnykh statei, posviashchennyi 70-letiiu professora Viktora Vladimirovicha Sergeeva [Honoris causa. Collected Articles of the scientific conference celebrating the 70th anniversary of Professor Viktor Sergeev]. Sankt-Peterburg, 2016. P. 7−13.

2. Aizenshtat, M.P. Parlamentskie materialy Britanii XVII-XIX vekov. Zaprety i preodoleniia. [Britain’s parliamentary materials of the 18-19 centuries. Prohibitions and their overcoming] // Novaia i noveishaia istoriia [Modern and contemporary history]. 2016. № 5. P. 16−25.

3. Bykova, E.A., Sigova, A.A. Vopros priznaniia sovetskogo gosudarstva v politicheskoi diskussii britanskogo parlamenta [The recognition of the Soviet state in the political debate of the British Parliament] // Veter Perestroiki − 2022 [The Wind of Perestroika − 2022]. Sbornik materialov Vtoroj Vserossiiskoi nauchnoi konferentsii [Collected articles of the second national scientific conference]/ A. D. Matlin (otvetstvennyi redaktor) [ed.-in-chief A. D. Matlin]. Sankt-Peterburg, 2023. P. 22−27.

4. Golovina, N.M. «Neparlamentskie vyrazheniia» i rechevaia agressiia v britanskom parlamente: ritoricheskaia strategiia ili institutsional’naia norma? [Unparliamentary language and verbal aggression in the British Parliament: rhetorical strategy or Institutional norm?] // Rech’ i iazyki obshcheniia v konfliktogennom mire. Materialy mezhdunarodnoi nauchno-prakticheskoi konferentsii. [Speech and languages of communication in a conflict-prone world. Proceedings of an international research-topractice conference] / S.V. Myskin (otv. red.) [ed.-in-chief S.V. Myskin]. Moskva, 2021. P. 37−39.

5. Zakharova, O.V. Obsuzhdenie migratsionnoi politiki v britanskom parlamente. [Debates on Migration Policy in the British Parliament] // Chelovek, obraz, slovo v kontekste istoricheskogo vremeni i prostranstva. Мaterialy Vserossiiskoi nauchno-prakticheskoi konferentsii [Man, image and word in the context of historical time and space. Proceedings of an international research-to-practice conference]. 2015. P. 93−96.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3