Affiliation:
1. The Chinese University of Hong Kong
Abstract
Abstract
This paper introduces the Lannang Corpus (LanCorp), a public 375,000-word collection of raw and transcribed
recordings of Lannang languages spoken in metropolitan Manila, which have been annotated with part-of-speech tags and linked to 40
types of sociolinguistic metadata. It begins by providing an overview of the LanCorp (e.g. design, formats, accessibility). Then,
it goes on to show various examples of how the corpus can be used for variationist sociolinguistic research, using Lánnang-uè data
as a case study. The findings from the exploratory studies indicate that Lannang languages are influenced by sociolinguistic
factors, demonstrating the intricate nature of the Sino-Philippine sociolinguistic ecology. Due to its large size, sociolinguistic
metadata, and various formats, LanCorp can be used to study Lannang languages in general and how they are used by specific social
groups. It enables scholars to investigate multilingual interactions in a wide range of sociolinguistic factors, furthering the
field of Sino-Philippine (socio)linguistics.
Publisher
John Benjamins Publishing Company
Subject
Linguistics and Language,Language and Linguistics