Applying Text Analytics to the Mind-section Literature of the Tibetan Tradition of the Great Perfection-Reference-Cited by-同舟云学术

Applying Text Analytics to the Mind-section Literature of the Tibetan Tradition of the Great Perfection

Published:2021-04-08 Issue:2 Volume:20 Page:1-32
ISSN:2375-4699
Container-title:ACM Transactions on Asian and Low-Resource Language Information Processing
language:en
Short-container-title:ACM Trans. Asian Low-Resour. Lang. Inf. Process.

Author:

Krishna Ravi¹,Mu Norman²,Keutzer Kurt¹

Affiliation:

1. Electrical Engineering and Computer Science, University of California at Berkeley, Berkeley, California, United States of America

2. Google Inc., Mountain View, CA, United States of America

Abstract

Over the past decade, through a mixture of optical character recognition and manual input, there is now a growing corpus of Tibetan literature available as e-texts in Unicode format. With the creation of such a corpus, the techniques of text analytics that have been applied in the analysis of English and other modern languages may now be applied to Tibetan. In this work, we narrow our focus to examine a modest portion of that literature, the Mind-section portion of the literature of the Tibetan tradition of the Great Perfection. Here, we will use the lens of text analytics tools based on machine learning techniques to investigate a number of questions of interest to scholars of this and related traditions of the Great Perfection. It has been necessary for us to participate in all portions of this process: corpora identification and text edition selection, rendering the text as e-texts in Unicode using both Optical Character Recognition and manual entry, data cleaning and transformation, implementation of software for text analysis, and interpretation of results. For this reason, we hope this study can serve as a model for other low-resource languages that are just beginning to approach the problem of providing text analytics for their language.

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/3392047

Reference70 articles.

1. Jean-Luc Achard. 1997. L'Essence Perlée du Secret. Brepols Turnhout Belgium. Jean-Luc Achard. 1997. L'Essence Perlée du Secret. Brepols Turnhout Belgium.

2. What Writing Does and How It Does It

3. Stylometric analysis of Chinese Buddhist texts—Do different Chinese translations of the Gaṇḍavyūha reflect stylistic features that are typical for their age;Bingenheimer Marcus;J. Japan. Assoc. Dig. Human.,2017

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Visualization and Interactive Design of Cultural Heritage Information;Communications in Computer and Information Science;2024

2. An Effective Learning Evaluation Method Based on Text Data with Real-time Attribution - A Case Study for Mathematical Class with Students of Junior Middle School in China;ACM Transactions on Asian and Low-Resource Language Information Processing;2023-03-10