Applying Text Analytics to the Mind-section Literature of the Tibetan Tradition of the Great Perfection

Author:

Krishna Ravi1,Mu Norman2,Keutzer Kurt1

Affiliation:

1. Electrical Engineering and Computer Science, University of California at Berkeley, Berkeley, California, United States of America

2. Google Inc., Mountain View, CA, United States of America

Abstract

Over the past decade, through a mixture of optical character recognition and manual input, there is now a growing corpus of Tibetan literature available as e-texts in Unicode format. With the creation of such a corpus, the techniques of text analytics that have been applied in the analysis of English and other modern languages may now be applied to Tibetan. In this work, we narrow our focus to examine a modest portion of that literature, the Mind-section portion of the literature of the Tibetan tradition of the Great Perfection. Here, we will use the lens of text analytics tools based on machine learning techniques to investigate a number of questions of interest to scholars of this and related traditions of the Great Perfection. It has been necessary for us to participate in all portions of this process: corpora identification and text edition selection, rendering the text as e-texts in Unicode using both Optical Character Recognition and manual entry, data cleaning and transformation, implementation of software for text analysis, and interpretation of results. For this reason, we hope this study can serve as a model for other low-resource languages that are just beginning to approach the problem of providing text analytics for their language.

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Reference70 articles.

1. Jean-Luc Achard. 1997. L'Essence Perlée du Secret. Brepols Turnhout Belgium. Jean-Luc Achard. 1997. L'Essence Perlée du Secret. Brepols Turnhout Belgium.

2. What Writing Does and How It Does It

3. Stylometric analysis of Chinese Buddhist texts—Do different Chinese translations of the Gaṇḍavyūha reflect stylistic features that are typical for their age;Bingenheimer Marcus;J. Japan. Assoc. Dig. Human.,2017

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3