Affiliation:
1. The University of Tokyo
Abstract
Probabilistic context-free grammars (PCFGs), which are commonly used to generate trees randomly, have been well analyzed theoretically, leading to applications in various domains. Despite their utility, the distributions that the grammar can express are limited to those in which the distribution of a subtree depends only on its root and not on its context. This limitation presents a challenge for modeling various real-world phenomena, such as natural languages. To overcome this limitation, a probabilistic context-sensitive grammar (PCSG) is introduced, where the distribution of a subtree depends on its context. Numerical analysis of a PCSG reveals that the distribution of a symbol does not constitute a qualitative difference from that in the context-free case, but mutual information does. Furthermore, a novel metric introduced to directly quantify the breaking of this limitation detects a distinct difference between PCFGs and PCSGs. This metric, applicable to an arbitrary distribution of a tree, allows for further investigation and characterization of various tree structures that PCFGs cannot express.
Published by the American Physical Society
2024
Funder
Japan Society for the Promotion of Science
Japan Science and Technology Corporation
University of Tokyo
Publisher
American Physical Society (APS)
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献