Efficient Computation of Expectations under Spanning Tree Distributions

Author:

Zmigrod Ran1,Vieira Tim2,Cotterell Ryan34

Affiliation:

1. University of Cambridge, United Kingdom. rz279@cam.ac.uk

2. Johns Hopkins University, United Kingdom. tim.f.vieira@gmail.com

3. University of Cambridge, United Kingdom

4. ETH Zürich, United Kingdom. ryan.cotterell@inf.ethz.ch

Abstract

Abstract We give a general framework for inference in spanning tree models. We propose unified algorithms for the important cases of first-order expectations and second-order expectations in edge-factored, non-projective spanning-tree models. Our algorithms exploit a fundamental connection between gradients and expectations, which allows us to derive efficient algorithms. These algorithms are easy to implement with or without automatic differentiation software. We motivate the development of our framework with several cautionary tales of previous research, which has developed numerous inefficient algorithms for computing expectations and their gradients. We demonstrate how our framework efficiently computes several quantities with known algorithms, including the expected attachment score, entropy, and generalized expectation criteria. As a bonus, we give algorithms for quantities that are missing in the literature, including the KL divergence. In all cases, our approach matches the efficiency of existing algorithms and, in several cases, reduces the runtime complexity by a factor of the sentence length. We validate the implementation of our framework through runtime experiments. We find our algorithms are up to 15 and 9 times faster than previous algorithms for computing the Shannon entropy and the gradient of the generalized expectation objective, respectively.

Publisher

MIT Press - Journals

Subject

Artificial Intelligence,Computer Science Applications,Linguistics and Language,Human-Computer Interaction,Communication

Reference45 articles.

1. TensorFlow: Large-scale machine learning on heterogeneous systems;Abadi,2015

2. Prague dependency treebank 3.0;Bejček,2013

3. JAX: Composable transformations of Python+ NumPy programs;Bradbury,2018

4. A differential approach to inference in Bayesian networks;Darwiche;Journal of the ACM,2003

5. Deep biaffine attention for neural dependency parsing;Dozat,2017

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3