Optimizations for the EcoPod field identification tool-Reference-Cited by-同舟云学术

Optimizations for the EcoPod field identification tool

Published:2008-03-17 Issue:1 Volume:9 Page:
ISSN:1471-2105
Container-title:BMC Bioinformatics
language:en
Short-container-title:BMC Bioinformatics

Author:

Manoharan Aswath,Stamberger Jeannie,Yu YuanYuan,Paepcke Andreas

Abstract

Abstract Background We sketch our species identification tool for palm sized computers that helps knowledgeable observers with census activities. An algorithm turns an identification matrix into a minimal length series of questions that guide the operator towards identification. Historic observation data from the census geographic area helps minimize question volume. We explore how much historic data is required to boost performance, and whether the use of history negatively impacts identification of rare species. We also explore how characteristics of the matrix interact with the algorithm, and how best to predict the probability of observing a previously unseen species. Results Point counts of birds taken at Stanford University's Jasper Ridge Biological Preserve between 2000 and 2005 were used to examine the algorithm. A computer identified species by correctly answering, and counting the algorithm's questions. We also explored how the character density of the key matrix and the theoretical minimum number of questions for each bird in the matrix influenced the algorithm. Our investigation of the required probability smoothing determined whether Laplace smoothing of observation probabilities was sufficient, or whether the more complex Good-Turing technique is required. Conclusion Historic data improved identification speed, but only impacted the top 25% most frequently observed birds. For rare birds the history based algorithms did not impose a noticeable penalty in the number of questions required for identification. For our dataset neither age of the historic data, nor the number of observation years impacted the algorithm. Density of characters for different taxa in the identification matrix did not impact the algorithms. Intrinsic differences in identifying different birds did affect the algorithm, but the differences affected the baseline method of not using historic data to exactly the same degree. We found that Laplace smoothing performed better for rare species than Simple Good-Turing, and that, contrary to expectation, the technique did not then adversely affect identification performance for frequently observed birds.

Publisher

Springer Science and Business Media LLC

Subject

Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology

Link

https://link.springer.com/content/pdf/10.1186/1471-2105-9-150.pdf

Reference39 articles.

1. Olson DM, Dinerstein E, Powell GVN, Wikramanayake ED: Conservation Biology for the Biodiversity Crisis. Conservation Biology 2002, 16(1):1–3. 10.1046/j.1523-1739.2002.01612.x

2. Luck GW, Ricketts TH, Daily GC, Imhogg M: Alleviating spatial conflict between people and biodiversity. Proceedings of the National Academy of Science 2004, 101(1):182–186. 10.1073/pnas.2237148100

3. Hughes JB, Daily GC, Ehrlich PR: Population Diversity: Its Extent and Extinction. Science 1997, 278: 689. 10.1126/science.278.5338.689

4. Swengel AB: Population fluctuations of the Monarch (Danaus plexippus) in the 4th of July Butterfly Count 1977–1994. American Midland Naturalist 1995, 134: 205–214. 10.2307/2426291

5. McLaughlin JF, Hellmann JJ, Boggs CL, Ehrlich PR: Climate change hastens population extinctions. In Proceedings of the National Academy of Sciences of the United States of America. Volume 99. NATL ACAD SCIENCES; 2002:6070–6074. 10.1073/pnas.052131199