Bioscience Horizons Advance Access originally published online on February 18, 2009
Bioscience Horizons 2009 2(1):90-96; doi:10.1093/biohorizons/hzp007
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Insights into the development of online plant identification keys based on literature review: an exemplar electronic key to Australian Drosera
University of Reading, Reading RG6 6AA, UK
* Corresponding author: University of Reading, Reading RG6 6AA, UK. Tel: +44 01183787189. Email: redrinkwater{at}googlemail.com
Supervisor: Dr Alastair Culham, University of Reading, Reading RG6 6AA, UK.
| Abstract |
|---|
|
|
|---|
Keys are traditionally created from data gathered from real observations, which can be a complex and time-consuming process. With a large number of texts containing written descriptions of species, data needed for the creation of a key are already available. A number of published works on Drosera were chosen and the data in these were used to create an interactive, web-based key. The types of data used were evaluated for ease of use and accuracy and the effectiveness of purely literature-based data was tested. The resulting key highlighted a number of problems of using characters and states which have not been observed, as well as showing that literature-based research can lead to a workable key.
Key words: interactive keys, Drosera, web-based key, multi-access
The wish to identify and classify the organisms around us is something that we do instinctively as they form part of the world around us. At the most basic level, we need to be able to identify those organisms that we can eat, what creatures are predators and what is a good fuel; things essential to survival.1
In traditional cultures, plants and their uses are learnt, rather than recorded, but this has been lost from most modern cultures. Through ethnobotanical studies of different tribes, we can see that their use of the environment around them, and their knowledge of it, is much greater than that of the average person in a western culture. This knowledge is divided throughout the tribe, with different members knowing the names of what they use, e.g. women will know food and medicinal plants, whereas the shaman will know magic plants.2
One of the easiest ways of identifying plants, when the total choice of species is small (<100), is that of using a direct comparison with images or specimens. Little prior knowledge is required in order to carry out identification and normally involves matching an image from a book, e.g. Wild Flowers of Britain3 to the specimen in question. Although easy to do when choices are limited by geographic region, for instance, this technique can be time-consuming and result in misidentification as the images provided in a book may not show the required details for a thorough identification. The books used tend not to include all species present and often show only the most common forms, sometimes making this method of identification somewhat like a lucky dip. A usually more accurate method of identification can be made through the use of identification keys, because these can include highly specific descriptive terms that will separate superficially similar species consistently. That is why identification via keys is the usual scientific approach. These keys have two main forms: dichotomous and multi-access. Dichotomous keys are the traditional form of key and can be found in many field guides and floras, e.g. New Flora of the British Isles.4
There are two different styles of dichotomous key: bracketed and indented; these differ only in the layout of option choices (couplets), but not in how the couplets are constructed. Couplets generally offer two options, with the best keys having each option describing half the overall number of taxa. In general, the couplets have two or more correlating characters, helping to avoid character inconstancy and difficulties of observation or of absence of a character. The user is then lead to another couplet, which again aims to divide the taxa in half. Taxa are keyed out as the key progresses, with the couplets ultimately leading to a name, rather than the next couplet. An error in this process, or variability of the appearance of the specimen, can result in a misidentification of the specimen; however, this can most readily be allowed for by the author placing the taxon name within the key several times, each placement accounting for a different form of the species.5 This must be limited within the key, as it increases the size and complexity of the key and therefore the average number of couplets to reach an answer, consequently increasing the chances of making an error.
Multi-access keys are much more flexible through user choice of question order. Options are not presented as couplets but as a question with several possible answers. They can be in the form of a written key, e.g. the Epilobium key in Stace,6 but they are most commonly used in computer-based interactive keys, which are available online, e.g. Tools for Plant Identification7 and on CD. Such keys often include algorithms that optimize the order of questions presented and re-optimize this order each time a question is answered. As each question is answered, the number of taxa remaining is reduced until only one, or a few, remain. Just as with dichotomous keys, wrong character choice can lead to misidentification, and it is often harder to spot within a multi-access key as question choice is user dependant and errors will not be flagged up, as may happen in a dichotomous key. The strengths of this approach are that users can choose to answer only those questions they are most able to answer reliably, can often backtrack or change single answers and know how they have progressed in reducing the number of possible taxa.
With both dichotomous and multi-access keys, it is always advisable to use descriptions and images of the taxa to ensure that the identification made is correct. Here picture books can be of great use, as can images provided with the key. One such example is The Wild Flower Key.8
Interactive keys can be either dichotomous or multi-access, with the latter being more common. They use computer programs in which the user enters the attributes of the specimen. The program eliminates taxa whose attributes do not match those entered, until only one taxon remains.5 Interactive dichotomous keys simply provide an alternative way of offering sequential questions to print and paper but save the reader having to find the next question.
The main problem with interactive multi-access keys is that the user-driven question choice, based on characters that can easily be observed, may not make a significant contribution to the identification in relation to the number of questions answered.9
Interactive multi-access keys have several advantages over that of the traditional paper-based key:5
- Unrestricted character use—any characters can be used in any order. Questions which the user finds difficult to interpret or are absent can be avoided.
- Character choice deletion and changing—characters that have been chosen by the user can be deleted or changed during the identification process if they appear to be causing problems with the identification.
- Error tolerance—a correct identification can be made even with some errors.
- Locating errors—the program should be capable of identifying user and/or data entry errors during the identification process, such as a measurement entered that is outside all possible values given.
- Numeric characters—can be used without dividing them up into ranges.
- Easy updating—the key is easily maintained and updated by making changes to the data matrix behind the key.
There has been extensive debate in the literature as to whether dichotomous or multi-access keys are the better format to use. In a comparison of dichotomous, hypertext and multi-access keys,9 it was found that students made a higher number of correct identifications using a multi-access key and found it easier to use than the paper-based key. The ready access to computers in the lab and in the field (e.g. netbooks) now makes the use of interactive keys practical in situations where previously a paper-based key has been the only option.
There are several different programs for the construction of interactive keys, seven are shown in Table 1. A more comprehensive list can be found online on the DELTA-Intkey website10 and in a recent paper by Evans Walter and Winterton.11
|
The program suite that was chosen for this project was the DELTA editor12, 13 and Intkey.12, 14–16 The primary reason for this was its widespread use and because it is free for those who want to use it for non-commercial purposes. This is an advantage as it allows the key to be expanded by others at a later date. There are nine key construction programs based on the DELTA format listed on the DELTA-Intkey website,17 which was designed with the hope of eliminating many of the problems found with other systems where character coding is restricted by the requirements of particular programs. As a result of these restrictions, the types of data that could be represented and the number of other programs that could use these data are limited. DELTA was also designed to allow for easy use by people, rather than convenience in computer programming.5 This had also meant that the DELTA format has been adopted by Biodiversity Information Standards (TDWG) as a standard for data exchange.
Drosera (Droseraceae) is a large genus with around 200 species. They can be found in almost every country of the world, with the majority being found in Australia (approximately 140 species). They grow in areas with wet, low-nutrient soil, although some species have evolved to survive hot, dry summers and grow only during the wetter, cooler winters.18 They are a very diverse group, ranging from plants only a few millimetres across to plants the size of a small shrub and display a large range of different leaf morphologies. Their flowers are also varied much in both colour and size; there are some species where the flower is as big, if not bigger, than the plant itself.
Correct identification of all Drosera species is important for conservation as many of their habitats are under threat from development and environmental change. In Australia and South Africa, the expansion of cities, e.g. Perth and Cape Town, is beginning to threaten habitats, as is the draining of land for agriculture and forestry. Pollution is modifying habitats through increasing the nutrient content of the soils, making it possible for new species to come into these areas and out-compete the Drosera species. Droughts, especially in Australia, are also causing concern, as they are threatening to dry up the wet areas in which Drosera are found. There is also the added threat of collection pressure, especially of the more unusual, endemic species by carnivorous plant enthusiasts.
There is currently no interactive key available online to Australian species of Drosera, although there are several Australian-based projects working towards online identification of the flora,19 including Flora of Australia Online,20 which will provide a key to the genera.
A comprehensive descriptive enumeration of species of Australian Drosera has been published by Lowrie in a series of three books on Australian carnivorous plants.21–23 This has since been supplemented by a series of papers outlining new species and changes in the taxonomy of those species.24–29
The aim here is to create a user-friendly online interactive multi-access key to the Drosera of Australia, based on the published work. The types of characters will be evaluated and ways in which to communicate these to the users will be examined. Finally, the project will help to evaluate how good the use of literature-based review is for constructing character lists.
The key was tested using live specimens, from a collection held at Reading University. People with different levels of botanical knowledge were asked to identify a selection of specimens and were asked to comment on how they found the key to use. Five specimens were examined by a novice and two were examined by an experienced botanist.
The novice was asked to use the key in the two different modes within Intkey: Best order and Natural order. The Best order function uses an algorithm to determine which questions will give the biggest split of the remaining taxa and so should increase the efficiency of the key if all top questions recommended are able to be answered. The Natural order function arranges the characters in the order they were created in the DELTA file, this mode does not eliminate characters which are not applicable to the remaining taxa.
Notes were made on the number of questions answered to reach identification, whether a correct identification was made, which questions were answered, whether there was a difference between Best and Natural order and whether the guidance given by notes and images were of use.
Further testing was carried out based on the descriptions in Lowrie's books, to ensure that correct data had been entered. This was carried out at multiple stages throughout the construction of the key.
As a result of this testing process, changes were made to the key where faults were found and changes recommended by users were made.
Results for the novice user were as follows: the first plant examined was Drosera dichrosepala, which was confirmed by the collection owner, who also noted that it was a larger form. This caused a few problems with reaching a correct identification as the petals and basal rosette were larger than that entered in the key. These two features were left out of the identification characters.
Identification was successfully made in eight questions using the Best order function (Table 2) and in 10 questions using the Natural order function (Table 3).
|
|
The second plant examined was labelled as Drosera parvula, although this was incorrect and the plant was correctly identified by the collection owner as Drosera auriculata. This plant was in flower; however, they were not open.
Identification was successfully made based on three questions using the Best order function (Table 4) and on four questions using the Natural order function (Table 5).
|
|
The third plant tested was Drosera aliciae, a non-Australian species. The researcher was unable to identify this successfully using the key, so demonstrating the avoidance of false identification of species not included in the key.
The fourth plant tested was labelled as Drosera macrantha subsp. eremaea, this was confirmed by the collection owner. This specimen was not in flower, but there were the remains of a flower present.
Identification was made based on six questions using the Best function (Table 6) and on seven questions using the Natural order function (Table 7).
|
|
The final plant examined was labelled as Drosera paleacea and this was confirmed by the collection owner. The plant was in flower. It keyed out as D. paleacea subsp. paleacea using the key.
The identification was made in five questions using the Best order function (Table 8) and in six questions using the Natural order function (Table 9).
|
|
Results for the experienced botanist using the Best order function were as follows. The first species identified was Drosera adelae which was successfully identified in three steps. However, the basal lamina shape described by Lowrie in the books disagreed with what the tester thought, and so it was lost from the remaining taxa list when this was used.
The second species which was identified was Drosera binata. There were a large number of problems found during this test. Due to an error in coding, it was recorded that the species did not have a basal rosette when it does. It was recorded from the books as having no stipules, when they are present and lamina lengths from the book seem to be too small.
The finished key was to 134 species, as many as could be found during the scope of the project, with 113 characters.
The creation of a satisfactory character list and the relevant states for these was a complex task, with the data at first being taken directly from the literature and entered for each species. Once this had been done, several characters had an unwieldy number of states, and so a process of limiting the number of states had to be gone through. Despite this, there were still several characters where it was considered that the states were too complex. In most cases, these issues could have been resolved by splitting the character into two or three separate characters, e.g. petal colour, there are three main areas for colouration: the base, the main body of the petal and the tip. In the key, these were kept as a single character leading to 20 different states, with several colours being repeated with different variations: white, white blushed pink, white with pink veins, white with red spot near base. If this character had been separated out, it would have created a more user-friendly set of characters and states, as well as allowing for more accurate coding of the character.
One area that caused a large number of problems when coding the key and which also proved problematic in the testing was the variety of different gland types and the different arrangements of them on petioles, sepals and other areas of the plant. There was a lot of debate as to how many could be distinguished from each other, and whether users, especially novices would be able to make a clear enough distinction between them. The number of different descriptive gland types included in the key could be a consequence of inconsistent use of botanical terms in published work or it could be a true reflection of the gland variation within Drosera. Writing accurate and clear descriptions of the glands was also complicated by them being poorly represented in drawings that accompanied the descriptions and through there being little information on more complex gland descriptions in the botanical dictionaries and glossaries used.30, 31 It was also noted that users, especially the novice, avoided these more complex questions in favour of easier and clearer characters, perhaps suggesting that it is unnecessary for these characters to be included in the key.
The characters selected for inclusion in this key were based on the descriptions for each species, as written by Lowrie. As a result, it was assumed that these particular characters were considered good i.e. they are not subject to wide variation32 (as they are the basis for the species description). It was also assumed that the details given by Lowrie were accurate, e.g. measurements, however, to allow for some level of variation ± 10% was included for each, especially as many only gave a single value. For characters which used colours, descriptions such as yellowish-green were coded as yellow and green, and characters such as dark red were coded as red, both of these were to allow for peoples different perceptions of colour and to negate the need of having a reference set of colours for the user to compare with, e.g. the RHS colour chart.33 With some species (e.g. Drosera pulchella) this became quite complex, as there is a large level of flower colour variation within a species, and each had to be coded (Fig. 1).
|
Some of the software was quite difficult to use, even with the user guide. As a result of this, some aspects of the final key, e.g. the images, took longer than necessary to add due to confusion about how to add them to the key and issues when the data had to be re-exported (due to data updates), once images had been attached.
While working to create a key only from literature was in general effective, its use as the only source of primary data (and that of only one person) limited the accuracy and effectiveness of the key. The books on Australian Drosera show a noticeable change in descriptive style between the first volume and the following two. All three volumes lack some detail, with different levels of information being provided for individual species. These books are aimed at a predominantly amateur market and therefore avoid the rather dry but standardized style of formal botanical works. In the papers that have been published since the release of the third volume (1998), there has again been a change in the quality of the descriptions, with more consistency between the descriptions. The papers have also included some features which were absent from the majority of those in the books, e.g. seeds, and so these are absent from the key.
From the testing that was carried using the key, it was shown that it has the potential to be a useful resource for identifying Australian Drosera including very morphologically similar species. The key should be used to guide the user to descriptions, specimens and images available for the taxa, to confirm the identification made using the key. This is thought to be best practice, as it can provide a lot of extra information to the user.
This book-based exercise demonstrates that a workable key to a large genus can be written using data gleaned from published literature. The interactive, multi-access key requires very few questions to be answered before a successful identification is made, compared with a dichotomous key. Allowing the user to choose which questions to answer, while guided by the identification software, allows both experienced and inexperienced users to succeed in identifying difficult species. To ensure such keys are fully inclusive of natural variation, the initial literature-based review should ultimately be tested against, and enhanced by, reference to living and preserved specimens. The key to Australian Drosera is available via the University of Reading (http://www.reading.ac.uk/biologicalsciences/about/staff/a-culham.asp).
| References |
|---|
|
|
|---|
- Stace CA. Plant Taxonomy and Biosystematics (1989) 2nd ed. Cambridge: Cambridge University Press.
- Prance G. The ethnobotany of the Guaraní: the Yaboti Biosphere Reserve, Misiones, Argentina. (2008) Lecture.
- Phillips R. Wild Flowers of Britain (1977) London: Pan Books Ltd.
- Stace CA. New Flora of the British Isles (1991) Cambridge: Cambridge University Press.
- Dallwitz MJ, Paine TA, Zurcher EJ. Interactive identification using the internet. (2007) DELTA—DEscription Language for TAxonomy: http://delta-intkey.com/www/netid.htm (retrieved 12 January 2008).
- Stace CA. Epilobium. In: New Flora of the British Isles—CA Stace, ed. (1991) Cambridge: Cambridge University Press. 524–525.
- Barkworth M. Tools for plant identification. http://utc.usu.edu/keys/ (retrieved January 2009).
- Rose F, O'Reilly C. The Wild Flower Key (Revised Edition)—How to Identify Wild Plants, Trees and Shrubs in Britain and Ireland (2006) London: Warne & Co.
- Morse DR, Tardivel GM, Spicer JA. Comparison of the effectiveness of a dichotomous key and a multi-access key to woodlice. (1996) http://www.cs.kent.ac.uk/pubs/1996/44/ (retrieved 27 February 2008).
- Dallwitz MJ. Programs for interactive identification and retrieval (updated 20 May 2008). (2008) http://delta-intkey.com/www/idprogs.htm (retrieved April 2008).
- Evans Walter D, Winterton S. Keys and the crisis in taxonomy: extinction or reinvention. Annu Rev Entomol (2007) 52:193–208.[CrossRef][Web of Science][Medline]
- Dallwitz MJ. A general system for coding taxonomic descriptions. Taxon (1980) 29:41–46.[CrossRef][Web of Science]
- Dallwitz MJ, Paine TA, Zurcher EJ. User's Guide to the DELTA Editor (1999) http://delta-intkey.com (retrieved October 2007).
- Dallwitz MJ, Paine TA, Zurcher EJ. User's Guide to the DELTA System: a General System for Processing Taxonomic Descriptions (1993) 4th ed. http://delta-intkey.com (retrieved October 2007).
- Dallwitz MJ, Paine TA, Zurcher EJ. User's Guide to Intkey: A Program for Interactive Identification and Information Retrieval (1995) http://delta-intkey.com (retrieved October 2007).
- Dallwitz MJ, Paine TA, Zurcher EJ. Principles of interactive keys. (2000) http://delta-intkey.com (retrieved October 2007).
- Dallwitz MJ. DELTA—DEscription Language for TAxonomy. (2007) http://delta-intkey.com/ (retreived January 2009).
- D'Amato P. The Savage Garden (1998) Berkeley, CA: Ten Speed Press.
- Florabase: the Western Australia flora. http://florabase.dec.wa.gov.au/ (retrieved April 2008).
- Flora of Australia Online. Australian Government: Department of the Environment, Water, Heritage and the Arts: http://www.environment.gov.au/biodiversity/abrs/online-resources/flora/main/index.html (retrieved April 2008).
- Lowrie A. Carnivorous Plants of Australia (1987) 1. University of Australia Press: Nedlands, Western Australia.
- Lowrie A. Carnivorous Plants of Australia. (1989) 2. Nedlands, Western Australia: University of Australia Press.
- Lowrie A. Carnivorous Plants of Australia. (1998) 3. Nedlands, Western Australia: University of Australia Press.
- Lowrie A. A taxonomic revision of Drosera section Stolonifera (Droseraceae), from south-west Western Australia. Nuytsia (2005) 15:355–393.
- Lowrie A. A taxonomic review of the yellow-flowered tuberous species of Drosera (Droseraceae) from south-west Western Australia. Nuytsia (1999) 13:73–87.
- Lowrie A, Conran JG. A revision of the Drosera omissa/D. nitidula complex (Droseraceae) from south-west Western Australia. Taxon (2007) 56:533–544.[Web of Science]
- Lowrie A. New species in Drosera section Lasiocephala (Droseraceae) from tropical northern Australia. Nuytsia (1996) 11:55–69.
- Lowrie A. Drosera pedicellaris (Droseraceae), a new species from south-west Western Australia. Nuytsia (2002) 15:59–62.
- Mann P. Drosera gibsonii (Droseraceae), a new Pygmy Drosera from south-west Western Australia. Nuytsia (2007) 16:321–323.
- Allaby M. A Dictionary of Plant Sciences (2006) Oxford University Press: Oxford.
- Harris JG, Woolf Harris M. Plant Identification Terminology: an Illustrated Glossary (1994) Springlake, Utah: Spring Lake Publishing.
- Heywood VH, Moore DM. Current Concepts in Plant Taxonomy (Systematics Association Special Volume). (1984) London: Published for the Systematics Association by Academic Press.
- RHS colour chart. http://www.rhs.org.uk/Learning/Publications/pubs_library_colourchart.htm (retrieved September 2008).
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
