The WordCorr Project


Home ] [ Background ] [ Technical ] [ SourceForge Project Page ] [ Download ]


Overall ] [ General Plan ] [ Broader Implications ]

Background

Summary: Tracing systematic similarities among languages leads to understanding of their history. Unfortunately, it requires painstaking tabulations -- massive and inherently complex, and error prone in the way they are traditionally carried out.

The WordCorr project, with support from the National Science Foundation of the U.S. government, extends the capabilities of comparative linguists by matching their expertise in making pattern recognition judgments with a computational tool that organizes all their data in response to those judgments. This division of labor allows linguists to stay on top of greater quantities of data from more speech varieties than has been possible before, because it provides an information technology infrastructure for their discipline.

The computer does not do the analysis itself. Instead, it organizes the data according each linguist's informed decisions in a way that rapidly shows up contrasts and complementarities among correspondence sets, a key step in arriving at a systematic analysis. WordCorr also highlights and organizes information about the anomalies that invariably have to be accounted for. It allows different levels of detail to be worked on, simultaneously if desired. The project involves constructing a specialized tool for linguists by

Why WordCorr? In the tabulation phase of comparative linguistics, the ratio of bookkeeping time to thinking time has long stood at more than 200 to 1. Word processors and spreadsheets have brought the ratio down a little, but not much. The WordCorr design allows a linguist to concentrate on the data, because all the bookkeeping is handled in seconds, without error and without diverting attention from the main thought process.

This helps transform comparative linguistics from drudgery illuminated now and then by a flash of insight into a doable endeavor that more people should be attracted to. WordCorr also assists in organizing the presentation of evidence for hypotheses about how language families have developed. It allows a large number of speech varieties to be compared at once.

Because of the necessity for close attention to minutiae, truly collegial research by teams of scholars is hard to achieve. Usually one scholar keeps in his or her head most of the options and uncertainties of the developing analysis, and assistants may be limited to compiling data and checking out specific lines of thought the lead scholar wants traced.

WordCorr, however, is designed so that each member of a team may follow through several alternate analyses, show them to colleagues, and discuss them freely until a consensus emerges, because all the alternatives are accessible through the data base.

Broader Impact: The team research aspect allows classes in the comparative method to be conducted using exactly the same software the students will eventually use for their own field work. By making educationally useful data sets available to the public at large via the Internet, WordCorr may attract people into linguistics. It may also help informed citizens to realize that languages other than their own are an intricate and beautiful heritage, not something to be despised or stamped out.

Overall ] [ General Plan ] [ Broader Implications ]


Home ] [ Background ] [ Technical ] [ SourceForge Project Page ] [ Download ]


For problems or questions regarding this web contact khamasak@users.sourceforge.net.
Last updated: Jan 01, 1970


Sponsors:

SourceForge.net Logo Data House, Inc. Logo University of Hawaii Logo NSF Logo SIL Logo