Background

The WordCorr Project

[ Home ] [ Background ] [ Technical ] [ SourceForge Project Page ] [ Download ]

[ Overall ] [ General Plan ] [ Broader Implications ]

Background

Summary: Tracing systematic similarities among languages leads to understanding of their history. Unfortunately, it requires painstaking tabulations -- massive and inherently complex, and error prone in the way they are traditionally carried out.

The WordCorr project, with support from the National Science Foundation of the U.S. government, extends the capabilities of comparative linguists by matching their expertise in making pattern recognition judgments with a computational tool that organizes all their data in response to those judgments. This division of labor allows linguists to stay on top of greater quantities of data from more speech varieties than has been possible before, because it provides an information technology infrastructure for their discipline.

The computer does not do the analysis itself. Instead, it organizes the data according each linguist's informed decisions in a way that rapidly shows up contrasts and complementarities among correspondence sets, a key step in arriving at a systematic analysis. WordCorr also highlights and organizes information about the anomalies that invariably have to be accounted for. It allows different levels of detail to be worked on, simultaneously if desired. The project involves constructing a specialized tool for linguists by

Enabling collaboration between Dr. Joseph E. Grimes and his graduate assistants at the University of Hawaii and the software specialists at Data House, Honolulu, to produce a robust, user friendly, and secure tool with a master database accessible to linguists anywhere in the world that an Internet connection is possible, and that scales down so that one person can work alone with an extract from the database, for field work where there is no Internet access.
Providing means for a team of scholars, or a professor and class, to dialogue about each other's views of the analysis, using a chat-like facility on the Internet or a local network.
Organizing conceptual and technical help for linguists who may be so remotely situated that they lack access both to consultation with colleagues and to technical advice about WordCorr itself.
Compiling sample word list collections to exploit WordCorr as an educational tool by treating a class as a kind of research team.
Training linguists and graduate students to exploit WordCorr's speed and consistency in order to jump start comparative studies that combine field data with data from the linguistic literature.
Contributing to the Open Language Archives Community and to Linguist List's collection of materials from endangered languages.

Why WordCorr? In the tabulation phase of comparative linguistics, the ratio of bookkeeping time to thinking time has long stood at more than 200 to 1. Word processors and spreadsheets have brought the ratio down a little, but not much. The WordCorr design allows a linguist to concentrate on the data, because all the bookkeeping is handled in seconds, without error and without diverting attention from the main thought process.

This helps transform comparative linguistics from drudgery illuminated now and then by a flash of insight into a doable endeavor that more people should be attracted to. WordCorr also assists in organizing the presentation of evidence for hypotheses about how language families have developed. It allows a large number of speech varieties to be compared at once.

Because of the necessity for close attention to minutiae, truly collegial research by teams of scholars is hard to achieve. Usually one scholar keeps in his or her head most of the options and uncertainties of the developing analysis, and assistants may be limited to compiling data and checking out specific lines of thought the lead scholar wants traced.

WordCorr, however, is designed so that each member of a team may follow through several alternate analyses, show them to colleagues, and discuss them freely until a consensus emerges, because all the alternatives are accessible through the data base.

Broader Impact: The team research aspect allows classes in the comparative method to be conducted using exactly the same software the students will eventually use for their own field work. By making educationally useful data sets available to the public at large via the Internet, WordCorr may attract people into linguistics. It may also help informed citizens to realize that languages other than their own are an intricate and beautiful heritage, not something to be despised or stamped out.

[ Overall ] [ General Plan ] [ Broader Implications ]

[ Home ] [ Background ] [ Technical ] [ SourceForge Project Page ] [ Download ]

For problems or questions regarding this web contact khamasak@users.sourceforge.net.
Last updated: Jan 01, 1970

Sponsors: