University of Chicago (UC) researchers are studying natural language morphology in an attempt to develop computers that are better at understanding human language.
The researchers are using the Research Computing Center's (RCC) Midway supercomputing cluster to analyze corpa, which are standard bodies of written language that can contain billions of words taken from many different genres of writing. "A typical scenario for us is that, given some raw data, we have some intuition about certain patterns in the data, and we collaborate with RCC to create visualization tools to display data in a way that enables us to explore these patterns," says UC researcher Jackson Lee.
The visualization shows what words occur most often before and after it in a natural language corpus. "The construction of this visualization tool grew out of the observation that overall word distribution patterns are sensitive to the specific distribution of individual words, and we need a tool to 'see' what the grammar of a given word really looks like," Lee says.
He notes a better understanding of natural language morphology can lead to better designed human-machine interfaces and a better way to search large databases.
From UChicago News (IL)
View Full Article
Abstracts Copyright © 2015 Information Inc., Bethesda, Maryland, USA
No entries found