When designing a new engine or airplane wing, engineers can apply theories that have withstood years of scientific scrutiny, such as the Laws of Thermodynamics or Newton's Laws of Motion. To what theories —if any —can artificial intelligence (AI) researchers and technology pioneers turn when designing neural networks or algorithms? We asked experts from the fields of computer science, theoretical physics, and philosophy for their insights.
The Encyclopedia Britannia defines a scientific theory as a "systematic ideational structure of broad scope, conceived by the human imagination, that encompasses a family of empirical (experiential) laws regarding regularities existing in objects and events, both observed and posited."
Several main theories of AI do exist, according to Thomas G. Dietterich, Distinguished Professor (Emeritus) of computer science at Oregon State University.
The earliest came in reaction to behaviorist psychology and was developed by John McCarthy of Stanford University in 1959. A second is based on the statistical decision theory introduced by John von Neumann and Oskar Morganstern in 1953, and a third — based on cognitive modeling — builds on the work of John Anderson, Richard King Mellon Professor of Psychology and Computer Science at Carnegie Mellon University.
A fourth theory of AI focusing on Machine Learning (ML) should be divided into Learning by Imitation and Learning by Optimization, says Dietterich.
ML faces a theoretical crisis, according to Dietterich. "Experimental work in deep neural networks has far outpaced theoretical analysis, so we have deep learning methods that give exciting results, but for which we lack good theory."
Dietterich believes we need many different theories of AI: a theory of what it means for a system to be intelligent, a "definitional" theory, theories of what architectures and mechanisms can produce intelligent systems, a "design and engineering" theory. and a theory of how an intelligent system can operate successfully in a dangerous and open world. Dietterich describes the latter as "survival/robustness" theory.
"We have some definitional and design theories, but we lack a survival/robustness theory," he says.
Nick Bostrom, an applied ethics professor at Oxford University and director of Oxford's Future of Humanity Institute, suggests probability theory and decision theory "constitute fairly general theoretical foundations for AI." However, he sees value in gaining better theoretical insight into how different AI methods work. This would provide "more value, in the long run, than in eking out another decimal point of performance on some benchmark by fine-tuning parameters," he says.
However, Bostrom adds that theoretical insight need not always take the form of theorems or one big theory of AI. "Often, the real juice comes from clear but informal verbal explanations of the phenomena we observe when we run our algorithms," he says.
Adriano Barra, a theoretical physicist at the University of Salento, Italy, acknowledges that some theories, or "hints of theories," exist, but says they are unfinished. "We have a million learning rules, but no mathematical theory that explains them all." As a result, researchers can achieve practical working results with AI, but no global understanding.
People working in AI nowadays are not always aware of, or interested in, theories from decades ago, says Barra. "They have to solve problems for the bank, for security, for cryptography. They don't care about theories; they just want to be paid."
AI is often developed randomly, says Barra, citing neural network design. Without theory, researchers try a certain number of neurons in the first processing layer, then ask how much larger the second layer should be. They keep trying different numbers of neurons, and adjusting until it works.
Barra argues that trial and error is grossly inefficient. In contrast, with theory, you save time and energy, and "we don't have power to waste," he says. "If we had a theory that could provide guidance on how to design and train our networks, we would probably be able to save a lot of electricity, and carbon emissions."
Soo-Young Lee, director of KI for Artificial Intelligence at the Korea Advanced Institute of Science and Technology (KAIST), points out that even if your network architecture is imperfect, you can achieve success with good training data.
"Learning requires calculation and memory, so current success in deep learning comes mainly from two reasons: big data and faster computers. But that's not the only requirement; the third is theory," says Lee.
Why worry about theory if you get results without it? It is inefficient, says Lee, echoing Barra's concerns. "Training data is getting bigger every second." To utilize it efficiently Lee says, we need network architecture and learning algorithm theories.
Where will future theories of AI come from? A theory based on computational models drawn from cognitive neuroscience looks likely, suggests Lee. Barra, who works at this intersection of mathematics and neurobiology, says, "We need a theory of AI and I believe that the statistical mechanics of complex systems [such as mammalian brains] can be the robust mathematical tool to address this task."
Dietterich points to continued collaboration with fields such as economics, psychology, and statistics. AI researchers should also work closely with engineers and biologists to develop a survival/robustness theory, he says.
Ultrafast CPUs and GPUs, big data, and advances in quantum computing will continue to fuel AI's dizzying advance. Time will tell whether theory can catch up.
Karen Emslie is a location-independent freelance journalist and essayist.
No entries found