acm-header
Sign In

Communications of the ACM

Europe Region special section: Hot topics

Toward a Broad AI


AI initials

Credit: Getty Images

Despite big successes in artificial intelligence (AI) and deep learning, there have been critical assessments made to current deep learning methods.8 Deep learning is data hungry, has limited knowledge transfer capabilities, does not quickly adapt to changing tasks or distributions, and insufficiently incorporates world or prior knowledge.1,3,8,14 While deep learning excels in natural language processing and vision benchmarks, it often underperforms at real-world applications. Deep learning models were shown to fail at new data, new applications, deployments in the wild, and stress tests.4,5,7,13,15 Therefore, practitioners harbor doubt over these models and hesitate to employ them in real-world application.


A broad AI is a sophisticated and adaptive system, which successfully performs any cognitive task by virtue of its sensory perception, previous experience, and learned skills.


Current AI research has tried to overcome the criticisms and limitations of deep learning. AI research and machine learning in particular aims at a new level of AI—a "broad AI"—with considerably enhanced and broader capabilities for skill acquisition and problem solving.3 We contrast "broad AI" to "narrow AI," which are the AI systems currently applied. A broad AI considerably surpasses a narrow AI in the following essential properties: knowledge transfer and interaction, adaptability and robustness, abstraction and advanced reasoning, and efficiency (as illustrated in the accompanying figure). A broad AI is a sophisticated and adaptive system, which successfully performs any cognitive task by virtue of its sensory perception, previous experience, and learned skills.

uf1.jpg
Figure. Hierarchical model of cognitive abilities of AI systems.3

To improve adaptability and robustness, a broad AI utilizes few-shot learning, self-supervised learning with contrastive learning, and processes sensory inputs using context and memory. Few-shot learning trains models with a small amount of data using prior knowledge or previous experience. Few-shot learning has a plethora of real-world applications, for example, when learned models must quickly adapt to new situations, for new customers, new products, new processes, new workflows, or new sensory inputs.

With the advent of large corpora of unlabeled data in vision and language, self-supervised learning based on contrastive learning became very popular. Either views of images are contrasted with views of other images or text descriptions of images are contrasted with text descriptions of other images. Contrastive Language-Image Pre-training (CLIP)10 yielded very impressive results at zero-shot transfer learning. The CLIP model has the potential to become one of the most important foundation models.2 A model with high zero-shot transfer learning performance is highly adaptive and very robustness, thus is supposed to perform well when deployed in real-world applications and will be trusted by practitioners.

A broad AI should process the input by using context and previous experiences. Conceptual short-term memory9 is a notion in cognitive science, which states that humans, when perceiving a stimulus, immediately associate it with information stored in the long-term memory. Like humans, machine learning and AI methods should "activate a large amount of potentially pertinent information,"9 which is stored in episodic or long-term memories. Very promising are Modern Hopfield networks,11,12,16 which reveal the covariance structures in the data, thereby making deep learning more robust. If features co-occur in the data, then modern Hopfield networks amplify this co-occurrence in samples that are retrieved. Modern Hopfield networks are a remedy for learning methods that suffer from the "explaining away" problem. Explaining away is the confirmation of one cause of an observed event that prevents the method from finding alternative causes. Explaining away is one reason for short-cut learning5 and the Clever Hans phenomenon.7 Modern Hopfield networks avoid explaining away via the enriched covariance structure.

Graph neural networks (GNNs) are a very promising research direction as they operate on graph structures, where nodes and edges are associated with labels and characteristics. GNNs are the predominant models of neural-symbolic computing.6 They describe the properties of molecules, simulate social networks, or predict future states in physical and engineering applications with particle-particle interactions.

Back to Top

Europe's Opportunity for a Broad AI

The most promising approach to a broad AI is a neuro-symbolic AI, that is, a bilateral AI that combines methods from symbolic and sub-symbolic AI. In contrast to other regions, Europe has strong research groups in both symbolic and sub-symbolic AI, therefore has the unprecedented opportunity to make a fundamental contribution to the next level of AI—a broad AI.


Europe has strong research groups in both symbolic and sub-symbolic AI, therefore has the unprecedented opportunity to make a fundamental contribution to the next level of AI—a broad AI.


AI researchers should strive for a broad AI with considerably enhanced and broader capabilities for skill acquisition and problem solving by means of bilateral AI approaches that combine symbolic and sub-symbolic AI.

Back to Top

References

1. Bengio, Y., Lecun, Y., and Hinton, G. Turing lecture: Deep Learning for AI. Commun. ACM 64, 7 (July 2021), 58–65; doi:10.1145/3448250

2. Bommasani, R. et al. On the Opportunities and Risks of Foundation Models (2021); ArXiv:2108.07258.

3. Chollet, F. On the Measure of Intelligence (2019); ArXiv:1911.01547.

4. D'Amour, A. et al. Underspecification Presents Challenges for Credibility in Modern Machine Learning (2020); ArXiv: 011.03395.

5. Geirhos, R., Jacobsen, J.-H., Michaelis, C., Zemel, R.S., Brendel, W., Bethge, M. and Wichmann, F.A. Shortcut Learning in Deep Neural Networks (2020); ArXiv:2004.07780.

6. Lamb, L.C., Garcez, A., Gori, M., Prates, M., Avelar, P. and Vardi, M. Graph Neural Networks Meet Neural-Symbolic Computing: A Survey and Perspective (2020); ArXiv:2003.00330.

7. Lapuschkin, S. et al. Unmasking clever Hans predictors and assessing what machines really learn. Nature Communications 10 (2019).

8. Marcus, G. Deep Learning: A Critical Appraisal. (2018); ArXiv:1801.00631.

9. Potter, M. Conceptual short term memory in perception and thought. Frontiers in Psychology 3 (2012), 113.

10. Radford, A. et al. Learning transferable visual models from natural language supervision. In Proceedings of the 38th Intern. Conf. Machine Learning. 2021.

11. Ramsauer, H. et al. Hopfield Networks is All You Need (2020); ArXiv:2008.02217.

12. Ramsauer, H. et al. Hopfield networks is all you need. In Proceedings of the 2021 Intern. Conf. Learning Representations; https://openreview.net/forum?id=tL89RnzIiCd.

13. Recht, B., Roelofs, R., Schmidt, L., and Shankar, V. (2019). Do ImageNet classifiers generalize to ImageNet? Proceedings of the 36th Intern. Conf. Machine Learning 97 (2019), 5389–5400.

14. Sutton, R. The Bitter Lesson. 2019; http://www.incompleteideas.net/IncIdeas/BitterLesson.html

15. Taori, R., Dave, A., Shankar, V., Carlini, N., Recht, B., and Schmidt, L. Measuring robustness to natural Ddistribution shifts in image classification. In Proceedings of the 33rd Conf. Advances in Neural Information Processing Systems. Curran Associates, Inc., 2020, 18583–18599

16. Widrich, M. et al. Modern Hopfield networks and attention for immune repertoire classification. Advances in Neural Information Processing Systems 33 (2020); https://bit.ly/3JPpI5y.

Back to Top

Author

Sepp Hochreiter is a professor at Johannes Kepler Universität Linz, Austria.


Copyright held by author/owner. Publication rights licensed to ACM.
Request permission to publish from [email protected]

The Digital Library is published by the Association for Computing Machinery. Copyright © 2022 ACM, Inc.


 

No entries found