acm-header
Sign In

Communications of the ACM

Viewpoint

The Growing Cost of Deep Learning for Source Code


data values on a colorful graph, illustration

Credit: Ozz Design

Recent years have seen a steep increase in the use of artificial intelligence methods in software engineering (AI+SE) research. The combination of these two fields has unlocked remarkable new abilities: Lachaux et al.'s recent work on unsupervised machine translation of programming languages,15 for instance, learns to generate Java methods from C++ with over 80% accuracy—without curated examples. This would surely have sounded like a vision of a distant future just a decade ago, but such quick progress is indicative of the substantial and unique potential of deep learning for software engineering tasks and domains.

Yet these abilities come at a price. The "secret ingredient" is data, as epitomized by Lachaux et al.'s work that utilizes 163 billion tokens across three programming languages. For perspective, this is not just nearly 100 times the size of virtually all prior datasets in the AI+SE field; the estimated cost of training this model is to the tune of tens of thousands of dollars. And even that is a drop in the bucket compared to what is next: training the new state-of-the-art in language models—GPT-32—runs in the order of millions. This may be a small price to pay for Facebook, where Lachaux et al.'s research was conducted, or OpenAI (GPT-3), but this exploding trend in cost to achieve the state of the art has left the ability to train and test such models limited to a select few large technology companies—and way beyond the resources of virtually all academic labs. It is reasonable, then, to worry that a continuation of this trend will stifle some of the innovative capacity of academic labs and leave much of the future of AI-based SE research in the hands of elite industry labs. This Viewpoint is a call to action, in which we discuss the current trends, their importance for our field, and propose solutions.


 

No entries found

Log in to Read the Full Article

Sign In

Sign in using your ACM Web Account username and password to access premium content if you are an ACM member, Communications subscriber or Digital Library subscriber.

Need Access?

Please select one of the options below for access to premium content and features.

Create a Web Account

If you are already an ACM member, Communications subscriber, or Digital Library subscriber, please set up a web account to access premium content on this site.

Join the ACM

Become a member to take full advantage of ACM's outstanding computing information resources, networking opportunities, and other benefits.
  

Subscribe to Communications of the ACM Magazine

Get full access to 50+ years of CACM content and receive the print version of the magazine monthly.

Purchase the Article

Non-members can purchase this article or a copy of the magazine in which it appears.
Sign In for Full Access
» Forgot Password? » Create an ACM Web Account