Deep learning, the artificial-intelligence technology that powers voice assistants, autonomous cars, and Go champions, relies on complicated "neural network" software arranged in layers. A deep-learning system can live on a single computer, but the biggest ones are spread over thousands of machines wired together into "clusters," which sometimes live at large data centers, like those operated by Google. In a big cluster, as many as forty-eight pizza-box-size servers slide into a rack as tall as a person; these racks stand in rows, filling buildings the size of warehouses. The neural networks in such systems can tackle daunting problems, but they also face clear challenges. A network spread across a cluster is like a brain that's been scattered around a room and wired together. Electrons move fast, but, even so, cross-chip communication is slow, and uses extravagant amounts of energy.
Eric Vishria, a general partner at Benchmark, a venture-capital firm in San Francisco, first came to understand this problem in the spring of 2016, while listening to a presentation from a new computer-chip company called Cerebras Systems. Benchmark is known for having made early investments in companies such as Twitter, Uber, and eBay—that is, in software, not hardware. The firm looks at about two hundred startup pitches a year, and invests in maybe one. "We're in this kissing-a-thousand-frogs kind of game," Vishria told me. As the presentation started, he had already decided to toss the frog back. "I'm, like, Why did I agree to this? We're not gonna do a hardware investment," he recalled thinking. "This is so dumb."
Andrew Feldman, Cerebras's co-founder, began his slide deck with a cover slide, then a team slide, catching Vishria's attention: the talent was impressive. Then Feldman compared two kinds of computer chips. First, he looked at graphics-processing units, or G.P.U.s—chips designed for creating 3-D images. For a variety of reasons, today's machine-learning systems depend on these graphics chips. Next, he looked at central processing units, or C.P.U.s—the general-purpose chips that do most of the work on a typical computer. "Slide 3 was something along the lines of, 'G.P.U.s actually suck for deep learning—they just happen to be a hundred times better than C.P.U.s,' " Vishria recalled. "And, as soon as he said it, I was, like, facepalm. Of course! Of course!" Cerebras was proposing a new kind of chip—one built not for graphics but for A.I. specifically.
Vishria had grown used to hearing pitches from companies that planned to use deep learning for cybersecurity, medical imaging, chatbots, and other applications. After the Cerebras presentation, he talked with engineers at some of the companies that Benchmark had helped fund, including Zillow, Uber, and Stitch Fix; they told him that they were struggling with A.I. because "training" the neural networks took too long. Google had begun using super-fast "tensor-processing units," or T.P.U.s—special chips it had designed for artificial intelligence. Vishria knew that a gold rush was under way, and that someone had to build the picks and shovels.
That year, Benchmark and Foundation Capital, another venture-capital company, led a twenty-seven-million-dollar round of investment in Cerebras, which has since raised close to half a billion dollars. Other companies are also making so-called A.I. accelerators; Cerebras's competitors—Groq, Graphcore, and SambaNova—have raised more than two billion dollars in capital combined. But Cerebras's approach is unique. Instead of making chips in the usual way—by printing dozens of them onto a large wafer of silicon, cutting them out of the wafer, and then wiring them to one another—the company has made one giant "wafer-scale" chip. A typical computer chip is the size of a fingernail. Cerebras's is the size of a dinner plate. It is the largest computer chip in the world.
From The New Yorker
View Full Article
No entries found