Current processor trends of integrating more cores with SIMD units have made it more to extract performance from applications. It is believed that traditional...Nadathur Satish, Changkyu Kim, Jatin Chhugani, Hideki Saito, Rakesh Krishnaiyer, Mikhail Smelyanskiy, Milind Girkar, Pradeep Dubey From Communications of the ACM | May 2015
Specialization improves energy-efficiency in computing but only makes economic sense if there is significant demand. A balance can often be found by designing...Trevor Mudge From Communications of the ACM | April 2015
We present the Convolution Engine (CE) — a programmable processor specialized for the convolution-like data-flow prevalent in computational photography, computer...Wajahat Qadeer, Rehan Hameed, Ofer Shacham, Preethi Venkatesan, Christos Kozyrakis, Mark Horowitz From Communications of the ACM | April 2015
"Neural Acceleration for General-Purpose Approximate Programs" demonstrates the significant advantages in cost, power, and latency through approximate computing...Ravi Nair From Communications of the ACM | January 2015
This paper describes a new approach that uses machine learning-based transformations to accelerate approximation-tolerant programs.Hadi Esmaeilzadeh, Adrian Sampson, Luis Ceze, Doug Burger From Communications of the ACM | January 2015
As GPUs have become mainstream parallel processing engines, many applications targeting GPUs now have data locality more amenable to traditional caching. The...Stephen W. Keckler From Communications of the ACM | December 2014
This paper studies the effect of accelerating highly parallel workloads with significant locality on a massively multithreaded GPU.Timothy G. Rogers, Mike O'Connor, Tor M. Aamodt From Communications of the ACM | December 2014
"Dissection: A New Paradigm for Solving Bicomposite Search Problems," by Itai Dinur, Orr Dunkelman, Nathan Keller, and Adi Shamir, presents an elegant new algorithm...Bart Preneel From Communications of the ACM | October 2014
In this paper, we introduce the new notion of bicomposite search problems, and show that they can be solved with improved combinations of time and space complexities...Itai Dinur, Orr Dunkelman, Nathan Keller, Adi Shamir From Communications of the ACM | October 2014
Having multiple Wi-Fi Access Points with an overlapping coverage area operating on the same frequency may not be a problem anymore.Konstantina (Dina) Papagiannaki From Communications of the ACM | July 2014
JMB, a joint multiuser beamforming system, enables independent access points (APs) to beamform their signals and communicate with their clients on the same channel...Hariharan Rahul, Swarun Kumar, Dina Katabi From Communications of the ACM | July 2014
An ideal scheme for password storage would enable a password with more than 20 bits of randomness to be input and output from the brain of a human being who is...Ari Juels, Bonnie Wong From Communications of the ACM | May 2014
We present a defense against coercion attacks using the concept of implicit learning from cognitive psychology. We use a carefully crafted computer game to allow...Hristo Bojinov, Daniel Sanchez, Paul Reber, Dan Boneh, Patrick Lincoln From Communications of the ACM | May 2014
Today's smartphone operating systems frequently fail to provide users with adequate control over and visibility into how third-party applications use their privacy...William Enck, Peter Gilbert, Byung-Gon Chun, Landon P. Cox, Jaeyeon Jung, Patrick McDaniel, Anmol N. Sheth From Communications of the ACM | March 2014
Moore's Law has been the mainstay of semiconductor electronics since the invention of the transistor and its application to the integrated circuit. Implicit in...Subramanian S. Iyer From Communications of the ACM | January 2014
Three-dimensional integrated circuit (3D IC) with through-silicon-via (TSV) is believed to offer new levels of efficiency, power, performance, and form-factor advantages...Moongon Jung, Joydeep Mitra, David Z. Pan, Sung Kyu Lim From Communications of the ACM | January 2014
In quite a tour de force, the authors of the following paper have built a provably correct real-time garbage collector for reconfigurable hardware (field programmable...Eliot Moss From Communications of the ACM | December 2013
We present a garbage collector synthesized directly to hardware, capable of collecting a heap of uniform objects completely concurrently. These heaps are composed...David F. Bacon, Perry Cheng, Sunil Shukla From Communications of the ACM | December 2013
Exponentially increasing transistor integration also demands more interconnections, which have started hitting fundamental limits. The Centip3De design demonstrates...Shekhar Borkar From Communications of the ACM | November 2013