Parallel computation is making a comeback after a quarter century of neglect. Past research can be put to quick use today.
Interesting article. We should definitely learn from the past. But I also think there are some differences between older parallel systems and new ones.
I think this is a problem: "We can completely avoid the cache consistency problem by never writing to shared data." The problem is that, for many of the things we want to do with multicore processors (like graphics, or games, which is my area) we *do* need to write to shared data. This is a problem with a lot of solutions: they require the use of write-only data. This is sometimes possible, but sometimes not. For the real-world applications that we work on, this is a restriction that is often unworkable. The framebuffer in graphics, for example, is writeable shared memory.
So I think we still have a problem that we don't have a consistent answer to: data that is shared and modified.
The other difference is to do with the lifetime of tasks. In older systems, there were long-running tasks. This is often true of HPC today. But for the media-rich devices that use multicore, tasks are often much shorter-running. This has the effect of increasing the relative cost of starting and stopping tasks. We also have the problem of duplicating processor cores, but not necessarily being able to afford to duplicate the memory chips. This increases problems of memory bandwidth.
So, there are some fundamental differences today, which mean some of the past solutions aren't always appropriate.
Our comment about cache consistency was part of a discussion about designs that enable reproducible results of very large systems of partially ordered tasks operating in a common virtual memory. Some types of cache, such as framebuffers as noted here, are not part of virtual address space. We saw some very promising designs of virtual memory and database systems that used write-once memory to avoid consistency problems within the shared address space. Some of these issues may be tough today because designers of multicore systems did not set an objective of reproducible results in large task systems. I hate to see people overlooking the good results of past research just because it is "old" and did not anticipate today's environment.
The question of designing large task systems for determinacy is separate from the question of whether we can afford large task systems. When tasks cannot be started cheaply, we won't be able to afford large systems of those tasks. That should motivate interest in how to reduce task startup costs. There was a lot of research on that too, with many successful designs, that we did not cover in our article.
Displaying all 2 comments