Search
Now showing items 1-10 of 16
Limitless Directories: A Scalable Cache Coherence Scheme
(1991-06)
Caches enhance the performance of multiprocessors by reducing network traffic and average memory access latency. However, cache-based systems must address the problem of cache coherence. We propose the LimitLESS directory ...
Memory Assignment for Multiprocessor Caches Through Graph Coloring
(1992-02)
It has become apparent that the achieved performance of multiprocessors is heavily dependent upon the quality of the availabel compilers. In this paper we are concerned with compile-time techniques that can be used to ...
Compile-time Techniques for Processor Allocation in Macro Dataflow Graphs for Multiprocessors
(1992-06)
When compiling a progam consisting of multiple nested loops for execution on a multiprocessor, processor allocation is the problem of determining the number of processors over which to partition each nested loop. This paper ...
Experience with Fine-grain Synchronization in MIMD Machines for Preconditioned Conjugate Gradient
(1992-10)
This paper discusses our experience with fine-grain synchronization for the preconditioned conjugate gradient method using the modified incomplete Cholesky factorization of the coefficient matrix as a preconditioner. This ...
Modeling Multiprogrammed Caches
This paper presents a simple, yet accurate, model for multiprogrammed caches and validates it against trace-driven simulation. The model takes into account nonstationary behavior of processes and process sharing. By making ...
Baring it all to Software: The Raw Machine
(1997-03)
Rapid advances in technology force a quest for computer architectures that exploit new opportunities and shed existing mechanisms that do not scale. Current architectures, such as hardware scheduled superscalars, are ...
The MIT Alewife Machine: A Large-scale Distributed-memory Multiprocessor
(1991-06)
The Alewife multiprocessor project focuses on the architecture and design of a large-scale parallel machine. The machine uses a low dimension direct interconnection network to provide scalable communication band-width, ...
Software-Extended Coherent Shared Memory: Performance and Cost
(1993-10)
This paper evaluates the tradeoffs involved when designing a directory-based protocol that implements coherent shared memory through a combination of hardware and software mechanisms. The fundamental design decisions involve ...
UDM: User Direct Messaging for General-Purpose Multiprocessing
(1996-03)
User Direct Messaging (UDM) allows user-level, processor-to- processor messaging to coexist with general multiprogramming and virtual memory. Direct messaging, where processors launch and receive messages in tens of cycles ...
How to Build Scalable On-Chip ILP Networks for a Decentralized Architecture
(2000-04)
The era of billion transistors-on-a-chip is creating a completely different set of design constraints, forcing radically new microprocessor archiecture designs. This paper examines a few of the possible microarchitectures ...