Search
Now showing items 1-10 of 18
Hierarchical Compilation of Macro Dataflow Graphs for Multiprocessors with Local Memory
(1992-10)
This paper presents a hierarchical approach for compiling macro dataflow graphs for multiprocessors with local memory. Macro dataflow graphs comprise several nodes (or macros operations) that must be executed subject to ...
Integrating Message-passing and Shared-memory: Early Experience
(1992-10)
This paper discusses some of the issues involved in implementing a shared-address space programming model on large-scale, distributed-memory multiprocessors. Because message-passing mechanisms are much more efficient than ...
Column-associative Caches: A Technique for Reducing the Miss Rate of Direct-mapped Caches
(1993-11)
Direct-mapped caches are a popular design choice for high-performance processors; unfortunately, direct-mapped caches suffer systematic interference misses when more than one address map into the same cache set. This paper ...
Virtual Wires: Overcoming Pin Limitations in FPGA-based Logic Emulators
(1992-11)
Existing FPGA-based logic emulators suffer from limited inter-chip communication bandwidth, resulting in low gate utilization (10 20 percent). This resource imbalance increases the number of chips needed to emulate a ...
The Sensitivity of Communication Mechanisms to Bandwidth and Latency
The goal of this paper is to gain insight into the relative performance of communication mechanisms as bisection bandwidth and network latency vary. We compare shared memory with and without prefetching, message passing ...
Shared Memory Versus Message Passing for Iterative Solution of Sparse, Irregular Problems
(1996-10)
The benefits of hardware support for shared memory versus those formessage passing are difficult to evaluate without an in-depth study ofreal applications on a common platform. We evaluate the communicationmechanisms of ...
Maps: a Compiler-Managed Memory System for RAW Machines
(1998-07)
Microprocessors of the next decade and beyond will be built using VLSI chips employing billions of transistors. In this generation of microprocessors, achieving a high level of parallelism at a reasonable clock speed will ...
Exploring Optimal Cost-Performance Designs for RAW processors
(1998-06)
The semiconductor industry roadmap projects that advances in VLSI technology will permit more than one billion transistors on a chip by the year 2010. The MIT Raw microprocessor is a proposed architecture that strives to ...
FUGU: Implementing Translation and Protection in a Multiuser, Multimodel Multiprocessor
(1994-10)
Multimodel multiprocessors provide both shared memory and message passing primitives to the user for efficient communication. In a multiuser machine, translation permits machine resource to be virtualized and protection ...
Stream Algorithms and Architecture
(2003-03)
Wire-exposed, programmable microarchitectures including Trips [11]], Smart Memories [8], and Raw [13] offer an opportunity to schedule instruction execution and data movement explicitly. This paper proposes stream algorithms, ...