Search
Now showing items 1-10 of 46
Smartlocks: Self-Aware Synchronization through Lock Acquisition Scheduling
(2009-11-09)
As multicore processors become increasingly prevalent, system complexity is skyrocketing. The advent of the asymmetric multicore compounds this -- it is no longer practical for an average programmer to balance the system ...
Power-Aware Computing with Dynamic Knobs
(2010-05-14)
We present PowerDial, a system for dynamically adapting application behavior to execute successfully in the face of load and power fluctuations. PowerDial transforms static configuration parameters into dynamic knobs that ...
Hierarchical Compilation of Macro Dataflow Graphs for Multiprocessors with Local Memory
(1992-10)
This paper presents a hierarchical approach for compiling macro dataflow graphs for multiprocessors with local memory. Macro dataflow graphs comprise several nodes (or macros operations) that must be executed subject to ...
Integrating Message-passing and Shared-memory: Early Experience
(1992-10)
This paper discusses some of the issues involved in implementing a shared-address space programming model on large-scale, distributed-memory multiprocessors. Because message-passing mechanisms are much more efficient than ...
Column-associative Caches: A Technique for Reducing the Miss Rate of Direct-mapped Caches
(1993-11)
Direct-mapped caches are a popular design choice for high-performance processors; unfortunately, direct-mapped caches suffer systematic interference misses when more than one address map into the same cache set. This paper ...
Virtual Wires: Overcoming Pin Limitations in FPGA-based Logic Emulators
(1992-11)
Existing FPGA-based logic emulators suffer from limited inter-chip communication bandwidth, resulting in low gate utilization (10 20 percent). This resource imbalance increases the number of chips needed to emulate a ...
A Stream Algorithm for the SVD
(2003-10-22)
We present a stream algorithm for the Singular-Value Decomposition (SVD) of anM X N matrix A. Our algorithm trades speed of numerical convergence for parallelism,and derives from a one-sided, cyclic-by-rows Hestenes SVD. ...
Partitioning Strategies for Concurrent Programming
(2009-06-16)
This work presents four partitioning strategies, or patterns, useful for decomposing a serial application into multiple concurrently executing parts. These partitioning strategies augment the commonly used task and data ...
Energy Scalability of On-Chip Interconnection Networks in Multicore Architectures
(2008-11-11)
On-chip interconnection networks (OCNs) such as point-to-point networks and buses form the communication backbone in systems-on-a-chip, multicore processors, and tiled processors. OCNs can consume significant portions of ...
SEEC: A Framework for Self-aware Computing
(2010-10-13)
As the complexity of computing systems increases, application programmers must be experts in their application domain and have the systems knowledge required to address the problems that arise from parallelism, power, ...