GLORIA — GEOMAR Library Ocean Research Information Access

1

Online Resource

Virtual Hierarchies

Marty, Michael R. ; Hill, Mark D.

Institute of Electrical and Electronics Engineers (IEEE) ; 2008

In: IEEE Micro Vol. 28, No. 1 ( 2008-1), p. 99-109

add to mindlist on the mindlist

Details

In: IEEE Micro, Institute of Electrical and Electronics Engineers (IEEE), Vol. 28, No. 1 ( 2008-1), p. 99-109

Type of Medium: Online Resource

ISSN: 0272-1732

URL: Article

DOI: 10.1109/MM.2008.19

RVK:

SQ 1100

Language: Unknown

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Publication Date: 2008

detail.hit.zdb_id: 2027750-7

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Online Resource

Link to publisher

2

Online Resource

Range Translations for Fast Virtual Memory

Gandhi, Jayneel ; Karakostas, Vasileios ; Ayar, Furkan ; [et al.]

Institute of Electrical and Electronics Engineers (IEEE) ; 2016

In: IEEE Micro Vol. 36, No. 3 ( 2016-5), p. 118-126

add to mindlist on the mindlist

Details

In: IEEE Micro, Institute of Electrical and Electronics Engineers (IEEE), Vol. 36, No. 3 ( 2016-5), p. 118-126

Type of Medium: Online Resource

ISSN: 0272-1732 , 1937-4143

URL: Journal

URL: Article

DOI: 10.1109/MM.40

DOI: 10.1109/MM.2016.10

RVK:

SQ 1100

Language: Unknown

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Publication Date: 2016

detail.hit.zdb_id: 2027750-7

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Online Resource

Link to publisher

3

Online Resource

Performance Pathologies in Hardware Transactional Memory

Bobba, Jayaram ; Moore, Kevin E. ; Volos, Haris ; [et al.]

Institute of Electrical and Electronics Engineers (IEEE) ; 2008

In: IEEE Micro Vol. 28, No. 1 ( 2008-1), p. 32-41

add to mindlist on the mindlist

Details

In: IEEE Micro, Institute of Electrical and Electronics Engineers (IEEE), Vol. 28, No. 1 ( 2008-1), p. 32-41

Type of Medium: Online Resource

ISSN: 0272-1732 , 1937-4143

URL: Journal

URL: Article

DOI: 10.1109/MM.40

DOI: 10.1109/MM.2008.11

RVK:

SQ 1100

Language: Unknown

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Publication Date: 2008

detail.hit.zdb_id: 2027750-7

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Online Resource

Link to publisher

4

Online Resource

Efficient virtual memory for big memory servers

Basu, Arkaprava ; Gandhi, Jayneel ; Chang, Jichuan ; [et al.]

Association for Computing Machinery (ACM) ; 2013

In: ACM SIGARCH Computer Architecture News Vol. 41, No. 3 ( 2013-06-26), p. 237-248

add to mindlist on the mindlist

Details

In: ACM SIGARCH Computer Architecture News, Association for Computing Machinery (ACM), Vol. 41, No. 3 ( 2013-06-26), p. 237-248

Abstract: Our analysis shows that many "big-memory" server workloads, such as databases, in-memory caches, and graph analytics, pay a high cost for page-based virtual memory. They consume as much as 10% of execution cycles on TLB misses, even using large pages. On the other hand, we find that these workloads use read-write permission on most pages, are provisioned not to swap, and rarely benefit from the full flexibility of page-based virtual memory. To remove the TLB miss overhead for big-memory workloads, we propose mapping part of a process's linear virtual address space with a direct segment , while page mapping the rest of the virtual address space. Direct segments use minimal hardware---base, limit and offset registers per core---to map contiguous virtual memory regions directly to contiguous physical memory. They eliminate the possibility of TLB misses for key data structures such as database buffer pools and in-memory key-value stores. Memory mapped by a direct segment may be converted back to paging when needed. We prototype direct-segment software support for x86-64 in Linux and emulate direct-segment hardware. For our workloads, direct segments eliminate almost all TLB misses and reduce the execution time wasted on TLB misses to less than 0.5%.

Type of Medium: Online Resource

ISSN: 0163-5964

URL: Article

DOI: 10.1145/2508148.2485943

RVK:

SS 1985

Language: English

Publisher: Association for Computing Machinery (ACM)

Publication Date: 2013

detail.hit.zdb_id: 2088489-8

detail.hit.zdb_id: 186012-4

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Online Resource

Link to publisher

5

Online Resource

Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset

Martin, Milo M. K. ; Sorin, Daniel J. ; Beckmann, Bradford M. ; [et al.]

Association for Computing Machinery (ACM) ; 2005

In: ACM SIGARCH Computer Architecture News Vol. 33, No. 4 ( 2005-11), p. 92-99

add to mindlist on the mindlist

Details

In: ACM SIGARCH Computer Architecture News, Association for Computing Machinery (ACM), Vol. 33, No. 4 ( 2005-11), p. 92-99

Abstract: The Wisconsin Multifacet Project has created a simulation toolset to characterize and evaluate the performance of multiprocessor hardware systems commonly used as database and web servers. We leverage an existing full-system functional simulation infrastructure (Simics [14]) as the basis around which to build a set of timing simulator modules for modeling the timing of the memory system and microprocessors. This simulator infrastructure enables us to run architectural experiments using a suite of scaled-down commercial workloads [3] . To enable other researchers to more easily perform such research, we have released these timing simulator modules as the Multifacet General Execution-driven Multiprocessor Simulator (GEMS) Toolset, release 1.0, under GNU GPL [9].

Type of Medium: Online Resource

ISSN: 0163-5964

URL: Article

DOI: 10.1145/1105734.1105747

RVK:

SS 1985

Language: English

Publisher: Association for Computing Machinery (ACM)

Publication Date: 2005

detail.hit.zdb_id: 2088489-8

detail.hit.zdb_id: 186012-4

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Online Resource

Link to publisher

6

Online Resource

Virtual hierarchies to support server consolidation

Marty, Michael R. ; Hill, Mark D.

Association for Computing Machinery (ACM) ; 2007

In: ACM SIGARCH Computer Architecture News Vol. 35, No. 2 ( 2007-06-09), p. 46-56

add to mindlist on the mindlist

Details

In: ACM SIGARCH Computer Architecture News, Association for Computing Machinery (ACM), Vol. 35, No. 2 ( 2007-06-09), p. 46-56

Abstract: Server consolidation is becoming an increasingly popular technique to manage and utilize systems. This paper develops CMP memory systems for server consolidation where most sharing occurs within Virtual Machines (VMs). Our memory systems maximize shared memory accesses serviced within a VM, minimize interference among separate VMs, facilitate dynamic reassignment of VMs to processors and memory, and support content-based page sharing among VMs. We begin with a tiled architecture where each of 64 tiles contains a processor, private L1 caches, and an L2 bank. First, we reveal why single-level directory designs fail to meet workload consolidation goals. Second, we develop the paper's central idea of imposing a two-level virtual (or logical) coherence hierarchy on a physically flat CMP that harmonizes with VM assignment. Third, we show that the best of our two virtual hierarchy (VH) variants performs 12-58% better than the best alternative flat directory protocol when consolidating Apache, OLTP, and Zeus commel workloads on our simulated 64-core CMP.

Type of Medium: Online Resource

ISSN: 0163-5964

URL: Article

DOI: 10.1145/1273440.1250670

RVK:

SS 1985

Language: English

Publisher: Association for Computing Machinery (ACM)

Publication Date: 2007

detail.hit.zdb_id: 2088489-8

detail.hit.zdb_id: 186012-4

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Online Resource

Link to publisher

7

Online Resource

Reducing memory reference energy with opportunistic virtual caching

Basu, Arkaprava ; Hill, Mark D. ; Swift, Michael M.

Association for Computing Machinery (ACM) ; 2012

In: ACM SIGARCH Computer Architecture News Vol. 40, No. 3 ( 2012-09-05), p. 297-308

add to mindlist on the mindlist

Details

In: ACM SIGARCH Computer Architecture News, Association for Computing Machinery (ACM), Vol. 40, No. 3 ( 2012-09-05), p. 297-308

Abstract: Most modern cores perform a highly-associative transaction look aside buffer (TLB) lookup on every memory access. These designs often hide the TLB lookup latency by overlapping it with L1 cache access, but this overlap does not hide the power dissipated by TLB lookups. It can even exacerbate the power dissipation by requiring higher associativity L1 cache. With today's concern for power dissipation, designs could instead adopt a virtual L1 cache, wherein TLB access power is dissipated only after L1 cache misses. Unfortunately, virtual caches have compatibility issues, such as supporting writeable synonyms and x86's physical page table walker. This work proposes an Opportunistic Virtual Cache (OVC) that exposes virtual caching as a dynamic optimization by allowing some memory blocks to be cached with virtual addresses and others with physical addresses. OVC relies on small OS changes to signal which pages can use virtual caching (e.g., no writeable synonyms), but defaults to physical caching for compatibility. We show OVC's promise with analysis that finds virtual cache problems exist, but are dynamically rare. We change 240 lines in Linux 2.6.28 to enable OVC. On experiments with Parsec and commercial workloads, the resulting system saves 94-99% of TLB lookup energy and nearly 23% of L1 cache dynamic lookup energy.

Type of Medium: Online Resource

ISSN: 0163-5964

URL: Article

DOI: 10.1145/2366231.2337194

RVK:

SS 1985

Language: English

Publisher: Association for Computing Machinery (ACM)

Publication Date: 2012

detail.hit.zdb_id: 2088489-8

detail.hit.zdb_id: 186012-4

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Online Resource

Link to publisher

8

Online Resource

Agile paging : exceeding the best of nested and shadow paging

Gandhi, Jayneel ; Hill, Mark D. ; Swift, Michael M.

Association for Computing Machinery (ACM) ; 2016

In: ACM SIGARCH Computer Architecture News Vol. 44, No. 3 ( 2016-10-12), p. 707-718

add to mindlist on the mindlist

Details

In: ACM SIGARCH Computer Architecture News, Association for Computing Machinery (ACM), Vol. 44, No. 3 ( 2016-10-12), p. 707-718

Abstract: Virtualization provides benefits for many workloads, but the overheads of virtualizing memory are not universally low. The cost comes from managing two levels of address translation---one in the guest virtual machine (VM) and the other in the host virtual machine monitor (VMM)---with either nested or shadow paging. Nested paging directly performs a two-level page walk that makes TLB misses slower than unvirtualized native, but enables fast page tables changes. Alternatively, shadow paging restores native TLB miss speeds, but requires costly VMM intervention on page table updates. This paper proposes agile paging that combines both techniques and exceeds the best of both. A virtualized page walk starts with shadow paging and optionally switches in the same page walk to nested paging where frequent page table updates would cause costly VMM interventions. Agile paging enables most TLB misses to be handled as fast as native while most page table changes avoid VMM intervention. It requires modest changes to hardware (e.g., demark when to switch) and VMM policies (e.g., predict good switching opportunities). We emulate the proposed hardware and prototype the software in Linux with KVM on x86-64. Agile paging performs more than 12% better than the best of the two techniques and comes within 4% of native execution for all workloads.

Type of Medium: Online Resource

ISSN: 0163-5964

URL: Article

DOI: 10.1145/3007787.3001212

RVK:

SS 1985

Language: English

Publisher: Association for Computing Machinery (ACM)

Publication Date: 2016

detail.hit.zdb_id: 2088489-8

detail.hit.zdb_id: 186012-4

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Online Resource

Link to publisher

9

Online Resource

An Analysis of Persistent Memory Use with WHISPER

Nalli, Sanketh ; Haria, Swapnil ; Hill, Mark D. ; [et al.]

Association for Computing Machinery (ACM) ; 2017

In: ACM SIGARCH Computer Architecture News Vol. 45, No. 1 ( 2017-05-11), p. 135-148

add to mindlist on the mindlist

Details

In: ACM SIGARCH Computer Architecture News, Association for Computing Machinery (ACM), Vol. 45, No. 1 ( 2017-05-11), p. 135-148

Abstract: Emerging non-volatile memory (NVM) technologies promise durability with read and write latencies comparable to volatile memory (DRAM). We define Persistent Memory (PM) as NVM accessed with byte addressability at low latency via normal memory instructions. Persistent-memory applications ensure the consistency of persistent data by inserting ordering points between writes to PM allowing the construction of higher-level transaction mechanisms. An epoch is a set of writes to PM between ordering points. To put systems research in PM on a firmer footing, we developed and analyzed a PM benchmark suite called WHISPER (Wisconsin-HP Labs Suite for Persistence) that comprises ten PM applications we gathered to cover all current interfaces to PM. A quantitative analysis reveals several insights: (a) only 4% of writes in PM-aware applications are to PM and the rest are to volatile memory, (b) software transactions are often implemented with 5 to 50 ordering points (c) 75% of epochs update exactly one 64B cache line, (d) 80% of epochs from the same thread depend on previous epochs from the same thread, while few epochs depend on epochs from other threads. Based on our analysis, we propose the Hands-off Persistence System (HOPS) to track updates to PM in hardware. Current hardware design requires applications to force data to PM as each epoch ends. HOPS provides high-level ISA primitives for applications to express durability and ordering constraints separately and enforces them automatically, while achieving 24.3% better performance over current approaches to persistence.

Type of Medium: Online Resource

ISSN: 0163-5964

URL: Article

DOI: 10.1145/3093337.3037730

RVK:

SS 1985

Language: English

Publisher: Association for Computing Machinery (ACM)

Publication Date: 2017

detail.hit.zdb_id: 2088489-8

detail.hit.zdb_id: 186012-4

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Online Resource

Link to publisher

10

Online Resource

Performance pathologies in hardware transactional memory

Bobba, Jayaram ; Moore, Kevin E. ; Volos, Haris ; [et al.]

Association for Computing Machinery (ACM) ; 2007

In: ACM SIGARCH Computer Architecture News Vol. 35, No. 2 ( 2007-06-09), p. 81-91

add to mindlist on the mindlist

Details

In: ACM SIGARCH Computer Architecture News, Association for Computing Machinery (ACM), Vol. 35, No. 2 ( 2007-06-09), p. 81-91

Abstract: Hardware Transactional Memory (HTM) systems reflect choices from three key design dimensions: conflict detection, version management, and conflict resolution. Previously proposed HTMs represent three points in this design space: lazy conflict detection, lazy version management, committer wins (LL); eager conflict detection, lazy version management, requester wins (EL); and eager conflict detection, eager version management, and requester stalls with conservative deadlock avoidance (EE). To isolate the effects of these high-level design decisions, we develop a common framework that abstracts away differences in cache write policies, interconnects, and ISA to compare these three design points. Not surprisingly, the relative performance of these systems depends on the workload. Under light transactional loads they perform similarly, but under heavy loads they differ by up to 80%. None of the systems performs best on all of our benchmarks. We identify seven performance pathologies -interactions between workload and system that degrade performance-as the root cause of many performance differences: FriendlyFire, StarvingWriter, SerializedCommit, FutileStall, StarvingElder, RestartConvoy, and DuelingUpgrades. We discuss when and on which systems these pathologies can occur and show that they actually manifest within TM workloads. The insight provided by these pathologies motivated four enhanced systems that often significantly reduce transactional memory overhead. Importantly, by avoiding transaction pathologies, each enhanced system performs well across our suite of benchmarks.

Type of Medium: Online Resource

ISSN: 0163-5964

URL: Article

DOI: 10.1145/1273440.1250674

RVK:

SS 1985

Language: English

Publisher: Association for Computing Machinery (ACM)

Publication Date: 2007

detail.hit.zdb_id: 2088489-8

detail.hit.zdb_id: 186012-4

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Online Resource

Link to publisher