GLORIA

GEOMAR Library Ocean Research Information Access

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
  • 1
    Online Resource
    Online Resource
    Institute of Electrical and Electronics Engineers (IEEE) ; 2008
    In:  IEEE Micro Vol. 28, No. 1 ( 2008-1), p. 99-109
    In: IEEE Micro, Institute of Electrical and Electronics Engineers (IEEE), Vol. 28, No. 1 ( 2008-1), p. 99-109
    Type of Medium: Online Resource
    ISSN: 0272-1732
    RVK:
    Language: Unknown
    Publisher: Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2008
    detail.hit.zdb_id: 2027750-7
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 2
    Online Resource
    Online Resource
    Institute of Electrical and Electronics Engineers (IEEE) ; 2016
    In:  IEEE Micro Vol. 36, No. 3 ( 2016-5), p. 118-126
    In: IEEE Micro, Institute of Electrical and Electronics Engineers (IEEE), Vol. 36, No. 3 ( 2016-5), p. 118-126
    Type of Medium: Online Resource
    ISSN: 0272-1732 , 1937-4143
    RVK:
    Language: Unknown
    Publisher: Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016
    detail.hit.zdb_id: 2027750-7
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 3
    Online Resource
    Online Resource
    Institute of Electrical and Electronics Engineers (IEEE) ; 2008
    In:  IEEE Micro Vol. 28, No. 1 ( 2008-1), p. 32-41
    In: IEEE Micro, Institute of Electrical and Electronics Engineers (IEEE), Vol. 28, No. 1 ( 2008-1), p. 32-41
    Type of Medium: Online Resource
    ISSN: 0272-1732 , 1937-4143
    RVK:
    Language: Unknown
    Publisher: Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2008
    detail.hit.zdb_id: 2027750-7
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 4
    Online Resource
    Online Resource
    Association for Computing Machinery (ACM) ; 2013
    In:  ACM SIGARCH Computer Architecture News Vol. 41, No. 3 ( 2013-06-26), p. 237-248
    In: ACM SIGARCH Computer Architecture News, Association for Computing Machinery (ACM), Vol. 41, No. 3 ( 2013-06-26), p. 237-248
    Abstract: Our analysis shows that many "big-memory" server workloads, such as databases, in-memory caches, and graph analytics, pay a high cost for page-based virtual memory. They consume as much as 10% of execution cycles on TLB misses, even using large pages. On the other hand, we find that these workloads use read-write permission on most pages, are provisioned not to swap, and rarely benefit from the full flexibility of page-based virtual memory. To remove the TLB miss overhead for big-memory workloads, we propose mapping part of a process's linear virtual address space with a direct segment , while page mapping the rest of the virtual address space. Direct segments use minimal hardware---base, limit and offset registers per core---to map contiguous virtual memory regions directly to contiguous physical memory. They eliminate the possibility of TLB misses for key data structures such as database buffer pools and in-memory key-value stores. Memory mapped by a direct segment may be converted back to paging when needed. We prototype direct-segment software support for x86-64 in Linux and emulate direct-segment hardware. For our workloads, direct segments eliminate almost all TLB misses and reduce the execution time wasted on TLB misses to less than 0.5%.
    Type of Medium: Online Resource
    ISSN: 0163-5964
    RVK:
    Language: English
    Publisher: Association for Computing Machinery (ACM)
    Publication Date: 2013
    detail.hit.zdb_id: 2088489-8
    detail.hit.zdb_id: 186012-4
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 5
    Online Resource
    Online Resource
    Association for Computing Machinery (ACM) ; 2005
    In:  ACM SIGARCH Computer Architecture News Vol. 33, No. 4 ( 2005-11), p. 92-99
    In: ACM SIGARCH Computer Architecture News, Association for Computing Machinery (ACM), Vol. 33, No. 4 ( 2005-11), p. 92-99
    Abstract: The Wisconsin Multifacet Project has created a simulation toolset to characterize and evaluate the performance of multiprocessor hardware systems commonly used as database and web servers. We leverage an existing full-system functional simulation infrastructure (Simics [14]) as the basis around which to build a set of timing simulator modules for modeling the timing of the memory system and microprocessors. This simulator infrastructure enables us to run architectural experiments using a suite of scaled-down commercial workloads [3] . To enable other researchers to more easily perform such research, we have released these timing simulator modules as the Multifacet General Execution-driven Multiprocessor Simulator (GEMS) Toolset, release 1.0, under GNU GPL [9].
    Type of Medium: Online Resource
    ISSN: 0163-5964
    RVK:
    Language: English
    Publisher: Association for Computing Machinery (ACM)
    Publication Date: 2005
    detail.hit.zdb_id: 2088489-8
    detail.hit.zdb_id: 186012-4
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 6
    Online Resource
    Online Resource
    Association for Computing Machinery (ACM) ; 2007
    In:  ACM SIGARCH Computer Architecture News Vol. 35, No. 2 ( 2007-06-09), p. 46-56
    In: ACM SIGARCH Computer Architecture News, Association for Computing Machinery (ACM), Vol. 35, No. 2 ( 2007-06-09), p. 46-56
    Abstract: Server consolidation is becoming an increasingly popular technique to manage and utilize systems. This paper develops CMP memory systems for server consolidation where most sharing occurs within Virtual Machines (VMs). Our memory systems maximize shared memory accesses serviced within a VM, minimize interference among separate VMs, facilitate dynamic reassignment of VMs to processors and memory, and support content-based page sharing among VMs. We begin with a tiled architecture where each of 64 tiles contains a processor, private L1 caches, and an L2 bank. First, we reveal why single-level directory designs fail to meet workload consolidation goals. Second, we develop the paper's central idea of imposing a two-level virtual (or logical) coherence hierarchy on a physically flat CMP that harmonizes with VM assignment. Third, we show that the best of our two virtual hierarchy (VH) variants performs 12-58% better than the best alternative flat directory protocol when consolidating Apache, OLTP, and Zeus commel workloads on our simulated 64-core CMP.
    Type of Medium: Online Resource
    ISSN: 0163-5964
    RVK:
    Language: English
    Publisher: Association for Computing Machinery (ACM)
    Publication Date: 2007
    detail.hit.zdb_id: 2088489-8
    detail.hit.zdb_id: 186012-4
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 7
    Online Resource
    Online Resource
    Association for Computing Machinery (ACM) ; 2012
    In:  ACM SIGARCH Computer Architecture News Vol. 40, No. 3 ( 2012-09-05), p. 297-308
    In: ACM SIGARCH Computer Architecture News, Association for Computing Machinery (ACM), Vol. 40, No. 3 ( 2012-09-05), p. 297-308
    Abstract: Most modern cores perform a highly-associative transaction look aside buffer (TLB) lookup on every memory access. These designs often hide the TLB lookup latency by overlapping it with L1 cache access, but this overlap does not hide the power dissipated by TLB lookups. It can even exacerbate the power dissipation by requiring higher associativity L1 cache. With today's concern for power dissipation, designs could instead adopt a virtual L1 cache, wherein TLB access power is dissipated only after L1 cache misses. Unfortunately, virtual caches have compatibility issues, such as supporting writeable synonyms and x86's physical page table walker. This work proposes an Opportunistic Virtual Cache (OVC) that exposes virtual caching as a dynamic optimization by allowing some memory blocks to be cached with virtual addresses and others with physical addresses. OVC relies on small OS changes to signal which pages can use virtual caching (e.g., no writeable synonyms), but defaults to physical caching for compatibility. We show OVC's promise with analysis that finds virtual cache problems exist, but are dynamically rare. We change 240 lines in Linux 2.6.28 to enable OVC. On experiments with Parsec and commercial workloads, the resulting system saves 94-99% of TLB lookup energy and nearly 23% of L1 cache dynamic lookup energy.
    Type of Medium: Online Resource
    ISSN: 0163-5964
    RVK:
    Language: English
    Publisher: Association for Computing Machinery (ACM)
    Publication Date: 2012
    detail.hit.zdb_id: 2088489-8
    detail.hit.zdb_id: 186012-4
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 8
    Online Resource
    Online Resource
    Association for Computing Machinery (ACM) ; 2016
    In:  ACM SIGARCH Computer Architecture News Vol. 44, No. 3 ( 2016-10-12), p. 707-718
    In: ACM SIGARCH Computer Architecture News, Association for Computing Machinery (ACM), Vol. 44, No. 3 ( 2016-10-12), p. 707-718
    Abstract: Virtualization provides benefits for many workloads, but the overheads of virtualizing memory are not universally low. The cost comes from managing two levels of address translation---one in the guest virtual machine (VM) and the other in the host virtual machine monitor (VMM)---with either nested or shadow paging. Nested paging directly performs a two-level page walk that makes TLB misses slower than unvirtualized native, but enables fast page tables changes. Alternatively, shadow paging restores native TLB miss speeds, but requires costly VMM intervention on page table updates. This paper proposes agile paging that combines both techniques and exceeds the best of both. A virtualized page walk starts with shadow paging and optionally switches in the same page walk to nested paging where frequent page table updates would cause costly VMM interventions. Agile paging enables most TLB misses to be handled as fast as native while most page table changes avoid VMM intervention. It requires modest changes to hardware (e.g., demark when to switch) and VMM policies (e.g., predict good switching opportunities). We emulate the proposed hardware and prototype the software in Linux with KVM on x86-64. Agile paging performs more than 12% better than the best of the two techniques and comes within 4% of native execution for all workloads.
    Type of Medium: Online Resource
    ISSN: 0163-5964
    RVK:
    Language: English
    Publisher: Association for Computing Machinery (ACM)
    Publication Date: 2016
    detail.hit.zdb_id: 2088489-8
    detail.hit.zdb_id: 186012-4
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 9
    Online Resource
    Online Resource
    Association for Computing Machinery (ACM) ; 2017
    In:  ACM SIGARCH Computer Architecture News Vol. 45, No. 1 ( 2017-05-11), p. 135-148
    In: ACM SIGARCH Computer Architecture News, Association for Computing Machinery (ACM), Vol. 45, No. 1 ( 2017-05-11), p. 135-148
    Abstract: Emerging non-volatile memory (NVM) technologies promise durability with read and write latencies comparable to volatile memory (DRAM). We define Persistent Memory (PM) as NVM accessed with byte addressability at low latency via normal memory instructions. Persistent-memory applications ensure the consistency of persistent data by inserting ordering points between writes to PM allowing the construction of higher-level transaction mechanisms. An epoch is a set of writes to PM between ordering points. To put systems research in PM on a firmer footing, we developed and analyzed a PM benchmark suite called WHISPER (Wisconsin-HP Labs Suite for Persistence) that comprises ten PM applications we gathered to cover all current interfaces to PM. A quantitative analysis reveals several insights: (a) only 4% of writes in PM-aware applications are to PM and the rest are to volatile memory, (b) software transactions are often implemented with 5 to 50 ordering points (c) 75% of epochs update exactly one 64B cache line, (d) 80% of epochs from the same thread depend on previous epochs from the same thread, while few epochs depend on epochs from other threads. Based on our analysis, we propose the Hands-off Persistence System (HOPS) to track updates to PM in hardware. Current hardware design requires applications to force data to PM as each epoch ends. HOPS provides high-level ISA primitives for applications to express durability and ordering constraints separately and enforces them automatically, while achieving 24.3% better performance over current approaches to persistence.
    Type of Medium: Online Resource
    ISSN: 0163-5964
    RVK:
    Language: English
    Publisher: Association for Computing Machinery (ACM)
    Publication Date: 2017
    detail.hit.zdb_id: 2088489-8
    detail.hit.zdb_id: 186012-4
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 10
    Online Resource
    Online Resource
    Association for Computing Machinery (ACM) ; 2007
    In:  ACM SIGARCH Computer Architecture News Vol. 35, No. 2 ( 2007-06-09), p. 81-91
    In: ACM SIGARCH Computer Architecture News, Association for Computing Machinery (ACM), Vol. 35, No. 2 ( 2007-06-09), p. 81-91
    Abstract: Hardware Transactional Memory (HTM) systems reflect choices from three key design dimensions: conflict detection, version management, and conflict resolution. Previously proposed HTMs represent three points in this design space: lazy conflict detection, lazy version management, committer wins (LL); eager conflict detection, lazy version management, requester wins (EL); and eager conflict detection, eager version management, and requester stalls with conservative deadlock avoidance (EE). To isolate the effects of these high-level design decisions, we develop a common framework that abstracts away differences in cache write policies, interconnects, and ISA to compare these three design points. Not surprisingly, the relative performance of these systems depends on the workload. Under light transactional loads they perform similarly, but under heavy loads they differ by up to 80%. None of the systems performs best on all of our benchmarks. We identify seven performance pathologies -interactions between workload and system that degrade performance-as the root cause of many performance differences: FriendlyFire, StarvingWriter, SerializedCommit, FutileStall, StarvingElder, RestartConvoy, and DuelingUpgrades. We discuss when and on which systems these pathologies can occur and show that they actually manifest within TM workloads. The insight provided by these pathologies motivated four enhanced systems that often significantly reduce transactional memory overhead. Importantly, by avoiding transaction pathologies, each enhanced system performs well across our suite of benchmarks.
    Type of Medium: Online Resource
    ISSN: 0163-5964
    RVK:
    Language: English
    Publisher: Association for Computing Machinery (ACM)
    Publication Date: 2007
    detail.hit.zdb_id: 2088489-8
    detail.hit.zdb_id: 186012-4
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...