GLORIA

GEOMAR Library Ocean Research Information Access

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
  • 1
    Online Resource
    Online Resource
    Association for Computing Machinery (ACM) ; 2017
    In:  ACM SIGARCH Computer Architecture News Vol. 45, No. 1 ( 2017-05-11), p. 765-777
    In: ACM SIGARCH Computer Architecture News, Association for Computing Machinery (ACM), Vol. 45, No. 1 ( 2017-05-11), p. 765-777
    Abstract: NUMA (non-uniform memory access) servers are commonly used in high-performance computing and datacenters. Within each server, a processor-interconnect (e.g., Intel QPI, AMD HyperTransport) is used to communicate between the different sockets or nodes. In this work, we explore the impact of the processor-interconnect on overall performance -- in particular, the performance un- fairness caused by processor-interconnect arbitration. It is well known that locally-fair arbitration does not guarantee globally-fair bandwidth sharing as closer nodes receive more bandwidth in a multi-hop network. However, this work demonstrates that the opposite can occur in a commodity NUMA server where remote nodes receive higher bandwidth (and perform better). We analyze this problem and iden- tify that this occurs because of external concentration used in router micro-architectures for processor-interconnects without globally-aware arbitration. While accessing remote memory can occur in any NUMA system, performance un- fairness (or performance variation) is more critical in cloud computing and virtual machines with shared resources. We demonstrate how this unfairness creates significant performance variation when a workload is executed on the Xen virtualization platform. We then provide analysis using synthetic workloads to better understand the source of unfair- ness and eliminate the impact of other shared resources, including the shared last-level cache and main memory. To provide fairness, we propose a novel, history-based arbitration that tracks the history of arbitration grants made in the previous history window. A weighted arbitration is done based on the history to provide global fairness. Through simulations, we show our proposed history-based arbitration can provide global fairness and minimize the processor- interconnect performance unfairness at low cost.
    Type of Medium: Online Resource
    ISSN: 0163-5964
    RVK:
    Language: English
    Publisher: Association for Computing Machinery (ACM)
    Publication Date: 2017
    detail.hit.zdb_id: 2088489-8
    detail.hit.zdb_id: 186012-4
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 2
    Online Resource
    Online Resource
    Institute of Electrical and Electronics Engineers (IEEE) ; 2020
    In:  IEEE Transactions on Multimedia Vol. 22, No. 4 ( 2020-4), p. 980-991
    In: IEEE Transactions on Multimedia, Institute of Electrical and Electronics Engineers (IEEE), Vol. 22, No. 4 ( 2020-4), p. 980-991
    Type of Medium: Online Resource
    ISSN: 1520-9210 , 1941-0077
    RVK:
    Language: Unknown
    Publisher: Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2020
    detail.hit.zdb_id: 2033070-4
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 3
    Online Resource
    Online Resource
    Oxford University Press (OUP) ; 2018
    In:  The Computer Journal Vol. 61, No. 2 ( 2018-02-01), p. 264-272
    In: The Computer Journal, Oxford University Press (OUP), Vol. 61, No. 2 ( 2018-02-01), p. 264-272
    Type of Medium: Online Resource
    ISSN: 0010-4620 , 1460-2067
    RVK:
    RVK:
    Language: English
    Publisher: Oxford University Press (OUP)
    Publication Date: 2018
    detail.hit.zdb_id: 1477172-X
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 4
    Online Resource
    Online Resource
    Institute of Electrical and Electronics Engineers (IEEE) ; 2015
    In:  IEEE Transactions on Visualization and Computer Graphics Vol. 21, No. 3 ( 2015-3-1), p. 389-401
    In: IEEE Transactions on Visualization and Computer Graphics, Institute of Electrical and Electronics Engineers (IEEE), Vol. 21, No. 3 ( 2015-3-1), p. 389-401
    Type of Medium: Online Resource
    ISSN: 1077-2626
    RVK:
    Language: Unknown
    Publisher: Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015
    detail.hit.zdb_id: 2027333-2
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 5
    Online Resource
    Online Resource
    Elsevier BV ; 2011
    In:  Theoretical Computer Science Vol. 412, No. 35 ( 2011-08), p. 4636-4649
    In: Theoretical Computer Science, Elsevier BV, Vol. 412, No. 35 ( 2011-08), p. 4636-4649
    Type of Medium: Online Resource
    ISSN: 0304-3975
    RVK:
    Language: English
    Publisher: Elsevier BV
    Publication Date: 2011
    detail.hit.zdb_id: 193706-6
    detail.hit.zdb_id: 1466347-8
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 6
    Online Resource
    Online Resource
    Association for Computing Machinery (ACM) ; 2013
    In:  ACM SIGARCH Computer Architecture News Vol. 41, No. 3 ( 2013-06-26), p. 380-391
    In: ACM SIGARCH Computer Architecture News, Association for Computing Machinery (ACM), Vol. 41, No. 3 ( 2013-06-26), p. 380-391
    Abstract: DRAM has been a de facto standard for main memory, and advances in process technology have led to a rapid increase in its capacity and bandwidth. In contrast, its random access latency has remained relatively stagnant, as it is still around 100 CPU clock cycles. Modern computer systems rely on caches or other latency tolerance techniques to lower the average access latency. However, not all applications have ample parallelism or locality that would help hide or reduce the latency. Moreover, applications' demands for memory space continue to grow, while the capacity gap between last-level caches and main memory is unlikely to shrink. Consequently, reducing the main-memory latency is important for application performance. Unfortunately, previous proposals have not adequately addressed this problem, as they have focused only on improving the bandwidth and capacity or reduced the latency at the cost of significant area overhead. We propose asymmetric DRAM bank organizations to reduce the average main-memory access latency. We first analyze the access and cycle times of a modern DRAM device to identify key delay components for latency reduction. Then we reorganize a subset of DRAM banks to reduce their access and cycle times by half with low area overhead. By synergistically combining these reorganized DRAM banks with support for non-uniform bank accesses, we introduce a novel DRAM bank organization with center high-aspect-ratio mats called CHARM. Experiments on a simulated chip-multiprocessor system show that CHARM improves both the instructions per cycle and system-wide energy-delay product up to 21% and 32%, respectively, with only a 3% increase in die area.
    Type of Medium: Online Resource
    ISSN: 0163-5964
    RVK:
    Language: English
    Publisher: Association for Computing Machinery (ACM)
    Publication Date: 2013
    detail.hit.zdb_id: 2088489-8
    detail.hit.zdb_id: 186012-4
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...