Zobrazit minimální záznam

Abstractions for C++ code optimizations in parallel high-performance applications

dc.contributor.authorKlepl, Jiří
dc.contributor.authorŠmelko, Adam
dc.contributor.authorRozsypal, Lukáš
dc.contributor.authorKruliš, Martin
dc.date.accessioned2024-11-19T10:10:52Z
dc.date.available2024-11-19T10:10:52Z
dc.date.issued2024
dc.identifier.urihttps://hdl.handle.net/20.500.14178/2705
dc.description.abstractMany computational problems consider memory throughput a performance bottleneck, especially in the domain of parallel computing. Software needs to be attuned to hardware features like cache architectures or concurrent memory banks to reach a decent level of performance efficiency. This can be achieved by selecting the right memory layouts for data structures or changing the order of data structure traversal. In this work, we present an abstraction for traversing a set of regular data structures (e.g., multidimensional arrays) that allows the design of traversal-agnostic algorithms. Such algorithms can easily optimize for memory performance and employ semi-automated parallelization or autotuning without altering their internal code. We also add an abstraction for autotuning that allows defining tuning parameters in one place and removes boilerplate code. The proposed solution was implemented as an extension of the Noarr library that simplifies a layout-agnostic design of regular data structures. It is implemented entirely using C++ template meta-programming without any nonstandard dependencies, so it is fully compatible with existing compilers, including CUDA NVCC or Intel DPC++. We evaluate the performance and expressiveness of our approach on the Polybench-C benchmarks.en
dc.language.isoen
dc.relation.urlhttps://doi.org/10.1016/j.parco.2024.103096
dc.rightsCreative Commons Uveďte původ 4.0 Internationalcs
dc.rightsCreative Commons Attribution 4.0 Internationalen
dc.titleAbstractions for C++ code optimizations in parallel high-performance applicationsen
dcterms.accessRightsopenAccess
dcterms.licensehttps://creativecommons.org/licenses/by/4.0/legalcode
dc.date.updated2025-03-13T06:13:02Z
dc.subject.keywordAutotuningen
dc.subject.keywordCode optimizationen
dc.subject.keywordParallel programmingen
dc.subject.keywordPlain C++en
dc.subject.keywordRegular data structureen
dc.subject.keywordTraversalen
dc.identifier.eissn1872-7336
dc.relation.fundingReferenceinfo:eu-repo/grantAgreement/MSM//SVV260698
dc.relation.fundingReferenceinfo:eu-repo/grantAgreement/UK/GAUK/GAUK269723
dc.date.embargoStartDate2025-03-13
dc.type.obd73
dc.type.versioninfo:eu-repo/semantics/publishedVersion
dc.identifier.doi10.1016/j.parco.2024.103096
dc.identifier.utWos001299241000001
dc.identifier.eidScopus2-s2.0-85201453256
dc.identifier.obd650922
dc.subject.rivPrimary10000::10200::10201
dc.relation.datasetUrlhttps://doi.org/10.5281/zenodo.12687162
dcterms.isPartOf.nameParallel Computing
dcterms.isPartOf.issn0167-8191
dcterms.isPartOf.journalYear2024
dcterms.isPartOf.journalVolume121
dcterms.isPartOf.journalIssueSeptember 2024
uk.faculty.primaryId116
uk.faculty.primaryNameMatematicko-fyzikální fakultacs
uk.faculty.primaryNameFaculty of Mathematics and Physicsen
uk.department.primaryId1850
uk.department.primaryNameKatedra distribuovaných a spolehlivých systémůcs
uk.department.primaryNameDepartment of Distributed and Dependable Systemsen
dc.type.obdHierarchyCsČLÁNEK V ČASOPISU::článek v časopisu::původní článekcs
dc.type.obdHierarchyEnJOURNAL ARTICLE::journal article::original articleen
dc.type.obdHierarchyCode73::152::206en
uk.displayTitleAbstractions for C++ code optimizations in parallel high-performance applicationsen


Soubory tohoto záznamu

Thumbnail

Tento záznam se objevuje v následujících kolekcích

Zobrazit minimální záznam