,
,
, HPF: High Performance Fortran
, HPX: High Performance ParalleX
,
, MPI: Message Passing Interface
,
,
, Polyhedral Benchmark Suite
,
,
Representation Extraction Tool for C-Based High Level Languages, 2014. ,
, , 2017.
, Compilers: Principles, Techniques, and Tools, 1986.
Control flow analysis, Proceedings of a Symposium on Compiler Optimization, pp.1-19, 1970. ,
PENCIL: A PlatformNeutral Compute Intermediate Language for Accelerator Programming, Proceedings of the 24th International Conference on Parallel Architectures and Compilation Techniques, p.15, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01257236
PENCIL Language Specification, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01154812
Opening polyhedral compiler's black box, Proceedings of the 2016 International Symposium on Code Generation and Optimization, pp.128-138, 2016. ,
Code Generation in the Polyhedral Model is Easier than You Think, Proceedings of the Inernation Conference on Parallel Architectures and Compilation Techniques (PACT'04, 2004. ,
URL : https://hal.archives-ouvertes.fr/hal-00017260
Polyhedral Analysis for the OpenMP Programmer, Proceedings of the 7th International Conference on OpenMP in the Petascale Era, pp.37-53, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00752626
Synthesis of high-performance parallel programs for a class of ab initio quantum chemistry models, Proceedings of the IEEE, vol.93, issue.2, pp.276-292, 2005. ,
VOBLA: A Vehicle for Optimized Basic Linear Algebra, SIGPLAN Notices, vol.49, pp.115-124, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-01508181
Intermediate Representation for Heterogeneous Multi-Core: A Survey, VLSI Systems, Architecture, Technology and Applications, pp.1-6, 2015. ,
Extending GCC with a Multi-Grain Parallelism Adaptation Framework for MPSoCs, GCC for Research Opportunities Workshop, 2010. ,
Kimble: a Hierarchical Intermediate Representation for Multi-Grain Parallelism, Proceedings of the Workshop on Intermediate Representations, pp.21-28, 2011. ,
Using an Intermediate Representation to Map Workloads on Heterogeneous Parallel Systems, PDP'16, pp.811-819, 2016. ,
Theano: a CPU and GPU math expression compiler, Proceedings of the Python for Scientific Computing Conference (SciPy), 2010. ,
Extending OpenMP for NUMA Machines, Proceedings of the, 2000. ,
, ACM/IEEE Conference on Supercomputing, 2000.
Compiling Affine Loop Nests for Distributed-memory Parallel Architectures, Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, vol.33, p.12, 2013. ,
A practical automatic polyhedral program optimization system, ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2008. ,
A practical automatic polyhedral program optimization system, ACM SIGPLAN Conference on Programming Language Design and Implementation, 2008. ,
hwloc: a Generic Framework for Managing Hardware Affinities in HPC Applications, PDP 2010 -The 18th Euromicro International Conference on Parallel, Distributed and Network-Based Computing, 2010. ,
Forestgomp: An efficient openmp environment for numa architectures, International Journal of Parallel Programming, vol.38, pp.418-439, 2010. ,
A Heterogeneous Parallel Framework for Domain-Specific Languages, Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, pp.89-100, 2011. ,
Software-based speculative parallelism, 3rd ACM Workshop on Feedback-Directed and Dynamic Optimization, p.3, 1998. ,
, Proceedings of the 9th International Conference on Principles and Practice of Programming in Java, pp.51-61, 2011.
Static Single Assignment Form for MessagePassing Programs, Int. J. Parallel Program, vol.29, pp.139-184, 2001. ,
Polyhedral Transformations of Explicitly Parallel Programs, Proceedings of the Fifth International Workshop on Polyhedral Compilation Techniques, p.15, 2015. ,
Static Data Race Detection for SPMD Programs via an Extended Polyhedral Representation, Proceedings of the Sixth International Workshop on Polyhedral Compilation Techniques, p.16, 2016. ,
Chill: A framework for composing high-level loop transformations, 2008. ,
Typesafe abstractions for tensor operations (short paper), Proceedings of the 8th ACM SIGPLAN International Symposium on Scala, pp.45-50, 2017. ,
DOI : 10.1145/3136000.3136001
URL : http://arxiv.org/pdf/1710.06892
TVM: end-to-end optimization stack for deep learning, 2018. ,
Stream Compilation for Real-Time Embedded Multicore Systems, Proceedings of the 7th Annual IEEE/ACM International Symposium on Code Generation and Optimization, pp.210-220, 2009. ,
DOI : 10.1109/cgo.2009.27
URL : http://cccp.eecs.umich.edu/papers/ychoi-cgo09.pdf
Static Analysis of OpenStream Programs, Proceedings of the Sixth International Workshop on Polyhedral Compilation Techniques, p.16, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01251845
A Polyhedral Approach to Ease the Composition of Program Transformations, pp.292-303, 2004. ,
URL : https://hal.archives-ouvertes.fr/hal-01257301
Facilitating the search for compositions of program transformations, Proceedings of the 19th Annual International Conference on Supercomputing, pp.151-160, 2005. ,
URL : https://hal.archives-ouvertes.fr/hal-01257296
Array SSA for Explicitly Parallel Programs, Proceedings of the 5th International Euro-Par Conference on Parallel Processing, pp.383-390, 1999. ,
Efficiently Computing Static Single Assignment Form and the Control Dependence Graph, ACM Trans. Program. Lang. Syst, vol.13, pp.451-490, 1991. ,
Liveness Analysis in Explicitly Parallel Programs, Proceedings of the Sixth International Workshop on Polyhedral Compilation Techniques, p.16, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01251843
Traffic management: A holistic approach to memory placement on numa systems, Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems, pp.381-394, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00945758
Memory access coalescing: A technique for eliminating redundant memory accesses, Proceedings of the ACM SIGPLAN 1994 Conference on Programming Language Design and Implementation, pp.186-195, 1994. ,
High-order and high accurate cfd methods and their applications for complex grid problems, Communications in Computational Physics, vol.11, pp.1081-1102, 2012. ,
A Language for the Compact Representation of Multiple Program Versions, pp.136-151, 2006. ,
URL : https://hal.archives-ouvertes.fr/hal-00141067
Xfor: Filling the gap between automatic loop optimization and peak performance, 14th International Symposium on Parallel and Distributed Computing, pp.100-109, 2015. ,
DOI : 10.1109/ispdc.2015.19
URL : https://hal.archives-ouvertes.fr/hal-01155144
, Polyhedron Model. Springer US, pp.1581-1592, 2011.
The Program Dependence Graph and Its Use in Optimization, ACM Trans. Program. Lang. Syst, vol.9, issue.3, pp.319-349, 1987. ,
DOI : 10.1007/3-540-12925-1_33
URL : http://www.cs.utexas.edu/users/less/reading/spring00/ferrante.pdf
Enabling High-performance Memory Migration for Multithreaded Applications on LINUX, Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing, pp.1-9, 2009. ,
URL : https://hal.archives-ouvertes.fr/inria-00358172
Polly -Polyhedral Optimization in LLVM, Proceedings of the Sixth International Workshop on Polyhedral Compilation Techniques, p.11, 2011. ,
Data layout transformation for stencil computations on short-vector simd architectures, Proceedings of the 20th International Conference on Compiler Construction: Part of the Joint European Conferences on Theory and Practice of Software, pp.225-245, 2011. ,
DOI : 10.1007/978-3-642-19861-8_13
URL : http://users.ece.cmu.edu/~franzf/papers/cc2011.pdf
Enabling Locality-aware Computations in, OpenMP. Sci. Program, vol.18, pp.169-181, 2010. ,
Fast Static Condensation for the Helmholtz Equation in a Spectral-Element Discretization, pp.371-380, 2016. ,
Factorizing the factorization -a spectralelement solver for elliptic equations with linear operation count, Journal of Computational Physics, vol.346, pp.437-448, 2017. ,
Analysis and tuning of libtensor framework on multicore architectures, 21st International Conference on High Performance Computing, HiPC, pp.1-10, 2014. ,
Dynamic and speculative polyhedral parallelization using compiler-generated skeletons, Int. J. Parallel Program, vol.42, pp.529-545, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-00825738
INSPIRE: The Insieme Parallel Intermediate Representation, Proceedings of the 22Nd International Conference on Parallel Architectures and Compilation Techniques, pp.7-18, 2013. ,
Automatic speculative POLyhedral Loop Optimizer, Proceedings of the Sixth International Workshop on Polyhedral Compilation Techniques, IM-PACT '17 ,
URL : https://hal.archives-ouvertes.fr/hal-01533692
Optimizing Compilers for Modern Architectures: A Dependence-based Approach, 2002. ,
Automatic Resource-Constrained Static Task Parallelization, 2013. ,
URL : https://hal.archives-ouvertes.fr/pastel-00935483
SPIRE : A Methodology for Sequential to Parallel Intermediate Representation Extension, HiPEAC Computing Systems Week, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00823324
LLVM Parallel Intermediate Representation: Design and Evaluation Using OpenSHMEM Communications, Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC, vol.2, pp.1-2, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01254368
The tensor algebra compiler, Proc. ACM Program. Lang, vol.1, p.29, 2017. ,
py: Transformation-based code generation for gpus and cpus, Proceedings of ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming, vol.82, p.87, 2014. ,
Embedded processor design challenges, pp.171-187, 2002. ,
Array SSA Form and Its Use in Parallelization, Proceedings of the 25th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pp.107-120, 1998. ,
A safe approximate algorithm for interprocedural aliasing, Proceedings of the ACM SIGPLAN 1992 Conference on Programming Language Design and Implementation, pp.235-248, 1992. ,
Basic Compiler Algorithms for Parallel Programs, Proceedings of the Seventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp.1-12, 1999. ,
Static Nonconcurrency Analysis of OpenMP Programs, OpenMP Shared Memory Parallel Programming, pp.36-50, 2008. ,
Posh: A tls compiler that exploits program structure, Proceedings of the Eleventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp.158-167, 2006. ,
PolyLib: A library for manipulating parameterized polyhedra, 1999. ,
COFFEE: an optimizing compiler for finite element local assembly, 2014. ,
Matching Memory Access Patterns and Data Placement for NUMA Systems, Proceedings of the Tenth International Symposium on Code Generation and Optimization, pp.230-241, 2012. ,
A Library for Portable and Composable Data Locality Optimizations for NUMA Systems, Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp.227-238, 2015. ,
, , pp.1756-1765, 2011.
Reconciling Languages, Runtimes, Compilation and Optimizations for Streaming Applications, 2013. ,
URL : https://hal.archives-ouvertes.fr/tel-00840333
Erbium: A Deterministic, Concurrent Intermediate Representation to Map Data-flow Tasks to Scalable, Persistent Streaming Processes, Proceedings of the 2010 International Conference on Compilers, Architectures and Synthesis for Embedded Systems, pp.11-20, 2010. ,
URL : https://hal.archives-ouvertes.fr/inria-00551510
Locality-Aware Task Scheduling and Data Distribution for OpenMP Programs on NUMA Systems and Manycore Processors, Scientific Programming, p.2015, 2015. ,
Optimizing cache access: A tool for source-to-source transformations and real-life compiler tests, Euro-Par 2004 Parallel Processing, pp.72-81, 2004. ,
Abstractions for Specifying Sparse Matrix Data Transformations, Proceedings of the Eighth International Workshop on Polyhedral Compilation Techniques, 2018. ,
Cil: Intermediate language and tools for analysis and transformation of c programs, Proceedings of the 11th International Conference on Compiler Construction, pp.213-228, 2002. ,
Concurrent SSA Form in the Presence of Mutual Exclusion, Proceedings of the 1998 International Conference on Parallel Processing, p.356, 1998. ,
PLASMA: Portable Programming for SIMD Heterogeneous Accelerators, Proceedings of the Workshop on Language, Compiler, and Architecture Support for GPGPU, 2010. ,
On Simplifying and Optimizing Message Passing Programs: a Compiler and Runtime-Based Approach, 2011. ,
Exact dependence analysis for increased communication overlap, Proceedings of the 19th European Conference on Recent Advances in the Message Passing Interface, pp.89-99, 2012. ,
MPC: A Unified Parallel Runtime for Clusters of NUMA Machines, Proceedings of the 14th International Euro-Par Conference on Parallel Processing, pp.78-88, 2008. ,
Program Transformations and Memory Architecture Optimizations for HighLevel Synthesis of Hardware Accelerators, 2010. ,
URL : https://hal.archives-ouvertes.fr/tel-00544349
Preserving High-Level Semantics of Parallel Programming Annotations Through the Compilation Flow of Optimizing Compilers, Proceedings of the 15th Workshop on Compilers for Parallel Computers, p.10, 2010. ,
URL : https://hal.archives-ouvertes.fr/inria-00551518
Expressiveness and Data-flow Compilation of OpenMP Streaming Programs, ACM Transactions on Architecture and Code Optimization (TACO), vol.9, p.25, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00710409
GRAPHITE: Polyhedral Analyses and Optimizations for GCC, Proceedings of the 2006 GCC Developers Summit, 2006. ,
High Performance Computing for Computational Science -VECPAR, Improving Memory Affinity of Geophysics Applications on NUMA Platforms Using Minas, pp.279-292, 2010. ,
Polyhedral optimization of tensorflow computation graphs, Proceedings of the 6th Workshop on Extreme-scale Programming Tools at The International Conference for High Performance Computing, Networking, Storage and Analysis, 2017. ,
Static Condensation, pp.47-70, 2004. ,
Halide: A language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines, Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation, pp.519-530, 2013. ,
Modeling of languages for tensor manipulation, 2018. ,
Cfdlang: High-level code generation for high-order methods in fluid dynamics, Proceedings of the Real World Domain Specific Languages Workshop, vol.5, pp.1-5, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01857925
A Programming Language Interface to Describe Transformations and Code Generation, pp.136-150, 2011. ,
Region Array SSA, Proceedings of the 15th International Conference on Parallel Architectures and Compilation Techniques, pp.43-52, 2006. ,
Analysis and optimization of explicitly parallel programs using the parallel program graph representation, Proceedings of the 10th International Workshop on Languages and Compilers for Parallel Computing, pp.94-113, 1998. ,
Parallel Program Graphs and Their Classification, Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing, pp.633-655, 1994. ,
Embedding fork-join parallelism into llvm's intermediate representation, Proceedings of the 22Nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp.249-265, 2017. ,
Chunking Parallel Loops in the Presence of Synchronization, Proceedings of the 23rd International Conference on Supercomputing, pp.181-192, 2009. ,
Program Flow Graph Construction for Static Analysis of Explicitly Parallel Message-Passing Programs, Army Research Laboratory, 2000. ,
Program generation for small-scale linear algebra applications, Proceedings of the 2018 International Symposium on Code Generation and Optimization, pp.327-339, 2018. ,
A basic linear algebra compiler, Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization, vol.23, p.32, 2014. ,
A basic linear algebra compiler for structured matrices, International Symposium on Code Generation and Optimization (CGO, pp.117-127, 2016. ,
TTC: A tensor transposition compiler for multiple architectures, 2016. ,
Static Single Assignment for Explicitly Parallel Programs, Proceedings of the 20th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pp.260-272, 1993. ,
Analyzing Programs with Explicit Parallelism, Languages and Compilers for Parallel Computing, vol.589, pp.405-419, 1992. ,
Intermediate Representations in Imperative Compilers: A Survey, ACM Computing Surveys, vol.45, issue.3, p.27, 2013. ,
Lift: A functional data-parallel ir for high-performance gpu code generation, Proceedings of the 2017 International Symposium on Code Generation and Optimization, pp.74-85, 2017. ,
Extended ssa with factored use-def chains to support optimization and parallelism, Proceedings of the Twenty-Seventh Hawaii International Conference on, vol.2, pp.43-52, 1994. ,
Data-flow Analysis for MPI Programs, Internationl Conference on Parallel Processing, pp.175-184, 2006. ,
More data locality for static control programs on numa architectures, Proceedings of the 7th International Workshop on Polyhedral Compilation Techniques, p.17, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01529354
Insieme-RS: A Compiler-supported Parallel Runtime System, 2013. ,
Nwchem: A comprehensive and scalable open-source solution for large scale molecular simulations, Computer Physics Communications, vol.181, pp.1477-1489, 2010. ,
Tensor comprehensions: Frameworkagnostic high-performance machine learning abstractions, 2018. ,
Polyhedral Extraction Tool, Proceedings of the Sixth International Workshop on Polyhedral Compilation Techniques, p.12, 2012. ,
An Integer Set Library for the Polyhedral Model, Mathematical Software (ICMS'10), vol.6327, pp.299-302, 2010. ,
Counting Affine Calculator and Applications, Proceedings of the Sixth International Workshop on Polyhedral Compilation Techniques, p.11, 2011. ,
Polyhedral Parallel Code Generation for CUDA, ACM Trans. Archit. Code Optim, vol.9, p.23, 2013. ,
Loop Tiling for Parallelism, 2000. ,
Poet: Parameterized optimizations for empirical tuning, IEEE International Parallel and Distributed Processing Symposium, pp.1-8, 2007. ,
Array Dataflow Analysis for Polyhedral X10 Programs, Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp.23-34, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00761537
Intermediate Language Extensions for Parallelism, Proceedings of the Compilation of the Co-located Workshops on DSM'11, TMC'11, AGERE! 2011, AOOPES'11, NEAT'11, & VMIL'11, pp.329-340, 2011. ,
Using algebraic transformations to optimize expression evaluation in scientific code, Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques, p.376, 1998. ,