an Audio Signal Processing Language ,
Programming Language and a Compilation Infrastructure ,
A single-chip, 1.6-billion, 16-b mac/s multiprocessor dsp. Solid-State Circuits, IEEE Journal, issue.3, pp.35412-424, 2000. ,
A comparison of list schedules for parallel processing systems, Communications of the ACM, vol.17, issue.12, pp.685-690, 1974. ,
DOI : 10.1145/361604.361619
Shared memory consistency models: a tutorial, Computer, vol.29, issue.12, pp.66-76, 1996. ,
DOI : 10.1109/2.546611
Performance Characterization of a Hierarchical MPI Implementation on Large-scale Distributed-memory Platforms, 2009 International Conference on Parallel Processing, pp.132-139, 2009. ,
DOI : 10.1109/ICPP.2009.51
The Fortress Language Specification, 2007. ,
Automatic translation of FORTRAN programs to vector form, ACM Transactions on Programming Languages and Systems, vol.9, issue.4, pp.491-542, 1987. ,
DOI : 10.1145/29873.29875
Communication Optimization and Code Generation for Distributed Memory Machines, Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation, PLDI '93, pp.126-138, 1993. ,
Static Compilation Analysis for Host-Accelerator Communication Optimization, 24th Int. Workshop on Languages and Compilers for Parallel Computing (LCPC), Fort Collins, 2011. ,
DOI : 10.1007/978-3-642-36036-7_16
URL : https://hal.archives-ouvertes.fr/hal-00743496
A Modular Static Analysis Approach to Affine Loop Invariants Detection, Electronic Notes in Theoretical Computer Science, vol.267, issue.1, pp.3-16, 2010. ,
DOI : 10.1016/j.entcs.2010.09.002
URL : https://hal.archives-ouvertes.fr/hal-00586338
PIPS: a Workbench for Interprocedural Program Analyses and Parallelization, Meeting on data parallel languages and compilers for portable parallel computing, 1994. ,
Scanning Polyhedra with DO Loops, Proceedings of the third ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP '91, pp.39-50, 1991. ,
DOI : 10.1145/109625.109631
URL : https://hal.archives-ouvertes.fr/hal-00752774
Extending OpenMP to Survive the Heterogeneous Multi-Core Era, International Journal of Parallel Programming, vol.41, issue.1, pp.440-459, 2010. ,
DOI : 10.1007/s10766-010-0135-4
Large-scale simulation of elastic wave propagation in heterogeneous media on parallel computers, Computer Methods in Applied Mechanics and Engineering, vol.152, issue.1-2, pp.85-102, 1998. ,
DOI : 10.1016/S0045-7825(97)00183-7
Programming Distributed Memory Sytems Using OpenMP, 2007 IEEE International Parallel and Distributed Processing Symposium, pp.1-8, 2007. ,
DOI : 10.1109/IPDPS.2007.370397
Kimble: a Hierarchical Intermediate Representation for Multi-Grain Parallelism, Proceedings of the Workshop on Intermediate Representations, pp.21-28, 2011. ,
A survey of multicore processors, IEEE Signal Processing Magazine, vol.26, issue.6, pp.26-37, 2009. ,
DOI : 10.1109/MSP.2009.934110
Cilk: An Efficient Multithreaded Runtime System, In Journal of Parallel and Distributed Computing, pp.207-216, 1995. ,
DOI : 10.1006/jpdc.1996.0107
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.18.3175
Automatic Distributed Memory Code Generation using the Polyhedral Framework, Indian Institute of Science, 2011. ,
L??vy Flights in Dobe Ju/???hoansi Foraging Patterns, Human Ecology, vol.15, issue.3, pp.129-138, 2007. ,
DOI : 10.1007/s10745-006-9083-4
HELIX, Proceedings of the Tenth International Symposium on Code Generation and Optimization, CHO '12, pp.84-93, 2012. ,
DOI : 10.1145/2259016.2259028
Habanero-Java, Proceedings of the 9th International Conference on Principles and Practice of Programming in Java, PPPJ '11, 2011. ,
DOI : 10.1145/2093157.2093165
Stream Compilation for Real-Time Embedded Multicore Systems, 2009 International Symposium on Code Generation and Optimization, pp.210-220, 2009. ,
DOI : 10.1109/CGO.2009.27
Triplet: A clustering scheduling algorithm for heterogeneous systems, Proceedings International Conference on Parallel Processing Workshops, pp.231-236, 2001. ,
DOI : 10.1109/ICPPW.2001.951956
URL : https://hal.archives-ouvertes.fr/inria-00100488
Data and Process Abstraction in PIPS Internal Representation, Proceedings of the Workshop on Intermediate Representations, pp.77-84, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00744291
Analyses de Régions de Tableaux et Applications, 1996. ,
Interprocedural Array Region Analyses, International Journal of Parallel Programming, vol.2, issue.3, pp.513-546, 1996. ,
DOI : 10.1007/BF03356758
URL : https://hal.archives-ouvertes.fr/hal-00752611
Importance of Simulations for Nuclear and Aeronautical Inspections with Ultrasonic and Eddy Current Testing, Simulation in NDT, 2010. ,
Efficiently computing static single assignment form and the control dependence graph, ACM Transactions on Programming Languages and Systems, vol.13, issue.4, pp.451-490, 1991. ,
DOI : 10.1145/115372.115320
GATS 1.0, Proceedings of the 2005 conference on Genetic and evolutionary computation , GECCO '05, pp.2209-2210, 2005. ,
DOI : 10.1145/1068009.1068378
Modeling the Weather with a Data Flow Supercomputer, IEEE Transactions on Computers, vol.33, issue.7, pp.592-603, 1984. ,
DOI : 10.1109/TC.1984.5009332
Formal Derivation of Strongly Correct Parallel Programs, 1977. ,
Polynômes arithmétiques et méthode de polyèdres en combinatoire, International Series of Numerical Mathematics, p.35, 1977. ,
Dataflow analysis of array and scalar references, International Journal of Parallel Programming, vol.24, issue.4, 1991. ,
DOI : 10.1007/BF01407931
The program dependence graph and its use in optimization, ACM Transactions on Programming Languages and Systems, vol.9, issue.3, pp.319-349, 1987. ,
DOI : 10.1145/24039.24041
Some Computer Organizations and Their Effectiveness, IEEE Transactions on Computers, vol.21, issue.9, pp.948-960, 1972. ,
DOI : 10.1109/TC.1972.5009071
Computers and Intractability: A Guide to the Theory of NP-Completeness, 1990. ,
Clustering task graphs for message passing architectures, ACM SIGARCH Computer Architecture News, vol.18, issue.3, pp.447-456, 1990. ,
DOI : 10.1145/255129.255188
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.49.1744
Automatic extraction of functional parallelism from ordinary programs, IEEE Transactions on Parallel and Distributed Systems, vol.3, issue.2, pp.166-178, 1992. ,
DOI : 10.1109/71.127258
Message-passing code generation for non-rectangular tiling transformations, Parallel Computing, vol.32, issue.10, 2006. ,
DOI : 10.1016/j.parco.2006.07.003
Graph Visualization Software ,
A simple adaptive algorithm for real-time processing in antenna arrays, Proceedings of the IEEE, vol.57, issue.10, pp.1696-1704, 1969. ,
DOI : 10.1109/PROC.1969.7385
Generating Efficient Parallel Programs for Distributed Memory Systems, 2013. ,
A Combined Corner and Edge Detector, Procedings of the Alvey Vision Conference 1988, pp.147-151, 1988. ,
DOI : 10.5244/C.2.23
Efficient construction of program dependence graphs, ACM SIGSOFT Software Engineering Notes, vol.18, issue.3, pp.160-170, 1993. ,
DOI : 10.1145/174146.154268
A 48-Core IA-32 message-passing processor with DVFS in 45nm CMOS, 2010 IEEE International Solid-State Circuits Conference, (ISSCC), pp.108-109, 2010. ,
DOI : 10.1109/ISSCC.2010.5434077
Parallelization of DOALL and DOACROSS Loops???a Survey, Emphasizing Parallel Programming Techniques, pp.53-103, 1997. ,
DOI : 10.1016/S0065-2458(08)60706-8
Insieme -an Optimization System for OpenMP, MPI and OpenCL Programs, 2011. ,
Semantical Interprocedural Parallelization: An Overview of the PIPS Project, ICS, pp.244-251, 1991. ,
URL : https://hal.archives-ouvertes.fr/hal-00984684
Optimal Partitioning Scheme for Wavefront/Systolic Array Processors, Proceedings of IEEE Symposium on Circuits and Systems, pp.940-943, 1986. ,
Newgen: A Language Independent Program Generator, 1989. ,
A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs, SIAM Journal on Scientific Computing, vol.20, issue.1, 1998. ,
DOI : 10.1137/S1064827595287997
A Multi-Grain Parallelizing Compilation Scheme for OS- CAR (Optimally Scheduled Advanced Multiprocessor), Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing, pp.283-297, 1992. ,
High-level microprogramming: an optimizing C compiler for a processing element of a CAD accelerator, [1990] Proceedings of the 23rd Annual Workshop and Symposium@m_MICRO 23: Microprogramming and Microarchitecture, pp.97-106, 1990. ,
DOI : 10.1109/MICRO.1990.151431
Parallelizing with BDSC, a resource-constrained scheduling algorithm for shared and distributed memory systems, MINES ParisTech, 2012. ,
DOI : 10.1016/j.parco.2014.11.004
URL : https://hal.archives-ouvertes.fr/hal-01097328
Task Parallelism and Data Distribution: An Overview of Explicit Parallel Programming Languages, Lecture Notes in Computer Science, vol.7760, pp.174-189, 2012. ,
DOI : 10.1007/978-3-642-37658-0_12
URL : https://hal.archives-ouvertes.fr/hal-00742536
SPIRE: A Methodology for Sequential to Parallel Intermediate Representation Extension, Proceedings of the 17th Workshop on Compilers for Parallel Computing, CPC'13, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00823324
Scheduling for heterogeneous Systems using constrained critical paths, Parallel Computing, vol.38, issue.4-5, pp.175-193, 2012. ,
DOI : 10.1016/j.parco.2012.01.001
Duplication Scheduling Heuristics (DSH): A New Precedence Task Scheduler for Parallel Processor Systems, 1987. ,
Atomic Vector Operations on Chip Multiprocessors, SIGARCH Comput. Archit. News, issue.3, pp.36441-452, 2008. ,
Benchmarking the task graph scheduling algorithms, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing, pp.531-537, 1998. ,
DOI : 10.1109/IPPS.1998.669967
Static scheduling algorithms for allocating directed task graphs to multiprocessors, ACM Computing Surveys, vol.31, issue.4, pp.406-471, 1999. ,
DOI : 10.1145/344588.344618
Transactional memory, Communications of the ACM, vol.51, issue.7, pp.80-88, 2008. ,
DOI : 10.1145/1364782.1364800
OpenMP to GPGPU: a Compiler Framework for Automatic Translation and Optimization, Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming, PPoPP '09, pp.101-110, 2009. ,
Convex Invariant Refinement by Control Node Splitting: a Heuristic Approach, Electronic Notes in Theoretical Computer Science, vol.288, pp.49-59, 2012. ,
DOI : 10.1016/j.entcs.2012.10.007
URL : https://hal.archives-ouvertes.fr/hal-00833344
GENERIC and GIMPLE: a New Tree Representation for Entire Functions, GCC developers summit 2003, pp.171-180, 2003. ,
STEP: A Distributed OpenMP for Coarse-Grain Parallelism Tool, Proceedings of the 4th international conference on OpenMP in a new era of parallelism, IWOMP'08, pp.83-99, 2008. ,
DOI : 10.1007/978-3-540-79561-2_8
URL : https://hal.archives-ouvertes.fr/hal-01373120
Partitioning and Mapping Algorithms into Fixed Size Systolic Arrays, IEEE Transactions on Computers, vol.35, issue.1, pp.1-12, 1986. ,
DOI : 10.1109/TC.1986.1676652
Exascale Algorithms for Generalized MPI_Comm_split, Recent Advances in the Message Passing Interface, pp.9-18, 2011. ,
DOI : 10.1007/978-3-642-24449-0_4
Cramming More Components Onto Integrated Circuits, Proceedings of the IEEE, vol.86, issue.1, pp.82-85, 1998. ,
DOI : 10.1109/JPROC.1998.658762
A Transformation Framework for Optimizing Task-Parallel Programs, ACM Transactions on Programming Languages and Systems, vol.35, issue.1, pp.1-3, 2013. ,
DOI : 10.1145/2450136.2450138
Automatic partitioning of signal processing programs for symmetric multiprocessors, Proceedings of the 1996 Conference on Parallel Architectures and Compilation Technique, pp.269-280, 1996. ,
DOI : 10.1109/PACT.1996.552675
OpenMP and Automatic Parallelization in GCC, the Proceedings of the GCC Developers Summit, 2006. ,
Adding Automatic Parallelization to Faust, Linux Audio Conference, 2009. ,
Advanced compiler optimizations for supercomputers, Communications of the ACM, vol.29, issue.12, pp.1184-1201, 1986. ,
DOI : 10.1145/7902.7904
PLASMA: Portable Programming for SIMD Heterogeneous Accelerators, BIBLIOGRAPHY Workshop on Language, Compiler, and Architecture Support for GPGPU. [92] PolyLib. A Library of Polyhedral Functions, 2010. ,
A stream-computing extension to OpenMP, Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers, HiPEAC '11, pp.5-14, 2011. ,
DOI : 10.1145/1944862.1944867
URL : https://hal.archives-ouvertes.fr/hal-00659411
OpenStream, ACM Transactions on Architecture and Code Optimization, vol.9, issue.4, pp.1-5325, 2013. ,
DOI : 10.1145/2400682.2400712
URL : https://hal.archives-ouvertes.fr/hal-00786675
Optimisation multi-niveau d'une application de traitement d'images sur machinesparalì eles, 2012. ,
Parallelization Schemes for Memory Optimization on the Cell Processor: A Case Study on the Harris Corner Detector, Transactions on High-Performance Embedded Architectures and Compilers III, pp.177-200, 2011. ,
DOI : 10.1007/s10766-007-0034-5
URL : https://hal.archives-ouvertes.fr/hal-00753708
Synchronization using counting semaphores, Proceedings of the 2nd international conference on Supercomputing , ICS '88, pp.627-637, 1988. ,
DOI : 10.1145/55364.55426
Partitioning and Scheduling Parallel Programs for Multiprocessors, 1989. ,
COMP 322: Principles of Parallel Programming, 2009. ,
Parallel Program Graphs and their classification, Lecture Notes in Computer Science, vol.768, pp.633-655, 1993. ,
DOI : 10.1007/3-540-57659-2_36
Phasers, Proceedings of the 22nd annual international conference on Supercomputing , ICS '08, pp.277-288, 2008. ,
DOI : 10.1145/1375527.1375568
A Scheduling Algorithm to Optimize Parallel Processes, 2008 International Conference of the Chilean Computer Science Society, pp.73-78, 2008. ,
DOI : 10.1109/SCCC.2008.8
A scheduling algorithm to optimize real-world applications, 24th International Conference on Distributed Computing Systems Workshops, 2004. Proceedings., pp.858-862, 2004. ,
DOI : 10.1109/ICDCSW.2004.1284133
Performance-effective and low-complexity task scheduling for heterogeneous computing, IEEE Transactions on Parallel and Distributed Systems, vol.13, issue.3, pp.260-274, 2002. ,
DOI : 10.1109/71.993206
Transitive Closures of Affine Integer Tuple Relations and Their Overapproximations, Proceedings of the 18th International Conference on Static Analysis, pp.216-232, 2011. ,
DOI : 10.1007/978-3-642-02658-4_44
URL : https://hal.archives-ouvertes.fr/hal-00645221
Hypertool: a programming aid for message-passing systems, IEEE Transactions on Parallel and Distributed Systems, vol.1, issue.3, pp.330-343, 1990. ,
DOI : 10.1109/71.80160
PYRROS: Static Task Scheduling and Code Generation for Message Passing Multiprocessors, Proceedings of the 6th International Conference on Supercomputing, ICS '92, pp.428-437, 1992. ,
DSC: scheduling parallel tasks on an unbounded number of processors, IEEE Transactions on Parallel and Distributed Systems, vol.5, issue.9, pp.951-967, 1994. ,
DOI : 10.1109/71.308533
Cuckoo Search via Levy Flights, Nature Biologically Inspired Computing NaBIC 2009. World Congress on, pp.210-214, 2009. ,
DOI : 10.1109/nabic.2009.5393690
Productivity BIBLIOGRAPHY and Performance Using Partitioned Global Address Space Languages, Proceedings of the 2007 International Workshop on Parallel Symbolic Computation, PASCO '07, pp.24-32, 2007. ,
DOI : 10.1145/1278177.1278183
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.126.6770
Intermediate language extensions for parallelism, Proceedings of the compilation of the co-located workshops on DSM'11, TMC'11, AGERE!'11, AOOPES'11, NEAT'11, & VMIL'11, SPLASH '11 Workshops, pp.329-340, 2011. ,
DOI : 10.1145/2095050.2095103