diff options
Diffstat (limited to 'share/doc/gccint/LTO-Overview.html')
-rw-r--r-- | share/doc/gccint/LTO-Overview.html | 209 |
1 files changed, 209 insertions, 0 deletions
diff --git a/share/doc/gccint/LTO-Overview.html b/share/doc/gccint/LTO-Overview.html new file mode 100644 index 0000000..f44fc43 --- /dev/null +++ b/share/doc/gccint/LTO-Overview.html @@ -0,0 +1,209 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> +<html> +<!-- Copyright (C) 1988-2023 Free Software Foundation, Inc. + +Permission is granted to copy, distribute and/or modify this document +under the terms of the GNU Free Documentation License, Version 1.3 or +any later version published by the Free Software Foundation; with the +Invariant Sections being "Funding Free Software", the Front-Cover +Texts being (a) (see below), and with the Back-Cover Texts being (b) +(see below). A copy of the license is included in the section entitled +"GNU Free Documentation License". + +(a) The FSF's Front-Cover Text is: + +A GNU Manual + +(b) The FSF's Back-Cover Text is: + +You have freedom to copy and modify this GNU Manual, like GNU + software. Copies published by the Free Software Foundation raise + funds for GNU development. --> +<!-- Created by GNU Texinfo 5.1, http://www.gnu.org/software/texinfo/ --> +<head> +<title>GNU Compiler Collection (GCC) Internals: LTO Overview</title> + +<meta name="description" content="GNU Compiler Collection (GCC) Internals: LTO Overview"> +<meta name="keywords" content="GNU Compiler Collection (GCC) Internals: LTO Overview"> +<meta name="resource-type" content="document"> +<meta name="distribution" content="global"> +<meta name="Generator" content="makeinfo"> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> +<link href="index.html#Top" rel="start" title="Top"> +<link href="Option-Index.html#Option-Index" rel="index" title="Option Index"> +<link href="index.html#SEC_Contents" rel="contents" title="Table of Contents"> +<link href="LTO.html#LTO" rel="up" title="LTO"> +<link href="LTO-object-file-layout.html#LTO-object-file-layout" rel="next" title="LTO object file layout"> +<link href="LTO.html#LTO" rel="previous" title="LTO"> +<style type="text/css"> +<!-- +a.summary-letter {text-decoration: none} +blockquote.smallquotation {font-size: smaller} +div.display {margin-left: 3.2em} +div.example {margin-left: 3.2em} +div.indentedblock {margin-left: 3.2em} +div.lisp {margin-left: 3.2em} +div.smalldisplay {margin-left: 3.2em} +div.smallexample {margin-left: 3.2em} +div.smallindentedblock {margin-left: 3.2em; font-size: smaller} +div.smalllisp {margin-left: 3.2em} +kbd {font-style:oblique} +pre.display {font-family: inherit} +pre.format {font-family: inherit} +pre.menu-comment {font-family: serif} +pre.menu-preformatted {font-family: serif} +pre.smalldisplay {font-family: inherit; font-size: smaller} +pre.smallexample {font-size: smaller} +pre.smallformat {font-family: inherit; font-size: smaller} +pre.smalllisp {font-size: smaller} +span.nocodebreak {white-space:nowrap} +span.nolinebreak {white-space:nowrap} +span.roman {font-family:serif; font-weight:normal} +span.sansserif {font-family:sans-serif; font-weight:normal} +ul.no-bullet {list-style: none} +--> +</style> + + +</head> + +<body lang="en" bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#800080" alink="#FF0000"> +<a name="LTO-Overview"></a> +<div class="header"> +<p> +Next: <a href="LTO-object-file-layout.html#LTO-object-file-layout" accesskey="n" rel="next">LTO object file layout</a>, Up: <a href="LTO.html#LTO" accesskey="u" rel="up">LTO</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html#Option-Index" title="Index" rel="index">Index</a>]</p> +</div> +<hr> +<a name="Design-Overview"></a> +<h3 class="section">25.1 Design Overview</h3> + +<p>Link time optimization is implemented as a GCC front end for a +bytecode representation of GIMPLE that is emitted in special sections +of <code>.o</code> files. Currently, LTO support is enabled in most +ELF-based systems, as well as darwin, cygwin and mingw systems. +</p> +<p>By default, object files generated with LTO support contain only GIMPLE +bytecode. Such objects are called “slim”, and they require that +tools like <code>ar</code> and <code>nm</code> understand symbol tables of LTO +sections. For most targets these tools have been extended to use the +plugin infrastructure, so GCC can support “slim” objects consisting +of the intermediate code alone. +</p> +<p>GIMPLE bytecode could also be saved alongside final object code if +the <samp>-ffat-lto-objects</samp> option is passed, or if no plugin support +is detected for <code>ar</code> and <code>nm</code> when GCC is configured. It makes +the object files generated with LTO support larger than regular object +files. This “fat” object format allows to ship one set of fat +objects which could be used both for development and the production of +optimized builds. A, perhaps surprising, side effect of this feature +is that any mistake in the toolchain leads to LTO information not +being used (e.g. an older <code>libtool</code> calling <code>ld</code> directly). +This is both an advantage, as the system is more robust, and a +disadvantage, as the user is not informed that the optimization has +been disabled. +</p> +<p>At the highest level, LTO splits the compiler in two. The first half +(the “writer”) produces a streaming representation of all the +internal data structures needed to optimize and generate code. This +includes declarations, types, the callgraph and the GIMPLE representation +of function bodies. +</p> +<p>When <samp>-flto</samp> is given during compilation of a source file, the +pass manager executes all the passes in <code>all_lto_gen_passes</code>. +Currently, this phase is composed of two IPA passes: +</p> +<ul> +<li> <code>pass_ipa_lto_gimple_out</code> +This pass executes the function <code>lto_output</code> in +<samp>lto-streamer-out.cc</samp>, which traverses the call graph encoding +every reachable declaration, type and function. This generates a +memory representation of all the file sections described below. + +</li><li> <code>pass_ipa_lto_finish_out</code> +This pass executes the function <code>produce_asm_for_decls</code> in +<samp>lto-streamer-out.cc</samp>, which takes the memory image built in the +previous pass and encodes it in the corresponding ELF file sections. +</li></ul> + +<p>The second half of LTO support is the “reader”. This is implemented +as the GCC front end <samp>lto1</samp> in <samp>lto/lto.cc</samp>. When +<samp>collect2</samp> detects a link set of <code>.o</code>/<code>.a</code> files with +LTO information and the <samp>-flto</samp> is enabled, it invokes +<samp>lto1</samp> which reads the set of files and aggregates them into a +single translation unit for optimization. The main entry point for +the reader is <samp>lto/lto.cc</samp>:<code>lto_main</code>. +</p> +<a name="LTO-modes-of-operation"></a> +<h4 class="subsection">25.1.1 LTO modes of operation</h4> + +<p>One of the main goals of the GCC link-time infrastructure was to allow +effective compilation of large programs. For this reason GCC implements two +link-time compilation modes. +</p> +<ol> +<li> <em>LTO mode</em>, in which the whole program is read into the +compiler at link-time and optimized in a similar way as if it +were a single source-level compilation unit. + +</li><li> <em>WHOPR or partitioned mode</em>, designed to utilize multiple +CPUs and/or a distributed compilation environment to quickly link +large applications. WHOPR stands for WHOle Program optimizeR (not to +be confused with the semantics of <samp>-fwhole-program</samp>). It +partitions the aggregated callgraph from many different <code>.o</code> +files and distributes the compilation of the sub-graphs to different +CPUs. + +<p>Note that distributed compilation is not implemented yet, but since +the parallelism is facilitated via generating a <code>Makefile</code>, it +would be easy to implement. +</p></li></ol> + +<p>WHOPR splits LTO into three main stages: +</p><ol> +<li> Local generation (LGEN) +This stage executes in parallel. Every file in the program is compiled +into the intermediate language and packaged together with the local +call-graph and summary information. This stage is the same for both +the LTO and WHOPR compilation mode. + +</li><li> Whole Program Analysis (WPA) +WPA is performed sequentially. The global call-graph is generated, and +a global analysis procedure makes transformation decisions. The global +call-graph is partitioned to facilitate parallel optimization during +phase 3. The results of the WPA stage are stored into new object files +which contain the partitions of program expressed in the intermediate +language and the optimization decisions. + +</li><li> Local transformations (LTRANS) +This stage executes in parallel. All the decisions made during phase 2 +are implemented locally in each partitioned object file, and the final +object code is generated. Optimizations which cannot be decided +efficiently during the phase 2 may be performed on the local +call-graph partitions. +</li></ol> + +<p>WHOPR can be seen as an extension of the usual LTO mode of +compilation. In LTO, WPA and LTRANS are executed within a single +execution of the compiler, after the whole program has been read into +memory. +</p> +<p>When compiling in WHOPR mode, the callgraph is partitioned during +the WPA stage. The whole program is split into a given number of +partitions of roughly the same size. The compiler tries to +minimize the number of references which cross partition boundaries. +The main advantage of WHOPR is to allow the parallel execution of +LTRANS stages, which are the most time-consuming part of the +compilation process. Additionally, it avoids the need to load the +whole program into memory. +</p> + +<hr> +<div class="header"> +<p> +Next: <a href="LTO-object-file-layout.html#LTO-object-file-layout" accesskey="n" rel="next">LTO object file layout</a>, Up: <a href="LTO.html#LTO" accesskey="u" rel="up">LTO</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html#Option-Index" title="Index" rel="index">Index</a>]</p> +</div> + + + +</body> +</html> |