diff options
Diffstat (limited to 'share/doc/gccint/Insn-Splitting.html')
-rw-r--r-- | share/doc/gccint/Insn-Splitting.html | 458 |
1 files changed, 458 insertions, 0 deletions
diff --git a/share/doc/gccint/Insn-Splitting.html b/share/doc/gccint/Insn-Splitting.html new file mode 100644 index 0000000..8bfce4b --- /dev/null +++ b/share/doc/gccint/Insn-Splitting.html @@ -0,0 +1,458 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> +<html> +<!-- Copyright (C) 1988-2023 Free Software Foundation, Inc. + +Permission is granted to copy, distribute and/or modify this document +under the terms of the GNU Free Documentation License, Version 1.3 or +any later version published by the Free Software Foundation; with the +Invariant Sections being "Funding Free Software", the Front-Cover +Texts being (a) (see below), and with the Back-Cover Texts being (b) +(see below). A copy of the license is included in the section entitled +"GNU Free Documentation License". + +(a) The FSF's Front-Cover Text is: + +A GNU Manual + +(b) The FSF's Back-Cover Text is: + +You have freedom to copy and modify this GNU Manual, like GNU + software. Copies published by the Free Software Foundation raise + funds for GNU development. --> +<!-- Created by GNU Texinfo 5.1, http://www.gnu.org/software/texinfo/ --> +<head> +<title>GNU Compiler Collection (GCC) Internals: Insn Splitting</title> + +<meta name="description" content="GNU Compiler Collection (GCC) Internals: Insn Splitting"> +<meta name="keywords" content="GNU Compiler Collection (GCC) Internals: Insn Splitting"> +<meta name="resource-type" content="document"> +<meta name="distribution" content="global"> +<meta name="Generator" content="makeinfo"> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> +<link href="index.html#Top" rel="start" title="Top"> +<link href="Option-Index.html#Option-Index" rel="index" title="Option Index"> +<link href="index.html#SEC_Contents" rel="contents" title="Table of Contents"> +<link href="Machine-Desc.html#Machine-Desc" rel="up" title="Machine Desc"> +<link href="Including-Patterns.html#Including-Patterns" rel="next" title="Including Patterns"> +<link href="Expander-Definitions.html#Expander-Definitions" rel="previous" title="Expander Definitions"> +<style type="text/css"> +<!-- +a.summary-letter {text-decoration: none} +blockquote.smallquotation {font-size: smaller} +div.display {margin-left: 3.2em} +div.example {margin-left: 3.2em} +div.indentedblock {margin-left: 3.2em} +div.lisp {margin-left: 3.2em} +div.smalldisplay {margin-left: 3.2em} +div.smallexample {margin-left: 3.2em} +div.smallindentedblock {margin-left: 3.2em; font-size: smaller} +div.smalllisp {margin-left: 3.2em} +kbd {font-style:oblique} +pre.display {font-family: inherit} +pre.format {font-family: inherit} +pre.menu-comment {font-family: serif} +pre.menu-preformatted {font-family: serif} +pre.smalldisplay {font-family: inherit; font-size: smaller} +pre.smallexample {font-size: smaller} +pre.smallformat {font-family: inherit; font-size: smaller} +pre.smalllisp {font-size: smaller} +span.nocodebreak {white-space:nowrap} +span.nolinebreak {white-space:nowrap} +span.roman {font-family:serif; font-weight:normal} +span.sansserif {font-family:sans-serif; font-weight:normal} +ul.no-bullet {list-style: none} +--> +</style> + + +</head> + +<body lang="en" bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#800080" alink="#FF0000"> +<a name="Insn-Splitting"></a> +<div class="header"> +<p> +Next: <a href="Including-Patterns.html#Including-Patterns" accesskey="n" rel="next">Including Patterns</a>, Previous: <a href="Expander-Definitions.html#Expander-Definitions" accesskey="p" rel="previous">Expander Definitions</a>, Up: <a href="Machine-Desc.html#Machine-Desc" accesskey="u" rel="up">Machine Desc</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html#Option-Index" title="Index" rel="index">Index</a>]</p> +</div> +<hr> +<a name="Defining-How-to-Split-Instructions"></a> +<h3 class="section">17.16 Defining How to Split Instructions</h3> +<a name="index-insn-splitting"></a> +<a name="index-instruction-splitting"></a> +<a name="index-splitting-instructions"></a> + +<p>There are two cases where you should specify how to split a pattern +into multiple insns. On machines that have instructions requiring +delay slots (see <a href="Delay-Slots.html#Delay-Slots">Delay Slots</a>) or that have instructions whose +output is not available for multiple cycles (see <a href="Processor-pipeline-description.html#Processor-pipeline-description">Processor pipeline description</a>), the compiler phases that optimize these cases need to +be able to move insns into one-instruction delay slots. However, some +insns may generate more than one machine instruction. These insns +cannot be placed into a delay slot. +</p> +<p>Often you can rewrite the single insn as a list of individual insns, +each corresponding to one machine instruction. The disadvantage of +doing so is that it will cause the compilation to be slower and require +more space. If the resulting insns are too complex, it may also +suppress some optimizations. The compiler splits the insn if there is a +reason to believe that it might improve instruction or delay slot +scheduling. +</p> +<p>The insn combiner phase also splits putative insns. If three insns are +merged into one insn with a complex expression that cannot be matched by +some <code>define_insn</code> pattern, the combiner phase attempts to split +the complex pattern into two insns that are recognized. Usually it can +break the complex pattern into two patterns by splitting out some +subexpression. However, in some other cases, such as performing an +addition of a large constant in two insns on a RISC machine, the way to +split the addition into two insns is machine-dependent. +</p> +<a name="index-define_005fsplit"></a> +<p>The <code>define_split</code> definition tells the compiler how to split a +complex insn into several simpler insns. It looks like this: +</p> +<div class="smallexample"> +<pre class="smallexample">(define_split + [<var>insn-pattern</var>] + "<var>condition</var>" + [<var>new-insn-pattern-1</var> + <var>new-insn-pattern-2</var> + …] + "<var>preparation-statements</var>") +</pre></div> + +<p><var>insn-pattern</var> is a pattern that needs to be split and +<var>condition</var> is the final condition to be tested, as in a +<code>define_insn</code>. When an insn matching <var>insn-pattern</var> and +satisfying <var>condition</var> is found, it is replaced in the insn list +with the insns given by <var>new-insn-pattern-1</var>, +<var>new-insn-pattern-2</var>, etc. +</p> +<p>The <var>preparation-statements</var> are similar to those statements that +are specified for <code>define_expand</code> (see <a href="Expander-Definitions.html#Expander-Definitions">Expander Definitions</a>) +and are executed before the new RTL is generated to prepare for the +generated code or emit some insns whose pattern is not fixed. Unlike +those in <code>define_expand</code>, however, these statements must not +generate any new pseudo-registers. Once reload has completed, they also +must not allocate any space in the stack frame. +</p> +<p>There are two special macros defined for use in the preparation statements: +<code>DONE</code> and <code>FAIL</code>. Use them with a following semicolon, +as a statement. +</p> +<dl compact="compact"> +<dd> +<a name="index-DONE-1"></a> +</dd> +<dt><code>DONE</code></dt> +<dd><p>Use the <code>DONE</code> macro to end RTL generation for the splitter. The +only RTL insns generated as replacement for the matched input insn will +be those already emitted by explicit calls to <code>emit_insn</code> within +the preparation statements; the replacement pattern is not used. +</p> +<a name="index-FAIL-1"></a> +</dd> +<dt><code>FAIL</code></dt> +<dd><p>Make the <code>define_split</code> fail on this occasion. When a <code>define_split</code> +fails, it means that the splitter was not truly available for the inputs +it was given, and the input insn will not be split. +</p></dd> +</dl> + +<p>If the preparation falls through (invokes neither <code>DONE</code> nor +<code>FAIL</code>), then the <code>define_split</code> uses the replacement +template. +</p> +<p>Patterns are matched against <var>insn-pattern</var> in two different +circumstances. If an insn needs to be split for delay slot scheduling +or insn scheduling, the insn is already known to be valid, which means +that it must have been matched by some <code>define_insn</code> and, if +<code>reload_completed</code> is nonzero, is known to satisfy the constraints +of that <code>define_insn</code>. In that case, the new insn patterns must +also be insns that are matched by some <code>define_insn</code> and, if +<code>reload_completed</code> is nonzero, must also satisfy the constraints +of those definitions. +</p> +<p>As an example of this usage of <code>define_split</code>, consider the following +example from <samp>a29k.md</samp>, which splits a <code>sign_extend</code> from +<code>HImode</code> to <code>SImode</code> into a pair of shift insns: +</p> +<div class="smallexample"> +<pre class="smallexample">(define_split + [(set (match_operand:SI 0 "gen_reg_operand" "") + (sign_extend:SI (match_operand:HI 1 "gen_reg_operand" "")))] + "" + [(set (match_dup 0) + (ashift:SI (match_dup 1) + (const_int 16))) + (set (match_dup 0) + (ashiftrt:SI (match_dup 0) + (const_int 16)))] + " +{ operands[1] = gen_lowpart (SImode, operands[1]); }") +</pre></div> + +<p>When the combiner phase tries to split an insn pattern, it is always the +case that the pattern is <em>not</em> matched by any <code>define_insn</code>. +The combiner pass first tries to split a single <code>set</code> expression +and then the same <code>set</code> expression inside a <code>parallel</code>, but +followed by a <code>clobber</code> of a pseudo-reg to use as a scratch +register. In these cases, the combiner expects exactly one or two new insn +patterns to be generated. It will verify that these patterns match some +<code>define_insn</code> definitions, so you need not do this test in the +<code>define_split</code> (of course, there is no point in writing a +<code>define_split</code> that will never produce insns that match). +</p> +<p>Here is an example of this use of <code>define_split</code>, taken from +<samp>rs6000.md</samp>: +</p> +<div class="smallexample"> +<pre class="smallexample">(define_split + [(set (match_operand:SI 0 "gen_reg_operand" "") + (plus:SI (match_operand:SI 1 "gen_reg_operand" "") + (match_operand:SI 2 "non_add_cint_operand" "")))] + "" + [(set (match_dup 0) (plus:SI (match_dup 1) (match_dup 3))) + (set (match_dup 0) (plus:SI (match_dup 0) (match_dup 4)))] +" +{ + int low = INTVAL (operands[2]) & 0xffff; + int high = (unsigned) INTVAL (operands[2]) >> 16; + + if (low & 0x8000) + high++, low |= 0xffff0000; + + operands[3] = GEN_INT (high << 16); + operands[4] = GEN_INT (low); +}") +</pre></div> + +<p>Here the predicate <code>non_add_cint_operand</code> matches any +<code>const_int</code> that is <em>not</em> a valid operand of a single add +insn. The add with the smaller displacement is written so that it +can be substituted into the address of a subsequent operation. +</p> +<p>An example that uses a scratch register, from the same file, generates +an equality comparison of a register and a large constant: +</p> +<div class="smallexample"> +<pre class="smallexample">(define_split + [(set (match_operand:CC 0 "cc_reg_operand" "") + (compare:CC (match_operand:SI 1 "gen_reg_operand" "") + (match_operand:SI 2 "non_short_cint_operand" ""))) + (clobber (match_operand:SI 3 "gen_reg_operand" ""))] + "find_single_use (operands[0], insn, 0) + && (GET_CODE (*find_single_use (operands[0], insn, 0)) == EQ + || GET_CODE (*find_single_use (operands[0], insn, 0)) == NE)" + [(set (match_dup 3) (xor:SI (match_dup 1) (match_dup 4))) + (set (match_dup 0) (compare:CC (match_dup 3) (match_dup 5)))] + " +{ + /* <span class="roman">Get the constant we are comparing against, C, and see what it + looks like sign-extended to 16 bits. Then see what constant + could be XOR’ed with C to get the sign-extended value.</span> */ + + int c = INTVAL (operands[2]); + int sextc = (c << 16) >> 16; + int xorv = c ^ sextc; + + operands[4] = GEN_INT (xorv); + operands[5] = GEN_INT (sextc); +}") +</pre></div> + +<p>To avoid confusion, don’t write a single <code>define_split</code> that +accepts some insns that match some <code>define_insn</code> as well as some +insns that don’t. Instead, write two separate <code>define_split</code> +definitions, one for the insns that are valid and one for the insns that +are not valid. +</p> +<p>The splitter is allowed to split jump instructions into a sequence of jumps or +create new jumps while splitting non-jump instructions. As the control flow +graph and branch prediction information needs to be updated after the splitter +runs, several restrictions apply. +</p> +<p>Splitting of a jump instruction into a sequence that has another jump +instruction to the same label is always valid, as the compiler expects +identical behavior of the new jump. When the new sequence contains multiple +jump instructions or new labels, more assistance is needed. The splitter is +permitted to create only unconditional jumps, or simple conditional jump +instructions. Additionally it must attach a <code>REG_BR_PROB</code> note to each +conditional jump. A global variable <code>split_branch_probability</code> holds the +probability of the original branch in case it was a simple conditional jump, +-1 otherwise. To simplify recomputing of edge frequencies, the new +sequence is permitted to have only forward jumps to the newly-created labels. +</p> +<a name="index-define_005finsn_005fand_005fsplit"></a> +<p>For the common case where the pattern of a define_split exactly matches the +pattern of a define_insn, use <code>define_insn_and_split</code>. It looks like +this: +</p> +<div class="smallexample"> +<pre class="smallexample">(define_insn_and_split + [<var>insn-pattern</var>] + "<var>condition</var>" + "<var>output-template</var>" + "<var>split-condition</var>" + [<var>new-insn-pattern-1</var> + <var>new-insn-pattern-2</var> + …] + "<var>preparation-statements</var>" + [<var>insn-attributes</var>]) + +</pre></div> + +<p><var>insn-pattern</var>, <var>condition</var>, <var>output-template</var>, and +<var>insn-attributes</var> are used as in <code>define_insn</code>. The +<var>new-insn-pattern</var> vector and the <var>preparation-statements</var> are used as +in a <code>define_split</code>. The <var>split-condition</var> is also used as in +<code>define_split</code>, with the additional behavior that if the condition starts +with ‘<samp>&&</samp>’, the condition used for the split will be the constructed as a +logical “and” of the split condition with the insn condition. For example, +from i386.md: +</p> +<div class="smallexample"> +<pre class="smallexample">(define_insn_and_split "zero_extendhisi2_and" + [(set (match_operand:SI 0 "register_operand" "=r") + (zero_extend:SI (match_operand:HI 1 "register_operand" "0"))) + (clobber (reg:CC 17))] + "TARGET_ZERO_EXTEND_WITH_AND && !optimize_size" + "#" + "&& reload_completed" + [(parallel [(set (match_dup 0) + (and:SI (match_dup 0) (const_int 65535))) + (clobber (reg:CC 17))])] + "" + [(set_attr "type" "alu1")]) + +</pre></div> + +<p>In this case, the actual split condition will be +‘<samp>TARGET_ZERO_EXTEND_WITH_AND && !optimize_size && reload_completed</samp>’. +</p> +<p>The <code>define_insn_and_split</code> construction provides exactly the same +functionality as two separate <code>define_insn</code> and <code>define_split</code> +patterns. It exists for compactness, and as a maintenance tool to prevent +having to ensure the two patterns’ templates match. +</p> +<a name="index-define_005finsn_005fand_005frewrite"></a> +<p>It is sometimes useful to have a <code>define_insn_and_split</code> +that replaces specific operands of an instruction but leaves the +rest of the instruction pattern unchanged. You can do this directly +with a <code>define_insn_and_split</code>, but it requires a +<var>new-insn-pattern-1</var> that repeats most of the original <var>insn-pattern</var>. +There is also the complication that an implicit <code>parallel</code> in +<var>insn-pattern</var> must become an explicit <code>parallel</code> in +<var>new-insn-pattern-1</var>, which is easy to overlook. +A simpler alternative is to use <code>define_insn_and_rewrite</code>, which +is a form of <code>define_insn_and_split</code> that automatically generates +<var>new-insn-pattern-1</var> by replacing each <code>match_operand</code> +in <var>insn-pattern</var> with a corresponding <code>match_dup</code>, and each +<code>match_operator</code> in the pattern with a corresponding <code>match_op_dup</code>. +The arguments are otherwise identical to <code>define_insn_and_split</code>: +</p> +<div class="smallexample"> +<pre class="smallexample">(define_insn_and_rewrite + [<var>insn-pattern</var>] + "<var>condition</var>" + "<var>output-template</var>" + "<var>split-condition</var>" + "<var>preparation-statements</var>" + [<var>insn-attributes</var>]) +</pre></div> + +<p>The <code>match_dup</code>s and <code>match_op_dup</code>s in the new +instruction pattern use any new operand values that the +<var>preparation-statements</var> store in the <code>operands</code> array, +as for a normal <code>define_insn_and_split</code>. <var>preparation-statements</var> +can also emit additional instructions before the new instruction. +They can even emit an entirely different sequence of instructions and +use <code>DONE</code> to avoid emitting a new form of the original +instruction. +</p> +<p>The split in a <code>define_insn_and_rewrite</code> is only intended +to apply to existing instructions that match <var>insn-pattern</var>. +<var>split-condition</var> must therefore start with <code>&&</code>, +so that the split condition applies on top of <var>condition</var>. +</p> +<p>Here is an example from the AArch64 SVE port, in which operand 1 is +known to be equivalent to an all-true constant and isn’t used by the +output template: +</p> +<div class="smallexample"> +<pre class="smallexample">(define_insn_and_rewrite "*while_ult<GPI:mode><PRED_ALL:mode>_cc" + [(set (reg:CC CC_REGNUM) + (compare:CC + (unspec:SI [(match_operand:PRED_ALL 1) + (unspec:PRED_ALL + [(match_operand:GPI 2 "aarch64_reg_or_zero" "rZ") + (match_operand:GPI 3 "aarch64_reg_or_zero" "rZ")] + UNSPEC_WHILE_LO)] + UNSPEC_PTEST_PTRUE) + (const_int 0))) + (set (match_operand:PRED_ALL 0 "register_operand" "=Upa") + (unspec:PRED_ALL [(match_dup 2) + (match_dup 3)] + UNSPEC_WHILE_LO))] + "TARGET_SVE" + "whilelo\t%0.<PRED_ALL:Vetype>, %<w>2, %<w>3" + ;; Force the compiler to drop the unused predicate operand, so that we + ;; don't have an unnecessary PTRUE. + "&& !CONSTANT_P (operands[1])" + { + operands[1] = CONSTM1_RTX (<MODE>mode); + } +) +</pre></div> + +<p>The splitter in this case simply replaces operand 1 with the constant +value that it is known to have. The equivalent <code>define_insn_and_split</code> +would be: +</p> +<div class="smallexample"> +<pre class="smallexample">(define_insn_and_split "*while_ult<GPI:mode><PRED_ALL:mode>_cc" + [(set (reg:CC CC_REGNUM) + (compare:CC + (unspec:SI [(match_operand:PRED_ALL 1) + (unspec:PRED_ALL + [(match_operand:GPI 2 "aarch64_reg_or_zero" "rZ") + (match_operand:GPI 3 "aarch64_reg_or_zero" "rZ")] + UNSPEC_WHILE_LO)] + UNSPEC_PTEST_PTRUE) + (const_int 0))) + (set (match_operand:PRED_ALL 0 "register_operand" "=Upa") + (unspec:PRED_ALL [(match_dup 2) + (match_dup 3)] + UNSPEC_WHILE_LO))] + "TARGET_SVE" + "whilelo\t%0.<PRED_ALL:Vetype>, %<w>2, %<w>3" + ;; Force the compiler to drop the unused predicate operand, so that we + ;; don't have an unnecessary PTRUE. + "&& !CONSTANT_P (operands[1])" + [(parallel + [(set (reg:CC CC_REGNUM) + (compare:CC + (unspec:SI [(match_dup 1) + (unspec:PRED_ALL [(match_dup 2) + (match_dup 3)] + UNSPEC_WHILE_LO)] + UNSPEC_PTEST_PTRUE) + (const_int 0))) + (set (match_dup 0) + (unspec:PRED_ALL [(match_dup 2) + (match_dup 3)] + UNSPEC_WHILE_LO))])] + { + operands[1] = CONSTM1_RTX (<MODE>mode); + } +) +</pre></div> + +<hr> +<div class="header"> +<p> +Next: <a href="Including-Patterns.html#Including-Patterns" accesskey="n" rel="next">Including Patterns</a>, Previous: <a href="Expander-Definitions.html#Expander-Definitions" accesskey="p" rel="previous">Expander Definitions</a>, Up: <a href="Machine-Desc.html#Machine-Desc" accesskey="u" rel="up">Machine Desc</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html#Option-Index" title="Index" rel="index">Index</a>]</p> +</div> + + + +</body> +</html> |