diff options
author | alk3pInjection <webmaster@raspii.tech> | 2024-02-04 16:16:35 +0800 |
---|---|---|
committer | alk3pInjection <webmaster@raspii.tech> | 2024-02-04 16:16:35 +0800 |
commit | abdaadbcae30fe0c9a66c7516798279fdfd97750 (patch) | |
tree | 00a54a6e25601e43876d03c1a4a12a749d4a914c /share/doc/cppinternals/Token-Spacing.html |
https://developer.arm.com/downloads/-/arm-gnu-toolchain-downloads
Change-Id: I7303388733328cd98ab9aa3c30236db67f2e9e9c
Diffstat (limited to 'share/doc/cppinternals/Token-Spacing.html')
-rw-r--r-- | share/doc/cppinternals/Token-Spacing.html | 203 |
1 files changed, 203 insertions, 0 deletions
diff --git a/share/doc/cppinternals/Token-Spacing.html b/share/doc/cppinternals/Token-Spacing.html new file mode 100644 index 0000000..8da69cb --- /dev/null +++ b/share/doc/cppinternals/Token-Spacing.html @@ -0,0 +1,203 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> +<html> +<!-- Created by GNU Texinfo 5.1, http://www.gnu.org/software/texinfo/ --> +<head> +<title>The GNU C Preprocessor Internals: Token Spacing</title> + +<meta name="description" content="The GNU C Preprocessor Internals: Token Spacing"> +<meta name="keywords" content="The GNU C Preprocessor Internals: Token Spacing"> +<meta name="resource-type" content="document"> +<meta name="distribution" content="global"> +<meta name="Generator" content="makeinfo"> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> +<link href="index.html#Top" rel="start" title="Top"> +<link href="Concept-Index.html#Concept-Index" rel="index" title="Concept Index"> +<link href="index.html#SEC_Contents" rel="contents" title="Table of Contents"> +<link href="index.html#Top" rel="up" title="Top"> +<link href="Line-Numbering.html#Line-Numbering" rel="next" title="Line Numbering"> +<link href="Macro-Expansion.html#Macro-Expansion" rel="previous" title="Macro Expansion"> +<style type="text/css"> +<!-- +a.summary-letter {text-decoration: none} +blockquote.smallquotation {font-size: smaller} +div.display {margin-left: 3.2em} +div.example {margin-left: 3.2em} +div.indentedblock {margin-left: 3.2em} +div.lisp {margin-left: 3.2em} +div.smalldisplay {margin-left: 3.2em} +div.smallexample {margin-left: 3.2em} +div.smallindentedblock {margin-left: 3.2em; font-size: smaller} +div.smalllisp {margin-left: 3.2em} +kbd {font-style:oblique} +pre.display {font-family: inherit} +pre.format {font-family: inherit} +pre.menu-comment {font-family: serif} +pre.menu-preformatted {font-family: serif} +pre.smalldisplay {font-family: inherit; font-size: smaller} +pre.smallexample {font-size: smaller} +pre.smallformat {font-family: inherit; font-size: smaller} +pre.smalllisp {font-size: smaller} +span.nocodebreak {white-space:nowrap} +span.nolinebreak {white-space:nowrap} +span.roman {font-family:serif; font-weight:normal} +span.sansserif {font-family:sans-serif; font-weight:normal} +ul.no-bullet {list-style: none} +--> +</style> + + +</head> + +<body lang="en" bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#800080" alink="#FF0000"> +<a name="Token-Spacing"></a> +<div class="header"> +<p> +Next: <a href="Line-Numbering.html#Line-Numbering" accesskey="n" rel="next">Line Numbering</a>, Previous: <a href="Macro-Expansion.html#Macro-Expansion" accesskey="p" rel="previous">Macro Expansion</a>, Up: <a href="index.html#Top" accesskey="u" rel="up">Top</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Concept-Index.html#Concept-Index" title="Index" rel="index">Index</a>]</p> +</div> +<hr> +<a name="Token-Spacing-1"></a> +<h2 class="unnumbered">Token Spacing</h2> +<a name="index-paste-avoidance"></a> +<a name="index-spacing"></a> +<a name="index-token-spacing"></a> + +<p>First, consider an issue that only concerns the stand-alone +preprocessor: there needs to be a guarantee that re-reading its preprocessed +output results in an identical token stream. Without taking special +measures, this might not be the case because of macro substitution. +For example: +</p> +<div class="smallexample"> +<pre class="smallexample">#define PLUS + +#define EMPTY +#define f(x) =x= ++PLUS -EMPTY- PLUS+ f(=) + → + + - - + + = = = +<em>not</em> + → ++ -- ++ === +</pre></div> + +<p>One solution would be to simply insert a space between all adjacent +tokens. However, we would like to keep space insertion to a minimum, +both for aesthetic reasons and because it causes problems for people who +still try to abuse the preprocessor for things like Fortran source and +Makefiles. +</p> +<p>For now, just notice that when tokens are added (or removed, as shown by +the <code>EMPTY</code> example) from the original lexed token stream, we need +to check for accidental token pasting. We call this <em>paste +avoidance</em>. Token addition and removal can only occur because of macro +expansion, but accidental pasting can occur in many places: both before +and after each macro replacement, each argument replacement, and +additionally each token created by the ‘<samp>#</samp>’ and ‘<samp>##</samp>’ operators. +</p> +<p>Look at how the preprocessor gets whitespace output correct +normally. The <code>cpp_token</code> structure contains a flags byte, and one +of those flags is <code>PREV_WHITE</code>. This is flagged by the lexer, and +indicates that the token was preceded by whitespace of some form other +than a new line. The stand-alone preprocessor can use this flag to +decide whether to insert a space between tokens in the output. +</p> +<p>Now consider the result of the following macro expansion: +</p> +<div class="smallexample"> +<pre class="smallexample">#define add(x, y, z) x + y +z; +sum = add (1,2, 3); + → sum = 1 + 2 +3; +</pre></div> + +<p>The interesting thing here is that the tokens ‘<samp>1</samp>’ and ‘<samp>2</samp>’ are +output with a preceding space, and ‘<samp>3</samp>’ is output without a +preceding space, but when lexed none of these tokens had that property. +Careful consideration reveals that ‘<samp>1</samp>’ gets its preceding +whitespace from the space preceding ‘<samp>add</samp>’ in the macro invocation, +<em>not</em> replacement list. ‘<samp>2</samp>’ gets its whitespace from the +space preceding the parameter ‘<samp>y</samp>’ in the macro replacement list, +and ‘<samp>3</samp>’ has no preceding space because parameter ‘<samp>z</samp>’ has none +in the replacement list. +</p> +<p>Once lexed, tokens are effectively fixed and cannot be altered, since +pointers to them might be held in many places, in particular by +in-progress macro expansions. So instead of modifying the two tokens +above, the preprocessor inserts a special token, which I call a +<em>padding token</em>, into the token stream to indicate that spacing of +the subsequent token is special. The preprocessor inserts padding +tokens in front of every macro expansion and expanded macro argument. +These point to a <em>source token</em> from which the subsequent real token +should inherit its spacing. In the above example, the source tokens are +‘<samp>add</samp>’ in the macro invocation, and ‘<samp>y</samp>’ and ‘<samp>z</samp>’ in the +macro replacement list, respectively. +</p> +<p>It is quite easy to get multiple padding tokens in a row, for example if +a macro’s first replacement token expands straight into another macro. +</p> +<div class="smallexample"> +<pre class="smallexample">#define foo bar +#define bar baz +[foo] + → [baz] +</pre></div> + +<p>Here, two padding tokens are generated with sources the ‘<samp>foo</samp>’ token +between the brackets, and the ‘<samp>bar</samp>’ token from foo’s replacement +list, respectively. Clearly the first padding token is the one to +use, so the output code should contain a rule that the first +padding token in a sequence is the one that matters. +</p> +<p>But what if a macro expansion is left? Adjusting the above +example slightly: +</p> +<div class="smallexample"> +<pre class="smallexample">#define foo bar +#define bar EMPTY baz +#define EMPTY +[foo] EMPTY; + → [ baz] ; +</pre></div> + +<p>As shown, now there should be a space before ‘<samp>baz</samp>’ and the +semicolon in the output. +</p> +<p>The rules we decided above fail for ‘<samp>baz</samp>’: we generate three +padding tokens, one per macro invocation, before the token ‘<samp>baz</samp>’. +We would then have it take its spacing from the first of these, which +carries source token ‘<samp>foo</samp>’ with no leading space. +</p> +<p>It is vital that cpplib get spacing correct in these examples since any +of these macro expansions could be stringized, where spacing matters. +</p> +<p>So, this demonstrates that not just entering macro and argument +expansions, but leaving them requires special handling too. I made +cpplib insert a padding token with a <code>NULL</code> source token when +leaving macro expansions, as well as after each replaced argument in a +macro’s replacement list. It also inserts appropriate padding tokens on +either side of tokens created by the ‘<samp>#</samp>’ and ‘<samp>##</samp>’ operators. +I expanded the rule so that, if we see a padding token with a +<code>NULL</code> source token, <em>and</em> that source token has no leading +space, then we behave as if we have seen no padding tokens at all. A +quick check shows this rule will then get the above example correct as +well. +</p> +<p>Now a relationship with paste avoidance is apparent: we have to be +careful about paste avoidance in exactly the same locations we have +padding tokens in order to get white space correct. This makes +implementation of paste avoidance easy: wherever the stand-alone +preprocessor is fixing up spacing because of padding tokens, and it +turns out that no space is needed, it has to take the extra step to +check that a space is not needed after all to avoid an accidental paste. +The function <code>cpp_avoid_paste</code> advises whether a space is required +between two consecutive tokens. To avoid excessive spacing, it tries +hard to only require a space if one is likely to be necessary, but for +reasons of efficiency it is slightly conservative and might recommend a +space where one is not strictly needed. +</p> +<hr> +<div class="header"> +<p> +Next: <a href="Line-Numbering.html#Line-Numbering" accesskey="n" rel="next">Line Numbering</a>, Previous: <a href="Macro-Expansion.html#Macro-Expansion" accesskey="p" rel="previous">Macro Expansion</a>, Up: <a href="index.html#Top" accesskey="u" rel="up">Top</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Concept-Index.html#Concept-Index" title="Index" rel="index">Index</a>]</p> +</div> + + + +</body> +</html> |