summaryrefslogtreecommitdiff
path: root/share/doc/gcc/Nvidia-PTX-Options.html
diff options
context:
space:
mode:
Diffstat (limited to 'share/doc/gcc/Nvidia-PTX-Options.html')
-rw-r--r--share/doc/gcc/Nvidia-PTX-Options.html193
1 files changed, 193 insertions, 0 deletions
diff --git a/share/doc/gcc/Nvidia-PTX-Options.html b/share/doc/gcc/Nvidia-PTX-Options.html
new file mode 100644
index 0000000..3d9453c
--- /dev/null
+++ b/share/doc/gcc/Nvidia-PTX-Options.html
@@ -0,0 +1,193 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
+<html>
+<!-- This file documents the use of the GNU compilers.
+
+Copyright (C) 1988-2023 Free Software Foundation, Inc.
+
+Permission is granted to copy, distribute and/or modify this document
+under the terms of the GNU Free Documentation License, Version 1.3 or
+any later version published by the Free Software Foundation; with the
+Invariant Sections being "Funding Free Software", the Front-Cover
+Texts being (a) (see below), and with the Back-Cover Texts being (b)
+(see below). A copy of the license is included in the section entitled
+"GNU Free Documentation License".
+
+(a) The FSF's Front-Cover Text is:
+
+A GNU Manual
+
+(b) The FSF's Back-Cover Text is:
+
+You have freedom to copy and modify this GNU Manual, like GNU
+ software. Copies published by the Free Software Foundation raise
+ funds for GNU development. -->
+<!-- Created by GNU Texinfo 5.1, http://www.gnu.org/software/texinfo/ -->
+<head>
+<title>Using the GNU Compiler Collection (GCC): Nvidia PTX Options</title>
+
+<meta name="description" content="Using the GNU Compiler Collection (GCC): Nvidia PTX Options">
+<meta name="keywords" content="Using the GNU Compiler Collection (GCC): Nvidia PTX Options">
+<meta name="resource-type" content="document">
+<meta name="distribution" content="global">
+<meta name="Generator" content="makeinfo">
+<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
+<link href="index.html#Top" rel="start" title="Top">
+<link href="Indices.html#Indices" rel="index" title="Indices">
+<link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
+<link href="Submodel-Options.html#Submodel-Options" rel="up" title="Submodel Options">
+<link href="OpenRISC-Options.html#OpenRISC-Options" rel="next" title="OpenRISC Options">
+<link href="Nios-II-Options.html#Nios-II-Options" rel="previous" title="Nios II Options">
+<style type="text/css">
+<!--
+a.summary-letter {text-decoration: none}
+blockquote.smallquotation {font-size: smaller}
+div.display {margin-left: 3.2em}
+div.example {margin-left: 3.2em}
+div.indentedblock {margin-left: 3.2em}
+div.lisp {margin-left: 3.2em}
+div.smalldisplay {margin-left: 3.2em}
+div.smallexample {margin-left: 3.2em}
+div.smallindentedblock {margin-left: 3.2em; font-size: smaller}
+div.smalllisp {margin-left: 3.2em}
+kbd {font-style:oblique}
+pre.display {font-family: inherit}
+pre.format {font-family: inherit}
+pre.menu-comment {font-family: serif}
+pre.menu-preformatted {font-family: serif}
+pre.smalldisplay {font-family: inherit; font-size: smaller}
+pre.smallexample {font-size: smaller}
+pre.smallformat {font-family: inherit; font-size: smaller}
+pre.smalllisp {font-size: smaller}
+span.nocodebreak {white-space:nowrap}
+span.nolinebreak {white-space:nowrap}
+span.roman {font-family:serif; font-weight:normal}
+span.sansserif {font-family:sans-serif; font-weight:normal}
+ul.no-bullet {list-style: none}
+-->
+</style>
+
+
+</head>
+
+<body lang="en_US" bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#800080" alink="#FF0000">
+<a name="Nvidia-PTX-Options"></a>
+<div class="header">
+<p>
+Next: <a href="OpenRISC-Options.html#OpenRISC-Options" accesskey="n" rel="next">OpenRISC Options</a>, Previous: <a href="Nios-II-Options.html#Nios-II-Options" accesskey="p" rel="previous">Nios II Options</a>, Up: <a href="Submodel-Options.html#Submodel-Options" accesskey="u" rel="up">Submodel Options</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Indices.html#Indices" title="Index" rel="index">Index</a>]</p>
+</div>
+<hr>
+<a name="Nvidia-PTX-Options-1"></a>
+<h4 class="subsection">3.19.35 Nvidia PTX Options</h4>
+<a name="index-Nvidia-PTX-options"></a>
+<a name="index-nvptx-options"></a>
+
+<p>These options are defined for Nvidia PTX:
+</p>
+<dl compact="compact">
+<dd>
+<a name="index-m64"></a>
+</dd>
+<dt><code>-m64</code></dt>
+<dd><p>Ignored, but preserved for backward compatibility. Only 64-bit ABI is
+supported.
+</p>
+<a name="index-march-12"></a>
+</dd>
+<dt><code>-march=<var>architecture-string</var></code></dt>
+<dd><p>Generate code for the specified PTX ISA target architecture
+(e.g. &lsquo;<samp>sm_35</samp>&rsquo;). Valid architecture strings are &lsquo;<samp>sm_30</samp>&rsquo;,
+&lsquo;<samp>sm_35</samp>&rsquo;, &lsquo;<samp>sm_53</samp>&rsquo;, &lsquo;<samp>sm_70</samp>&rsquo;, &lsquo;<samp>sm_75</samp>&rsquo; and
+&lsquo;<samp>sm_80</samp>&rsquo;.
+The default depends on how the compiler has been configured, see
+<samp>--with-arch</samp>.
+</p>
+<p>This option sets the value of the preprocessor macro
+<code>__PTX_SM__</code>; for instance, for &lsquo;<samp>sm_35</samp>&rsquo;, it has the value
+&lsquo;<samp>350</samp>&rsquo;.
+</p>
+<a name="index-misa"></a>
+</dd>
+<dt><code>-misa=<var>architecture-string</var></code></dt>
+<dd><p>Alias of <samp>-march=</samp>.
+</p>
+<a name="index-march-13"></a>
+</dd>
+<dt><code>-march-map=<var>architecture-string</var></code></dt>
+<dd><p>Select the closest available <samp>-march=</samp> value that is not more
+capable. For instance, for <samp>-march-map=sm_50</samp> select
+<samp>-march=sm_35</samp>, and for <samp>-march-map=sm_53</samp> select
+<samp>-march=sm_53</samp>.
+</p>
+<a name="index-mptx"></a>
+</dd>
+<dt><code>-mptx=<var>version-string</var></code></dt>
+<dd><p>Generate code for the specified PTX ISA version (e.g. &lsquo;<samp>7.0</samp>&rsquo;).
+Valid version strings include &lsquo;<samp>3.1</samp>&rsquo;, &lsquo;<samp>6.0</samp>&rsquo;, &lsquo;<samp>6.3</samp>&rsquo;, and
+&lsquo;<samp>7.0</samp>&rsquo;. The default PTX ISA version is 6.0, unless a higher
+version is required for specified PTX ISA target architecture via
+option <samp>-march=</samp>.
+</p>
+<p>This option sets the values of the preprocessor macros
+<code>__PTX_ISA_VERSION_MAJOR__</code> and <code>__PTX_ISA_VERSION_MINOR__</code>;
+for instance, for &lsquo;<samp>3.1</samp>&rsquo; the macros have the values &lsquo;<samp>3</samp>&rsquo; and
+&lsquo;<samp>1</samp>&rsquo;, respectively.
+</p>
+<a name="index-mmainkernel"></a>
+</dd>
+<dt><code>-mmainkernel</code></dt>
+<dd><p>Link in code for a __main kernel. This is for stand-alone instead of
+offloading execution.
+</p>
+<a name="index-moptimize"></a>
+</dd>
+<dt><code>-moptimize</code></dt>
+<dd><p>Apply partitioned execution optimizations. This is the default when any
+level of optimization is selected.
+</p>
+<a name="index-msoft_002dstack"></a>
+</dd>
+<dt><code>-msoft-stack</code></dt>
+<dd><p>Generate code that does not use <code>.local</code> memory
+directly for stack storage. Instead, a per-warp stack pointer is
+maintained explicitly. This enables variable-length stack allocation (with
+variable-length arrays or <code>alloca</code>), and when global memory is used for
+underlying storage, makes it possible to access automatic variables from other
+threads, or with atomic instructions. This code generation variant is used
+for OpenMP offloading, but the option is exposed on its own for the purpose
+of testing the compiler; to generate code suitable for linking into programs
+using OpenMP offloading, use option <samp>-mgomp</samp>.
+</p>
+<a name="index-muniform_002dsimt"></a>
+</dd>
+<dt><code>-muniform-simt</code></dt>
+<dd><p>Switch to code generation variant that allows to execute all threads in each
+warp, while maintaining memory state and side effects as if only one thread
+in each warp was active outside of OpenMP SIMD regions. All atomic operations
+and calls to runtime (malloc, free, vprintf) are conditionally executed (iff
+current lane index equals the master lane index), and the register being
+assigned is copied via a shuffle instruction from the master lane. Outside of
+SIMD regions lane 0 is the master; inside, each thread sees itself as the
+master. Shared memory array <code>int __nvptx_uni[]</code> stores all-zeros or
+all-ones bitmasks for each warp, indicating current mode (0 outside of SIMD
+regions). Each thread can bitwise-and the bitmask at position <code>tid.y</code>
+with current lane index to compute the master lane index.
+</p>
+<a name="index-mgomp"></a>
+</dd>
+<dt><code>-mgomp</code></dt>
+<dd><p>Generate code for use in OpenMP offloading: enables <samp>-msoft-stack</samp> and
+<samp>-muniform-simt</samp> options, and selects corresponding multilib variant.
+</p>
+</dd>
+</dl>
+
+<hr>
+<div class="header">
+<p>
+Next: <a href="OpenRISC-Options.html#OpenRISC-Options" accesskey="n" rel="next">OpenRISC Options</a>, Previous: <a href="Nios-II-Options.html#Nios-II-Options" accesskey="p" rel="previous">Nios II Options</a>, Up: <a href="Submodel-Options.html#Submodel-Options" accesskey="u" rel="up">Submodel Options</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Indices.html#Indices" title="Index" rel="index">Index</a>]</p>
+</div>
+
+
+
+</body>
+</html>