diff options
Diffstat (limited to 'share/doc/gcc/x86-Built_002din-Functions.html')
-rw-r--r-- | share/doc/gcc/x86-Built_002din-Functions.html | 1857 |
1 files changed, 1857 insertions, 0 deletions
diff --git a/share/doc/gcc/x86-Built_002din-Functions.html b/share/doc/gcc/x86-Built_002din-Functions.html new file mode 100644 index 0000000..5db0876 --- /dev/null +++ b/share/doc/gcc/x86-Built_002din-Functions.html @@ -0,0 +1,1857 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> +<html> +<!-- This file documents the use of the GNU compilers. + +Copyright (C) 1988-2023 Free Software Foundation, Inc. + +Permission is granted to copy, distribute and/or modify this document +under the terms of the GNU Free Documentation License, Version 1.3 or +any later version published by the Free Software Foundation; with the +Invariant Sections being "Funding Free Software", the Front-Cover +Texts being (a) (see below), and with the Back-Cover Texts being (b) +(see below). A copy of the license is included in the section entitled +"GNU Free Documentation License". + +(a) The FSF's Front-Cover Text is: + +A GNU Manual + +(b) The FSF's Back-Cover Text is: + +You have freedom to copy and modify this GNU Manual, like GNU + software. Copies published by the Free Software Foundation raise + funds for GNU development. --> +<!-- Created by GNU Texinfo 5.1, http://www.gnu.org/software/texinfo/ --> +<head> +<title>Using the GNU Compiler Collection (GCC): x86 Built-in Functions</title> + +<meta name="description" content="Using the GNU Compiler Collection (GCC): x86 Built-in Functions"> +<meta name="keywords" content="Using the GNU Compiler Collection (GCC): x86 Built-in Functions"> +<meta name="resource-type" content="document"> +<meta name="distribution" content="global"> +<meta name="Generator" content="makeinfo"> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> +<link href="index.html#Top" rel="start" title="Top"> +<link href="Indices.html#Indices" rel="index" title="Indices"> +<link href="index.html#SEC_Contents" rel="contents" title="Table of Contents"> +<link href="Target-Builtins.html#Target-Builtins" rel="up" title="Target Builtins"> +<link href="x86-transactional-memory-intrinsics.html#x86-transactional-memory-intrinsics" rel="next" title="x86 transactional memory intrinsics"> +<link href="TI-C6X-Built_002din-Functions.html#TI-C6X-Built_002din-Functions" rel="previous" title="TI C6X Built-in Functions"> +<style type="text/css"> +<!-- +a.summary-letter {text-decoration: none} +blockquote.smallquotation {font-size: smaller} +div.display {margin-left: 3.2em} +div.example {margin-left: 3.2em} +div.indentedblock {margin-left: 3.2em} +div.lisp {margin-left: 3.2em} +div.smalldisplay {margin-left: 3.2em} +div.smallexample {margin-left: 3.2em} +div.smallindentedblock {margin-left: 3.2em; font-size: smaller} +div.smalllisp {margin-left: 3.2em} +kbd {font-style:oblique} +pre.display {font-family: inherit} +pre.format {font-family: inherit} +pre.menu-comment {font-family: serif} +pre.menu-preformatted {font-family: serif} +pre.smalldisplay {font-family: inherit; font-size: smaller} +pre.smallexample {font-size: smaller} +pre.smallformat {font-family: inherit; font-size: smaller} +pre.smalllisp {font-size: smaller} +span.nocodebreak {white-space:nowrap} +span.nolinebreak {white-space:nowrap} +span.roman {font-family:serif; font-weight:normal} +span.sansserif {font-family:sans-serif; font-weight:normal} +ul.no-bullet {list-style: none} +--> +</style> + + +</head> + +<body lang="en_US" bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#800080" alink="#FF0000"> +<a name="x86-Built_002din-Functions"></a> +<div class="header"> +<p> +Next: <a href="x86-transactional-memory-intrinsics.html#x86-transactional-memory-intrinsics" accesskey="n" rel="next">x86 transactional memory intrinsics</a>, Previous: <a href="TI-C6X-Built_002din-Functions.html#TI-C6X-Built_002din-Functions" accesskey="p" rel="previous">TI C6X Built-in Functions</a>, Up: <a href="Target-Builtins.html#Target-Builtins" accesskey="u" rel="up">Target Builtins</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Indices.html#Indices" title="Index" rel="index">Index</a>]</p> +</div> +<hr> +<a name="x86-Built_002din-Functions-1"></a> +<h4 class="subsection">6.60.35 x86 Built-in Functions</h4> + +<p>These built-in functions are available for the x86-32 and x86-64 family +of computers, depending on the command-line switches used. +</p> +<p>If you specify command-line switches such as <samp>-msse</samp>, +the compiler could use the extended instruction sets even if the built-ins +are not used explicitly in the program. For this reason, applications +that perform run-time CPU detection must compile separate files for each +supported architecture, using the appropriate flags. In particular, +the file containing the CPU detection code should be compiled without +these options. +</p> +<p>The following machine modes are available for use with MMX built-in functions +(see <a href="Vector-Extensions.html#Vector-Extensions">Vector Extensions</a>): <code>V2SI</code> for a vector of two 32-bit integers, +<code>V4HI</code> for a vector of four 16-bit integers, and <code>V8QI</code> for a +vector of eight 8-bit integers. Some of the built-in functions operate on +MMX registers as a whole 64-bit entity, these use <code>V1DI</code> as their mode. +</p> +<p>If 3DNow! extensions are enabled, <code>V2SF</code> is used as a mode for a vector +of two 32-bit floating-point values. +</p> +<p>If SSE extensions are enabled, <code>V4SF</code> is used for a vector of four 32-bit +floating-point values. Some instructions use a vector of four 32-bit +integers, these use <code>V4SI</code>. Finally, some instructions operate on an +entire vector register, interpreting it as a 128-bit integer, these use mode +<code>TI</code>. +</p> +<p>The x86-32 and x86-64 family of processors use additional built-in +functions for efficient use of <code>TF</code> (<code>__float128</code>) 128-bit +floating point and <code>TC</code> 128-bit complex floating-point values. +</p> +<p>The following floating-point built-in functions are always available: +</p> +<dl> +<dt><a name="index-_005f_005fbuiltin_005ffabsq"></a>Built-in Function: <em>__float128</em> <strong>__builtin_fabsq</strong> <em>(__float128 <var>x</var>))</em></dt> +<dd><p>Computes the absolute value of <var>x</var>. +</p></dd></dl> + +<dl> +<dt><a name="index-_005f_005fbuiltin_005fcopysignq"></a>Built-in Function: <em>__float128</em> <strong>__builtin_copysignq</strong> <em>(__float128 <var>x</var>, __float128 <var>y</var>)</em></dt> +<dd><p>Copies the sign of <var>y</var> into <var>x</var> and returns the new value of +<var>x</var>. +</p></dd></dl> + +<dl> +<dt><a name="index-_005f_005fbuiltin_005finfq"></a>Built-in Function: <em>__float128</em> <strong>__builtin_infq</strong> <em>(void)</em></dt> +<dd><p>Similar to <code>__builtin_inf</code>, except the return type is <code>__float128</code>. +</p></dd></dl> + +<dl> +<dt><a name="index-_005f_005fbuiltin_005fhuge_005fvalq"></a>Built-in Function: <em>__float128</em> <strong>__builtin_huge_valq</strong> <em>(void)</em></dt> +<dd><p>Similar to <code>__builtin_huge_val</code>, except the return type is <code>__float128</code>. +</p></dd></dl> + +<dl> +<dt><a name="index-_005f_005fbuiltin_005fnanq"></a>Built-in Function: <em>__float128</em> <strong>__builtin_nanq</strong> <em>(void)</em></dt> +<dd><p>Similar to <code>__builtin_nan</code>, except the return type is <code>__float128</code>. +</p></dd></dl> + +<dl> +<dt><a name="index-_005f_005fbuiltin_005fnansq"></a>Built-in Function: <em>__float128</em> <strong>__builtin_nansq</strong> <em>(void)</em></dt> +<dd><p>Similar to <code>__builtin_nans</code>, except the return type is <code>__float128</code>. +</p></dd></dl> + +<p>The following built-in function is always available. +</p> +<dl> +<dt><a name="index-_005f_005fbuiltin_005fia32_005fpause"></a>Built-in Function: <em>void</em> <strong>__builtin_ia32_pause</strong> <em>(void)</em></dt> +<dd><p>Generates the <code>pause</code> machine instruction with a compiler memory +barrier. +</p></dd></dl> + +<p>The following built-in functions are always available and can be used to +check the target platform type. +</p> +<dl> +<dt><a name="index-_005f_005fbuiltin_005fcpu_005finit-1"></a>Built-in Function: <em>void</em> <strong>__builtin_cpu_init</strong> <em>(void)</em></dt> +<dd><p>This function runs the CPU detection code to check the type of CPU and the +features supported. This built-in function needs to be invoked along with the built-in functions +to check CPU type and features, <code>__builtin_cpu_is</code> and +<code>__builtin_cpu_supports</code>, only when used in a function that is +executed before any constructors are called. The CPU detection code is +automatically executed in a very high priority constructor. +</p> +<p>For example, this function has to be used in <code>ifunc</code> resolvers that +check for CPU type using the built-in functions <code>__builtin_cpu_is</code> +and <code>__builtin_cpu_supports</code>, or in constructors on targets that +don’t support constructor priority. +</p><div class="smallexample"> +<pre class="smallexample"> +static void (*resolve_memcpy (void)) (void) +{ + // ifunc resolvers fire before constructors, explicitly call the init + // function. + __builtin_cpu_init (); + if (__builtin_cpu_supports ("ssse3")) + return ssse3_memcpy; // super fast memcpy with ssse3 instructions. + else + return default_memcpy; +} + +void *memcpy (void *, const void *, size_t) + __attribute__ ((ifunc ("resolve_memcpy"))); +</pre></div> + +</dd></dl> + +<dl> +<dt><a name="index-_005f_005fbuiltin_005fcpu_005fis-1"></a>Built-in Function: <em>int</em> <strong>__builtin_cpu_is</strong> <em>(const char *<var>cpuname</var>)</em></dt> +<dd><p>This function returns a positive integer if the run-time CPU +is of type <var>cpuname</var> +and returns <code>0</code> otherwise. The following CPU names can be detected: +</p> +<dl compact="compact"> +<dt>‘<samp>amd</samp>’</dt> +<dd><p>AMD CPU. +</p> +</dd> +<dt>‘<samp>intel</samp>’</dt> +<dd><p>Intel CPU. +</p> +</dd> +<dt>‘<samp>atom</samp>’</dt> +<dd><p>Intel Atom CPU. +</p> +</dd> +<dt>‘<samp>slm</samp>’</dt> +<dd><p>Intel Silvermont CPU. +</p> +</dd> +<dt>‘<samp>core2</samp>’</dt> +<dd><p>Intel Core 2 CPU. +</p> +</dd> +<dt>‘<samp>corei7</samp>’</dt> +<dd><p>Intel Core i7 CPU. +</p> +</dd> +<dt>‘<samp>nehalem</samp>’</dt> +<dd><p>Intel Core i7 Nehalem CPU. +</p> +</dd> +<dt>‘<samp>westmere</samp>’</dt> +<dd><p>Intel Core i7 Westmere CPU. +</p> +</dd> +<dt>‘<samp>sandybridge</samp>’</dt> +<dd><p>Intel Core i7 Sandy Bridge CPU. +</p> +</dd> +<dt>‘<samp>ivybridge</samp>’</dt> +<dd><p>Intel Core i7 Ivy Bridge CPU. +</p> +</dd> +<dt>‘<samp>haswell</samp>’</dt> +<dd><p>Intel Core i7 Haswell CPU. +</p> +</dd> +<dt>‘<samp>broadwell</samp>’</dt> +<dd><p>Intel Core i7 Broadwell CPU. +</p> +</dd> +<dt>‘<samp>skylake</samp>’</dt> +<dd><p>Intel Core i7 Skylake CPU. +</p> +</dd> +<dt>‘<samp>skylake-avx512</samp>’</dt> +<dd><p>Intel Core i7 Skylake AVX512 CPU. +</p> +</dd> +<dt>‘<samp>cannonlake</samp>’</dt> +<dd><p>Intel Core i7 Cannon Lake CPU. +</p> +</dd> +<dt>‘<samp>icelake-client</samp>’</dt> +<dd><p>Intel Core i7 Ice Lake Client CPU. +</p> +</dd> +<dt>‘<samp>icelake-server</samp>’</dt> +<dd><p>Intel Core i7 Ice Lake Server CPU. +</p> +</dd> +<dt>‘<samp>cascadelake</samp>’</dt> +<dd><p>Intel Core i7 Cascadelake CPU. +</p> +</dd> +<dt>‘<samp>tigerlake</samp>’</dt> +<dd><p>Intel Core i7 Tigerlake CPU. +</p> +</dd> +<dt>‘<samp>cooperlake</samp>’</dt> +<dd><p>Intel Core i7 Cooperlake CPU. +</p> +</dd> +<dt>‘<samp>sapphirerapids</samp>’</dt> +<dd><p>Intel Core i7 sapphirerapids CPU. +</p> +</dd> +<dt>‘<samp>alderlake</samp>’</dt> +<dd><p>Intel Core i7 Alderlake CPU. +</p> +</dd> +<dt>‘<samp>rocketlake</samp>’</dt> +<dd><p>Intel Core i7 Rocketlake CPU. +</p> +</dd> +<dt>‘<samp>graniterapids</samp>’</dt> +<dd><p>Intel Core i7 graniterapids CPU. +</p> +</dd> +<dt>‘<samp>graniterapids-d</samp>’</dt> +<dd><p>Intel Core i7 graniterapids D CPU. +</p> +</dd> +<dt>‘<samp>bonnell</samp>’</dt> +<dd><p>Intel Atom Bonnell CPU. +</p> +</dd> +<dt>‘<samp>silvermont</samp>’</dt> +<dd><p>Intel Atom Silvermont CPU. +</p> +</dd> +<dt>‘<samp>goldmont</samp>’</dt> +<dd><p>Intel Atom Goldmont CPU. +</p> +</dd> +<dt>‘<samp>goldmont-plus</samp>’</dt> +<dd><p>Intel Atom Goldmont Plus CPU. +</p> +</dd> +<dt>‘<samp>tremont</samp>’</dt> +<dd><p>Intel Atom Tremont CPU. +</p> +</dd> +<dt>‘<samp>sierraforest</samp>’</dt> +<dd><p>Intel Atom Sierra Forest CPU. +</p> +</dd> +<dt>‘<samp>grandridge</samp>’</dt> +<dd><p>Intel Atom Grand Ridge CPU. +</p> +</dd> +<dt>‘<samp>knl</samp>’</dt> +<dd><p>Intel Knights Landing CPU. +</p> +</dd> +<dt>‘<samp>knm</samp>’</dt> +<dd><p>Intel Knights Mill CPU. +</p> +</dd> +<dt>‘<samp>lujiazui</samp>’</dt> +<dd><p>ZHAOXIN lujiazui CPU. +</p> +</dd> +<dt>‘<samp>amdfam10h</samp>’</dt> +<dd><p>AMD Family 10h CPU. +</p> +</dd> +<dt>‘<samp>barcelona</samp>’</dt> +<dd><p>AMD Family 10h Barcelona CPU. +</p> +</dd> +<dt>‘<samp>shanghai</samp>’</dt> +<dd><p>AMD Family 10h Shanghai CPU. +</p> +</dd> +<dt>‘<samp>istanbul</samp>’</dt> +<dd><p>AMD Family 10h Istanbul CPU. +</p> +</dd> +<dt>‘<samp>btver1</samp>’</dt> +<dd><p>AMD Family 14h CPU. +</p> +</dd> +<dt>‘<samp>amdfam15h</samp>’</dt> +<dd><p>AMD Family 15h CPU. +</p> +</dd> +<dt>‘<samp>bdver1</samp>’</dt> +<dd><p>AMD Family 15h Bulldozer version 1. +</p> +</dd> +<dt>‘<samp>bdver2</samp>’</dt> +<dd><p>AMD Family 15h Bulldozer version 2. +</p> +</dd> +<dt>‘<samp>bdver3</samp>’</dt> +<dd><p>AMD Family 15h Bulldozer version 3. +</p> +</dd> +<dt>‘<samp>bdver4</samp>’</dt> +<dd><p>AMD Family 15h Bulldozer version 4. +</p> +</dd> +<dt>‘<samp>btver2</samp>’</dt> +<dd><p>AMD Family 16h CPU. +</p> +</dd> +<dt>‘<samp>amdfam17h</samp>’</dt> +<dd><p>AMD Family 17h CPU. +</p> +</dd> +<dt>‘<samp>znver1</samp>’</dt> +<dd><p>AMD Family 17h Zen version 1. +</p> +</dd> +<dt>‘<samp>znver2</samp>’</dt> +<dd><p>AMD Family 17h Zen version 2. +</p> +</dd> +<dt>‘<samp>amdfam19h</samp>’</dt> +<dd><p>AMD Family 19h CPU. +</p> +</dd> +<dt>‘<samp>znver3</samp>’</dt> +<dd><p>AMD Family 19h Zen version 3. +</p> +</dd> +<dt>‘<samp>znver4</samp>’</dt> +<dd><p>AMD Family 19h Zen version 4. +</p></dd> +</dl> + +<p>Here is an example: +</p><div class="smallexample"> +<pre class="smallexample">if (__builtin_cpu_is ("corei7")) + { + do_corei7 (); // Core i7 specific implementation. + } +else + { + do_generic (); // Generic implementation. + } +</pre></div> +</dd></dl> + +<dl> +<dt><a name="index-_005f_005fbuiltin_005fcpu_005fsupports-1"></a>Built-in Function: <em>int</em> <strong>__builtin_cpu_supports</strong> <em>(const char *<var>feature</var>)</em></dt> +<dd><p>This function returns a positive integer if the run-time CPU +supports <var>feature</var> +and returns <code>0</code> otherwise. The following features can be detected: +</p> +<dl compact="compact"> +<dt>‘<samp>cmov</samp>’</dt> +<dd><p>CMOV instruction. +</p></dd> +<dt>‘<samp>mmx</samp>’</dt> +<dd><p>MMX instructions. +</p></dd> +<dt>‘<samp>popcnt</samp>’</dt> +<dd><p>POPCNT instruction. +</p></dd> +<dt>‘<samp>sse</samp>’</dt> +<dd><p>SSE instructions. +</p></dd> +<dt>‘<samp>sse2</samp>’</dt> +<dd><p>SSE2 instructions. +</p></dd> +<dt>‘<samp>sse3</samp>’</dt> +<dd><p>SSE3 instructions. +</p></dd> +<dt>‘<samp>ssse3</samp>’</dt> +<dd><p>SSSE3 instructions. +</p></dd> +<dt>‘<samp>sse4.1</samp>’</dt> +<dd><p>SSE4.1 instructions. +</p></dd> +<dt>‘<samp>sse4.2</samp>’</dt> +<dd><p>SSE4.2 instructions. +</p></dd> +<dt>‘<samp>avx</samp>’</dt> +<dd><p>AVX instructions. +</p></dd> +<dt>‘<samp>avx2</samp>’</dt> +<dd><p>AVX2 instructions. +</p></dd> +<dt>‘<samp>sse4a</samp>’</dt> +<dd><p>SSE4A instructions. +</p></dd> +<dt>‘<samp>fma4</samp>’</dt> +<dd><p>FMA4 instructions. +</p></dd> +<dt>‘<samp>xop</samp>’</dt> +<dd><p>XOP instructions. +</p></dd> +<dt>‘<samp>fma</samp>’</dt> +<dd><p>FMA instructions. +</p></dd> +<dt>‘<samp>avx512f</samp>’</dt> +<dd><p>AVX512F instructions. +</p></dd> +<dt>‘<samp>bmi</samp>’</dt> +<dd><p>BMI instructions. +</p></dd> +<dt>‘<samp>bmi2</samp>’</dt> +<dd><p>BMI2 instructions. +</p></dd> +<dt>‘<samp>aes</samp>’</dt> +<dd><p>AES instructions. +</p></dd> +<dt>‘<samp>pclmul</samp>’</dt> +<dd><p>PCLMUL instructions. +</p></dd> +<dt>‘<samp>avx512vl</samp>’</dt> +<dd><p>AVX512VL instructions. +</p></dd> +<dt>‘<samp>avx512bw</samp>’</dt> +<dd><p>AVX512BW instructions. +</p></dd> +<dt>‘<samp>avx512dq</samp>’</dt> +<dd><p>AVX512DQ instructions. +</p></dd> +<dt>‘<samp>avx512cd</samp>’</dt> +<dd><p>AVX512CD instructions. +</p></dd> +<dt>‘<samp>avx512er</samp>’</dt> +<dd><p>AVX512ER instructions. +</p></dd> +<dt>‘<samp>avx512pf</samp>’</dt> +<dd><p>AVX512PF instructions. +</p></dd> +<dt>‘<samp>avx512vbmi</samp>’</dt> +<dd><p>AVX512VBMI instructions. +</p></dd> +<dt>‘<samp>avx512ifma</samp>’</dt> +<dd><p>AVX512IFMA instructions. +</p></dd> +<dt>‘<samp>avx5124vnniw</samp>’</dt> +<dd><p>AVX5124VNNIW instructions. +</p></dd> +<dt>‘<samp>avx5124fmaps</samp>’</dt> +<dd><p>AVX5124FMAPS instructions. +</p></dd> +<dt>‘<samp>avx512vpopcntdq</samp>’</dt> +<dd><p>AVX512VPOPCNTDQ instructions. +</p></dd> +<dt>‘<samp>avx512vbmi2</samp>’</dt> +<dd><p>AVX512VBMI2 instructions. +</p></dd> +<dt>‘<samp>gfni</samp>’</dt> +<dd><p>GFNI instructions. +</p></dd> +<dt>‘<samp>vpclmulqdq</samp>’</dt> +<dd><p>VPCLMULQDQ instructions. +</p></dd> +<dt>‘<samp>avx512vnni</samp>’</dt> +<dd><p>AVX512VNNI instructions. +</p></dd> +<dt>‘<samp>avx512bitalg</samp>’</dt> +<dd><p>AVX512BITALG instructions. +</p></dd> +<dt>‘<samp>x86-64</samp>’</dt> +<dd><p>Baseline x86-64 microarchitecture level (as defined in x86-64 psABI). +</p></dd> +<dt>‘<samp>x86-64-v2</samp>’</dt> +<dd><p>x86-64-v2 microarchitecture level. +</p></dd> +<dt>‘<samp>x86-64-v3</samp>’</dt> +<dd><p>x86-64-v3 microarchitecture level. +</p></dd> +<dt>‘<samp>x86-64-v4</samp>’</dt> +<dd><p>x86-64-v4 microarchitecture level. +</p> + +</dd> +</dl> + +<p>Here is an example: +</p><div class="smallexample"> +<pre class="smallexample">if (__builtin_cpu_supports ("popcnt")) + { + asm("popcnt %1,%0" : "=r"(count) : "rm"(n) : "cc"); + } +else + { + count = generic_countbits (n); //generic implementation. + } +</pre></div> +</dd></dl> + +<p>The following built-in functions are made available by <samp>-mmmx</samp>. +All of them generate the machine instruction that is part of the name. +</p> +<div class="smallexample"> +<pre class="smallexample">v8qi __builtin_ia32_paddb (v8qi, v8qi); +v4hi __builtin_ia32_paddw (v4hi, v4hi); +v2si __builtin_ia32_paddd (v2si, v2si); +v8qi __builtin_ia32_psubb (v8qi, v8qi); +v4hi __builtin_ia32_psubw (v4hi, v4hi); +v2si __builtin_ia32_psubd (v2si, v2si); +v8qi __builtin_ia32_paddsb (v8qi, v8qi); +v4hi __builtin_ia32_paddsw (v4hi, v4hi); +v8qi __builtin_ia32_psubsb (v8qi, v8qi); +v4hi __builtin_ia32_psubsw (v4hi, v4hi); +v8qi __builtin_ia32_paddusb (v8qi, v8qi); +v4hi __builtin_ia32_paddusw (v4hi, v4hi); +v8qi __builtin_ia32_psubusb (v8qi, v8qi); +v4hi __builtin_ia32_psubusw (v4hi, v4hi); +v4hi __builtin_ia32_pmullw (v4hi, v4hi); +v4hi __builtin_ia32_pmulhw (v4hi, v4hi); +di __builtin_ia32_pand (di, di); +di __builtin_ia32_pandn (di,di); +di __builtin_ia32_por (di, di); +di __builtin_ia32_pxor (di, di); +v8qi __builtin_ia32_pcmpeqb (v8qi, v8qi); +v4hi __builtin_ia32_pcmpeqw (v4hi, v4hi); +v2si __builtin_ia32_pcmpeqd (v2si, v2si); +v8qi __builtin_ia32_pcmpgtb (v8qi, v8qi); +v4hi __builtin_ia32_pcmpgtw (v4hi, v4hi); +v2si __builtin_ia32_pcmpgtd (v2si, v2si); +v8qi __builtin_ia32_punpckhbw (v8qi, v8qi); +v4hi __builtin_ia32_punpckhwd (v4hi, v4hi); +v2si __builtin_ia32_punpckhdq (v2si, v2si); +v8qi __builtin_ia32_punpcklbw (v8qi, v8qi); +v4hi __builtin_ia32_punpcklwd (v4hi, v4hi); +v2si __builtin_ia32_punpckldq (v2si, v2si); +v8qi __builtin_ia32_packsswb (v4hi, v4hi); +v4hi __builtin_ia32_packssdw (v2si, v2si); +v8qi __builtin_ia32_packuswb (v4hi, v4hi); + +v4hi __builtin_ia32_psllw (v4hi, v4hi); +v2si __builtin_ia32_pslld (v2si, v2si); +v1di __builtin_ia32_psllq (v1di, v1di); +v4hi __builtin_ia32_psrlw (v4hi, v4hi); +v2si __builtin_ia32_psrld (v2si, v2si); +v1di __builtin_ia32_psrlq (v1di, v1di); +v4hi __builtin_ia32_psraw (v4hi, v4hi); +v2si __builtin_ia32_psrad (v2si, v2si); +v4hi __builtin_ia32_psllwi (v4hi, int); +v2si __builtin_ia32_pslldi (v2si, int); +v1di __builtin_ia32_psllqi (v1di, int); +v4hi __builtin_ia32_psrlwi (v4hi, int); +v2si __builtin_ia32_psrldi (v2si, int); +v1di __builtin_ia32_psrlqi (v1di, int); +v4hi __builtin_ia32_psrawi (v4hi, int); +v2si __builtin_ia32_psradi (v2si, int); +</pre></div> + +<p>The following built-in functions are made available either with +<samp>-msse</samp>, or with <samp>-m3dnowa</samp>. All of them generate +the machine instruction that is part of the name. +</p> +<div class="smallexample"> +<pre class="smallexample">v4hi __builtin_ia32_pmulhuw (v4hi, v4hi); +v8qi __builtin_ia32_pavgb (v8qi, v8qi); +v4hi __builtin_ia32_pavgw (v4hi, v4hi); +v1di __builtin_ia32_psadbw (v8qi, v8qi); +v8qi __builtin_ia32_pmaxub (v8qi, v8qi); +v4hi __builtin_ia32_pmaxsw (v4hi, v4hi); +v8qi __builtin_ia32_pminub (v8qi, v8qi); +v4hi __builtin_ia32_pminsw (v4hi, v4hi); +int __builtin_ia32_pmovmskb (v8qi); +void __builtin_ia32_maskmovq (v8qi, v8qi, char *); +void __builtin_ia32_movntq (di *, di); +void __builtin_ia32_sfence (void); +</pre></div> + +<p>The following built-in functions are available when <samp>-msse</samp> is used. +All of them generate the machine instruction that is part of the name. +</p> +<div class="smallexample"> +<pre class="smallexample">int __builtin_ia32_comieq (v4sf, v4sf); +int __builtin_ia32_comineq (v4sf, v4sf); +int __builtin_ia32_comilt (v4sf, v4sf); +int __builtin_ia32_comile (v4sf, v4sf); +int __builtin_ia32_comigt (v4sf, v4sf); +int __builtin_ia32_comige (v4sf, v4sf); +int __builtin_ia32_ucomieq (v4sf, v4sf); +int __builtin_ia32_ucomineq (v4sf, v4sf); +int __builtin_ia32_ucomilt (v4sf, v4sf); +int __builtin_ia32_ucomile (v4sf, v4sf); +int __builtin_ia32_ucomigt (v4sf, v4sf); +int __builtin_ia32_ucomige (v4sf, v4sf); +v4sf __builtin_ia32_addps (v4sf, v4sf); +v4sf __builtin_ia32_subps (v4sf, v4sf); +v4sf __builtin_ia32_mulps (v4sf, v4sf); +v4sf __builtin_ia32_divps (v4sf, v4sf); +v4sf __builtin_ia32_addss (v4sf, v4sf); +v4sf __builtin_ia32_subss (v4sf, v4sf); +v4sf __builtin_ia32_mulss (v4sf, v4sf); +v4sf __builtin_ia32_divss (v4sf, v4sf); +v4sf __builtin_ia32_cmpeqps (v4sf, v4sf); +v4sf __builtin_ia32_cmpltps (v4sf, v4sf); +v4sf __builtin_ia32_cmpleps (v4sf, v4sf); +v4sf __builtin_ia32_cmpgtps (v4sf, v4sf); +v4sf __builtin_ia32_cmpgeps (v4sf, v4sf); +v4sf __builtin_ia32_cmpunordps (v4sf, v4sf); +v4sf __builtin_ia32_cmpneqps (v4sf, v4sf); +v4sf __builtin_ia32_cmpnltps (v4sf, v4sf); +v4sf __builtin_ia32_cmpnleps (v4sf, v4sf); +v4sf __builtin_ia32_cmpngtps (v4sf, v4sf); +v4sf __builtin_ia32_cmpngeps (v4sf, v4sf); +v4sf __builtin_ia32_cmpordps (v4sf, v4sf); +v4sf __builtin_ia32_cmpeqss (v4sf, v4sf); +v4sf __builtin_ia32_cmpltss (v4sf, v4sf); +v4sf __builtin_ia32_cmpless (v4sf, v4sf); +v4sf __builtin_ia32_cmpunordss (v4sf, v4sf); +v4sf __builtin_ia32_cmpneqss (v4sf, v4sf); +v4sf __builtin_ia32_cmpnltss (v4sf, v4sf); +v4sf __builtin_ia32_cmpnless (v4sf, v4sf); +v4sf __builtin_ia32_cmpordss (v4sf, v4sf); +v4sf __builtin_ia32_maxps (v4sf, v4sf); +v4sf __builtin_ia32_maxss (v4sf, v4sf); +v4sf __builtin_ia32_minps (v4sf, v4sf); +v4sf __builtin_ia32_minss (v4sf, v4sf); +v4sf __builtin_ia32_andps (v4sf, v4sf); +v4sf __builtin_ia32_andnps (v4sf, v4sf); +v4sf __builtin_ia32_orps (v4sf, v4sf); +v4sf __builtin_ia32_xorps (v4sf, v4sf); +v4sf __builtin_ia32_movss (v4sf, v4sf); +v4sf __builtin_ia32_movhlps (v4sf, v4sf); +v4sf __builtin_ia32_movlhps (v4sf, v4sf); +v4sf __builtin_ia32_unpckhps (v4sf, v4sf); +v4sf __builtin_ia32_unpcklps (v4sf, v4sf); +v4sf __builtin_ia32_cvtpi2ps (v4sf, v2si); +v4sf __builtin_ia32_cvtsi2ss (v4sf, int); +v2si __builtin_ia32_cvtps2pi (v4sf); +int __builtin_ia32_cvtss2si (v4sf); +v2si __builtin_ia32_cvttps2pi (v4sf); +int __builtin_ia32_cvttss2si (v4sf); +v4sf __builtin_ia32_rcpps (v4sf); +v4sf __builtin_ia32_rsqrtps (v4sf); +v4sf __builtin_ia32_sqrtps (v4sf); +v4sf __builtin_ia32_rcpss (v4sf); +v4sf __builtin_ia32_rsqrtss (v4sf); +v4sf __builtin_ia32_sqrtss (v4sf); +v4sf __builtin_ia32_shufps (v4sf, v4sf, int); +void __builtin_ia32_movntps (float *, v4sf); +int __builtin_ia32_movmskps (v4sf); +</pre></div> + +<p>The following built-in functions are available when <samp>-msse</samp> is used. +</p> +<dl> +<dt><a name="index-_005f_005fbuiltin_005fia32_005floadups"></a>Built-in Function: <em>v4sf</em> <strong>__builtin_ia32_loadups</strong> <em>(float *)</em></dt> +<dd><p>Generates the <code>movups</code> machine instruction as a load from memory. +</p></dd></dl> + +<dl> +<dt><a name="index-_005f_005fbuiltin_005fia32_005fstoreups"></a>Built-in Function: <em>void</em> <strong>__builtin_ia32_storeups</strong> <em>(float *, v4sf)</em></dt> +<dd><p>Generates the <code>movups</code> machine instruction as a store to memory. +</p></dd></dl> + +<dl> +<dt><a name="index-_005f_005fbuiltin_005fia32_005floadss"></a>Built-in Function: <em>v4sf</em> <strong>__builtin_ia32_loadss</strong> <em>(float *)</em></dt> +<dd><p>Generates the <code>movss</code> machine instruction as a load from memory. +</p></dd></dl> + +<dl> +<dt><a name="index-_005f_005fbuiltin_005fia32_005floadhps"></a>Built-in Function: <em>v4sf</em> <strong>__builtin_ia32_loadhps</strong> <em>(v4sf, const v2sf *)</em></dt> +<dd><p>Generates the <code>movhps</code> machine instruction as a load from memory. +</p></dd></dl> + +<dl> +<dt><a name="index-_005f_005fbuiltin_005fia32_005floadlps"></a>Built-in Function: <em>v4sf</em> <strong>__builtin_ia32_loadlps</strong> <em>(v4sf, const v2sf *)</em></dt> +<dd><p>Generates the <code>movlps</code> machine instruction as a load from memory +</p></dd></dl> + +<dl> +<dt><a name="index-_005f_005fbuiltin_005fia32_005fstorehps"></a>Built-in Function: <em>void</em> <strong>__builtin_ia32_storehps</strong> <em>(v2sf *, v4sf)</em></dt> +<dd><p>Generates the <code>movhps</code> machine instruction as a store to memory. +</p></dd></dl> + +<dl> +<dt><a name="index-_005f_005fbuiltin_005fia32_005fstorelps"></a>Built-in Function: <em>void</em> <strong>__builtin_ia32_storelps</strong> <em>(v2sf *, v4sf)</em></dt> +<dd><p>Generates the <code>movlps</code> machine instruction as a store to memory. +</p></dd></dl> + +<p>The following built-in functions are available when <samp>-msse2</samp> is used. +All of them generate the machine instruction that is part of the name. +</p> +<div class="smallexample"> +<pre class="smallexample">int __builtin_ia32_comisdeq (v2df, v2df); +int __builtin_ia32_comisdlt (v2df, v2df); +int __builtin_ia32_comisdle (v2df, v2df); +int __builtin_ia32_comisdgt (v2df, v2df); +int __builtin_ia32_comisdge (v2df, v2df); +int __builtin_ia32_comisdneq (v2df, v2df); +int __builtin_ia32_ucomisdeq (v2df, v2df); +int __builtin_ia32_ucomisdlt (v2df, v2df); +int __builtin_ia32_ucomisdle (v2df, v2df); +int __builtin_ia32_ucomisdgt (v2df, v2df); +int __builtin_ia32_ucomisdge (v2df, v2df); +int __builtin_ia32_ucomisdneq (v2df, v2df); +v2df __builtin_ia32_cmpeqpd (v2df, v2df); +v2df __builtin_ia32_cmpltpd (v2df, v2df); +v2df __builtin_ia32_cmplepd (v2df, v2df); +v2df __builtin_ia32_cmpgtpd (v2df, v2df); +v2df __builtin_ia32_cmpgepd (v2df, v2df); +v2df __builtin_ia32_cmpunordpd (v2df, v2df); +v2df __builtin_ia32_cmpneqpd (v2df, v2df); +v2df __builtin_ia32_cmpnltpd (v2df, v2df); +v2df __builtin_ia32_cmpnlepd (v2df, v2df); +v2df __builtin_ia32_cmpngtpd (v2df, v2df); +v2df __builtin_ia32_cmpngepd (v2df, v2df); +v2df __builtin_ia32_cmpordpd (v2df, v2df); +v2df __builtin_ia32_cmpeqsd (v2df, v2df); +v2df __builtin_ia32_cmpltsd (v2df, v2df); +v2df __builtin_ia32_cmplesd (v2df, v2df); +v2df __builtin_ia32_cmpunordsd (v2df, v2df); +v2df __builtin_ia32_cmpneqsd (v2df, v2df); +v2df __builtin_ia32_cmpnltsd (v2df, v2df); +v2df __builtin_ia32_cmpnlesd (v2df, v2df); +v2df __builtin_ia32_cmpordsd (v2df, v2df); +v2di __builtin_ia32_paddq (v2di, v2di); +v2di __builtin_ia32_psubq (v2di, v2di); +v2df __builtin_ia32_addpd (v2df, v2df); +v2df __builtin_ia32_subpd (v2df, v2df); +v2df __builtin_ia32_mulpd (v2df, v2df); +v2df __builtin_ia32_divpd (v2df, v2df); +v2df __builtin_ia32_addsd (v2df, v2df); +v2df __builtin_ia32_subsd (v2df, v2df); +v2df __builtin_ia32_mulsd (v2df, v2df); +v2df __builtin_ia32_divsd (v2df, v2df); +v2df __builtin_ia32_minpd (v2df, v2df); +v2df __builtin_ia32_maxpd (v2df, v2df); +v2df __builtin_ia32_minsd (v2df, v2df); +v2df __builtin_ia32_maxsd (v2df, v2df); +v2df __builtin_ia32_andpd (v2df, v2df); +v2df __builtin_ia32_andnpd (v2df, v2df); +v2df __builtin_ia32_orpd (v2df, v2df); +v2df __builtin_ia32_xorpd (v2df, v2df); +v2df __builtin_ia32_movsd (v2df, v2df); +v2df __builtin_ia32_unpckhpd (v2df, v2df); +v2df __builtin_ia32_unpcklpd (v2df, v2df); +v16qi __builtin_ia32_paddb128 (v16qi, v16qi); +v8hi __builtin_ia32_paddw128 (v8hi, v8hi); +v4si __builtin_ia32_paddd128 (v4si, v4si); +v2di __builtin_ia32_paddq128 (v2di, v2di); +v16qi __builtin_ia32_psubb128 (v16qi, v16qi); +v8hi __builtin_ia32_psubw128 (v8hi, v8hi); +v4si __builtin_ia32_psubd128 (v4si, v4si); +v2di __builtin_ia32_psubq128 (v2di, v2di); +v8hi __builtin_ia32_pmullw128 (v8hi, v8hi); +v8hi __builtin_ia32_pmulhw128 (v8hi, v8hi); +v2di __builtin_ia32_pand128 (v2di, v2di); +v2di __builtin_ia32_pandn128 (v2di, v2di); +v2di __builtin_ia32_por128 (v2di, v2di); +v2di __builtin_ia32_pxor128 (v2di, v2di); +v16qi __builtin_ia32_pavgb128 (v16qi, v16qi); +v8hi __builtin_ia32_pavgw128 (v8hi, v8hi); +v16qi __builtin_ia32_pcmpeqb128 (v16qi, v16qi); +v8hi __builtin_ia32_pcmpeqw128 (v8hi, v8hi); +v4si __builtin_ia32_pcmpeqd128 (v4si, v4si); +v16qi __builtin_ia32_pcmpgtb128 (v16qi, v16qi); +v8hi __builtin_ia32_pcmpgtw128 (v8hi, v8hi); +v4si __builtin_ia32_pcmpgtd128 (v4si, v4si); +v16qi __builtin_ia32_pmaxub128 (v16qi, v16qi); +v8hi __builtin_ia32_pmaxsw128 (v8hi, v8hi); +v16qi __builtin_ia32_pminub128 (v16qi, v16qi); +v8hi __builtin_ia32_pminsw128 (v8hi, v8hi); +v16qi __builtin_ia32_punpckhbw128 (v16qi, v16qi); +v8hi __builtin_ia32_punpckhwd128 (v8hi, v8hi); +v4si __builtin_ia32_punpckhdq128 (v4si, v4si); +v2di __builtin_ia32_punpckhqdq128 (v2di, v2di); +v16qi __builtin_ia32_punpcklbw128 (v16qi, v16qi); +v8hi __builtin_ia32_punpcklwd128 (v8hi, v8hi); +v4si __builtin_ia32_punpckldq128 (v4si, v4si); +v2di __builtin_ia32_punpcklqdq128 (v2di, v2di); +v16qi __builtin_ia32_packsswb128 (v8hi, v8hi); +v8hi __builtin_ia32_packssdw128 (v4si, v4si); +v16qi __builtin_ia32_packuswb128 (v8hi, v8hi); +v8hi __builtin_ia32_pmulhuw128 (v8hi, v8hi); +void __builtin_ia32_maskmovdqu (v16qi, v16qi); +v2df __builtin_ia32_loadupd (double *); +void __builtin_ia32_storeupd (double *, v2df); +v2df __builtin_ia32_loadhpd (v2df, double const *); +v2df __builtin_ia32_loadlpd (v2df, double const *); +int __builtin_ia32_movmskpd (v2df); +int __builtin_ia32_pmovmskb128 (v16qi); +void __builtin_ia32_movnti (int *, int); +void __builtin_ia32_movnti64 (long long int *, long long int); +void __builtin_ia32_movntpd (double *, v2df); +void __builtin_ia32_movntdq (v2df *, v2df); +v4si __builtin_ia32_pshufd (v4si, int); +v8hi __builtin_ia32_pshuflw (v8hi, int); +v8hi __builtin_ia32_pshufhw (v8hi, int); +v2di __builtin_ia32_psadbw128 (v16qi, v16qi); +v2df __builtin_ia32_sqrtpd (v2df); +v2df __builtin_ia32_sqrtsd (v2df); +v2df __builtin_ia32_shufpd (v2df, v2df, int); +v2df __builtin_ia32_cvtdq2pd (v4si); +v4sf __builtin_ia32_cvtdq2ps (v4si); +v4si __builtin_ia32_cvtpd2dq (v2df); +v2si __builtin_ia32_cvtpd2pi (v2df); +v4sf __builtin_ia32_cvtpd2ps (v2df); +v4si __builtin_ia32_cvttpd2dq (v2df); +v2si __builtin_ia32_cvttpd2pi (v2df); +v2df __builtin_ia32_cvtpi2pd (v2si); +int __builtin_ia32_cvtsd2si (v2df); +int __builtin_ia32_cvttsd2si (v2df); +long long __builtin_ia32_cvtsd2si64 (v2df); +long long __builtin_ia32_cvttsd2si64 (v2df); +v4si __builtin_ia32_cvtps2dq (v4sf); +v2df __builtin_ia32_cvtps2pd (v4sf); +v4si __builtin_ia32_cvttps2dq (v4sf); +v2df __builtin_ia32_cvtsi2sd (v2df, int); +v2df __builtin_ia32_cvtsi642sd (v2df, long long); +v4sf __builtin_ia32_cvtsd2ss (v4sf, v2df); +v2df __builtin_ia32_cvtss2sd (v2df, v4sf); +void __builtin_ia32_clflush (const void *); +void __builtin_ia32_lfence (void); +void __builtin_ia32_mfence (void); +v16qi __builtin_ia32_loaddqu (const char *); +void __builtin_ia32_storedqu (char *, v16qi); +v1di __builtin_ia32_pmuludq (v2si, v2si); +v2di __builtin_ia32_pmuludq128 (v4si, v4si); +v8hi __builtin_ia32_psllw128 (v8hi, v8hi); +v4si __builtin_ia32_pslld128 (v4si, v4si); +v2di __builtin_ia32_psllq128 (v2di, v2di); +v8hi __builtin_ia32_psrlw128 (v8hi, v8hi); +v4si __builtin_ia32_psrld128 (v4si, v4si); +v2di __builtin_ia32_psrlq128 (v2di, v2di); +v8hi __builtin_ia32_psraw128 (v8hi, v8hi); +v4si __builtin_ia32_psrad128 (v4si, v4si); +v2di __builtin_ia32_pslldqi128 (v2di, int); +v8hi __builtin_ia32_psllwi128 (v8hi, int); +v4si __builtin_ia32_pslldi128 (v4si, int); +v2di __builtin_ia32_psllqi128 (v2di, int); +v2di __builtin_ia32_psrldqi128 (v2di, int); +v8hi __builtin_ia32_psrlwi128 (v8hi, int); +v4si __builtin_ia32_psrldi128 (v4si, int); +v2di __builtin_ia32_psrlqi128 (v2di, int); +v8hi __builtin_ia32_psrawi128 (v8hi, int); +v4si __builtin_ia32_psradi128 (v4si, int); +v4si __builtin_ia32_pmaddwd128 (v8hi, v8hi); +v2di __builtin_ia32_movq128 (v2di); +</pre></div> + +<p>The following built-in functions are available when <samp>-msse3</samp> is used. +All of them generate the machine instruction that is part of the name. +</p> +<div class="smallexample"> +<pre class="smallexample">v2df __builtin_ia32_addsubpd (v2df, v2df); +v4sf __builtin_ia32_addsubps (v4sf, v4sf); +v2df __builtin_ia32_haddpd (v2df, v2df); +v4sf __builtin_ia32_haddps (v4sf, v4sf); +v2df __builtin_ia32_hsubpd (v2df, v2df); +v4sf __builtin_ia32_hsubps (v4sf, v4sf); +v16qi __builtin_ia32_lddqu (char const *); +void __builtin_ia32_monitor (void *, unsigned int, unsigned int); +v4sf __builtin_ia32_movshdup (v4sf); +v4sf __builtin_ia32_movsldup (v4sf); +void __builtin_ia32_mwait (unsigned int, unsigned int); +</pre></div> + +<p>The following built-in functions are available when <samp>-mssse3</samp> is used. +All of them generate the machine instruction that is part of the name. +</p> +<div class="smallexample"> +<pre class="smallexample">v2si __builtin_ia32_phaddd (v2si, v2si); +v4hi __builtin_ia32_phaddw (v4hi, v4hi); +v4hi __builtin_ia32_phaddsw (v4hi, v4hi); +v2si __builtin_ia32_phsubd (v2si, v2si); +v4hi __builtin_ia32_phsubw (v4hi, v4hi); +v4hi __builtin_ia32_phsubsw (v4hi, v4hi); +v4hi __builtin_ia32_pmaddubsw (v8qi, v8qi); +v4hi __builtin_ia32_pmulhrsw (v4hi, v4hi); +v8qi __builtin_ia32_pshufb (v8qi, v8qi); +v8qi __builtin_ia32_psignb (v8qi, v8qi); +v2si __builtin_ia32_psignd (v2si, v2si); +v4hi __builtin_ia32_psignw (v4hi, v4hi); +v1di __builtin_ia32_palignr (v1di, v1di, int); +v8qi __builtin_ia32_pabsb (v8qi); +v2si __builtin_ia32_pabsd (v2si); +v4hi __builtin_ia32_pabsw (v4hi); +</pre></div> + +<p>The following built-in functions are available when <samp>-mssse3</samp> is used. +All of them generate the machine instruction that is part of the name. +</p> +<div class="smallexample"> +<pre class="smallexample">v4si __builtin_ia32_phaddd128 (v4si, v4si); +v8hi __builtin_ia32_phaddw128 (v8hi, v8hi); +v8hi __builtin_ia32_phaddsw128 (v8hi, v8hi); +v4si __builtin_ia32_phsubd128 (v4si, v4si); +v8hi __builtin_ia32_phsubw128 (v8hi, v8hi); +v8hi __builtin_ia32_phsubsw128 (v8hi, v8hi); +v8hi __builtin_ia32_pmaddubsw128 (v16qi, v16qi); +v8hi __builtin_ia32_pmulhrsw128 (v8hi, v8hi); +v16qi __builtin_ia32_pshufb128 (v16qi, v16qi); +v16qi __builtin_ia32_psignb128 (v16qi, v16qi); +v4si __builtin_ia32_psignd128 (v4si, v4si); +v8hi __builtin_ia32_psignw128 (v8hi, v8hi); +v2di __builtin_ia32_palignr128 (v2di, v2di, int); +v16qi __builtin_ia32_pabsb128 (v16qi); +v4si __builtin_ia32_pabsd128 (v4si); +v8hi __builtin_ia32_pabsw128 (v8hi); +</pre></div> + +<p>The following built-in functions are available when <samp>-msse4.1</samp> is +used. All of them generate the machine instruction that is part of the +name. +</p> +<div class="smallexample"> +<pre class="smallexample">v2df __builtin_ia32_blendpd (v2df, v2df, const int); +v4sf __builtin_ia32_blendps (v4sf, v4sf, const int); +v2df __builtin_ia32_blendvpd (v2df, v2df, v2df); +v4sf __builtin_ia32_blendvps (v4sf, v4sf, v4sf); +v2df __builtin_ia32_dppd (v2df, v2df, const int); +v4sf __builtin_ia32_dpps (v4sf, v4sf, const int); +v4sf __builtin_ia32_insertps128 (v4sf, v4sf, const int); +v2di __builtin_ia32_movntdqa (v2di *); +v16qi __builtin_ia32_mpsadbw128 (v16qi, v16qi, const int); +v8hi __builtin_ia32_packusdw128 (v4si, v4si); +v16qi __builtin_ia32_pblendvb128 (v16qi, v16qi, v16qi); +v8hi __builtin_ia32_pblendw128 (v8hi, v8hi, const int); +v2di __builtin_ia32_pcmpeqq (v2di, v2di); +v8hi __builtin_ia32_phminposuw128 (v8hi); +v16qi __builtin_ia32_pmaxsb128 (v16qi, v16qi); +v4si __builtin_ia32_pmaxsd128 (v4si, v4si); +v4si __builtin_ia32_pmaxud128 (v4si, v4si); +v8hi __builtin_ia32_pmaxuw128 (v8hi, v8hi); +v16qi __builtin_ia32_pminsb128 (v16qi, v16qi); +v4si __builtin_ia32_pminsd128 (v4si, v4si); +v4si __builtin_ia32_pminud128 (v4si, v4si); +v8hi __builtin_ia32_pminuw128 (v8hi, v8hi); +v4si __builtin_ia32_pmovsxbd128 (v16qi); +v2di __builtin_ia32_pmovsxbq128 (v16qi); +v8hi __builtin_ia32_pmovsxbw128 (v16qi); +v2di __builtin_ia32_pmovsxdq128 (v4si); +v4si __builtin_ia32_pmovsxwd128 (v8hi); +v2di __builtin_ia32_pmovsxwq128 (v8hi); +v4si __builtin_ia32_pmovzxbd128 (v16qi); +v2di __builtin_ia32_pmovzxbq128 (v16qi); +v8hi __builtin_ia32_pmovzxbw128 (v16qi); +v2di __builtin_ia32_pmovzxdq128 (v4si); +v4si __builtin_ia32_pmovzxwd128 (v8hi); +v2di __builtin_ia32_pmovzxwq128 (v8hi); +v2di __builtin_ia32_pmuldq128 (v4si, v4si); +v4si __builtin_ia32_pmulld128 (v4si, v4si); +int __builtin_ia32_ptestc128 (v2di, v2di); +int __builtin_ia32_ptestnzc128 (v2di, v2di); +int __builtin_ia32_ptestz128 (v2di, v2di); +v2df __builtin_ia32_roundpd (v2df, const int); +v4sf __builtin_ia32_roundps (v4sf, const int); +v2df __builtin_ia32_roundsd (v2df, v2df, const int); +v4sf __builtin_ia32_roundss (v4sf, v4sf, const int); +</pre></div> + +<p>The following built-in functions are available when <samp>-msse4.1</samp> is +used. +</p> +<dl> +<dt><a name="index-_005f_005fbuiltin_005fia32_005fvec_005fset_005fv4sf"></a>Built-in Function: <em>v4sf</em> <strong>__builtin_ia32_vec_set_v4sf</strong> <em>(v4sf, float, const int)</em></dt> +<dd><p>Generates the <code>insertps</code> machine instruction. +</p></dd></dl> + +<dl> +<dt><a name="index-_005f_005fbuiltin_005fia32_005fvec_005fext_005fv16qi"></a>Built-in Function: <em>int</em> <strong>__builtin_ia32_vec_ext_v16qi</strong> <em>(v16qi, const int)</em></dt> +<dd><p>Generates the <code>pextrb</code> machine instruction. +</p></dd></dl> + +<dl> +<dt><a name="index-_005f_005fbuiltin_005fia32_005fvec_005fset_005fv16qi"></a>Built-in Function: <em>v16qi</em> <strong>__builtin_ia32_vec_set_v16qi</strong> <em>(v16qi, int, const int)</em></dt> +<dd><p>Generates the <code>pinsrb</code> machine instruction. +</p></dd></dl> + +<dl> +<dt><a name="index-_005f_005fbuiltin_005fia32_005fvec_005fset_005fv4si"></a>Built-in Function: <em>v4si</em> <strong>__builtin_ia32_vec_set_v4si</strong> <em>(v4si, int, const int)</em></dt> +<dd><p>Generates the <code>pinsrd</code> machine instruction. +</p></dd></dl> + +<dl> +<dt><a name="index-_005f_005fbuiltin_005fia32_005fvec_005fset_005fv2di"></a>Built-in Function: <em>v2di</em> <strong>__builtin_ia32_vec_set_v2di</strong> <em>(v2di, long long, const int)</em></dt> +<dd><p>Generates the <code>pinsrq</code> machine instruction in 64bit mode. +</p></dd></dl> + +<p>The following built-in functions are changed to generate new SSE4.1 +instructions when <samp>-msse4.1</samp> is used. +</p> +<dl> +<dt><a name="index-_005f_005fbuiltin_005fia32_005fvec_005fext_005fv4sf"></a>Built-in Function: <em>float</em> <strong>__builtin_ia32_vec_ext_v4sf</strong> <em>(v4sf, const int)</em></dt> +<dd><p>Generates the <code>extractps</code> machine instruction. +</p></dd></dl> + +<dl> +<dt><a name="index-_005f_005fbuiltin_005fia32_005fvec_005fext_005fv4si"></a>Built-in Function: <em>int</em> <strong>__builtin_ia32_vec_ext_v4si</strong> <em>(v4si, const int)</em></dt> +<dd><p>Generates the <code>pextrd</code> machine instruction. +</p></dd></dl> + +<dl> +<dt><a name="index-long-2"></a>Built-in Function: <em>long</em> <strong>long</strong> <em>__builtin_ia32_vec_ext_v2di (v2di, const int)</em></dt> +<dd><p>Generates the <code>pextrq</code> machine instruction in 64bit mode. +</p></dd></dl> + +<p>The following built-in functions are available when <samp>-msse4.2</samp> is +used. All of them generate the machine instruction that is part of the +name. +</p> +<div class="smallexample"> +<pre class="smallexample">v16qi __builtin_ia32_pcmpestrm128 (v16qi, int, v16qi, int, const int); +int __builtin_ia32_pcmpestri128 (v16qi, int, v16qi, int, const int); +int __builtin_ia32_pcmpestria128 (v16qi, int, v16qi, int, const int); +int __builtin_ia32_pcmpestric128 (v16qi, int, v16qi, int, const int); +int __builtin_ia32_pcmpestrio128 (v16qi, int, v16qi, int, const int); +int __builtin_ia32_pcmpestris128 (v16qi, int, v16qi, int, const int); +int __builtin_ia32_pcmpestriz128 (v16qi, int, v16qi, int, const int); +v16qi __builtin_ia32_pcmpistrm128 (v16qi, v16qi, const int); +int __builtin_ia32_pcmpistri128 (v16qi, v16qi, const int); +int __builtin_ia32_pcmpistria128 (v16qi, v16qi, const int); +int __builtin_ia32_pcmpistric128 (v16qi, v16qi, const int); +int __builtin_ia32_pcmpistrio128 (v16qi, v16qi, const int); +int __builtin_ia32_pcmpistris128 (v16qi, v16qi, const int); +int __builtin_ia32_pcmpistriz128 (v16qi, v16qi, const int); +v2di __builtin_ia32_pcmpgtq (v2di, v2di); +</pre></div> + +<p>The following built-in functions are available when <samp>-msse4.2</samp> is +used. +</p> +<dl> +<dt><a name="index-int"></a>Built-in Function: <em>unsigned</em> <strong>int</strong> <em>__builtin_ia32_crc32qi (unsigned int, unsigned char)</em></dt> +<dd><p>Generates the <code>crc32b</code> machine instruction. +</p></dd></dl> + +<dl> +<dt><a name="index-int-1"></a>Built-in Function: <em>unsigned</em> <strong>int</strong> <em>__builtin_ia32_crc32hi (unsigned int, unsigned short)</em></dt> +<dd><p>Generates the <code>crc32w</code> machine instruction. +</p></dd></dl> + +<dl> +<dt><a name="index-int-2"></a>Built-in Function: <em>unsigned</em> <strong>int</strong> <em>__builtin_ia32_crc32si (unsigned int, unsigned int)</em></dt> +<dd><p>Generates the <code>crc32l</code> machine instruction. +</p></dd></dl> + +<dl> +<dt><a name="index-long-3"></a>Built-in Function: <em>unsigned</em> <strong>long</strong> <em>long __builtin_ia32_crc32di (unsigned long long, unsigned long long)</em></dt> +<dd><p>Generates the <code>crc32q</code> machine instruction. +</p></dd></dl> + +<p>The following built-in functions are changed to generate new SSE4.2 +instructions when <samp>-msse4.2</samp> is used. +</p> +<dl> +<dt><a name="index-_005f_005fbuiltin_005fpopcount-1"></a>Built-in Function: <em>int</em> <strong>__builtin_popcount</strong> <em>(unsigned int)</em></dt> +<dd><p>Generates the <code>popcntl</code> machine instruction. +</p></dd></dl> + +<dl> +<dt><a name="index-_005f_005fbuiltin_005fpopcountl-1"></a>Built-in Function: <em>int</em> <strong>__builtin_popcountl</strong> <em>(unsigned long)</em></dt> +<dd><p>Generates the <code>popcntl</code> or <code>popcntq</code> machine instruction, +depending on the size of <code>unsigned long</code>. +</p></dd></dl> + +<dl> +<dt><a name="index-_005f_005fbuiltin_005fpopcountll-1"></a>Built-in Function: <em>int</em> <strong>__builtin_popcountll</strong> <em>(unsigned long long)</em></dt> +<dd><p>Generates the <code>popcntq</code> machine instruction. +</p></dd></dl> + +<p>The following built-in functions are available when <samp>-mavx</samp> is +used. All of them generate the machine instruction that is part of the +name. +</p> +<div class="smallexample"> +<pre class="smallexample">v4df __builtin_ia32_addpd256 (v4df,v4df); +v8sf __builtin_ia32_addps256 (v8sf,v8sf); +v4df __builtin_ia32_addsubpd256 (v4df,v4df); +v8sf __builtin_ia32_addsubps256 (v8sf,v8sf); +v4df __builtin_ia32_andnpd256 (v4df,v4df); +v8sf __builtin_ia32_andnps256 (v8sf,v8sf); +v4df __builtin_ia32_andpd256 (v4df,v4df); +v8sf __builtin_ia32_andps256 (v8sf,v8sf); +v4df __builtin_ia32_blendpd256 (v4df,v4df,int); +v8sf __builtin_ia32_blendps256 (v8sf,v8sf,int); +v4df __builtin_ia32_blendvpd256 (v4df,v4df,v4df); +v8sf __builtin_ia32_blendvps256 (v8sf,v8sf,v8sf); +v2df __builtin_ia32_cmppd (v2df,v2df,int); +v4df __builtin_ia32_cmppd256 (v4df,v4df,int); +v4sf __builtin_ia32_cmpps (v4sf,v4sf,int); +v8sf __builtin_ia32_cmpps256 (v8sf,v8sf,int); +v2df __builtin_ia32_cmpsd (v2df,v2df,int); +v4sf __builtin_ia32_cmpss (v4sf,v4sf,int); +v4df __builtin_ia32_cvtdq2pd256 (v4si); +v8sf __builtin_ia32_cvtdq2ps256 (v8si); +v4si __builtin_ia32_cvtpd2dq256 (v4df); +v4sf __builtin_ia32_cvtpd2ps256 (v4df); +v8si __builtin_ia32_cvtps2dq256 (v8sf); +v4df __builtin_ia32_cvtps2pd256 (v4sf); +v4si __builtin_ia32_cvttpd2dq256 (v4df); +v8si __builtin_ia32_cvttps2dq256 (v8sf); +v4df __builtin_ia32_divpd256 (v4df,v4df); +v8sf __builtin_ia32_divps256 (v8sf,v8sf); +v8sf __builtin_ia32_dpps256 (v8sf,v8sf,int); +v4df __builtin_ia32_haddpd256 (v4df,v4df); +v8sf __builtin_ia32_haddps256 (v8sf,v8sf); +v4df __builtin_ia32_hsubpd256 (v4df,v4df); +v8sf __builtin_ia32_hsubps256 (v8sf,v8sf); +v32qi __builtin_ia32_lddqu256 (pcchar); +v32qi __builtin_ia32_loaddqu256 (pcchar); +v4df __builtin_ia32_loadupd256 (pcdouble); +v8sf __builtin_ia32_loadups256 (pcfloat); +v2df __builtin_ia32_maskloadpd (pcv2df,v2df); +v4df __builtin_ia32_maskloadpd256 (pcv4df,v4df); +v4sf __builtin_ia32_maskloadps (pcv4sf,v4sf); +v8sf __builtin_ia32_maskloadps256 (pcv8sf,v8sf); +void __builtin_ia32_maskstorepd (pv2df,v2df,v2df); +void __builtin_ia32_maskstorepd256 (pv4df,v4df,v4df); +void __builtin_ia32_maskstoreps (pv4sf,v4sf,v4sf); +void __builtin_ia32_maskstoreps256 (pv8sf,v8sf,v8sf); +v4df __builtin_ia32_maxpd256 (v4df,v4df); +v8sf __builtin_ia32_maxps256 (v8sf,v8sf); +v4df __builtin_ia32_minpd256 (v4df,v4df); +v8sf __builtin_ia32_minps256 (v8sf,v8sf); +v4df __builtin_ia32_movddup256 (v4df); +int __builtin_ia32_movmskpd256 (v4df); +int __builtin_ia32_movmskps256 (v8sf); +v8sf __builtin_ia32_movshdup256 (v8sf); +v8sf __builtin_ia32_movsldup256 (v8sf); +v4df __builtin_ia32_mulpd256 (v4df,v4df); +v8sf __builtin_ia32_mulps256 (v8sf,v8sf); +v4df __builtin_ia32_orpd256 (v4df,v4df); +v8sf __builtin_ia32_orps256 (v8sf,v8sf); +v2df __builtin_ia32_pd_pd256 (v4df); +v4df __builtin_ia32_pd256_pd (v2df); +v4sf __builtin_ia32_ps_ps256 (v8sf); +v8sf __builtin_ia32_ps256_ps (v4sf); +int __builtin_ia32_ptestc256 (v4di,v4di,ptest); +int __builtin_ia32_ptestnzc256 (v4di,v4di,ptest); +int __builtin_ia32_ptestz256 (v4di,v4di,ptest); +v8sf __builtin_ia32_rcpps256 (v8sf); +v4df __builtin_ia32_roundpd256 (v4df,int); +v8sf __builtin_ia32_roundps256 (v8sf,int); +v8sf __builtin_ia32_rsqrtps_nr256 (v8sf); +v8sf __builtin_ia32_rsqrtps256 (v8sf); +v4df __builtin_ia32_shufpd256 (v4df,v4df,int); +v8sf __builtin_ia32_shufps256 (v8sf,v8sf,int); +v4si __builtin_ia32_si_si256 (v8si); +v8si __builtin_ia32_si256_si (v4si); +v4df __builtin_ia32_sqrtpd256 (v4df); +v8sf __builtin_ia32_sqrtps_nr256 (v8sf); +v8sf __builtin_ia32_sqrtps256 (v8sf); +void __builtin_ia32_storedqu256 (pchar,v32qi); +void __builtin_ia32_storeupd256 (pdouble,v4df); +void __builtin_ia32_storeups256 (pfloat,v8sf); +v4df __builtin_ia32_subpd256 (v4df,v4df); +v8sf __builtin_ia32_subps256 (v8sf,v8sf); +v4df __builtin_ia32_unpckhpd256 (v4df,v4df); +v8sf __builtin_ia32_unpckhps256 (v8sf,v8sf); +v4df __builtin_ia32_unpcklpd256 (v4df,v4df); +v8sf __builtin_ia32_unpcklps256 (v8sf,v8sf); +v4df __builtin_ia32_vbroadcastf128_pd256 (pcv2df); +v8sf __builtin_ia32_vbroadcastf128_ps256 (pcv4sf); +v4df __builtin_ia32_vbroadcastsd256 (pcdouble); +v4sf __builtin_ia32_vbroadcastss (pcfloat); +v8sf __builtin_ia32_vbroadcastss256 (pcfloat); +v2df __builtin_ia32_vextractf128_pd256 (v4df,int); +v4sf __builtin_ia32_vextractf128_ps256 (v8sf,int); +v4si __builtin_ia32_vextractf128_si256 (v8si,int); +v4df __builtin_ia32_vinsertf128_pd256 (v4df,v2df,int); +v8sf __builtin_ia32_vinsertf128_ps256 (v8sf,v4sf,int); +v8si __builtin_ia32_vinsertf128_si256 (v8si,v4si,int); +v4df __builtin_ia32_vperm2f128_pd256 (v4df,v4df,int); +v8sf __builtin_ia32_vperm2f128_ps256 (v8sf,v8sf,int); +v8si __builtin_ia32_vperm2f128_si256 (v8si,v8si,int); +v2df __builtin_ia32_vpermil2pd (v2df,v2df,v2di,int); +v4df __builtin_ia32_vpermil2pd256 (v4df,v4df,v4di,int); +v4sf __builtin_ia32_vpermil2ps (v4sf,v4sf,v4si,int); +v8sf __builtin_ia32_vpermil2ps256 (v8sf,v8sf,v8si,int); +v2df __builtin_ia32_vpermilpd (v2df,int); +v4df __builtin_ia32_vpermilpd256 (v4df,int); +v4sf __builtin_ia32_vpermilps (v4sf,int); +v8sf __builtin_ia32_vpermilps256 (v8sf,int); +v2df __builtin_ia32_vpermilvarpd (v2df,v2di); +v4df __builtin_ia32_vpermilvarpd256 (v4df,v4di); +v4sf __builtin_ia32_vpermilvarps (v4sf,v4si); +v8sf __builtin_ia32_vpermilvarps256 (v8sf,v8si); +int __builtin_ia32_vtestcpd (v2df,v2df,ptest); +int __builtin_ia32_vtestcpd256 (v4df,v4df,ptest); +int __builtin_ia32_vtestcps (v4sf,v4sf,ptest); +int __builtin_ia32_vtestcps256 (v8sf,v8sf,ptest); +int __builtin_ia32_vtestnzcpd (v2df,v2df,ptest); +int __builtin_ia32_vtestnzcpd256 (v4df,v4df,ptest); +int __builtin_ia32_vtestnzcps (v4sf,v4sf,ptest); +int __builtin_ia32_vtestnzcps256 (v8sf,v8sf,ptest); +int __builtin_ia32_vtestzpd (v2df,v2df,ptest); +int __builtin_ia32_vtestzpd256 (v4df,v4df,ptest); +int __builtin_ia32_vtestzps (v4sf,v4sf,ptest); +int __builtin_ia32_vtestzps256 (v8sf,v8sf,ptest); +void __builtin_ia32_vzeroall (void); +void __builtin_ia32_vzeroupper (void); +v4df __builtin_ia32_xorpd256 (v4df,v4df); +v8sf __builtin_ia32_xorps256 (v8sf,v8sf); +</pre></div> + +<p>The following built-in functions are available when <samp>-mavx2</samp> is +used. All of them generate the machine instruction that is part of the +name. +</p> +<div class="smallexample"> +<pre class="smallexample">v32qi __builtin_ia32_mpsadbw256 (v32qi,v32qi,int); +v32qi __builtin_ia32_pabsb256 (v32qi); +v16hi __builtin_ia32_pabsw256 (v16hi); +v8si __builtin_ia32_pabsd256 (v8si); +v16hi __builtin_ia32_packssdw256 (v8si,v8si); +v32qi __builtin_ia32_packsswb256 (v16hi,v16hi); +v16hi __builtin_ia32_packusdw256 (v8si,v8si); +v32qi __builtin_ia32_packuswb256 (v16hi,v16hi); +v32qi __builtin_ia32_paddb256 (v32qi,v32qi); +v16hi __builtin_ia32_paddw256 (v16hi,v16hi); +v8si __builtin_ia32_paddd256 (v8si,v8si); +v4di __builtin_ia32_paddq256 (v4di,v4di); +v32qi __builtin_ia32_paddsb256 (v32qi,v32qi); +v16hi __builtin_ia32_paddsw256 (v16hi,v16hi); +v32qi __builtin_ia32_paddusb256 (v32qi,v32qi); +v16hi __builtin_ia32_paddusw256 (v16hi,v16hi); +v4di __builtin_ia32_palignr256 (v4di,v4di,int); +v4di __builtin_ia32_andsi256 (v4di,v4di); +v4di __builtin_ia32_andnotsi256 (v4di,v4di); +v32qi __builtin_ia32_pavgb256 (v32qi,v32qi); +v16hi __builtin_ia32_pavgw256 (v16hi,v16hi); +v32qi __builtin_ia32_pblendvb256 (v32qi,v32qi,v32qi); +v16hi __builtin_ia32_pblendw256 (v16hi,v16hi,int); +v32qi __builtin_ia32_pcmpeqb256 (v32qi,v32qi); +v16hi __builtin_ia32_pcmpeqw256 (v16hi,v16hi); +v8si __builtin_ia32_pcmpeqd256 (c8si,v8si); +v4di __builtin_ia32_pcmpeqq256 (v4di,v4di); +v32qi __builtin_ia32_pcmpgtb256 (v32qi,v32qi); +v16hi __builtin_ia32_pcmpgtw256 (16hi,v16hi); +v8si __builtin_ia32_pcmpgtd256 (v8si,v8si); +v4di __builtin_ia32_pcmpgtq256 (v4di,v4di); +v16hi __builtin_ia32_phaddw256 (v16hi,v16hi); +v8si __builtin_ia32_phaddd256 (v8si,v8si); +v16hi __builtin_ia32_phaddsw256 (v16hi,v16hi); +v16hi __builtin_ia32_phsubw256 (v16hi,v16hi); +v8si __builtin_ia32_phsubd256 (v8si,v8si); +v16hi __builtin_ia32_phsubsw256 (v16hi,v16hi); +v32qi __builtin_ia32_pmaddubsw256 (v32qi,v32qi); +v16hi __builtin_ia32_pmaddwd256 (v16hi,v16hi); +v32qi __builtin_ia32_pmaxsb256 (v32qi,v32qi); +v16hi __builtin_ia32_pmaxsw256 (v16hi,v16hi); +v8si __builtin_ia32_pmaxsd256 (v8si,v8si); +v32qi __builtin_ia32_pmaxub256 (v32qi,v32qi); +v16hi __builtin_ia32_pmaxuw256 (v16hi,v16hi); +v8si __builtin_ia32_pmaxud256 (v8si,v8si); +v32qi __builtin_ia32_pminsb256 (v32qi,v32qi); +v16hi __builtin_ia32_pminsw256 (v16hi,v16hi); +v8si __builtin_ia32_pminsd256 (v8si,v8si); +v32qi __builtin_ia32_pminub256 (v32qi,v32qi); +v16hi __builtin_ia32_pminuw256 (v16hi,v16hi); +v8si __builtin_ia32_pminud256 (v8si,v8si); +int __builtin_ia32_pmovmskb256 (v32qi); +v16hi __builtin_ia32_pmovsxbw256 (v16qi); +v8si __builtin_ia32_pmovsxbd256 (v16qi); +v4di __builtin_ia32_pmovsxbq256 (v16qi); +v8si __builtin_ia32_pmovsxwd256 (v8hi); +v4di __builtin_ia32_pmovsxwq256 (v8hi); +v4di __builtin_ia32_pmovsxdq256 (v4si); +v16hi __builtin_ia32_pmovzxbw256 (v16qi); +v8si __builtin_ia32_pmovzxbd256 (v16qi); +v4di __builtin_ia32_pmovzxbq256 (v16qi); +v8si __builtin_ia32_pmovzxwd256 (v8hi); +v4di __builtin_ia32_pmovzxwq256 (v8hi); +v4di __builtin_ia32_pmovzxdq256 (v4si); +v4di __builtin_ia32_pmuldq256 (v8si,v8si); +v16hi __builtin_ia32_pmulhrsw256 (v16hi, v16hi); +v16hi __builtin_ia32_pmulhuw256 (v16hi,v16hi); +v16hi __builtin_ia32_pmulhw256 (v16hi,v16hi); +v16hi __builtin_ia32_pmullw256 (v16hi,v16hi); +v8si __builtin_ia32_pmulld256 (v8si,v8si); +v4di __builtin_ia32_pmuludq256 (v8si,v8si); +v4di __builtin_ia32_por256 (v4di,v4di); +v16hi __builtin_ia32_psadbw256 (v32qi,v32qi); +v32qi __builtin_ia32_pshufb256 (v32qi,v32qi); +v8si __builtin_ia32_pshufd256 (v8si,int); +v16hi __builtin_ia32_pshufhw256 (v16hi,int); +v16hi __builtin_ia32_pshuflw256 (v16hi,int); +v32qi __builtin_ia32_psignb256 (v32qi,v32qi); +v16hi __builtin_ia32_psignw256 (v16hi,v16hi); +v8si __builtin_ia32_psignd256 (v8si,v8si); +v4di __builtin_ia32_pslldqi256 (v4di,int); +v16hi __builtin_ia32_psllwi256 (16hi,int); +v16hi __builtin_ia32_psllw256(v16hi,v8hi); +v8si __builtin_ia32_pslldi256 (v8si,int); +v8si __builtin_ia32_pslld256(v8si,v4si); +v4di __builtin_ia32_psllqi256 (v4di,int); +v4di __builtin_ia32_psllq256(v4di,v2di); +v16hi __builtin_ia32_psrawi256 (v16hi,int); +v16hi __builtin_ia32_psraw256 (v16hi,v8hi); +v8si __builtin_ia32_psradi256 (v8si,int); +v8si __builtin_ia32_psrad256 (v8si,v4si); +v4di __builtin_ia32_psrldqi256 (v4di, int); +v16hi __builtin_ia32_psrlwi256 (v16hi,int); +v16hi __builtin_ia32_psrlw256 (v16hi,v8hi); +v8si __builtin_ia32_psrldi256 (v8si,int); +v8si __builtin_ia32_psrld256 (v8si,v4si); +v4di __builtin_ia32_psrlqi256 (v4di,int); +v4di __builtin_ia32_psrlq256(v4di,v2di); +v32qi __builtin_ia32_psubb256 (v32qi,v32qi); +v32hi __builtin_ia32_psubw256 (v16hi,v16hi); +v8si __builtin_ia32_psubd256 (v8si,v8si); +v4di __builtin_ia32_psubq256 (v4di,v4di); +v32qi __builtin_ia32_psubsb256 (v32qi,v32qi); +v16hi __builtin_ia32_psubsw256 (v16hi,v16hi); +v32qi __builtin_ia32_psubusb256 (v32qi,v32qi); +v16hi __builtin_ia32_psubusw256 (v16hi,v16hi); +v32qi __builtin_ia32_punpckhbw256 (v32qi,v32qi); +v16hi __builtin_ia32_punpckhwd256 (v16hi,v16hi); +v8si __builtin_ia32_punpckhdq256 (v8si,v8si); +v4di __builtin_ia32_punpckhqdq256 (v4di,v4di); +v32qi __builtin_ia32_punpcklbw256 (v32qi,v32qi); +v16hi __builtin_ia32_punpcklwd256 (v16hi,v16hi); +v8si __builtin_ia32_punpckldq256 (v8si,v8si); +v4di __builtin_ia32_punpcklqdq256 (v4di,v4di); +v4di __builtin_ia32_pxor256 (v4di,v4di); +v4di __builtin_ia32_movntdqa256 (pv4di); +v4sf __builtin_ia32_vbroadcastss_ps (v4sf); +v8sf __builtin_ia32_vbroadcastss_ps256 (v4sf); +v4df __builtin_ia32_vbroadcastsd_pd256 (v2df); +v4di __builtin_ia32_vbroadcastsi256 (v2di); +v4si __builtin_ia32_pblendd128 (v4si,v4si); +v8si __builtin_ia32_pblendd256 (v8si,v8si); +v32qi __builtin_ia32_pbroadcastb256 (v16qi); +v16hi __builtin_ia32_pbroadcastw256 (v8hi); +v8si __builtin_ia32_pbroadcastd256 (v4si); +v4di __builtin_ia32_pbroadcastq256 (v2di); +v16qi __builtin_ia32_pbroadcastb128 (v16qi); +v8hi __builtin_ia32_pbroadcastw128 (v8hi); +v4si __builtin_ia32_pbroadcastd128 (v4si); +v2di __builtin_ia32_pbroadcastq128 (v2di); +v8si __builtin_ia32_permvarsi256 (v8si,v8si); +v4df __builtin_ia32_permdf256 (v4df,int); +v8sf __builtin_ia32_permvarsf256 (v8sf,v8sf); +v4di __builtin_ia32_permdi256 (v4di,int); +v4di __builtin_ia32_permti256 (v4di,v4di,int); +v4di __builtin_ia32_extract128i256 (v4di,int); +v4di __builtin_ia32_insert128i256 (v4di,v2di,int); +v8si __builtin_ia32_maskloadd256 (pcv8si,v8si); +v4di __builtin_ia32_maskloadq256 (pcv4di,v4di); +v4si __builtin_ia32_maskloadd (pcv4si,v4si); +v2di __builtin_ia32_maskloadq (pcv2di,v2di); +void __builtin_ia32_maskstored256 (pv8si,v8si,v8si); +void __builtin_ia32_maskstoreq256 (pv4di,v4di,v4di); +void __builtin_ia32_maskstored (pv4si,v4si,v4si); +void __builtin_ia32_maskstoreq (pv2di,v2di,v2di); +v8si __builtin_ia32_psllv8si (v8si,v8si); +v4si __builtin_ia32_psllv4si (v4si,v4si); +v4di __builtin_ia32_psllv4di (v4di,v4di); +v2di __builtin_ia32_psllv2di (v2di,v2di); +v8si __builtin_ia32_psrav8si (v8si,v8si); +v4si __builtin_ia32_psrav4si (v4si,v4si); +v8si __builtin_ia32_psrlv8si (v8si,v8si); +v4si __builtin_ia32_psrlv4si (v4si,v4si); +v4di __builtin_ia32_psrlv4di (v4di,v4di); +v2di __builtin_ia32_psrlv2di (v2di,v2di); +v2df __builtin_ia32_gathersiv2df (v2df, pcdouble,v4si,v2df,int); +v4df __builtin_ia32_gathersiv4df (v4df, pcdouble,v4si,v4df,int); +v2df __builtin_ia32_gatherdiv2df (v2df, pcdouble,v2di,v2df,int); +v4df __builtin_ia32_gatherdiv4df (v4df, pcdouble,v4di,v4df,int); +v4sf __builtin_ia32_gathersiv4sf (v4sf, pcfloat,v4si,v4sf,int); +v8sf __builtin_ia32_gathersiv8sf (v8sf, pcfloat,v8si,v8sf,int); +v4sf __builtin_ia32_gatherdiv4sf (v4sf, pcfloat,v2di,v4sf,int); +v4sf __builtin_ia32_gatherdiv4sf256 (v4sf, pcfloat,v4di,v4sf,int); +v2di __builtin_ia32_gathersiv2di (v2di, pcint64,v4si,v2di,int); +v4di __builtin_ia32_gathersiv4di (v4di, pcint64,v4si,v4di,int); +v2di __builtin_ia32_gatherdiv2di (v2di, pcint64,v2di,v2di,int); +v4di __builtin_ia32_gatherdiv4di (v4di, pcint64,v4di,v4di,int); +v4si __builtin_ia32_gathersiv4si (v4si, pcint,v4si,v4si,int); +v8si __builtin_ia32_gathersiv8si (v8si, pcint,v8si,v8si,int); +v4si __builtin_ia32_gatherdiv4si (v4si, pcint,v2di,v4si,int); +v4si __builtin_ia32_gatherdiv4si256 (v4si, pcint,v4di,v4si,int); +</pre></div> + +<p>The following built-in functions are available when <samp>-maes</samp> is +used. All of them generate the machine instruction that is part of the +name. +</p> +<div class="smallexample"> +<pre class="smallexample">v2di __builtin_ia32_aesenc128 (v2di, v2di); +v2di __builtin_ia32_aesenclast128 (v2di, v2di); +v2di __builtin_ia32_aesdec128 (v2di, v2di); +v2di __builtin_ia32_aesdeclast128 (v2di, v2di); +v2di __builtin_ia32_aeskeygenassist128 (v2di, const int); +v2di __builtin_ia32_aesimc128 (v2di); +</pre></div> + +<p>The following built-in function is available when <samp>-mpclmul</samp> is +used. +</p> +<dl> +<dt><a name="index-_005f_005fbuiltin_005fia32_005fpclmulqdq128"></a>Built-in Function: <em>v2di</em> <strong>__builtin_ia32_pclmulqdq128</strong> <em>(v2di, v2di, const int)</em></dt> +<dd><p>Generates the <code>pclmulqdq</code> machine instruction. +</p></dd></dl> + +<p>The following built-in function is available when <samp>-mfsgsbase</samp> is +used. All of them generate the machine instruction that is part of the +name. +</p> +<div class="smallexample"> +<pre class="smallexample">unsigned int __builtin_ia32_rdfsbase32 (void); +unsigned long long __builtin_ia32_rdfsbase64 (void); +unsigned int __builtin_ia32_rdgsbase32 (void); +unsigned long long __builtin_ia32_rdgsbase64 (void); +void _writefsbase_u32 (unsigned int); +void _writefsbase_u64 (unsigned long long); +void _writegsbase_u32 (unsigned int); +void _writegsbase_u64 (unsigned long long); +</pre></div> + +<p>The following built-in function is available when <samp>-mrdrnd</samp> is +used. All of them generate the machine instruction that is part of the +name. +</p> +<div class="smallexample"> +<pre class="smallexample">unsigned int __builtin_ia32_rdrand16_step (unsigned short *); +unsigned int __builtin_ia32_rdrand32_step (unsigned int *); +unsigned int __builtin_ia32_rdrand64_step (unsigned long long *); +</pre></div> + +<p>The following built-in function is available when <samp>-mptwrite</samp> is +used. All of them generate the machine instruction that is part of the +name. +</p> +<div class="smallexample"> +<pre class="smallexample">void __builtin_ia32_ptwrite32 (unsigned); +void __builtin_ia32_ptwrite64 (unsigned long long); +</pre></div> + +<p>The following built-in functions are available when <samp>-msse4a</samp> is used. +All of them generate the machine instruction that is part of the name. +</p> +<div class="smallexample"> +<pre class="smallexample">void __builtin_ia32_movntsd (double *, v2df); +void __builtin_ia32_movntss (float *, v4sf); +v2di __builtin_ia32_extrq (v2di, v16qi); +v2di __builtin_ia32_extrqi (v2di, const unsigned int, const unsigned int); +v2di __builtin_ia32_insertq (v2di, v2di); +v2di __builtin_ia32_insertqi (v2di, v2di, const unsigned int, const unsigned int); +</pre></div> + +<p>The following built-in functions are available when <samp>-mxop</samp> is used. +</p><div class="smallexample"> +<pre class="smallexample">v2df __builtin_ia32_vfrczpd (v2df); +v4sf __builtin_ia32_vfrczps (v4sf); +v2df __builtin_ia32_vfrczsd (v2df); +v4sf __builtin_ia32_vfrczss (v4sf); +v4df __builtin_ia32_vfrczpd256 (v4df); +v8sf __builtin_ia32_vfrczps256 (v8sf); +v2di __builtin_ia32_vpcmov (v2di, v2di, v2di); +v2di __builtin_ia32_vpcmov_v2di (v2di, v2di, v2di); +v4si __builtin_ia32_vpcmov_v4si (v4si, v4si, v4si); +v8hi __builtin_ia32_vpcmov_v8hi (v8hi, v8hi, v8hi); +v16qi __builtin_ia32_vpcmov_v16qi (v16qi, v16qi, v16qi); +v2df __builtin_ia32_vpcmov_v2df (v2df, v2df, v2df); +v4sf __builtin_ia32_vpcmov_v4sf (v4sf, v4sf, v4sf); +v4di __builtin_ia32_vpcmov_v4di256 (v4di, v4di, v4di); +v8si __builtin_ia32_vpcmov_v8si256 (v8si, v8si, v8si); +v16hi __builtin_ia32_vpcmov_v16hi256 (v16hi, v16hi, v16hi); +v32qi __builtin_ia32_vpcmov_v32qi256 (v32qi, v32qi, v32qi); +v4df __builtin_ia32_vpcmov_v4df256 (v4df, v4df, v4df); +v8sf __builtin_ia32_vpcmov_v8sf256 (v8sf, v8sf, v8sf); +v16qi __builtin_ia32_vpcomeqb (v16qi, v16qi); +v8hi __builtin_ia32_vpcomeqw (v8hi, v8hi); +v4si __builtin_ia32_vpcomeqd (v4si, v4si); +v2di __builtin_ia32_vpcomeqq (v2di, v2di); +v16qi __builtin_ia32_vpcomequb (v16qi, v16qi); +v4si __builtin_ia32_vpcomequd (v4si, v4si); +v2di __builtin_ia32_vpcomequq (v2di, v2di); +v8hi __builtin_ia32_vpcomequw (v8hi, v8hi); +v8hi __builtin_ia32_vpcomeqw (v8hi, v8hi); +v16qi __builtin_ia32_vpcomfalseb (v16qi, v16qi); +v4si __builtin_ia32_vpcomfalsed (v4si, v4si); +v2di __builtin_ia32_vpcomfalseq (v2di, v2di); +v16qi __builtin_ia32_vpcomfalseub (v16qi, v16qi); +v4si __builtin_ia32_vpcomfalseud (v4si, v4si); +v2di __builtin_ia32_vpcomfalseuq (v2di, v2di); +v8hi __builtin_ia32_vpcomfalseuw (v8hi, v8hi); +v8hi __builtin_ia32_vpcomfalsew (v8hi, v8hi); +v16qi __builtin_ia32_vpcomgeb (v16qi, v16qi); +v4si __builtin_ia32_vpcomged (v4si, v4si); +v2di __builtin_ia32_vpcomgeq (v2di, v2di); +v16qi __builtin_ia32_vpcomgeub (v16qi, v16qi); +v4si __builtin_ia32_vpcomgeud (v4si, v4si); +v2di __builtin_ia32_vpcomgeuq (v2di, v2di); +v8hi __builtin_ia32_vpcomgeuw (v8hi, v8hi); +v8hi __builtin_ia32_vpcomgew (v8hi, v8hi); +v16qi __builtin_ia32_vpcomgtb (v16qi, v16qi); +v4si __builtin_ia32_vpcomgtd (v4si, v4si); +v2di __builtin_ia32_vpcomgtq (v2di, v2di); +v16qi __builtin_ia32_vpcomgtub (v16qi, v16qi); +v4si __builtin_ia32_vpcomgtud (v4si, v4si); +v2di __builtin_ia32_vpcomgtuq (v2di, v2di); +v8hi __builtin_ia32_vpcomgtuw (v8hi, v8hi); +v8hi __builtin_ia32_vpcomgtw (v8hi, v8hi); +v16qi __builtin_ia32_vpcomleb (v16qi, v16qi); +v4si __builtin_ia32_vpcomled (v4si, v4si); +v2di __builtin_ia32_vpcomleq (v2di, v2di); +v16qi __builtin_ia32_vpcomleub (v16qi, v16qi); +v4si __builtin_ia32_vpcomleud (v4si, v4si); +v2di __builtin_ia32_vpcomleuq (v2di, v2di); +v8hi __builtin_ia32_vpcomleuw (v8hi, v8hi); +v8hi __builtin_ia32_vpcomlew (v8hi, v8hi); +v16qi __builtin_ia32_vpcomltb (v16qi, v16qi); +v4si __builtin_ia32_vpcomltd (v4si, v4si); +v2di __builtin_ia32_vpcomltq (v2di, v2di); +v16qi __builtin_ia32_vpcomltub (v16qi, v16qi); +v4si __builtin_ia32_vpcomltud (v4si, v4si); +v2di __builtin_ia32_vpcomltuq (v2di, v2di); +v8hi __builtin_ia32_vpcomltuw (v8hi, v8hi); +v8hi __builtin_ia32_vpcomltw (v8hi, v8hi); +v16qi __builtin_ia32_vpcomneb (v16qi, v16qi); +v4si __builtin_ia32_vpcomned (v4si, v4si); +v2di __builtin_ia32_vpcomneq (v2di, v2di); +v16qi __builtin_ia32_vpcomneub (v16qi, v16qi); +v4si __builtin_ia32_vpcomneud (v4si, v4si); +v2di __builtin_ia32_vpcomneuq (v2di, v2di); +v8hi __builtin_ia32_vpcomneuw (v8hi, v8hi); +v8hi __builtin_ia32_vpcomnew (v8hi, v8hi); +v16qi __builtin_ia32_vpcomtrueb (v16qi, v16qi); +v4si __builtin_ia32_vpcomtrued (v4si, v4si); +v2di __builtin_ia32_vpcomtrueq (v2di, v2di); +v16qi __builtin_ia32_vpcomtrueub (v16qi, v16qi); +v4si __builtin_ia32_vpcomtrueud (v4si, v4si); +v2di __builtin_ia32_vpcomtrueuq (v2di, v2di); +v8hi __builtin_ia32_vpcomtrueuw (v8hi, v8hi); +v8hi __builtin_ia32_vpcomtruew (v8hi, v8hi); +v4si __builtin_ia32_vphaddbd (v16qi); +v2di __builtin_ia32_vphaddbq (v16qi); +v8hi __builtin_ia32_vphaddbw (v16qi); +v2di __builtin_ia32_vphadddq (v4si); +v4si __builtin_ia32_vphaddubd (v16qi); +v2di __builtin_ia32_vphaddubq (v16qi); +v8hi __builtin_ia32_vphaddubw (v16qi); +v2di __builtin_ia32_vphaddudq (v4si); +v4si __builtin_ia32_vphadduwd (v8hi); +v2di __builtin_ia32_vphadduwq (v8hi); +v4si __builtin_ia32_vphaddwd (v8hi); +v2di __builtin_ia32_vphaddwq (v8hi); +v8hi __builtin_ia32_vphsubbw (v16qi); +v2di __builtin_ia32_vphsubdq (v4si); +v4si __builtin_ia32_vphsubwd (v8hi); +v4si __builtin_ia32_vpmacsdd (v4si, v4si, v4si); +v2di __builtin_ia32_vpmacsdqh (v4si, v4si, v2di); +v2di __builtin_ia32_vpmacsdql (v4si, v4si, v2di); +v4si __builtin_ia32_vpmacssdd (v4si, v4si, v4si); +v2di __builtin_ia32_vpmacssdqh (v4si, v4si, v2di); +v2di __builtin_ia32_vpmacssdql (v4si, v4si, v2di); +v4si __builtin_ia32_vpmacsswd (v8hi, v8hi, v4si); +v8hi __builtin_ia32_vpmacssww (v8hi, v8hi, v8hi); +v4si __builtin_ia32_vpmacswd (v8hi, v8hi, v4si); +v8hi __builtin_ia32_vpmacsww (v8hi, v8hi, v8hi); +v4si __builtin_ia32_vpmadcsswd (v8hi, v8hi, v4si); +v4si __builtin_ia32_vpmadcswd (v8hi, v8hi, v4si); +v16qi __builtin_ia32_vpperm (v16qi, v16qi, v16qi); +v16qi __builtin_ia32_vprotb (v16qi, v16qi); +v4si __builtin_ia32_vprotd (v4si, v4si); +v2di __builtin_ia32_vprotq (v2di, v2di); +v8hi __builtin_ia32_vprotw (v8hi, v8hi); +v16qi __builtin_ia32_vpshab (v16qi, v16qi); +v4si __builtin_ia32_vpshad (v4si, v4si); +v2di __builtin_ia32_vpshaq (v2di, v2di); +v8hi __builtin_ia32_vpshaw (v8hi, v8hi); +v16qi __builtin_ia32_vpshlb (v16qi, v16qi); +v4si __builtin_ia32_vpshld (v4si, v4si); +v2di __builtin_ia32_vpshlq (v2di, v2di); +v8hi __builtin_ia32_vpshlw (v8hi, v8hi); +</pre></div> + +<p>The following built-in functions are available when <samp>-mfma4</samp> is used. +All of them generate the machine instruction that is part of the name. +</p> +<div class="smallexample"> +<pre class="smallexample">v2df __builtin_ia32_vfmaddpd (v2df, v2df, v2df); +v4sf __builtin_ia32_vfmaddps (v4sf, v4sf, v4sf); +v2df __builtin_ia32_vfmaddsd (v2df, v2df, v2df); +v4sf __builtin_ia32_vfmaddss (v4sf, v4sf, v4sf); +v2df __builtin_ia32_vfmsubpd (v2df, v2df, v2df); +v4sf __builtin_ia32_vfmsubps (v4sf, v4sf, v4sf); +v2df __builtin_ia32_vfmsubsd (v2df, v2df, v2df); +v4sf __builtin_ia32_vfmsubss (v4sf, v4sf, v4sf); +v2df __builtin_ia32_vfnmaddpd (v2df, v2df, v2df); +v4sf __builtin_ia32_vfnmaddps (v4sf, v4sf, v4sf); +v2df __builtin_ia32_vfnmaddsd (v2df, v2df, v2df); +v4sf __builtin_ia32_vfnmaddss (v4sf, v4sf, v4sf); +v2df __builtin_ia32_vfnmsubpd (v2df, v2df, v2df); +v4sf __builtin_ia32_vfnmsubps (v4sf, v4sf, v4sf); +v2df __builtin_ia32_vfnmsubsd (v2df, v2df, v2df); +v4sf __builtin_ia32_vfnmsubss (v4sf, v4sf, v4sf); +v2df __builtin_ia32_vfmaddsubpd (v2df, v2df, v2df); +v4sf __builtin_ia32_vfmaddsubps (v4sf, v4sf, v4sf); +v2df __builtin_ia32_vfmsubaddpd (v2df, v2df, v2df); +v4sf __builtin_ia32_vfmsubaddps (v4sf, v4sf, v4sf); +v4df __builtin_ia32_vfmaddpd256 (v4df, v4df, v4df); +v8sf __builtin_ia32_vfmaddps256 (v8sf, v8sf, v8sf); +v4df __builtin_ia32_vfmsubpd256 (v4df, v4df, v4df); +v8sf __builtin_ia32_vfmsubps256 (v8sf, v8sf, v8sf); +v4df __builtin_ia32_vfnmaddpd256 (v4df, v4df, v4df); +v8sf __builtin_ia32_vfnmaddps256 (v8sf, v8sf, v8sf); +v4df __builtin_ia32_vfnmsubpd256 (v4df, v4df, v4df); +v8sf __builtin_ia32_vfnmsubps256 (v8sf, v8sf, v8sf); +v4df __builtin_ia32_vfmaddsubpd256 (v4df, v4df, v4df); +v8sf __builtin_ia32_vfmaddsubps256 (v8sf, v8sf, v8sf); +v4df __builtin_ia32_vfmsubaddpd256 (v4df, v4df, v4df); +v8sf __builtin_ia32_vfmsubaddps256 (v8sf, v8sf, v8sf); + +</pre></div> + +<p>The following built-in functions are available when <samp>-mlwp</samp> is used. +</p> +<div class="smallexample"> +<pre class="smallexample">void __builtin_ia32_llwpcb16 (void *); +void __builtin_ia32_llwpcb32 (void *); +void __builtin_ia32_llwpcb64 (void *); +void * __builtin_ia32_llwpcb16 (void); +void * __builtin_ia32_llwpcb32 (void); +void * __builtin_ia32_llwpcb64 (void); +void __builtin_ia32_lwpval16 (unsigned short, unsigned int, unsigned short); +void __builtin_ia32_lwpval32 (unsigned int, unsigned int, unsigned int); +void __builtin_ia32_lwpval64 (unsigned __int64, unsigned int, unsigned int); +unsigned char __builtin_ia32_lwpins16 (unsigned short, unsigned int, unsigned short); +unsigned char __builtin_ia32_lwpins32 (unsigned int, unsigned int, unsigned int); +unsigned char __builtin_ia32_lwpins64 (unsigned __int64, unsigned int, unsigned int); +</pre></div> + +<p>The following built-in functions are available when <samp>-mbmi</samp> is used. +All of them generate the machine instruction that is part of the name. +</p><div class="smallexample"> +<pre class="smallexample">unsigned int __builtin_ia32_bextr_u32(unsigned int, unsigned int); +unsigned long long __builtin_ia32_bextr_u64 (unsigned long long, unsigned long long); +</pre></div> + +<p>The following built-in functions are available when <samp>-mbmi2</samp> is used. +All of them generate the machine instruction that is part of the name. +</p><div class="smallexample"> +<pre class="smallexample">unsigned int _bzhi_u32 (unsigned int, unsigned int); +unsigned int _pdep_u32 (unsigned int, unsigned int); +unsigned int _pext_u32 (unsigned int, unsigned int); +unsigned long long _bzhi_u64 (unsigned long long, unsigned long long); +unsigned long long _pdep_u64 (unsigned long long, unsigned long long); +unsigned long long _pext_u64 (unsigned long long, unsigned long long); +</pre></div> + +<p>The following built-in functions are available when <samp>-mlzcnt</samp> is used. +All of them generate the machine instruction that is part of the name. +</p><div class="smallexample"> +<pre class="smallexample">unsigned short __builtin_ia32_lzcnt_u16(unsigned short); +unsigned int __builtin_ia32_lzcnt_u32(unsigned int); +unsigned long long __builtin_ia32_lzcnt_u64 (unsigned long long); +</pre></div> + +<p>The following built-in functions are available when <samp>-mfxsr</samp> is used. +All of them generate the machine instruction that is part of the name. +</p><div class="smallexample"> +<pre class="smallexample">void __builtin_ia32_fxsave (void *); +void __builtin_ia32_fxrstor (void *); +void __builtin_ia32_fxsave64 (void *); +void __builtin_ia32_fxrstor64 (void *); +</pre></div> + +<p>The following built-in functions are available when <samp>-mxsave</samp> is used. +All of them generate the machine instruction that is part of the name. +</p><div class="smallexample"> +<pre class="smallexample">void __builtin_ia32_xsave (void *, long long); +void __builtin_ia32_xrstor (void *, long long); +void __builtin_ia32_xsave64 (void *, long long); +void __builtin_ia32_xrstor64 (void *, long long); +</pre></div> + +<p>The following built-in functions are available when <samp>-mxsaveopt</samp> is used. +All of them generate the machine instruction that is part of the name. +</p><div class="smallexample"> +<pre class="smallexample">void __builtin_ia32_xsaveopt (void *, long long); +void __builtin_ia32_xsaveopt64 (void *, long long); +</pre></div> + +<p>The following built-in functions are available when <samp>-mtbm</samp> is used. +Both of them generate the immediate form of the bextr machine instruction. +</p><div class="smallexample"> +<pre class="smallexample">unsigned int __builtin_ia32_bextri_u32 (unsigned int, + const unsigned int); +unsigned long long __builtin_ia32_bextri_u64 (unsigned long long, + const unsigned long long); +</pre></div> + + +<p>The following built-in functions are available when <samp>-m3dnow</samp> is used. +All of them generate the machine instruction that is part of the name. +</p> +<div class="smallexample"> +<pre class="smallexample">void __builtin_ia32_femms (void); +v8qi __builtin_ia32_pavgusb (v8qi, v8qi); +v2si __builtin_ia32_pf2id (v2sf); +v2sf __builtin_ia32_pfacc (v2sf, v2sf); +v2sf __builtin_ia32_pfadd (v2sf, v2sf); +v2si __builtin_ia32_pfcmpeq (v2sf, v2sf); +v2si __builtin_ia32_pfcmpge (v2sf, v2sf); +v2si __builtin_ia32_pfcmpgt (v2sf, v2sf); +v2sf __builtin_ia32_pfmax (v2sf, v2sf); +v2sf __builtin_ia32_pfmin (v2sf, v2sf); +v2sf __builtin_ia32_pfmul (v2sf, v2sf); +v2sf __builtin_ia32_pfrcp (v2sf); +v2sf __builtin_ia32_pfrcpit1 (v2sf, v2sf); +v2sf __builtin_ia32_pfrcpit2 (v2sf, v2sf); +v2sf __builtin_ia32_pfrsqrt (v2sf); +v2sf __builtin_ia32_pfsub (v2sf, v2sf); +v2sf __builtin_ia32_pfsubr (v2sf, v2sf); +v2sf __builtin_ia32_pi2fd (v2si); +v4hi __builtin_ia32_pmulhrw (v4hi, v4hi); +</pre></div> + +<p>The following built-in functions are available when <samp>-m3dnowa</samp> is used. +All of them generate the machine instruction that is part of the name. +</p> +<div class="smallexample"> +<pre class="smallexample">v2si __builtin_ia32_pf2iw (v2sf); +v2sf __builtin_ia32_pfnacc (v2sf, v2sf); +v2sf __builtin_ia32_pfpnacc (v2sf, v2sf); +v2sf __builtin_ia32_pi2fw (v2si); +v2sf __builtin_ia32_pswapdsf (v2sf); +v2si __builtin_ia32_pswapdsi (v2si); +</pre></div> + +<p>The following built-in functions are available when <samp>-mrtm</samp> is used +They are used for restricted transactional memory. These are the internal +low level functions. Normally the functions in +<a href="x86-transactional-memory-intrinsics.html#x86-transactional-memory-intrinsics">x86 transactional memory intrinsics</a> should be used instead. +</p> +<div class="smallexample"> +<pre class="smallexample">int __builtin_ia32_xbegin (); +void __builtin_ia32_xend (); +void __builtin_ia32_xabort (status); +int __builtin_ia32_xtest (); +</pre></div> + +<p>The following built-in functions are available when <samp>-mmwaitx</samp> is used. +All of them generate the machine instruction that is part of the name. +</p><div class="smallexample"> +<pre class="smallexample">void __builtin_ia32_monitorx (void *, unsigned int, unsigned int); +void __builtin_ia32_mwaitx (unsigned int, unsigned int, unsigned int); +</pre></div> + +<p>The following built-in functions are available when <samp>-mclzero</samp> is used. +All of them generate the machine instruction that is part of the name. +</p><div class="smallexample"> +<pre class="smallexample">void __builtin_i32_clzero (void *); +</pre></div> + +<p>The following built-in functions are available when <samp>-mpku</samp> is used. +They generate reads and writes to PKRU. +</p><div class="smallexample"> +<pre class="smallexample">void __builtin_ia32_wrpkru (unsigned int); +unsigned int __builtin_ia32_rdpkru (); +</pre></div> + +<p>The following built-in functions are available when +<samp>-mshstk</samp> option is used. They support shadow stack +machine instructions from Intel Control-flow Enforcement Technology (CET). +Each built-in function generates the machine instruction that is part +of the function’s name. These are the internal low-level functions. +Normally the functions in <a href="x86-control_002dflow-protection-intrinsics.html#x86-control_002dflow-protection-intrinsics">x86 control-flow protection intrinsics</a> +should be used instead. +</p> +<div class="smallexample"> +<pre class="smallexample">unsigned int __builtin_ia32_rdsspd (void); +unsigned long long __builtin_ia32_rdsspq (void); +void __builtin_ia32_incsspd (unsigned int); +void __builtin_ia32_incsspq (unsigned long long); +void __builtin_ia32_saveprevssp(void); +void __builtin_ia32_rstorssp(void *); +void __builtin_ia32_wrssd(unsigned int, void *); +void __builtin_ia32_wrssq(unsigned long long, void *); +void __builtin_ia32_wrussd(unsigned int, void *); +void __builtin_ia32_wrussq(unsigned long long, void *); +void __builtin_ia32_setssbsy(void); +void __builtin_ia32_clrssbsy(void *); +</pre></div> + +<hr> +<div class="header"> +<p> +Next: <a href="x86-transactional-memory-intrinsics.html#x86-transactional-memory-intrinsics" accesskey="n" rel="next">x86 transactional memory intrinsics</a>, Previous: <a href="TI-C6X-Built_002din-Functions.html#TI-C6X-Built_002din-Functions" accesskey="p" rel="previous">TI C6X Built-in Functions</a>, Up: <a href="Target-Builtins.html#Target-Builtins" accesskey="u" rel="up">Target Builtins</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Indices.html#Indices" title="Index" rel="index">Index</a>]</p> +</div> + + + +</body> +</html> |