Age | Commit message (Collapse) | Author |
|
If a float or double argument needs to be passed in core
register to a @CriticalNative method due to soft-float
native ABI, insert a fake call to Float.floatToRawIntBits()
or Double.doubleToRawLongBits() to satisfy type checks in
the compiler.
We cannot do that for intrinsics that expect those inputs in
actual FP registers, so we still prevent such intrinsics
from using `kCallCriticalNative`. This should be irrelevant
if an actual intrinsic implementation is emitted. There are
currently two unimplemented intrinsics that are affected by
the carve-out, namely MathRoundDouble and FP16ToHalf, and
four intrinsics implemented only when ARMv8A is supported,
namely MathRint, MathRoundFloat, MathCeil and MathFloor.
Test: testrunner.py --target --32 -t 178-app-image-native-method
Bug: 112189621
Change-Id: Id14ef4f49f8a0e6489f97dc9588c0e6a5c122632
|
|
A pattern seen in libcore and SPECjvm2008 workloads is a pair of HRem/HDiv
having the same dividend and divisor. The code generator processes
them separately and generates duplicated instructions calculating HDiv.
This CL adds detection of such a pattern to the instruction simplifier.
This optimization affects HInductionVarAnalysis and HLoopOptimization
preventing some loop optimizations. To avoid this the instruction simplifier
has the loop_friendly mode which means not to optimize HRems if they are in a loop.
A microbenchmark run on Pixel 3 shows the following improvements:
| little cores | big cores
arm32 Int32 | +21% | +40%
arm32 Int64 | +46% | +44%
arm64 Int32 | +27% | +14%
arm64 Int64 | +33% | +27%
Test: 411-checker-instruct-simplifier-hrem
Test: test.py --host --optimizing --jit --gtest --interpreter
Test: test.py --target --optimizing --jit --interpreter
Test: run-gtests.sh
Change-Id: I376a1bd299d7fe10acad46771236edd5f85dfe56
|
|
Make LSA a helper class, not an optimization pass. Move all
its allocations to ScopedArenaAllocator to reduce the peak
memory usage a little bit.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Change-Id: I7fc634abe732d22c99005921ffecac5207bcf05f
|
|
This avoids passing the `VariableSizedHandleScope*` argument
around and eliminates HGraph::inexact_object_rti_ and its
initialization. The latter shall allow running Optimizing
gtests that do not require type information without creating
a Runtime in future. (To be implemented in a separate CL.)
Test: m test-art-host-gtest
Test: testrunner.py --host --optmizing
Test: aosp_taimen-userdebug boots.
Change-Id: I36fe9bc556c6d610d644c8c14cc74c9985a14d64
|
|
Test: aosp_taimen-userdebug boots.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Bug: 147346243
Change-Id: I97fdc15e568ae3fe390efb1da690343025f84944
|
|
This reverts commit e2727154f25e0db9a5bb92af494d8e47b181dfcf.
Reason for revert: Breaks ASAN tests (ODR violation).
Bug: 142365358
Change-Id: I38103d74a1297256c81d90872b6902ff1e9ef7a4
|
|
Make symbols in compiler/optimizing hidden by a namespace
attribute. The unit intrinsic_objects.{h,cc} is excluded as
it is needed by dex2oat.
As the symbols are no longer exported, gtests are now linked
with the static version of the libartd-compiler library.
libart-compiler.so size:
- before:
arm: 2396152
arm64: 3345280
- after:
arm: 2016176 (-371KiB, -15.9%)
arm64: 2874480 (-460KiB, -14.1%)
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing --jit
Bug: 142365358
Change-Id: I1fb04a33351f53f00b389a1642e81a68e40912a8
|
|
Preparation for moving CompilerDriver and other stuff
from libart-compiler.so to dex2oat.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Change-Id: Ic221ebca4b8c79dd1549316921ace655f2e3f0fe
|
|
Treat verification results and image classes as mutable
only in CompilerDriver::PreCompile(), and treat them as
immutable during compilation, accessed through the
CompilerOptions. This severs the dependency of the inliner
on the CompilerDriver.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Change-Id: I594a0213ca6a5003c19b4bd488af98db4358d51d
|
|
This patch performs instruction simplification to
generate instructions andn, blsmsk and blsr on
cpus that have avx2.
Test: test.py --host --64, test-art-host-gtest
Change-Id: Ie41a1b99ac2980f1e9f6a831a7d639bc3e248f0f
Signed-off-by: Shalini Salomi Bodapati <shalini.salomi.bodapati@intel.com>
|
|
Instead just recognize the intrinsic when creating an invoke
instruction.
Also remove some old code related to compiler driver sharpening.
Test: test.py
Change-Id: Iecb668f30e95034970fcf57160ca12092c9c610d
|
|
Make the last sharpening helper (methods) like the other
helpers: being invoked by the instruction builder.
Test: test.py
Change-Id: Ic80a454f9b59b0b4ef7825590b24402500ba851c
|
|
This reverts commit 61908880e6565acfadbafe93fa64de000014f1a6.
Reason for revert: By failing to round multiply results, it does not follow Java rounding rules.
Change-Id: Ic0ef08691bef266c9f8d91973e596e09ff3307c6
|
|
This patch adds a new cpu vaiant named kabylake and performs
instruction simplification to generate VectorMulitplyAccumulate.
Test: ./test.py --host --64
Change-Id: Ie6cc882dadf1322dd4d3ae49bfdb600b0c447765
Signed-off-by: Gupta Kumar, Sanjiv <sanjiv.kumar.gupta@intel.com>
|
|
Rationale:
The change introduces actual conditional passes
(dependence on inliner). This ensures more
cases are optimized downstream without
needlessly introducing compile-time.
NOTE:
Some checker tests needed to be rewritten
due to subtle changes in the phase ordering.
No optimizations were harmed in the process,
though.
Bug: b/78171933, b/74026074
Test: test-art-host,target
Change-Id: I335260df780e14ba1f22499ad74d79060c7be44d
|
|
Rationale:
The change adds a return value to Run() in preparation of
conditional pass execution. The value returned by Run() is
best effort, returning false means no optimizations were
applied or no useful information was obtained. I filled
in a few cases with more exact information, others
still just return true. In addition, it integrates inlining
as a regular pass, avoiding the ugly "break" into
optimizations1 and optimziations2.
Bug: b/78171933, b/74026074
Test: test-art-host,target
Change-Id: Ia39c5c83c01dcd79841e4b623917d61c754cf075
|
|
Rationale:
Refactors the way we set up optimization passes
in the compiler into a more centralized approach.
The refactoring also found some "holes" in the
existing mechanism (missing string lookup in
the debugging mechanism, or inablity to set
alternative name for optimizations that may repeat).
Bug: 64538565
Test: test-art-host test-art-target
Change-Id: Ie5e0b70f67ac5acc706db91f64612dff0e561f83
|
|
Remove all copies of 'MaybeRecordStat', replacing them with a single
OptimizingCompilerStats::MaybeRecordStat helper.
Change-Id: I83b96b41439dccece3eee2e159b18c95336ea933
|
|
This change introduces new dex2oat switch --run-passes=. This switch
accepts path to a text file with names of passes to run.
Compiler will run optimization passes specified in the file rather
then the default ones.
There is no verification implemented on the compiler side. It is user's
responsibility to provide a list of passes that leads to successful
generation of correct code. Care should be taken to prepare a list
that satisfies all dependencies between optimizations.
We only take control of the optional optimizations. Codegen (builder),
and all passes required for register allocation will run unaffected
by this mechanism.
Change-Id: Ic3694e53515fefcc5ce6f28d9371776b5afcbb4f
|
|
This adds the ability to track where we allocate memory
when the kArenaAllocatorCountAllocations flag is turned on.
Also move some allocations from native heap to the Arena
and remove some unnecessary utilities.
Bug: 23736311
Change-Id: I1aaef3fd405d1de444fe9e618b1ce7ecef07ade3
|
|
Rationale:
Types (int, float etc.) and access type (field vs. array)
can be used to disambiguate write/read side-effects analysis.
This directly improves e.g. dead code elimination and licm.
Change-Id: I371f6909a3f42bda13190a03f04c4a867bde1d06
|
|
This patch refactors the way GraphChecker is invoked, utilizing the
same scoping mechanism as pass timing and graph visualizer. Therefore,
GraphChecker will now run not just after instances of HOptimization
but after the builders and reg alloc, too.
Change-Id: I8173b98b79afa95e1fcbf3ac9630a873d7f6c1d4
|
|
This should reduce the stack size needed by the
OptimizingCompiler::CompileOptimized() which was very
close to our limits for clang builds, causing repeated
build breakages on otherwise healthy changes:
art/compiler/optimizing/optimizing_compiler.cc:395:37:
error: stack frame size of 1760 bytes in function
'art::OptimizingCompiler::CompileOptimized'
[-Werror,-Wframe-larger-than=]
Change-Id: I2f4ab0235f4eac61823a4a320bb4fe78942a23c2
|
|
Adds a new pass which finds all unreachable blocks, typically due to
simplifying an if-condition to a constant, and removes them from the
graph. The patch also slightly generalizes the graph-transforming
operations.
Change-Id: Iff7c97f1d10b52886f3cd7401689ebe1bfdbf456
|
|
Until the global CFLAGS are fixed, add Wunused. Fix declarations
in the optimizing compiler.
Change-Id: Ic4553f08e809dc54f3d82af57ac592622c98e000
|
|
- propagate reference types between instructions
- remove checked casts when possible
- add StackHandleScopeCollection to manage an arbitrary number of stack
handles (see comments)
Change-Id: I31200067c5e7375a5ea8e2f873c4374ebdb5ee60
|
|
This patch refactors the way HGraph objects are created, moving the
instantiation out of the Builder class and creating the CodeGenerator
earlier. The patch uses this to build a single interface for printing
timings info and dumping the CFG.
Change-Id: I2eb63eabf28e2d0f5cdc7affaa690c3a4b1bdd21
|
|
Change-Id: I9c8afb0a58ef45e568576015473cbfd5f011c242
|
|
Change-Id: Icbab884d2dfd71656347368b424cb35cbf524051
|
|
Move existing optimizations to it.
Change-Id: I3b43f9997faf4ed8875162e3a3abdf99375478dd
|
|
This reverts commit 1ddbf6d4b37979a9f11a203c12befd5ae8b65df4.
Change-Id: I110a14668d1564ee0604dc958b91394b40da89fc
|
|
This reverts commit bf9cd7ba2118a75f5aa9b56241c4d5fa00dedeb8.
Change-Id: I0a483446666c9c24c45925a5fc199debdefd8b3e
|
|
- Add art::HOptimization.
- Rename art::ConstantPropagation to art::HConstantFolding in
compiler/optimizing/constant_folding.h to avoid name
clashes with a class of the same name in
compiler/dex/post_opt_passes.h.
- Rename art::DeadCodeElimination to
art::HDeadCodeElimination for consistency reasons.
- Have art::HDeadCodeElimination and art::HConstantFolding
derive from art::HOptimization.
- Start to use these optimizations in
art:OptimizingCompiler::TryCompile.
Change-Id: Iaab350c122d87b2333b3760312b15c0592d7e010
|