Age | Commit message (Collapse) | Author |
|
- Code refactoring to dissociate CHA attempts and devirtualization
attemps
- Only devirtualize (currently invoke-interface -> invoke-virtual) when
the target can be statically resolved. The benefits for CHA and inline
caches are less clear.
Test: test.py
Change-Id: I2d41cef8143ab1ce66b2c2e149674eaf228d15a3
|
|
This reverts commit 791df7a161ecfa28eb69862a4bc285282463b960.
This unreverts commit fc1ce4e8be0d977e3d41699f5ec746d68f63c024.
This unreverts commit b8686ce4c93eba7192ed7ef89e7ffd9f3aa6cd07.
We incorrectly failed to include PredicatedInstanceFieldGet in a few
conditions, including a DCHECK. This caused tests to fail under the
read-barrier-table-lookup configuration.
Reason for revert: Fixed 2 incorrect checks
Bug: 67037140
Test: ./art/test/testrunner/run_build_test_target.py -j70 art-gtest-read-barrier-table-lookup
Change-Id: I32b01b29fb32077fb5074e7c77a0226bd1fcaab4
|
|
This reverts commit fc1ce4e8be0d977e3d41699f5ec746d68f63c024.
Bug: 67037140
Reason for revert: Fails read-barrier-table-lookup tests.
Change-Id: I373867c728789bc14a4370b93a045481167d5f76
|
|
This reverts commit 47ac53100303e7e864b7f6d65f17b23088ccf1d6.
There was a bug in LSE where we would incorrectly record the
shadow$_monitor_ field as not having a default initial value. This
caused partial LSE to be unable to compile the Object.identityHashCode
function, causing crashes. This issue was fixed in a parent CL. Also
updated all Offsets in LSE_test to be outside of the object header
regardless of configuration.
Test: ./test.py --host
Bug: 67037140
Reason for revert: Fixed issue with shadow$_monitor_ field and offsets
Change-Id: I4fb2afff4d410da818db38ed833927dfc0f6be33
|
|
This reverts commit b8686ce4c93eba7192ed7ef89e7ffd9f3aa6cd07.
Bug: 67037140
Reason for revert: Fails a few tests.
Change-Id: Icf0635bffbfbba93bf0a5b854a9582c418198136
|
|
Add partial load-store elimination to the LSE pass. Partial LSE will
move object allocations which only escape along certain execution
paths closer to the escape point and allow more values to be
eliminated. It does this by creating new predicated load and store
instructions that are used when an object has only escaped some of the
time. In cases where the object has not escaped a default value will
be used.
Test: ./test.py --host
Test: ./test.py --target
Bug: 67037140
Change-Id: Idde67eb59ec90de79747cde17b552eec05b58497
|
|
We incorrectly handled merging unknowns in some situations.
Specifically in cases where we are unable to materialize loop-phis we
could end up with PureUnknowns which could end up hiding stores that
need to be kept.
In an unrelated issue we were incorrectly considering some values as
escapes when live at the point of an invoke. Since
SearchPhiPlaceholdersForKeptStores used a more precise notion of
escapes we could end up removing stores without being able to replace
the values.
This reverts commit 2316b3a0779f3721a78681f5c70ed6624ecaebef.
This unreverts commit b6837f0350ff66c13582b0e94178dd5ca283ff0a
This reverts commit fe270426c8a2a69a8f669339e83b86fbf40e25a1.
This unreverts commit bb6cda60e4418c0ab557ea4090e046bed8206763.
Bug: 67037140
Bug: 173120044
Reason for revert: Fixed issue causing incorrect store elimination
Test: ./test.py --host
Test: Boot cuttlefish
atest FrameworksServicesTests:com.android.server.job.BackgroundRestrictionsTest#testPowerWhiteList
Change-Id: I2ebae9ccfaf5169d551c5019b547589d0fce1dc9
|
|
This reverts commit b6837f0350ff66c13582b0e94178dd5ca283ff0a
This unreverts commit fe270426c8a2a69a8f669339e83b86fbf40e25a1.
This rereverts commit bb6cda60e4418c0ab557ea4090e046bed8206763.
Bug: 67037140
Bug: 173120044
Reason for revert: Git-blame seems to point to the CL as cause of
b/173120044. Revert during investigation.
Change-Id: I46f557ce79c15f07f4e77aacded1926b192754c3
|
|
A ScopedArenaAllocator in a single test was accidentally loaded using
operator new which is not supported. This caused a memory leak.
This reverts commit fe270426c8a2a69a8f669339e83b86fbf40e25a1.
This unreverts commit bb6cda60e4418c0ab557ea4090e046bed8206763.
Bug: 67037140
Reason for revert: Fixed memory leak in
LoadStoreAnalysisTest.PartialEscape test case
Test: SANITIZE_HOST=address ASAN_OPTIONS=detect_leaks=0 m test-art-host-gtest-dependencies
Run art_compiler_tests
Change-Id: I34fa2079df946ae54b8c91fa771a44d56438a719
|
|
This reverts commit bb6cda60e4418c0ab557ea4090e046bed8206763.
Bug: 67037140
Reason for revert: memory leak detected in the test.
Change-Id: I81cc2f61494e96964d8be40389eddcd7c66c9266
|
|
This is the first piece of partial LSE for art. This CL adds analysis
tools needed to implement partial LSE. More immediately, it improves
LSE so that it will remove stores that are provably non-observable
based on the location they occur. For example:
```
Foo o = new Foo();
if (xyz) {
check(foo);
foo.x++;
} else {
foo.x = 12;
}
return foo.x;
```
The store of 12 can be removed because the only escape in this method
is unreachable and was not executed by the point we reach the store.
The main purpose of this CL is to add the analysis tools needed to
implement partial Load-Store elimination. Namely it includes tracking
of which blocks are escaping and the groups of blocks that we cannot
remove allocations from.
The actual impact of this change is incredibly minor, being triggered
only once in a AOSP code. go/lem shows only minor effects to
compile-time and no effect on the compiled code. See
go/lem-allight-partial-lse-2 for numbers. Compile time shows an
average of 1.4% regression (max regression is 7% with 0.2 noise).
This CL adds a new 'reachability' concept to the HGraph. If this has
been calculated it allows one to quickly query whether there is any
execution path containing two blocks in a given order. This is used to
define a notion of sections of graph from which the escape of some
allocation is inevitable.
Test: art_compiler_tests
Test: treehugger
Bug: 67037140
Change-Id: I0edc8d6b73f7dd329cb1ea7923080a0abe913ea6
|
|
Pass enums by value instead of const reference.
Do not generate operator<< sources for headers that have no
enums or no declarations of operator<<. Do not define the
operator<< for flag enums; these were unused anyway.
Add generated operator<< for some enums in nodes.h . Change
the operator<< for ComparisonBias so that the graph
visualizer can use it but do not use the generated
operator<< yet as that would require changing checker tests.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Change-Id: Ifd4c455c2fa921a9668c966a13068d43b9c6e173
|
|
Added reasons for polymorphic invoke, custom, and unresolved.
Added a counter for the total number of inline attempts.
Test: run dex2oat on APK with --dump-stats
Change-Id: I57aa83dc7ac5fa8897b0c197f416baf46fbe9d53
|
|
This reverts commit e2727154f25e0db9a5bb92af494d8e47b181dfcf.
Reason for revert: Breaks ASAN tests (ODR violation).
Bug: 142365358
Change-Id: I38103d74a1297256c81d90872b6902ff1e9ef7a4
|
|
Make symbols in compiler/optimizing hidden by a namespace
attribute. The unit intrinsic_objects.{h,cc} is excluded as
it is needed by dex2oat.
As the symbols are no longer exported, gtests are now linked
with the static version of the libartd-compiler library.
libart-compiler.so size:
- before:
arm: 2396152
arm64: 3345280
- after:
arm: 2016176 (-371KiB, -15.9%)
arm64: 2874480 (-460KiB, -14.1%)
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing --jit
Bug: 142365358
Change-Id: I1fb04a33351f53f00b389a1642e81a68e40912a8
|
|
We currently don't handle this in the stack map, where we only encode
one stack slot for a dex register.
Bug: 136698025
Test: 721-osr
Change-Id: Ib395ed1165387ad5446a463c307cc0a45e365885
|
|
Remove over-broad use in headers. Fix up transitive includes.
Bug: 119869270
Test: mmma art
Change-Id: I518fa7c8bee014b260818fca1fbde6ec47d126da
|
|
Move the analysis after redundant phi and dead phi elimination,
knowing that only graphs with irreducible loops may still have
a phi as input of the invoke. In such a case, we bail.
bug: 112537407
Test: 563-checker-fake-string
Change-Id: Ib9eefa4ce905b7fb418ca9b2a3c26ea4df74ce8f
|
|
We wrongly assumed only irreducible loops could lead
to this situation, but any loop can actually be in between
a String NewInstance and a String.<init>.
bug: 109666561
Test: 563-checker-fakestring
Change-Id: I018a22f7e22c15e140252544415f51d544f7cc13
|
|
The current code doesn't work when dex register aliases.
bug: 78493232
Test: m
Change-Id: I1c296f6dc914388844ae5eb7d84f3bd7d81e1f87
|
|
The compiler stats have their own dex2oat parameter, the restriction
to debug build or verbose logging is antiquated.
Test: m
Change-Id: Idcbe5753bb2149a9694e39d7fa6ba7902e9c7810
|
|
Disabled the build time flag. (No image version bump needed.)
Bug: 26687569
Bug: 64692057
Bug: 76420366
This reverts commit 3fbd3ad99fad077e5c760e7238bcd55b07d4c06e.
Change-Id: I5d83c4ce8a7331c435d5155ac6e0ce1c77d60004
|
|
This reverts commit 3f41323cc9da335e9aa4f3fbad90a86caa82ee4d.
Reason for revert: Fails sporadically.
Bug: 26687569
Bug: 64692057
Bug: 76420366
Change-Id: I84d1e9e46c58aeecf17591ff71fbac6a1e583909
|
|
Add extra output for debugging failures and re-enable
the bitstring type checks.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing --jit
Test: testrunner.py --host -t 670-bitstring-type-check
Test: Pixel 2 XL boots.
Test: testrunner.py --target --optimizing --jit
Test: testrunner.py --target -t 670-bitstring-type-check
Bug: 64692057
Bug: 26687569
This reverts commit bff7a52e2c6c9e988c3ed1f12a2da0fa5fd37cfb.
Change-Id: I090e241983f3ac6ed8394d842e17716087d169ac
|
|
Enforce the layering that code in runtime/base should not depend on
runtime by separating it into libartbase. Some of the code in
runtime/base depends on the Runtime class, so it cannot be moved yet.
Also, some of the tests depend on CommonRuntimeTest, which itself needs
to be factored (in a subsequent CL).
Bug: 22322814
Test: make -j 50 checkbuild
make -j 50 test-art-host
Change-Id: I8b096c1e2542f829eb456b4b057c71421b77d7e2
|
|
Bug: 64692057
Bug: 71853552
Bug: 26687569
This reverts commit eb0ebed72432b3c6b8c7b38f8937d7ba736f4567.
Change-Id: I7daeaa077960ba41b2ed42bc47f17501621be4be
|
|
We guard the use of this feature with a compile-time flag,
set to true in this CL.
Boot image size for aosp_taimen-userdebug in AOSP master:
- before:
arm boot*.oat: 63604740
arm64 boot*.oat: 74237864
- after:
arm boot*.oat: 63531172 (-72KiB, -0.1%)
arm64 boot*.oat: 74135008 (-100KiB, -0.1%)
The new TypeCheckBenchmark yields the following changes
using the little cores of taimen fixed at 1.4016GHz:
32-bit 64-bit
timeCheckCastLevel1ToLevel1 11.48->15.80 11.47->15.78
timeCheckCastLevel2ToLevel1 15.08->15.79 15.08->15.79
timeCheckCastLevel3ToLevel1 19.01->15.82 17.94->15.81
timeCheckCastLevel9ToLevel1 42.55->15.79 42.63->15.81
timeCheckCastLevel9ToLevel2 39.70->14.36 39.70->14.35
timeInstanceOfLevel1ToLevel1 13.74->17.93 13.76->17.95
timeInstanceOfLevel2ToLevel1 17.02->17.95 16.99->17.93
timeInstanceOfLevel3ToLevel1 24.03->17.95 24.45->17.95
timeInstanceOfLevel9ToLevel1 47.13->17.95 47.14->18.00
timeInstanceOfLevel9ToLevel2 44.19->16.52 44.27->16.51
This suggests that the bitstring typecheck should not be
used for exact type checks which would be equivalent to the
"Level1ToLevel1" benchmark. Whether the implementation is
a beneficial replacement for the kClassHierarchyCheck and
kAbstractClassCheck on average depends on how many levels
from the target class (or Object for a negative result) is
a typical object's class.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing --jit
Test: testrunner.py --host -t 670-bitstring-type-check
Test: Pixel 2 XL boots.
Test: testrunner.py --target --optimizing --jit
Test: testrunner.py --target -t 670-bitstring-type-check
Bug: 64692057
Bug: 71853552
Bug: 26687569
Change-Id: I538d7e036b5a8ae2cc3fe77662a5903d74854562
|
|
Rationale:
With simple dex bytecode analysis, the inliner marks methods
that always throw to help subsequent code sinking. This reduces
overhead of non-nullable enforcing calls found in e.g the Kotlin
runtime library (1%-2% improvement on tree microbenchmark, about
5% on Denis' benchmark).
Test: test-art-host test-art-target
Change-Id: I45348f049721476828eb5443738021720d2857c0
|
|
Replace wherever possible. ART's base/logging is now mainly VLOG
and initialization code that is unnecessary to pull in and makes
changes to verbose logging more painful than they have to be.
Test: m test-art-host
Change-Id: I3e3a4672ba5b621e57590a526c7d1c8b749e4f6e
|
|
Add statistics for intrinsic and native stub compilation
and JIT failing to allocate memory for committing the
code. Clean up recording of compilation statistics.
New statistics when building aosp_taimen-userdebug boot
image with --dump-stats:
Attempted compilation of 94304 methods: 99.99% (94295) compiled.
OptStat#AttemptBytecodeCompilation: 89487
OptStat#AttemptIntrinsicCompilation: 160
OptStat#CompiledNativeStub: 4733
OptStat#CompiledIntrinsic: 84
OptStat#CompiledBytecode: 89478
...
where 94304=89487+4733+84 and 94295=89478+4733+84.
Test: testrunner.py -b --host --optimizing
Test: Manually inspect output of building boot image
with --dump-stats.
Bug: 69627511
Change-Id: I15eb2b062a96f09a7721948bcc77b83ee4f18efd
|
|
|
|
Introduce a new "Constructor Fence Redundancy Elimination" pass.
The pass currently performs local optimization only, i.e. within instructions
in the same basic block.
All constructor fences preceding a publish (e.g. store, invoke) get
merged into one instruction.
==============
OptStat#ConstructorFenceGeneratedNew: 43825
OptStat#ConstructorFenceGeneratedFinal: 17631 <+++
OptStat#ConstructorFenceRemovedLSE: 164
OptStat#ConstructorFenceRemovedPFRA: 9391
OptStat#ConstructorFenceRemovedCFRE: 16133 <---
Removes ~91.5% of the 'final' constructor fences in RitzBenchmark:
(We do not distinguish the exact reason that a fence was created, so
it's possible some "new" fences were also removed.)
==============
Test: art/test/run-test --host --optimizing 476-checker-ctor-fence-redun-elim
Bug: 36656456
Change-Id: I8020217b448ad96ce9b7640aa312ae784690ad99
|
|
Rationale:
Provides a (somewhat crude) quantative way to detect changes in
loop vectorization and idiom recognition (e.g. by means of market
scans, or just inspecting the same application before/after a change).
Test: market scan
Change-Id: Ic85938ba2b33c967de3159742c60301454a152a0
|
|
Statistics are attributed as follows:
Added because:
* HNewInstances requires a HConstructorFence following it.
* HReturn requires a HConstructorFence (for final fields) preceding it.
Removed because:
* Optimized in Load-Store-Elimination.
* Optimized in Prepare-For-Register-Allocation.
Test: art/test.py
Bug: 36656456
Change-Id: Ic119441c5151a5a840fc6532b411340e2d68e5eb
|
|
Remove all copies of 'MaybeRecordStat', replacing them with a single
OptimizingCompilerStats::MaybeRecordStat helper.
Change-Id: I83b96b41439dccece3eee2e159b18c95336ea933
|
|
- Change from a depth limit to a total number of HInstructions
inlined limit. Remove the dex2oat depth limit argument.
- Add more stats to diagnose reasons for not inlining.
- Clean up logging to easily parse output.
Individual Ritz benchmarks improve from 3 to 10%.
No change in other heuristics. There was already an instruction budget.
Note that the instruction budget is rarely hit in the "apps" I've tried
with.
Compile-times improve from 5 to 15%.
Code size go from 4% increase (Gms) to 1% decrease (Docs).
bug:35724239
test: test-art-host test-art-target
Change-Id: I5a35c4bd826cf21fead77859709553c5b57608d6
|
|
|
|
Small example of what the optimization does:
Object o = new Object();
if (test) {
throw new Error(o.toString());
}
will be turned into (note that the first user of 'o'
is the 'new Error' allocation which has 'o' in its
environment):
if (test) {
Object o = new Obect();
throw new Error(o.toString());
}
There are other examples in 639-checker-code-sinking.
Ritz individual benchmarks improve on art-jit-cc from
5% (EvaluateComplexFormulas) to 23% (MoveFunctionColumn)
on all platforms.
Test: 639-checker-code-sinking
Test: test-art-host
Test: borg job run
Test: libcore + jdwp
bug:35634932
bug:30933338
Change-Id: Ib99c00c93fe76ffffb17afffb5a0e30a14310652
|
|
This is a follow-up to
https://android-review.googlesource.com/343265
where we replaced Atomic<> with std::atomic<> because
Atomic<> was hiding std::atomic<>::operator=(). However,
while the default constructor of Atomic<> initializes the
value, the default constructor of std::atomic<> does not.
Test: m valgrind-test-art-host
Bug: 34053922
Change-Id: Iff2b38a7b28ee2d114991b60e3c40a33425bfc48
|
|
Stats from callee graph builder were not merged into main
stats and stats for callee graph optimizations were counted
even when the callee graph was eventually rejected.
Allocate the callee graph statistics on the arena.
Measured compilation of a big app using heaptrack:
bytes allocated in total (ignoring deallocations): 3.77GB -> 3.37GB
calls to allocation functions: 10650510 -> 8203129
Test: testrunner.py --host
Test: Stats change in the expected direction for an app.
Bug: 34053922
Change-Id: I605280d262b86af14b847acf3bb6dc077b749cc0
|
|
The class linker now tracks whether a method has a single implementation
and if so, the JIT compiler will try to devirtualize a virtual call for
the method into a direct call. If the single-implementation assumption
is violated due to additional class linking, compiled code that makes the
assumption is invalidated. Deoptimization is triggered for compiled code
live on stack. Instead of patching return pc's on stack, a CHA guard is
added which checks a hidden should_deoptimize flag for deoptimization.
This approach limits the number of deoptimization points.
This CL does not devirtualize abstract/interface method invocation.
Slides on CHA:
https://docs.google.com/a/google.com/presentation/d/1Ax6cabP1vM44aLOaJU3B26n5fTE9w5YU-1CRevIDsBc/edit?usp=sharing
Change-Id: I18bf716a601b6413b46312e925a6ad9e4008efa4
Test: ART_TEST_JIT=true m test-art-host/target-run-test test-art-host-gtest
|
|
Run it in the dead code elimination phase, as it relates to
creating dead branches.
From 0.04 to 0.07% less code size framework/gms/docs/fb (70K saved on fb)
3%-5% runtime performance improvements on Richards/DeltaBlue/Ritz.
Compile-time is mixed, so in the noise (from 2% slower to 1% faster).
test:611-checker-simplify-if
Change-Id: Ife8b7882d57b5481f5ca9dc163beba655d7e78bf
|
|
Second CL in the series of merging HGraphBuilder and SsaBuilder. This
patch refactors the builders so that dominator tree can be built
before any HInstructions are generated. This puts the SsaBuilder
removal of HLoadLocals/HStoreLocals straight after HGraphBuilder's
HInstruction generation phase. Next CL will therefore be able to
merge them.
This patch also adds util classes for iterating bytecode and switch
tables which allowed to simplify the code.
Bug: 27894376
Change-Id: Ic425d298b2e6e7980481ed697230b1a0b7904526
|
|
This removes redundant code from the generators and allows for easier
stat recording.
Change-Id: Iccd4368f9e9d87a6fecb863dee4e2145c97851c4
|
|
- report the max size of arena alloc
- report how many virtual or interface invokes were inlined
Change-Id: I82f154a8e25b5e3890181a1aa11346cdc3f93e37
|
|
This patch adds support for the --dump-stats facility with some
optimizations
and fixes all build issues introduced by the patch:
I68751b119a030952a11057cb651a3c63e87e73ea (which got reverted)
Change-Id: I5af1f2a8cced0a1a55c2bb4d8c88e6f0a24ec879
Signed-off-by: Jean-Philippe Halimi <jean-philippe.halimi@intel.com>
|
|
Change-Id: I7c01f4494bac8498accc0f087044ec509fee4c98
|
|
Add HClassTableGet to fetch an ArtMethod from the vtable or imt,
and compare it to the only method the profiling saw.
Change-Id: I76afd3689178f10e3be048aa3ac9a97c6f63295d
|
|
So we don't fallback to the interpreter in the presence of
irreducible loops.
Implications:
- A loop pre-header does not necessarily dominate a loop header.
- Non-constant redundant phis will be kept in loop headers, to
satisfy our linear scan register allocation algorithm.
- while-graph optimizations, such as gvn, licm, lse, and dce
need to know when they are dealing with irreducible loops.
Change-Id: I2cea8934ce0b40162d215353497c7f77d6c9137e
|
|
Just like aget(-wide), the value operand of aput(-wide) bytecode
instructions can be both int/long and float/double. This patch builds
on the previous mechanism for resolving type of ArrayGets to type the
values of ArraySets based on the reference type of the array.
Bug: 22538329
Change-Id: Ic86abbb58de146692de04476b555010b6fcdd8b6
|