Age | Commit message (Collapse) | Author |
|
If PartialLSE encounters an instanceof or check-cast before the
escapes it could not correctly handle it. Due to how java language
code is typically developed and compiled this is generally not a
problem but could lead to incorrect codegen on release builds or
DCHECK failures on debug builds. This fixes the issues by (1) causing
partial LSE to consider check-cast and instance-ofs to be escaping.
This also updates the instruction simplifier to be much more
aggressive in removing instance-of and check-casts.
Test: ./test.py --host
Bug: 186041085
Change-Id: Ia513c4210a87a0dfa92f10adc530e17ee631d006
|
|
In cases where all targets of a HPredicatedInstanceFieldGet
instruction are known to not be null the simplifier would attempt to
replace the default value with a null instruction. This would cause a
null-pointer dereference. Correct the simplifier to handle this case
correctly.
Moved some LSE test helper functions to CommonCompilerTestHelper to
avoid duplicating code.
Fixed an incorrect (though until now unused) constructor for
HPredicatatedInstanceFieldGet (the default value and target we
swapped).
Test: ./test.py --host
Test: ./art/tools/compile_jars.py --profile-file bad-compile.txt ~/imgur.apk
Bug: 183942773
Change-Id: I66f4ce37d768d5e457047a3f80bd4cb9aa4546a3
|
|
This reverts commit 791df7a161ecfa28eb69862a4bc285282463b960.
This unreverts commit fc1ce4e8be0d977e3d41699f5ec746d68f63c024.
This unreverts commit b8686ce4c93eba7192ed7ef89e7ffd9f3aa6cd07.
We incorrectly failed to include PredicatedInstanceFieldGet in a few
conditions, including a DCHECK. This caused tests to fail under the
read-barrier-table-lookup configuration.
Reason for revert: Fixed 2 incorrect checks
Bug: 67037140
Test: ./art/test/testrunner/run_build_test_target.py -j70 art-gtest-read-barrier-table-lookup
Change-Id: I32b01b29fb32077fb5074e7c77a0226bd1fcaab4
|
|
This reverts commit fc1ce4e8be0d977e3d41699f5ec746d68f63c024.
Bug: 67037140
Reason for revert: Fails read-barrier-table-lookup tests.
Change-Id: I373867c728789bc14a4370b93a045481167d5f76
|
|
This reverts commit 47ac53100303e7e864b7f6d65f17b23088ccf1d6.
There was a bug in LSE where we would incorrectly record the
shadow$_monitor_ field as not having a default initial value. This
caused partial LSE to be unable to compile the Object.identityHashCode
function, causing crashes. This issue was fixed in a parent CL. Also
updated all Offsets in LSE_test to be outside of the object header
regardless of configuration.
Test: ./test.py --host
Bug: 67037140
Reason for revert: Fixed issue with shadow$_monitor_ field and offsets
Change-Id: I4fb2afff4d410da818db38ed833927dfc0f6be33
|
|
This reverts commit b8686ce4c93eba7192ed7ef89e7ffd9f3aa6cd07.
Bug: 67037140
Reason for revert: Fails a few tests.
Change-Id: Icf0635bffbfbba93bf0a5b854a9582c418198136
|
|
Add partial load-store elimination to the LSE pass. Partial LSE will
move object allocations which only escape along certain execution
paths closer to the escape point and allow more values to be
eliminated. It does this by creating new predicated load and store
instructions that are used when an object has only escaped some of the
time. In cases where the object has not escaped a default value will
be used.
Test: ./test.py --host
Test: ./test.py --target
Bug: 67037140
Change-Id: Idde67eb59ec90de79747cde17b552eec05b58497
|
|
We incorrectly handled merging unknowns in some situations.
Specifically in cases where we are unable to materialize loop-phis we
could end up with PureUnknowns which could end up hiding stores that
need to be kept.
In an unrelated issue we were incorrectly considering some values as
escapes when live at the point of an invoke. Since
SearchPhiPlaceholdersForKeptStores used a more precise notion of
escapes we could end up removing stores without being able to replace
the values.
This reverts commit 2316b3a0779f3721a78681f5c70ed6624ecaebef.
This unreverts commit b6837f0350ff66c13582b0e94178dd5ca283ff0a
This reverts commit fe270426c8a2a69a8f669339e83b86fbf40e25a1.
This unreverts commit bb6cda60e4418c0ab557ea4090e046bed8206763.
Bug: 67037140
Bug: 173120044
Reason for revert: Fixed issue causing incorrect store elimination
Test: ./test.py --host
Test: Boot cuttlefish
atest FrameworksServicesTests:com.android.server.job.BackgroundRestrictionsTest#testPowerWhiteList
Change-Id: I2ebae9ccfaf5169d551c5019b547589d0fce1dc9
|
|
This reverts commit b6837f0350ff66c13582b0e94178dd5ca283ff0a
This unreverts commit fe270426c8a2a69a8f669339e83b86fbf40e25a1.
This rereverts commit bb6cda60e4418c0ab557ea4090e046bed8206763.
Bug: 67037140
Bug: 173120044
Reason for revert: Git-blame seems to point to the CL as cause of
b/173120044. Revert during investigation.
Change-Id: I46f557ce79c15f07f4e77aacded1926b192754c3
|
|
A ScopedArenaAllocator in a single test was accidentally loaded using
operator new which is not supported. This caused a memory leak.
This reverts commit fe270426c8a2a69a8f669339e83b86fbf40e25a1.
This unreverts commit bb6cda60e4418c0ab557ea4090e046bed8206763.
Bug: 67037140
Reason for revert: Fixed memory leak in
LoadStoreAnalysisTest.PartialEscape test case
Test: SANITIZE_HOST=address ASAN_OPTIONS=detect_leaks=0 m test-art-host-gtest-dependencies
Run art_compiler_tests
Change-Id: I34fa2079df946ae54b8c91fa771a44d56438a719
|
|
This reverts commit bb6cda60e4418c0ab557ea4090e046bed8206763.
Bug: 67037140
Reason for revert: memory leak detected in the test.
Change-Id: I81cc2f61494e96964d8be40389eddcd7c66c9266
|
|
This is the first piece of partial LSE for art. This CL adds analysis
tools needed to implement partial LSE. More immediately, it improves
LSE so that it will remove stores that are provably non-observable
based on the location they occur. For example:
```
Foo o = new Foo();
if (xyz) {
check(foo);
foo.x++;
} else {
foo.x = 12;
}
return foo.x;
```
The store of 12 can be removed because the only escape in this method
is unreachable and was not executed by the point we reach the store.
The main purpose of this CL is to add the analysis tools needed to
implement partial Load-Store elimination. Namely it includes tracking
of which blocks are escaping and the groups of blocks that we cannot
remove allocations from.
The actual impact of this change is incredibly minor, being triggered
only once in a AOSP code. go/lem shows only minor effects to
compile-time and no effect on the compiled code. See
go/lem-allight-partial-lse-2 for numbers. Compile time shows an
average of 1.4% regression (max regression is 7% with 0.2 noise).
This CL adds a new 'reachability' concept to the HGraph. If this has
been calculated it allows one to quickly query whether there is any
execution path containing two blocks in a given order. This is used to
define a notion of sections of graph from which the escape of some
allocation is inevitable.
Test: art_compiler_tests
Test: treehugger
Bug: 67037140
Change-Id: I0edc8d6b73f7dd329cb1ea7923080a0abe913ea6
|
|
In cases where an array has overlapping accesses on separate
iterations LSE could get confused and incorrectly believe removing
array stores is safe even when later iterations rely on the store
occurring. For example consider the following code:
```
int do_cal(int len) {
if (len < 5) {
return -1;
}
int w[] = new w[len];
int t = 0;
for (int i = 5; i < w.length; i++) {
w[i] = please_interleave(w[i - 1], w[i - 5]);
t = please_select(w[i], i);
}
return t;
}
```
We would either need to materialize 5 PHIs to hold the values
`w[i - 1], ..., w[i - 5]` or avoid removing the write to `w[i]`. Our
LSE pass is unable to do the former and would (incorrectly) fail to
recognize that, by not being able to determine the values of `w[i -
1]` and `w[i-5]` that the later store to `w[i]` must be preserved.
Bug: 168446366
Test: ./test.py --host
Change-Id: I89772c8bf49ebf6de70f86bd68484e14bd189406
|
|
Second attempt at this, which fixes the asan failures.
Remove test_per_src since it is not supported by atest.
Replace it with gtest_isolate which is transparent to atest,
and which still allows us to run tests in parallel.
The size of test binaries halves (from 1GB to 0.5GB).
Test run-time on host is unchanged.
Test run-time on target is 4x faster (tested on walleye).
Added a gtest_main.cc with the gtest isolated main function,
and ART-specific initialization.
Bug: 147819342
Test: m test-art-host-gtest
Test: art/tools/run-gtests.sh
Test: art/test/testrunner/run_build_test_target.py art-gtest-asan
Change-Id: I515c911bb7d44285495802fc66cd732fc8e6d8df
|
|
The only Optimizing test that actually needs a Runtime is
the ReferenceTypePropagationTest, so we make it subclass
CommonCompilerTest explicitly and change OptimizingUnitTest
to subclass CommonArtTest for the other tests.
On host, each test that initializes the Runtime takes ~220ms
more than without initializing the Runtime. For example, the
ConstantFoldingTest that has 10 individual tests previously
took over 2.2s to run but without the Runtime initialization
it takes around 3-5ms. On target, running 32-bit gtests on
taimen with run-gtests.sh (single-threaded) goes from
~28m47s to ~26m13s, a reduction of ~9%.
Test: m test-art-host-gtest
Test: run-gtests.sh
Change-Id: I43e50ed58e52cc0ad04cdb4d39801bfbae840a3d
|
|
And merge all functionality into OptimizingUnitTest.
Test: m test-art-host-gtest
Change-Id: I69a4e8c489462700ec0eb9ed93d5cdbdb6147f1a
|
|
This avoids passing the `VariableSizedHandleScope*` argument
around and eliminates HGraph::inexact_object_rti_ and its
initialization. The latter shall allow running Optimizing
gtests that do not require type information without creating
a Runtime in future. (To be implemented in a separate CL.)
Test: m test-art-host-gtest
Test: testrunner.py --host --optmizing
Test: aosp_taimen-userdebug boots.
Change-Id: I36fe9bc556c6d610d644c8c14cc74c9985a14d64
|
|
This reverts commit 8103e479d8f8447584582b2b70752029f7087776.
Reason for revert: asan run fails in multiple ways
Test: ran ./art/test/testrunner/run_build_test_target.py art-gtest-asan
Change-Id: Ib9f2887436a664b64c6410f56a25ae2dd0e0aab4
|
|
Remove test_per_src since it is not supported by atest.
Replace it with gtest_isolate which is transparent to atest,
and which still allows us to run tests in parallel.
The size of test binaries halves (from 1GB to 0.5GB).
Test run-time on host is unchanged.
Test run-time on target is 4x faster (tested on walleye).
Bug: 147819342
Test: m test-art-host-gtest
Test: art/tools/run-gtests.sh
Change-Id: Id295af00d08b24baa2e421b0f3313df0b2e56fe9
|
|
GraphChecker should be always used in gtests where it is possible.
Currently only ImprovedOptimizingUnitTest allows unit tests to use
GraphChecker. Unit tests based on OptimizingUnitTest cannot use this
functionality.
Another issue is that GraphChecker has reference type information
checks which unit tests cannot satisfy.
The CL resolves the issues by:
* Adding a public GraphChecker::SetRefTypeInfoCheckEnabled.
* Adding a private OptimizingUnitTestHelper::CheckGraph(HGraph* graph,
bool check_ref_type_info).
* Adding a public OptimizingUnitTestHelper::CheckGraph(graph) to perform
all checks.
* Adding a public
OptimizingUnitTestHelper::CheckGraphSkipRefTypeInfoChecks(graph) to
perform all checks but reference type information checks.
* Updating ImprovedOptimizingUnitTest::CheckGraph to use
OptimizingUnitTestHelper::CheckGraph.
To demonstrate how the new API can be used, unit tests for the
Load-Store-Analysis pass are updated.
Test: test.py --host --optimizing --jit --gtest
Test: test.py --target --optimizing --jit
Test: run-gtests.sh
Change-Id: I7ca0983e66d9904073f0d711b3de96cccfe42746
|
|
Subclasses of ImprovedOptimizingUnitTest might need a different number
of graph parameters. Currently only a parameter is defined and created.
This CL adds ImprovedOptimizingUnitTest::CreateParameters which
subclasses can override to create as many parameters as they need. All
created parameters are added to the entry basic block. The default
implementation of ImprovedOptimizingUnitTest::CreateParameters does
nothing.
Test: run-gtests.sh
Change-Id: I2c6a58232e36d3562fc2bc0cdc289dd739094a73
|
|
This reverts commit e2727154f25e0db9a5bb92af494d8e47b181dfcf.
Reason for revert: Breaks ASAN tests (ODR violation).
Bug: 142365358
Change-Id: I38103d74a1297256c81d90872b6902ff1e9ef7a4
|
|
Make symbols in compiler/optimizing hidden by a namespace
attribute. The unit intrinsic_objects.{h,cc} is excluded as
it is needed by dex2oat.
As the symbols are no longer exported, gtests are now linked
with the static version of the libartd-compiler library.
libart-compiler.so size:
- before:
arm: 2396152
arm64: 3345280
- after:
arm: 2016176 (-371KiB, -15.9%)
arm64: 2874480 (-460KiB, -14.1%)
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing --jit
Bug: 142365358
Change-Id: I1fb04a33351f53f00b389a1642e81a68e40912a8
|
|
Separating out the structs from DexFile allows them to be forward-
declared, which reduces the need to include the dex_file header.
Bug: 119869270
Test: m
Change-Id: I32dde5a632884bca7435cd584b4a81883de2e7b4
|
|
Handles compiler.
Bug: 116054210
Test: WITH_TIDY=1 mmma art
Change-Id: I5cdfe73c31ac39144838a2736146b71de037425e
|
|
This reverts commit b095f022a9683a9123018c01e22595cf969fd88b.
Reason for revert: Caused huge interpreter performance regression.
Change-Id: I0f27f8f234d315807695362bf679ef47f68723f7
|
|
|
|
Make sure that HSelectGenerator doesn't hoist instructions which
can throw. Currently this doesn't happen due to
SideEffect::CanTriggerGC however this side effect is to be removed
for some instructions.
Test: select_generator_test.
Test: test-art-host, test-art-target.
Change-Id: I996f6cbdcee4987a36079d387a7b74b326881ab6
|
|
Avoid bare pointers in DexFileLoader APIs, which caused clang-tidy
issues and other problems.
Bug: none
Test: build and boot
Change-Id: Ic277bc83af1997774b42c55d3d631ec940b9c015
|
|
Make ArenaPool an abstract base class and leave MallocArenaPool
implementation with it. This enables arena_allocator to be free
of MemMap, Mutex, etc., in preparation to move the remaining collections
out of runtime/base to libartbase/base.
Bug: 22322814
Test: make -j 50 test-art-host
build and boot
Change-Id: Ief84dcbfb749165d9bc82000c6b8f96f93052422
|
|
Previously, the code item was not necessarily 32 bit aligned. This
caused bus errors on armv7.
Also create a real dexfile object instead of casting 0 initialized
memory to a dex file pointer. We just got lucky before that the cdex
boolean was false.
Test: test-art-target-gtest
Bug: 63756964
Bug: 71605148
Change-Id: Ic7199f2b97bbd421de1d702efa5c6531ff45c022
|
|
Change constructor to use a reference to a dex file.
Remove duplicated logic for GetCodeItemSize.
Bug: 63756964
Test: test-art-host
Change-Id: I69af8b93abdf6bdfa4454e16db8f4e75883bca46
|
|
Move all the DexFile related source to a common subdirectory dex/ of
runtime.
Bug: 71361973
Test: make -j 50 test-art-host
Change-Id: I59e984ed660b93e0776556308be3d653722f5223
|
|
Make code item fields private and use accessors. Added a hand full of
friend classes to reduce the size of the change.
Changed default to be nullable and removed CreateNullable.
CreateNullable was a bad API since it defaulted to the unsafe, may
add a CreateNonNullable if it's important for performance.
Motivation:
Have a different layout for code items in cdex.
Bug: 63756964
Test: test-art-host-gtest
Test: test/testrunner/testrunner.py --host
Test: art/tools/run-jdwp-tests.sh '--mode=host' '--variant=X32' --debug
Change-Id: I42bc7435e20358682075cb6de52713b595f95bf9
|
|
When compiling an intrinsic method, generate a graph that
invokes the same method and try to compile it. If the call
is actually intrinsified (or simplified to other HIR) and
yields a leaf method, use the result of this compilation
attempt, otherwise compile the actual code or JNI stub.
Note that CodeGenerator::CreateThrowingSlowPathLocations()
actually marks the locations as kNoCall if the throw is not
in a catch block, thus considering some throwing methods
(for example, String.charAt()) as leaf methods.
We would ideally want to use the intrinsic codegen for all
intrinsics that do not generate a slow-path call to the
default implementation. Relying on the leaf method is
suboptimal as we're missing out on methods that do other
types of calls, for example runtime calls. This shall be
fixed in a subsequent CL.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Bug: 67717501
Change-Id: I640fda7c22d4ff494b5ff77ebec3b7f5f75af652
|
|
Cleanup errors from upstream cpplint in preparation
for moving art's cpplint fork to upstream tip-of-tree cpplint.
Test: cd art && mm
Bug: 68951293
Change-Id: I15faed4594cbcb8399850f8bdee39d42c0c5b956
|
|
Memory needed to compile the two most expensive methods for
aosp_angler-userdebug boot image:
BatteryStats.dumpCheckinLocked() : 21.1MiB -> 20.2MiB
BatteryStats.dumpLocked(): 42.0MiB -> 40.3MiB
This is because all the memory previously used by the graph
builder is reused by later passes.
And finish the "arena"->"allocator" renaming; make renamed
allocator pointers that are members of classes const when
appropriate (and make a few more members around them const).
Test: m test-art-host-gtest
Test: testrunner.py --host
Bug: 64312607
Change-Id: Ia50aafc80c05941ae5b96984ba4f31ed4c78255e
|
|
Memory needed to compile the two most expensive methods for
aosp_angler-userdebug boot image:
BatteryStats.dumpCheckinLocked() : 25.1MiB -> 21.1MiB
BatteryStats.dumpLocked(): 49.6MiB -> 42.0MiB
This is because all the memory previously used by Scheduler
is reused by the register allocator; the register allocator
has a higher peak usage of the ArenaStack.
And continue the "arena"->"allocator" renaming.
Test: m test-art-host-gtest
Test: testrunner.py --host
Bug: 64312607
Change-Id: Idfd79a9901552b5147ec0bf591cb38120de86b01
|
|
Passes using local ArenaAllocator were hiding their memory
usage from the allocation counting, making it difficult to
track down where memory was used. Using ScopedArenaAllocator
reveals the memory usage.
This changes the HGraph constructor which requires a lot of
changes in tests. Refactor these tests to limit the amount
of work needed the next time we change that constructor.
Test: m test-art-host-gtest
Test: testrunner.py --host
Test: Build with kArenaAllocatorCountAllocations = true.
Bug: 64312607
Change-Id: I34939e4086b500d6e827ff3ef2211d1a421ac91a
|
|
Replace most uses of the runtime's Primitive in compiler
with a new class DataType. This prepares for introducing
new types, such as Uint8, that the runtime does not need
to know about.
Test: m test-art-host-gtest
Test: testrunner.py --host
Bug: 23964345
Change-Id: Iec2ad82454eec678fffcd8279a9746b90feb9b0c
|
|
Let clang-format reorder the header includes.
Derived with:
* .clang-format:
BasedOnStyle: Google
IncludeIsMainRegex: '(_test|-inl)?$'
* Steps:
find . -name '*.cc' -o -name '*.h' | xargs sed -i.bak -e 's/^#include/ #include/' ; git commit -a -m 'ART: Include cleanup'
git-clang-format -style=file HEAD^
manual inspection
git commit -a --amend
Test: mmma art
Change-Id: Ia963a8ce3ce5f96b5e78acd587e26908c7a70d02
|
|
Remove the illegal optimization that destroyed constructor barriers
after inlining invoke-super constructor calls.
---
According to JLS 7.5.1,
"Note that if one constructor invokes another constructor, and the
invoked constructor sets a final field, the freeze for the final field
takes place at the end of the invoked constructor."
This means if an object is published (stored to a location potentially
visible to another thread) inside of an outer constructor, all final
field stores from any inner constructors must be visible to other
threads.
Test: art/test.py
Bug: 37001605
Change-Id: I3b55f6c628ff1773dab88022a6475d50a1a6f906
|
|
This commit adds a new `HInstructionScheduling` pass that performs
basic scheduling on the `HGraph`.
Currently, scheduling is performed at the block level, so no
`HInstruction` ever leaves its block in this pass.
The scheduling process iterates through blocks in the graph. For
blocks that we can and want to schedule:
1) Build a dependency graph for instructions. It includes data
dependencies (inputs/uses), but also environment dependencies and
side-effect dependencies.
2) Schedule the dependency graph. This is a topological sort of the
dependency graph, using heuristics to decide what node to schedule
first when there are multiple candidates. Currently the heuristics
only consider instruction latencies and schedule first the
instructions that are on the critical path.
Test: m test-art-host
Test: m test-art-target
Change-Id: Iec103177d4f059666d7c9626e5770531fbc5ccdc
|
|
VariableSizedHandleScope's internal handle scopes are not pushed
directly on the thread. This means that it is safe to intermix with
other types of handle scopes.
Added test.
Test: clean-oat-host && test-art-host
Change-Id: Id2fd1155788428f394d49615d337d9134824c8f0
|
|
Also fixed inclusion of -inl.h files in .h files by adding
scoped_object_access-inl.h and scoped_fast_natvie_object_access-inl.h
Changed AddLocalReference / Decode to use ObjPtr.
Changed libartbenchmark to be debug to avoid linkage errors.
Bug: 31113334
Test: test-art-host
Change-Id: I4d2e160483a29d21e1e0e440585ed328b9811483
|
|
This patch merges the instruction-building phases from HGraphBuilder
and SsaBuilder into a single HInstructionBuilder class. As a result,
it is not necessary to generate HLocal, HLoadLocal and HStoreLocal
instructions any more, as the builder produces SSA form directly.
Saves 5-15% of arena-allocated memory (see bug for more data):
GMS 20.46MB => 19.26MB (-5.86%)
Maps 24.12MB => 21.47MB (-10.98%)
YouTube 28.60MB => 26.01MB (-9.05%)
This CL fixed an issue with parsing quickened instructions.
Bug: 27894376
Bug: 27998571
Bug: 27995065
Change-Id: I20dbe1bf2d0fe296377478db98cb86cba695e694
|
|
Bug: 27995065
This reverts commit e3ff7b293be2a6791fe9d135d660c0cffe4bd73f.
Change-Id: I5363c7ce18f47fd422c15eed5423a345a57249d8
|
|
This patch merges the instruction-building phases from HGraphBuilder
and SsaBuilder into a single HInstructionBuilder class. As a result,
it is not necessary to generate HLocal, HLoadLocal and HStoreLocal
instructions any more, as the builder produces SSA form directly.
Saves 5-15% of arena-allocated memory (see bug for more data):
GMS 20.46MB => 19.26MB (-5.86%)
Maps 24.12MB => 21.47MB (-10.98%)
YouTube 28.60MB => 26.01MB (-9.05%)
Bug: 27894376
Change-Id: Iefe28d40600c169c5d306fd2c77034ae19476d90
|
|
Second CL in the series of merging HGraphBuilder and SsaBuilder. This
patch refactors the builders so that dominator tree can be built
before any HInstructions are generated. This puts the SsaBuilder
removal of HLoadLocals/HStoreLocals straight after HGraphBuilder's
HInstruction generation phase. Next CL will therefore be able to
merge them.
This patch also adds util classes for iterating bytecode and switch
tables which allowed to simplify the code.
Bug: 27894376
Change-Id: Ic425d298b2e6e7980481ed697230b1a0b7904526
|
|
So long, old friend.
Change-Id: I0241c798a34b92bf994fed83888da67d6e7f1891
|