| Age | Commit message (Collapse) | Author |
|
Reuse the memory previously allocated on the ArenaStack by
optimization passes.
This CL handles only the architecture-independent codegen
and slow paths, architecture-dependent codegen allocations
shall be moved to the ScopedArenaAllocator in a follow-up.
Memory needed to compile the two most expensive methods for
aosp_angler-userdebug boot image:
BatteryStats.dumpCheckinLocked() : 19.6MiB -> 18.5MiB (-1189KiB)
BatteryStats.dumpLocked(): 39.3MiB -> 37.0MiB (-2379KiB)
Also move definitions of functions that use bit_vector-inl.h
from bit_vector.h also to bit_vector-inl.h .
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Bug: 64312607
Change-Id: I84688c3a5a95bf90f56bd3a150bc31fedc95f29c
|
|
Memory needed to compile the two most expensive methods for
aosp_angler-userdebug boot image:
BatteryStats.dumpCheckinLocked() : 21.1MiB -> 20.2MiB
BatteryStats.dumpLocked(): 42.0MiB -> 40.3MiB
This is because all the memory previously used by the graph
builder is reused by later passes.
And finish the "arena"->"allocator" renaming; make renamed
allocator pointers that are members of classes const when
appropriate (and make a few more members around them const).
Test: m test-art-host-gtest
Test: testrunner.py --host
Bug: 64312607
Change-Id: Ia50aafc80c05941ae5b96984ba4f31ed4c78255e
|
|
Memory needed to compile the two most expensive methods for
aosp_angler-userdebug boot image:
BatteryStats.dumpCheckinLocked() : 25.1MiB -> 21.1MiB
BatteryStats.dumpLocked(): 49.6MiB -> 42.0MiB
This is because all the memory previously used by Scheduler
is reused by the register allocator; the register allocator
has a higher peak usage of the ArenaStack.
And continue the "arena"->"allocator" renaming.
Test: m test-art-host-gtest
Test: testrunner.py --host
Bug: 64312607
Change-Id: Idfd79a9901552b5147ec0bf591cb38120de86b01
|
|
Passes using local ArenaAllocator were hiding their memory
usage from the allocation counting, making it difficult to
track down where memory was used. Using ScopedArenaAllocator
reveals the memory usage.
This changes the HGraph constructor which requires a lot of
changes in tests. Refactor these tests to limit the amount
of work needed the next time we change that constructor.
Test: m test-art-host-gtest
Test: testrunner.py --host
Test: Build with kArenaAllocatorCountAllocations = true.
Bug: 64312607
Change-Id: I34939e4086b500d6e827ff3ef2211d1a421ac91a
|
|
For array accesses the element address has the following structure:
Address = CONST_OFFSET + base_addr + index << ELEM_SHIFT
The address part (index << ELEM_SHIFT) can be shared across array
accesses with the same data type and index.
For example, in the following loop 5 accesses can share address
computation:
void foo(int[] a, int[] b, int[] c) {
for (i...) {
a[i] = a[i] + 5;
b[i] = b[i] + c[i];
}
}
Test: test-art-host, test-art-target
Change-Id: Id09fa782934aad4ee47669275e7e1a4d7d23b0fa
|
|
Rationale:
As decided after the MIPS change, this change unifies our
six code generators again a bit (we cannot move it into
the generic path, since arm likes to run the simplifier
first). Generally the GVN does some last minute cleanup
(such as finding CSE in the runtime tests generated
by dynamic BCE). I started a golem run to find impact.
Test: test-art-host test-art-target
Change-Id: Ib4098c5bae2269e71fee95cc31e3662d3aa47f6a
|
|
|
|
Replace most uses of the runtime's Primitive in compiler
with a new class DataType. This prepares for introducing
new types, such as Uint8, that the runtime does not need
to know about.
Test: m test-art-host-gtest
Test: testrunner.py --host
Bug: 23964345
Change-Id: Iec2ad82454eec678fffcd8279a9746b90feb9b0c
|
|
Test: mma test-art-host-gtest
Test: mma test-art-target-gtest in QEMU
Test: ./testrunner.py --target --optimizing in QEMU
Change-Id: Ie3c6b29b9125ff8aef888c3574bdb0ab96574bd4
|
|
Move LinkerPatch to compiler/linker/linker_patch.h .
Move SrcMapElem to compiler/debug/src_map_elem.h .
Introduce compiled_method-inl.h to reduce the number
of `#include`s in compiled_method.h .
Test: m test-art-host-gtest
Test: testrunner.py --host
Change-Id: Id211cdf94a63ad265bf4709f1a5e06dffbe30f64
|
|
This shifts some code from the libart-compiler.so to dex2oat
and reduces memory needed for JIT. We also avoid loading the
libart-dexlayout.so for JIT but the memory savings are
minimal (one shared clean page, two shared dirty pages and
some per-app kernel mmap data) as the code has never been
needed in memory by JIT.
aosp_angler-userdebug file sizes (stripped):
lib64/libart-compiler.so: 2989112 -> 2671888 (-310KiB)
lib/libart-compiler.so: 2160816 -> 1939276 (-216KiB)
bin/dex2oat: 141868 -> 368808 (+222KiB)
LOAD/executable elf mapping sizes:
lib64/libart-compiler.so: 2866308 -> 2555500 (-304KiB)
lib/libart-compiler.so: 2050960 -> 1834836 (-211KiB)
bin/dex2oat: 129316 -> 345916 (+212KiB)
Test: m test-art-host-gtest
Test: testrunner.py --host
Test: cd art/; mma; cd -
Change-Id: If62f02847a6cbb208eaf7e1f3e91af4663fa4a5f
|
|
Remove unused Quick compiler flag.
Remove support for arm32 soft-float code (which is no longer
supported by our compiler).
Test: m
Change-Id: I38b16291d90094dbf26776923a46afbf8de53f20
|
|
Add debug info for method call thunks (currently unused) and
Baker read barrier thunks. Refactor debug info generation
for trampolines and record their sizes; change their names
to start with upper-case letters, so that they can be easily
generated as `#fn_name`.
This improved debug info must be generated by `dex2oat -g`,
the debug info generated by `oatdump --symbolize` remains
the same as before, except for the renamed trampolines and
an adjustment for "code delta", i.e. the Thumb mode bit.
Cortex-A53 erratum 843419 workaround thunks are not covered
by this CL.
Test: Manual; run-test --gdb -Xcompiler-option -g 160, pull
symbols for gdbclient, break in the introspection
entrypoint, check that gdb knows the new symbols
(and disassembles them) and `backtrace` works when
setting $pc to an address in the thunk.
Bug: 36141117
Change-Id: Id224b72cfa7a0628799c7db65e66e24c8517aabf
|
|
|
|
Introduce a new "Constructor Fence Redundancy Elimination" pass.
The pass currently performs local optimization only, i.e. within instructions
in the same basic block.
All constructor fences preceding a publish (e.g. store, invoke) get
merged into one instruction.
==============
OptStat#ConstructorFenceGeneratedNew: 43825
OptStat#ConstructorFenceGeneratedFinal: 17631 <+++
OptStat#ConstructorFenceRemovedLSE: 164
OptStat#ConstructorFenceRemovedPFRA: 9391
OptStat#ConstructorFenceRemovedCFRE: 16133 <---
Removes ~91.5% of the 'final' constructor fences in RitzBenchmark:
(We do not distinguish the exact reason that a fence was created, so
it's possible some "new" fences were also removed.)
==============
Test: art/test/run-test --host --optimizing 476-checker-ctor-fence-redun-elim
Bug: 36656456
Change-Id: I8020217b448ad96ce9b7640aa312ae784690ad99
|
|
Test: market scan.
Change-Id: I58b23b8d254883f30619ea3602d34bf93618d432
|
|
|
|
* changes:
optimizing: Add statistics for # of constructor fences added/removed
optimizing: Refactor statistics to use OptimizingCompilerStats helper
|
|
Statistics are attributed as follows:
Added because:
* HNewInstances requires a HConstructorFence following it.
* HReturn requires a HConstructorFence (for final fields) preceding it.
Removed because:
* Optimized in Load-Store-Elimination.
* Optimized in Prepare-For-Register-Allocation.
Test: art/test.py
Bug: 36656456
Change-Id: Ic119441c5151a5a840fc6532b411340e2d68e5eb
|
|
Remove all copies of 'MaybeRecordStat', replacing them with a single
OptimizingCompilerStats::MaybeRecordStat helper.
Change-Id: I83b96b41439dccece3eee2e159b18c95336ea933
|
|
Generate run-time code in the Optimizing compiler checking that
the Marking Register's value matches `self.tls32_.is.gc_marking`
in debug mode (on target; and on host with JIT, or with AOT when
compiling the core image). If a check fails, abort.
Test: m test-art-target
Test: m test-art-target with tree built with ART_USE_READ_BARRIER=false
Test: ARM64 device boot test with libartd.
Bug: 37707231
Change-Id: Ie9b322b22b3d26654a06821e1db71dbda3c43061
|
|
The select generator currently only inserts select instructions
if there is a diamond shape with a phi.
This change extends the select generator to also deal with the
pattern:
if (condition) {
movable instruction 0
return value0
} else {
movable instruction 1
return value1
}
which it turns into:
moveable instruction 0
moveable instruction 1
return select (value0, value1, condition)
Test: 592-checker-regression-bool-input
Change-Id: Iac50fb181dc2c9b7619f28977298662bc09fc0e1
|
|
Flakiness observed on the bots.
Revert "Jit Code Cache instruction pipeline flushing"
This reverts commit 56fe32eecd4f25237e66811fd766355a07908d22.
Revert "ARM64: More JIT Code Cache maintenace"
This reverts commit 17272ab679c9b5f5dac8754ac070b78b15271c27.
Revert "ARM64: JIT Code Cache maintenance"
This reverts commit 3ecac070ad55d433bbcbe11e21f4b44ab178effe.
Revert "Change flush order in JIT code cache"
This reverts commit 43ce5f82dae4dc5eebcf40e54b81ccd96eb5fba3.
Revert "Separate rw from rx views of jit code cache"
This reverts commit d1dbb74e5946fe6c6098a541012932e1e9dd3115.
Test: art/test.py --target --64
Bug: 64527643
Bug: 62356545
Change-Id: Ifa10ac77a60ee96e8cb68881bade4d6b4f828714
|
|
|
|
Let clang-format reorder the header includes.
Derived with:
* .clang-format:
BasedOnStyle: Google
IncludeIsMainRegex: '(_test|-inl)?$'
* Steps:
find . -name '*.cc' -o -name '*.h' | xargs sed -i.bak -e 's/^#include/ #include/' ; git commit -a -m 'ART: Include cleanup'
git-clang-format -style=file HEAD^
manual inspection
git commit -a --amend
Test: mmma art
Change-Id: Ia963a8ce3ce5f96b5e78acd587e26908c7a70d02
|
|
Restores instruction pipeline flushing on all cores following crashes
on ARMv7 with dual JIT code page mappings. We were inadvertantly
toggling permission on a non-executable page rather than executable.
Removes the data cache flush for roots data and replaces it with a
sequentially consistent barrier.
Fix MemMap::RemapAtEnd() when all pages are given out. To meet
invariants checked in the destructor, the base pointer needs to be
assigned as nullptr when this happens.
Bug: 63833411
Bug: 62332932
Test: art/test.py --target
Change-Id: I705cf5a3c80e78c4e912ea3d2c3c4aa89dee26bb
|
|
To avoid effects of concurrent method entrypoints update,
just pass the logger to the JIT compiler, which will invoke
it directly with the pointer to the newly allocated code.
Test: test.py --trace
Change-Id: I5fbcd7cbc948b7d46c98c1545d6e530fb1190602
|
|
Test: m test-art-host-gtest
Test: testrunner.py --host
Test: testrunner.py --target
Test: Nexus 6P boots.
Test: Build aosp_mips64-userdebug.
Bug: 30627598
Change-Id: I0e54fdd2e91e983d475b7a04d40815ba89ae3d4f
|
|
|
|
In preparation for adding ArtMethod entries to the .bss
section, add direct PC-relative pointers to methods so that
the number of needed .bss entries for boot image is small.
Test: m test-art-host-gtest
Test: testrunner.py --host
Test: testrunner.py --target on Nexus 6P
Test: Nexus 6P boots.
Test: Build aosp_mips64-userdebug
Bug: 30627598
Change-Id: Ia89f5f9975b741ddac2816e1570077ba4b4c020f
|
|
This CL separates load store analysis from LSE pass.
The load and store analysis in LSE pass records information
about heap memory accesses for arrays and fields.
Such information can also be used in the other optimizations like
instruction scheduling pass which can eliminate side-effect
dependencies between memory accesses to different locations.
Test: m test-art-host
Test: m test-art-target
Test: m test-art-host-gtest-load_store_analysis_test
Test: 530-checker-lse
Change-Id: I353a2b9a03b19bfa0e7ef07716d60bd4254c7ea7
|
|
Performance improvements on various benchmarks with this CL:
benchmarks improvements
---------------------------
algorithm 1%
benchmarksgame 2%
caffeinemark 2%
math 3%
stanford 4%
Tested on ARM Cortex-A53 CPU.
The code size impact is negligible.
Test: m test-art-host
Test: m test-art-target
Change-Id: I314c90c09ce27e3d224fc686ef73c7d94a6b5a2c
|
|
Move enumerations to own header. Move the compiler interface (of what
the compiler can tolerate) into its own header. Replace or remove
method_verifier.h where possible.
Test: mmma art
Change-Id: I075fcb10b02b6c1c760daad31cb18eaa42067b6d
|
|
Remove the illegal optimization that destroyed constructor barriers
after inlining invoke-super constructor calls.
---
According to JLS 7.5.1,
"Note that if one constructor invokes another constructor, and the
invoked constructor sets a final field, the freeze for the final field
takes place at the end of the invoked constructor."
This means if an object is published (stored to a location potentially
visible to another thread) inside of an outer constructor, all final
field stores from any inner constructors must be visible to other
threads.
Test: art/test.py
Bug: 37001605
Change-Id: I3b55f6c628ff1773dab88022a6475d50a1a6f906
|
|
This enables checker tests, as well as compiler_driver_test and
reflection_test for MIPS32 and MIPS64.
Test: mma test-art-host-gtest
Test: mma test-art-target-gtest in QEMU (MIPS64)
Test: ./testrunner.py --optimizing --target in QEMU (MIPS64)
Change-Id: Ic6fe5b17f7f2cd7e38e12fef25afccf9358b80e0
|
|
This is the core functionality. Further improvements
will be done separately.
This also adds/moves memory barriers where they belong and
removes the UnsafeGetLongVolatile and UnsafePutLongVolatile
MIPS32 intrinsics as they need to load/store a pair of
registers atomically, which is not supported directly by
the CPU.
Test: booted MIPS32R2 in QEMU
Test: test-art-target-run-test
Test: booted MIPS64 (with 2nd arch MIPS32R6) in QEMU
Test: "testrunner.py --target --optimizing -j1"
Test: same MIPS64 boot/test with ART_READ_BARRIER_TYPE=TABLELOOKUP
Test: "testrunner.py --target --optimizing --32 -j2" on CI20
Test: same CI20 test with ART_READ_BARRIER_TYPE=TABLELOOKUP
Change-Id: I0ff91525fefba3ec1cc019f50316478a888acced
|
|
- Change from a depth limit to a total number of HInstructions
inlined limit. Remove the dex2oat depth limit argument.
- Add more stats to diagnose reasons for not inlining.
- Clean up logging to easily parse output.
Individual Ritz benchmarks improve from 3 to 10%.
No change in other heuristics. There was already an instruction budget.
Note that the instruction budget is rarely hit in the "apps" I've tried
with.
Compile-times improve from 5 to 15%.
Code size go from 4% increase (Gms) to 1% decrease (Docs).
bug:35724239
test: test-art-host test-art-target
Change-Id: I5a35c4bd826cf21fead77859709553c5b57608d6
|
|
No longer used. SrcMapElem is still used by elf_debug_line_writer.h.
Address previous comments from aog/351387.
Test: make
Change-Id: Ib1525168b14889abbdc78ba20c64f3223b140a51
|
|
The method info data is stored separately from the code info to
reduce oat size by improving deduplication of stack maps.
To reduce code size, this moves the invoke info and inline info
method indices to this table.
Oat size for a large app (arm64): 77746816 -> 74023552 (-4.8%)
Average oat size reduction for golem (arm64): 2%
Repurposed unused SrcMapElem deduping to be for MethodInfo.
TODO: Delete SrcMapElem in a follow up CL.
Bug: 36124906
Test: clean-oat-host && test-art-host-run-test
Change-Id: I2241362e728389030b959f42161ce817cf6e2009
|
|
Fixed ImageWriter to write class table also if it contains
only boot class loader classes. Added a regression test and
added extra checks for debug-build to verify that dex cache
types from app image are also in the class table. Removed
some unnecessary debug output.
Test: 158-app-image-class-table
Bug: 34839984
Bug: 30627598
Bug: 34659969
This reverts commit 0b66d6174bf1f6023f9d36dda8538490b79c2e9f.
Change-Id: I6a747904940c6ebc297f4946feef99dc0adf930c
|
|
For app images, ImageWriter does not add boot image
classes to the app image class table even though it
keeps them in the dex caches. The reason for that is
unknown, the code looks OK.
Bug: 34839984
Bug: 30627598
Bug: 34659969
Also reverts "Improve debugging output for a crash."
This reverts commits
bfb80d25eaeb7a604d5dd25a370e3869e96a33ab,
8dd56fcb3196f466ecaffd445397cb11ef85f89f.
Test: testrunner.py --host
Change-Id: Ic8db128207c07588c7f11563208ae1e85c8b0e84
|
|
|
|
Apps will always call the Object version of arraycopy. When
we can infer the types of the passed arrays, replace the method
being called to be the typed System.arraycopy one.
10% improvement on ExoPlayerBench.
Test: 641-checker-arraycopy
bug: 7103825
Change-Id: I872d7a6e163a4614510ef04ae582eb90ec48b5fa
|
|
Rationale:
Break-out CL of ART Vectorizer: number 3.
The purpose is making the original CL smaller
and easier to review.
Bug: 34083438
Test: test-art-host
Change-Id: I7cece807ee4f5fcaeae41f1deed33ac263447b77
|
|
Small example of what the optimization does:
Object o = new Object();
if (test) {
throw new Error(o.toString());
}
will be turned into (note that the first user of 'o'
is the 'new Error' allocation which has 'o' in its
environment):
if (test) {
Object o = new Obect();
throw new Error(o.toString());
}
There are other examples in 639-checker-code-sinking.
Ritz individual benchmarks improve on art-jit-cc from
5% (EvaluateComplexFormulas) to 23% (MoveFunctionColumn)
on all platforms.
Test: 639-checker-code-sinking
Test: test-art-host
Test: borg job run
Test: libcore + jdwp
bug:35634932
bug:30933338
Change-Id: Ib99c00c93fe76ffffb17afffb5a0e30a14310652
|
|
Added extra output to the abort message to collect more data
when we hit the crash. Added extra check when loading an app
image to verify that the class table isn't already broken.
Test: testrunner.py --host
Bug: 34839984
Bug: 30627598
Bug: 34659969
This reverts commit 5812e20ff7cbc8efa0b8d7486ada2f58840a6ad5.
Change-Id: I9bb442a184c236dcb75b3e42a095f39cd6bee59d
|
|
Get it in line with ObjPtr and prettify our code.
Test: m
Change-Id: I1322e2a9bc7a85d7f2441034a19bf4d807b81a0e
|
|
Assert failing for "earchbox:search":
F zygote64: class_linker.cc:4612] Check failed: handle_scope_iface.Get() != nullptr
Test: m test-art-host
Bug: 34839984
Bug: 30627598
Bug: 34659969
This reverts commit 85c0f2ac03417f5125bc2ff1dab8109859c67d5c.
Change-Id: I39846c20295af5875b0f945be7035c73ded23135
|
|
The reason for the revert was fixed by
https://android-review.googlesource.com/332666 .
We now enable clearing dex cache types in test 155 from that
CL. Also avoid an unnecessary store in LookupResolvedTypes()
and prevent verifier from messing up the dex cache types.
Test: m test-art-host
Bug: 34839984
Bug: 30627598
Bug: 34659969
This reverts commit d16363a93053de0f32252c7897d839a46aff14ae.
Change-Id: Ie8603cfa772e78e648d005b0b6eae59062ae729d
|
|
|