Age | Commit message (Collapse) | Author |
|
Using benchmarks provided by
https://android-review.googlesource.com/1420959
on blueline little cores with fixed frequency 1420800:
before after
GetByteArrayViewInt 27.093 0.024
SetByteArrayViewInt 28.067 0.024
GetByteArrayViewBigEndianInt 27.142 0.026
SetByteArrayViewBigEndianInt 28.040 0.025
Test: testrunner.py --target --64 --optimizing
Bug: 71781600
Change-Id: I604326675042bd63dce8ec15075714003ca9915d
|
|
Prepare for Reference.getReferent() intrinsic implementation
by a refactoring to separate the retrieval of an intrinsic
method's declaring class to its own helper function, rather
than being a part of a larger one.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing --jit
Test: aosp_blueline-userdebug boots.
Test: run-gtests.sh
Test: testrunner.py --target --optimizing --jit
Bug: 170286013
Change-Id: Ib6c0e55d0c6fcc932999428f21c51afe32ab7ef2
|
|
To avoid doing dex cache lookup, pass the interface method instead. This
costs a few hundred KBs on speed compiled APKs (< 0.5% code size), but
improves performance when hitting a conflict (as seen on dogfood data).
For nterp, we currently pass the conflict method instead of the
interface method. We need to handle default methods before optimizing
it.
This removes our last use of dex cache in compiled code. A follow-up CL
will remove the NeedsDexCacheOfDeclaringClass from HInvokeInterface.
Test: test.py
Change-Id: I3cdd4543ad7d904b3e81950af46a48a48af6991a
|
|
This commit checks if a VarHandle access mode is supported. If not, an
UnsupportedOperationException is raised by calling the runtime to handle it.
I added the polymorphic intrinsics case in the IntrinsicSlowPath
code generation to handle all the eventual exceptions. For now,
none of the operations are actually compiled. If the slow path is
not called, the runtime handles the operation.
Bug: b/65872996
Test: art/test.py --host -r -t 712-varhandle-invocations --32
Test: art/test.py --host --all-compiler -r
Change-Id: I5a637561549b3fdd64fa53e2d7dbf835d3ae0d64
|
|
And use it to clean up code generators.
Also fix CFI in MaybeIncrementHotness() for arm/arm64/x86.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Test: testrunner.py --host --debuggable --ndebuggable \
--optimizing --jit --jit-on-first-use -t 178
Test: aosp_cf_x86_phone-userdebug boots.
Test: aosp_cf_x86_phone-userdebug/jitzygote boots.
Test: # On blueline:
testrunner.py --target --debuggable --ndebuggable \
--optimizing --jit --jit-on-first-use -t 178
Bug: 112189621
Change-Id: I524e6c3054ffe1b05e2860fd7988cd9995df2963
|
|
Emit direct calls from compiled managed code to the native
code registered with the method, avoiding the JNI stub.
Golem results:
art-opt-cc x86 x86-64 arm arm64
NativeDowncallStaticCritical +12.5% +62.5% +75.9% +41.7%
NativeDowncallStaticCritical6 +55.6% +87.5% +72.1% +35.3%
art-opt x86 x86-64 arm arm64
NativeDowncallStaticCritical +28.6% +85.6% +76.4% +38.4%
NativeDowncallStaticCritical6 +44.6% +44.6% +74.6% +32.2%
Test: Covered by 178-app-image-native-method.
Test: m test-art-host-gtest
Test: testrunner.py --host --debuggable --ndebuggable \
--optimizing --jit --jit-on-first-use
Test: run-gtests.sh
Test: testrunner.py --target --optimizing
Test: testrunner.py --target --debuggable --ndebuggable \
--optimizing --jit --jit-on-first-use -t 178
Test: aosp_cf_x86_phone-userdebug boots.
Test: aosp_cf_x86_phone-userdebug/jitzygote boots.
Bug: 112189621
Change-Id: I8b37da51e8fe0b7bc513bb81b127fe0416068866
|
|
This is a first CL in the series of introducing arm64 SVE support
in ART. The patch splits the codegen functionality into scalar and
vector ones and for the latter introduces NEON and SVE
implementations. SVE one currently is an exact copy of NEON one -
for the sake of testing and an easy diff when the next CL comes
with an actual SVE instructions support.
The patch effectively doesn't change any behavior; NEON mode is
used for vector instructions, tests pass.
Test: test-art-target.
Change-Id: I5f7f2c8218330998e5a733a56f42473526cd58e6
|
|
ART vectorizer assumes that there is single size of SIMD
register used for the whole program. Make this assumption explicit
and refactor the code.
Note: This is a base for the future introduction of SIMD slots of
size other than 8 or 16 bytes.
Test: test-art-target, test-art-host.
Change-Id: Id699d5e3590ca8c655ecd9f9ed4e63f49e3c4f9c
|
|
This reverts commit e2727154f25e0db9a5bb92af494d8e47b181dfcf.
Reason for revert: Breaks ASAN tests (ODR violation).
Bug: 142365358
Change-Id: I38103d74a1297256c81d90872b6902ff1e9ef7a4
|
|
Make symbols in compiler/optimizing hidden by a namespace
attribute. The unit intrinsic_objects.{h,cc} is excluded as
it is needed by dex2oat.
As the symbols are no longer exported, gtests are now linked
with the static version of the libartd-compiler library.
libart-compiler.so size:
- before:
arm: 2396152
arm64: 3345280
- after:
arm: 2016176 (-371KiB, -15.9%)
arm64: 2874480 (-460KiB, -14.1%)
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing --jit
Bug: 142365358
Change-Id: I1fb04a33351f53f00b389a1642e81a68e40912a8
|
|
MaybeRecordImplicitNullCheck is a function which uses
CodeGenerator::RecordPcInfo() and requires an exact PC. However for ARM32/ARM64,
when CodeGenerator::RecordPcInfo() is used without VIXL special scopes (EmissionCheckScope,
ExactAssemblyScope) there is no guarantee of an exact PC. Without the special scopes VIXL might
emit veneer/literal pools affecting a PC.
The ARM32 code generator has uses of MaybeRecordImplicitNullCheck without the
special scopes.
This CL fixes missing special scopes in the ARM32/ARM64 code generators.
It also changes API to prevent such cases:
1. A variant of CodeGenerator::RecordPcInfo with native_pc as a
parameter is added. The old variant (where Assembler::CodePosition is used) is
kept and documented that Assembler::CodePosition is target-dependent and
might be imprecise.
2. CodeGenerator::MaybeRecordImplicitNullCheck is made virtual. Checks
are added to ARM32/ARM64 code generators that
MaybeRecordImplicitNullCheck is invoked within VIXL special scopes.
Test: test.py --host --optimizing --jit --gtest
Test: test.py --target --optimizing --jit
Test: run-gtests.sh
Change-Id: Ic66c16e7bdf4751cbc19a9de05846fba005b7f55
|
|
For SIMD graphs allocate 64 bit instead of 128 bit on stack for
each FP register to be preserved by the callee in the frame entry
as ABI suggests (currently 64-bit registers are preserved but
more space on stack is allocated).
Note: slow paths still require spilling full 128-bit Q-Registers
for SIMD graphs due to register allocator restrictions.
Test: test-art-target.
Change-Id: Ie0b12e4b769158445f3d0f4562c70d4fb0ea7744
|
|
Some of safepoints don't need to have DexRegisterMap info;
this will decrease the stackmap size.
.oat file size reduction:
- boot.oat: -233 kb (-5.4%)
- boot-framework.oat: -704 kb (-4.9%)
Test: 461-get-reference-vreg, 466-get-live-vreg.
Test: 543-env-long-ref, 616-cha*.
Test: test-art-target, +gc-stress.
Change-Id: Idbad355770e30a30dcf14127642e03ee666878b8
|
|
Recognize appending with StringBuilder and replace the
entire expression with a runtime call that perfoms the
append in a more efficient manner.
For now, require the entire pattern to be in a single block
and be very strict about the StringBuilder environment uses.
Also, do not accept StringBuilder/char[]/Object/float/double
arguments as they throw non-OOME exceptions and/or require a
call from the entrypoint back to a helper function in Java;
these shall be implemented later.
Boot image size for aosp_taimen-userdebug:
- before:
arm/boot*.oat: 19653872
arm64/boot*.oat: 23292784
oat/arm64/services.odex: 22408664
- after:
arm/boot*.oat: 19432184 (-216KiB)
arm64/boot*.oat: 22992488 (-293KiB)
oat/arm64/services.odex: 22376776 (-31KiB)
Note that const-string in compiled boot image methods cannot
throw, but for apps it can and therefore its environment can
prevent the optimization for apps. We could implement either
a simple carve-out for const-string or generic environment
pruning to allow this pattern to be applied more often.
Results for the new StringBuilderAppendBenchmark on taimen:
timeAppendLongStrings: ~700ns -> ~200ns
timeAppendStringAndInt: ~220ns -> ~140ns
timeAppendStrings: ~200ns -> 130ns
Bug: 19575890
Test: 697-checker-string-append
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Test: aosp_taimen-userdebug boots.
Test: run-gtests.sh
Test: testrunner.py --target --optimizing
Test: vogar --benchmark art/benchmark/stringbuilder-append/src/StringBuilderAppendBenchmark.java
Change-Id: I51789bf299f5219f68ada4c077b6a1d3fe083964
|
|
Separating out the structs from DexFile allows them to be forward-
declared, which reduces the need to include the dex_file header.
Bug: 119869270
Test: m
Change-Id: I32dde5a632884bca7435cd584b4a81883de2e7b4
|
|
Avoid caching the results. Caching was broken for JIT in the
presence of class unloading; entries for unloaded dex files
were leaked and potentially used erroneously with a newly
loaded dex file.
Test: m test-art-host-gtest
Test: testrunner.py --host
Test: Pixel 2 XL boots.
Test: m test-art-target-gtest
Test: testrunner.py --target
Bug: 118808764
Change-Id: Ic1163601170364e060c2e3009752f543c9bb37b7
|
|
And enable test 920-objects that was crashing because
of this bug.
Test: testrunner.py --host --jit-on-first-use -t 920
Test: testrunner.py --host --optimizing
Test: m test-art-host-gtest
Bug: 117638896
Change-Id: I47dc893b121c82de537b3147c86d37a6eecf2d62
|
|
This avoids creating an object on the heap and thus
prevents issues for the 904-object-allocation in the
JIT-at-first-use configuration.
Test: run_build_test_target.py -j48 art-jit-on-first-use
(test 904 passes; test 1935 still failing).
Bug: 116189667
Change-Id: I58c0c8cb2d78edc63dab7d72e69b882abbfb79fd
|
|
Make the last sharpening helper (methods) like the other
helpers: being invoked by the instruction builder.
Test: test.py
Change-Id: Ic80a454f9b59b0b4ef7825590b24402500ba851c
|
|
Test: test-art-host-gtest-stack_map_test
Change-Id: Ife021d03e4e486043ec609f9af8673ace7bde497
|
|
There is no need to treat it specially any more,
because of the de-duplication at BitTable level.
This saves 0.6% of oat file size.
Test: test-art-host-gtest
Change-Id: Ife7927d736243879a41d6f325d49ebf6930a63f6
|
|
Removes CompilerDriver dependency from ImageWriter and
several other classes.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Test: Pixel 2 XL boots.
Test: m test-art-target-gtest
Test: testrunner.py --target --optimizing
Change-Id: I3c5b8ff73732128b9c4fad9405231a216ea72465
|
|
Add 'Kind' column to stack maps which marks special stack map types,
and use it at run-time to add extra sanity checks.
It will also allow us to binary search the stack maps.
The column increases .oat file by 0.2%.
Test: test-art-host-gtest-stack_map_test
Change-Id: I2a9143afa0e32bb06174604ca81a64c41fed232f
|
|
Add support for the compiler to call into the runtime for
invoke-custom bytecodes.
Bug: 35337872
Test: art/test.py --host -r -t 952
Test: art/test.py --target --64 -r -t 952
Test: art/test.py --target --32 -r -t 952
Change-Id: I821432e7e5248c91b8e1d36c3112974c34171803
|
|
It is not needed and it increases header dependencies.
Test: Build
Change-Id: I51fcef5025defe5e4185e9c4fde18b363194789e
|
|
|
|
This reverts commit 5ae7cdfe5b8da645d1fec61c76176e6a37e78fb9.
Reason for revert: Unknown crash in linker
Change-Id: Ib5646376e2e589a7a6c4d66e72caa1f208a15905
|
|
|
|
Test: m test-art-host-gtest
Change-Id: I26146535f2684ddab3554023f7df571d93a39f88
|
|
Implemented as a runtime call.
Bug: 66890674
Test: art/test.py --target -r -t 979
Test: art/test.py --target --64 -r -t 979
Test: art/test.py --host -r -t 979
Change-Id: I67f461c819a7d528d7455afda8b4a59e9aed381c
|
|
Implemented as a runtime call.
Bug: 66890674
Test: art/test.py --target -r -t 979
Test: art/test.py --target --64 -r -t 979
Test: art/test.py --host -r -t 979
Change-Id: I4b3d3969d455d0198cfe122eea8abd54e0ea20ee
|
|
Remove runtime/globals.h and make clients point to the right globals.h
(libartbase/base/globals.h). Also make within-libartbase includes
relative rather than using base/, etc.
Bug: 22322814
Test: make -j 40 checkbuild
Change-Id: I99de63fc851d48946ab401e2369de944419041c7
|
|
The linker crash (the reason for revert) is flaky and maybe
we shall not see it with this CL now that unrelated source
code has changed.
Test: Rely on TreeHugger
Bug: 36141117
Bug: 77581732
This reverts commit 5806a9ec99b5494b511e84c74f494f0b3a8ebec5.
Change-Id: I3a4a058847dff601681ba391abf45833424fa06d
|
|
Move the remainder of the Arena stuff, plus dumpable and
runtime/*memory_region* to libartbase. More preparation to build
profiling library.
Bug: 22322814
Test: make -j 50 checkbuild
Change-Id: Iaf26d310c89bc58846553281576c18102f5e4122
|
|
Reason for revert: This caused clang linker crash
in several branches.
Bug: 77581732
This reverts commit c9dd2207dfdab42586b1d6a5e7f11cf2fcea3a7a.
Change-Id: I1923809083cf41c4f19e3e60df03ae80517aaedb
|
|
Prepare for experimenting with Baker read barrier marking
introspection entrypoints for JIT.
Test: m test-art-host-gtest
Test: Compare compiled boot*.oat before and after (no diff).
Test: Pixel 2 XL boots.
Bug: 36141117
Change-Id: Idb413a31b158db4bf89a8707ea46dd167a06f110
|
|
Disabled the build time flag. (No image version bump needed.)
Bug: 26687569
Bug: 64692057
Bug: 76420366
This reverts commit 3fbd3ad99fad077e5c760e7238bcd55b07d4c06e.
Change-Id: I5d83c4ce8a7331c435d5155ac6e0ce1c77d60004
|
|
This reverts commit 3f41323cc9da335e9aa4f3fbad90a86caa82ee4d.
Reason for revert: Fails sporadically.
Bug: 26687569
Bug: 64692057
Bug: 76420366
Change-Id: I84d1e9e46c58aeecf17591ff71fbac6a1e583909
|
|
Add extra output for debugging failures and re-enable
the bitstring type checks.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing --jit
Test: testrunner.py --host -t 670-bitstring-type-check
Test: Pixel 2 XL boots.
Test: testrunner.py --target --optimizing --jit
Test: testrunner.py --target -t 670-bitstring-type-check
Bug: 64692057
Bug: 26687569
This reverts commit bff7a52e2c6c9e988c3ed1f12a2da0fa5fd37cfb.
Change-Id: I090e241983f3ac6ed8394d842e17716087d169ac
|
|
There were several utilities related to building/walking/testing dex
files that were not in libdexfile. This change consolidates these.
Bug: 22322814
Test: make -j 50 test-art-host
Change-Id: Id76e9179d03b8ec7d67f7e0f267121f54f0ec2e0
|
|
For PIC AOT-compiled app, use the .data.bimg.rel.ro to load
the boot image String/Class references instead of using the
mmapped boot image ClassTable and InternTable.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing --pictest --npictest
Test: Pixel 2 XL boots.
Test: testrunner.py --target --optimizing --pictest --npictest
Bug: 71526895
Change-Id: Id5703229777aecb589a933a41f92e44d3ec02a3d
|
|
Reuse PatchInfo<> for additional architectures and make the
naming more consistent across architectures. Change the
DexFile reference to pointer in preparation for patching
references to the upcoming .data.bimg.rel.ro section.
Update obsolete comments; instead of referencing dex cache
arrays which were used in the past, reference the .bss and
the .data.bimg.rel.ro which shall be used in upcoming CLs.
Test: Rely on TreeHugger.
Change-Id: I03be4c4118918189e55c62105bb594500c6a42c1
|
|
Bug: 64692057
Bug: 71853552
Bug: 26687569
This reverts commit eb0ebed72432b3c6b8c7b38f8937d7ba736f4567.
Change-Id: I7daeaa077960ba41b2ed42bc47f17501621be4be
|
|
We guard the use of this feature with a compile-time flag,
set to true in this CL.
Boot image size for aosp_taimen-userdebug in AOSP master:
- before:
arm boot*.oat: 63604740
arm64 boot*.oat: 74237864
- after:
arm boot*.oat: 63531172 (-72KiB, -0.1%)
arm64 boot*.oat: 74135008 (-100KiB, -0.1%)
The new TypeCheckBenchmark yields the following changes
using the little cores of taimen fixed at 1.4016GHz:
32-bit 64-bit
timeCheckCastLevel1ToLevel1 11.48->15.80 11.47->15.78
timeCheckCastLevel2ToLevel1 15.08->15.79 15.08->15.79
timeCheckCastLevel3ToLevel1 19.01->15.82 17.94->15.81
timeCheckCastLevel9ToLevel1 42.55->15.79 42.63->15.81
timeCheckCastLevel9ToLevel2 39.70->14.36 39.70->14.35
timeInstanceOfLevel1ToLevel1 13.74->17.93 13.76->17.95
timeInstanceOfLevel2ToLevel1 17.02->17.95 16.99->17.93
timeInstanceOfLevel3ToLevel1 24.03->17.95 24.45->17.95
timeInstanceOfLevel9ToLevel1 47.13->17.95 47.14->18.00
timeInstanceOfLevel9ToLevel2 44.19->16.52 44.27->16.51
This suggests that the bitstring typecheck should not be
used for exact type checks which would be equivalent to the
"Level1ToLevel1" benchmark. Whether the implementation is
a beneficial replacement for the kClassHierarchyCheck and
kAbstractClassCheck on average depends on how many levels
from the target class (or Object for a negative result) is
a typical object's class.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing --jit
Test: testrunner.py --host -t 670-bitstring-type-check
Test: Pixel 2 XL boots.
Test: testrunner.py --target --optimizing --jit
Test: testrunner.py --target -t 670-bitstring-type-check
Bug: 64692057
Bug: 71853552
Bug: 26687569
Change-Id: I538d7e036b5a8ae2cc3fe77662a5903d74854562
|
|
Avoid read barriers for boot image class InstanceOf. Boot
image classes are non-moveable, so comparing them against
from-space and to-space reference yields the same result.
Change the notion of a "fatal" type check slow path to mean
that the runtime call shall not return by normal path, i.e.
"fatal" now includes certainly throwing in a try-block. This
avoids unnecessary code to restore registers and jump back.
For boot image classes the CheckCast comparisons do not need
read barriers (for the same reason as for InstanceOf), so we
shall not have any false negatives and can treat the check's
slow paths as final in the same cases as in non-CC configs.
Boot image size for aosp_taimen-userdebug in AOSP master:
- before:
arm boot*.oat: 37075460
arm64 boot*.oat: 43431768
- after:
arm boot*.oat: 36894292 (-177KiB, -0.5%)
arm64 boot*.oat: 43201256 (-225KiB, -0.5%)
Also remove some obsolete helpers from CodeGenerator.
Test: Additional test in 603-checker-instanceof.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Test: Pixel 2 XL boots.
Test: testrunner.py --target --optimizing
Bug: 12687968
Change-Id: Ib1381084e46a10e70320dcc618f0502ad725f0b8
|
|
When compiling an intrinsic method, generate a graph that
invokes the same method and try to compile it. If the call
is actually intrinsified (or simplified to other HIR) and
yields a leaf method, use the result of this compilation
attempt, otherwise compile the actual code or JNI stub.
Note that CodeGenerator::CreateThrowingSlowPathLocations()
actually marks the locations as kNoCall if the throw is not
in a catch block, thus considering some throwing methods
(for example, String.charAt()) as leaf methods.
We would ideally want to use the intrinsic codegen for all
intrinsics that do not generate a slow-path call to the
default implementation. Relying on the leaf method is
suboptimal as we're missing out on methods that do other
types of calls, for example runtime calls. This shall be
fixed in a subsequent CL.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Bug: 67717501
Change-Id: I640fda7c22d4ff494b5ff77ebec3b7f5f75af652
|
|
Adding InstructionSet::kLast shall make it easier to encode
the InstructionSet in fewer bits using BitField<>. However,
introducing `kLast` into the `art` namespace is not a good
idea, so we change the InstructionSet to an enum class.
This also uncovered a case of InstructionSet::kNone being
erroneously used instead of vixl32::Condition::None(), so
it's good to remove `kNone` from the `art` namespace.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Change-Id: I6fa6168dfba4ed6da86d021a69c80224f09997a6
|
|
Reuse the memory previously allocated on the ArenaStack by
optimization passes.
This CL handles only the architecture-independent codegen
and slow paths, architecture-dependent codegen allocations
shall be moved to the ScopedArenaAllocator in a follow-up.
Memory needed to compile the two most expensive methods for
aosp_angler-userdebug boot image:
BatteryStats.dumpCheckinLocked() : 19.6MiB -> 18.5MiB (-1189KiB)
BatteryStats.dumpLocked(): 39.3MiB -> 37.0MiB (-2379KiB)
Also move definitions of functions that use bit_vector-inl.h
from bit_vector.h also to bit_vector-inl.h .
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Bug: 64312607
Change-Id: I84688c3a5a95bf90f56bd3a150bc31fedc95f29c
|
|
Fixes a bug introduced by
https://android-review.googlesource.com/504041
Test: test-art-host-gtest
Test: testrunner.py --host --optimizing
Bug: 64312607
Change-Id: I7fd2d55c2a657f736eaed7c94c41d1237ae2ec0b
|
|
Passes using local ArenaAllocator were hiding their memory
usage from the allocation counting, making it difficult to
track down where memory was used. Using ScopedArenaAllocator
reveals the memory usage.
This changes the HGraph constructor which requires a lot of
changes in tests. Refactor these tests to limit the amount
of work needed the next time we change that constructor.
Test: m test-art-host-gtest
Test: testrunner.py --host
Test: Build with kArenaAllocatorCountAllocations = true.
Bug: 64312607
Change-Id: I34939e4086b500d6e827ff3ef2211d1a421ac91a
|