Age | Commit message (Collapse) | Author |
|
This is in preparation of removing it from OatQuickMethodHeader.
Bug: 123510633
Test: m test-art-host-gtest
Test: ./art/test.py -b -r --host
Change-Id: I5c5adb4c040e329b81c1393aa1b80ee017729c8a
|
|
Test: aosp_taimen-userdebug boots.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Bug: 147346243
Change-Id: I97fdc15e568ae3fe390efb1da690343025f84944
|
|
This reverts commit e2727154f25e0db9a5bb92af494d8e47b181dfcf.
Reason for revert: Breaks ASAN tests (ODR violation).
Bug: 142365358
Change-Id: I38103d74a1297256c81d90872b6902ff1e9ef7a4
|
|
Make symbols in compiler/optimizing hidden by a namespace
attribute. The unit intrinsic_objects.{h,cc} is excluded as
it is needed by dex2oat.
As the symbols are no longer exported, gtests are now linked
with the static version of the libartd-compiler library.
libart-compiler.so size:
- before:
arm: 2396152
arm64: 3345280
- after:
arm: 2016176 (-371KiB, -15.9%)
arm64: 2874480 (-460KiB, -14.1%)
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing --jit
Bug: 142365358
Change-Id: I1fb04a33351f53f00b389a1642e81a68e40912a8
|
|
This reverts commit e1412dacbf1d2a809bd1fca658cc8cb8f61f8ee6.
Bug: 123510633
Bug: 127305289
Reason for revert: b/127305289
Change-Id: I54557b05a44777f1fa2c15bde4fa648980f42eed
|
|
This temporarily adds 0.25% to oat file size.
The space will be reclaimed back in follow-up CL.
This reverts commit 8f20a23a35fa6fbe4dcb4ff70268a24dc7fb2a24.
Reason for revert: Reland as-is after CL/903819
Bug: 123510633
Test: DCHECK compare the two stored code sizes.
Change-Id: Ia3ab31c208948f4996188764fcdcba13d9977d19
|
|
This reverts commit 68efa7b1128486e08ae60cd27181645b27bbd2e4.
Reason for revert: Breaks tests
Change-Id: I28fb143990f58e0d5f0b106bea9d9a159f19297e
|
|
This temporarily adds 0.25% to oat file size.
The space will be reclaimed back in follow-up CL.
Bug: 123510633
Test: DCHECK compare the two stored code sizes.
Change-Id: I15340824ca637fd075a4cef87771b06cb96bb9f4
|
|
Test: test-art-host-gtest-stack_map_test
Test: test-art-host-gtest-bit_table_test
Change-Id: I15c624d2a70736aeb8422ce5babcef8e8fa82136
|
|
Test: test-art-host-gtest-stack_map_test
Change-Id: Ife021d03e4e486043ec609f9af8673ace7bde497
|
|
Make it possible to share BitTables between CodeInfos.
This saves 1% of .oat file size.
Test: test-art-host-gtest
Change-Id: I14172cba6b65e734b94f8c232f24eeee1fc67113
|
|
Test: test-art-host-gtest
Change-Id: I5ce28973042f9241e72ceb52fc5db472ca571563
|
|
Try to simplify the code using the recently added iterators.
Test: test-art-host-gtest-stack_map_test
Change-Id: I0b9f54df01749ee6ec3a67cfb07ba636a2489c89
|
|
The stored information will be used in follow-up CLs.
This temporarily increases .oat file size by 0.7%.
Test: test-art-host-gtest
Change-Id: Ie7d898b06398ae44287bb1e8153861ab112a216c
|
|
|
|
Test: test-art-host-gtest-stack_map_test
Change-Id: I0abab008159db023d531df69214cd3bb8c0639bd
|
|
Add 'Kind' column to stack maps which marks special stack map types,
and use it at run-time to add extra sanity checks.
It will also allow us to binary search the stack maps.
The column increases .oat file by 0.2%.
Test: test-art-host-gtest-stack_map_test
Change-Id: I2a9143afa0e32bb06174604ca81a64c41fed232f
|
|
They are no longer needed in the new encoding.
I reuse the local variables in most places to DCHECK the size
of the decoded register map. This has one catch though:
We sometimes omit all dex registers, so the DCHECK should be
done only after checking if the map is empty (if applicable).
Test: test-art-host-gtest-stack_map_test
Change-Id: I94b67029842374bc8eb7c9e5eac76fc93a651f24
|
|
The register maps tend to be similar from stack map to stack map,
so instead of encoding them again, store only the modified ones.
The dex register bitmap stores the delta now - if register has
been modified since the previous stack map, the bit will be set.
The decoding logic scans backwards through stack maps until it
eventfully finds the most recent value of each register.
This CL saves ~2.5% of .oat file size (~10% of stackmap size).
Due to the scan, this makes dex register decoding slower by factor
of 2.5, but that still beats the old algorithm before refactoring.
Test: test-art-host-gtest-stack_map_test
Change-Id: Id5217a329eb757954e0c9447f38b05ec34118f84
|
|
The InlineInfo class actually represented a list of inlining
information for a given stack map, and the depth argument was
used everywhere to select to desired element from the list.
This was verbose and inconsistent with the other classes.
Change the InlineInfo class to represent a single inlining,
and select the desired depth when getting it from CodeInfo.
Test: test-art-host-gtest-stack_map_test
Change-Id: I35b73e6704854f0203f51d4dbdbed5b1d1cd5a3b
|
|
Simplify code by encoding dex register maps using BitTables.
The overall design is unchanged (bitmask+indices+catalogue).
This CL saves ~0.4% of .oat file size.
The dex register map decoding is factor of 3 faster now
(based on the time to verify the register maps on Arm).
This is not too surprising as the old version was O(n^2).
It also reduces compiler arena memory usage by 11% since the
BitTableBuilder is more memory efficient, we store less
intermediate data, and we deduplicate most data on the fly.
Test: test-art-host-gtest-stack_map_test
Change-Id: Ib703a5ddf7f581280522d589e4a2bfebe53c26a9
|
|
It is invalid to try to encode improperly aligned PC.
Test: test-art-target-gtest-stack_map_test
Change-Id: I73e7b6225bfee87b0d6161298e19648ee6e1d499
|
|
Store some of the needed decoding state explicitly to avoid passing it
around all the time. The DexRegisterMap class is rewritten in next CL.
Test: test-art-host-gtest-stack_map_test
Change-Id: Ie268dff2a1c1da2e08f0e6799ae51c30e11f350b
|
|
I need to reduce the StackMapEntry to a POD type so that it
can be used in BitTableBuilder.
Test: test-art-host-gtest-stack_map_test
Change-Id: I5f9ad7fdc9c9405f22669a11aea14f925ef06ef7
|
|
This reverts commit 8b20b5c1f5b454b2f8b8bff492c88724b5002600.
Reason for revert: Retry submit unmodified after fixing the test.
Use BitTable to store the masks as well and move the
deduplication responsibility to the BitTable builders.
Don't generate entries for masks which are all zeros.
This saves 0.2% of .oat file size on both Arm64 and Arm.
Encode registers as (value+shift) due to tailing zeros.
This saves 1.0% of .oat file size on Arm64 and 0.2% on Arm.
Test: test-art-target-gtest-exception_test
Test: test-art-host-gtest-bit_table_test
Test: test-art-host-gtest-stack_map_test
Change-Id: Ib643776dbec3f051cc29cd13ff39e453fab5fae9
|
|
This reverts commit ffaf87a429766ed80e6afee5bebea93db539620b.
Reason for revert: Breaks exception_test32 on target
for CMS and heap poisoning configs.
Change-Id: I127c17f693e28211a799f73a50e73105edee7e4c
|
|
Use BitTable to store the masks as well and move the
deduplication responsibility to the BitTable builders.
Don't generate entries for masks which are all zeros.
This saves 0.2% of .oat file size on both Arm64 and Arm.
Encode registers as (value+shift) due to tailing zeros.
This saves 1.0% of .oat file size on Arm64 and 0.2% on Arm.
Test: test-art-host-gtest
Change-Id: I636b7edd49e10e8afc9f2aa385b5980f7ee0e1f1
|
|
Remove most of the code related to handling of bit encodings.
The design is still same; the encodings are just more implicit.
Most of the complexity is replaced with a single BitTable class,
which is a generic purpose table of tightly bit-packed integers.
It has its own header which stores the bit-encoding of columns,
and that removes the need to handle the encodings explicitly.
Other classes, like StackMap, are accessors into the BitTable,
with named getter methods for the individual columns.
This CL saves ~1% of .oat file size (~4% of stackmap size).
Test: test-art-host-gtest
Change-Id: I7e92683753b0cc376300e3b23d892feac3670890
|
|
Make ArenaPool an abstract base class and leave MallocArenaPool
implementation with it. This enables arena_allocator to be free
of MemMap, Mutex, etc., in preparation to move the remaining collections
out of runtime/base to libartbase/base.
Bug: 22322814
Test: make -j 50 test-art-host
build and boot
Change-Id: Ief84dcbfb749165d9bc82000c6b8f96f93052422
|
|
Adding InstructionSet::kLast shall make it easier to encode
the InstructionSet in fewer bits using BitField<>. However,
introducing `kLast` into the `art` namespace is not a good
idea, so we change the InstructionSet to an enum class.
This also uncovered a case of InstructionSet::kNone being
erroneously used instead of vixl32::Condition::None(), so
it's good to remove `kNone` from the `art` namespace.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Change-Id: I6fa6168dfba4ed6da86d021a69c80224f09997a6
|
|
Reuse the memory previously allocated on the ArenaStack by
optimization passes.
This CL handles only the architecture-independent codegen
and slow paths, architecture-dependent codegen allocations
shall be moved to the ScopedArenaAllocator in a follow-up.
Memory needed to compile the two most expensive methods for
aosp_angler-userdebug boot image:
BatteryStats.dumpCheckinLocked() : 19.6MiB -> 18.5MiB (-1189KiB)
BatteryStats.dumpLocked(): 39.3MiB -> 37.0MiB (-2379KiB)
Also move definitions of functions that use bit_vector-inl.h
from bit_vector.h also to bit_vector-inl.h .
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Bug: 64312607
Change-Id: I84688c3a5a95bf90f56bd3a150bc31fedc95f29c
|
|
Memory needed to compile the two most expensive methods for
aosp_angler-userdebug boot image:
BatteryStats.dumpCheckinLocked() : 21.1MiB -> 20.2MiB
BatteryStats.dumpLocked(): 42.0MiB -> 40.3MiB
This is because all the memory previously used by the graph
builder is reused by later passes.
And finish the "arena"->"allocator" renaming; make renamed
allocator pointers that are members of classes const when
appropriate (and make a few more members around them const).
Test: m test-art-host-gtest
Test: testrunner.py --host
Bug: 64312607
Change-Id: Ia50aafc80c05941ae5b96984ba4f31ed4c78255e
|
|
The method info data is stored separately from the code info to
reduce oat size by improving deduplication of stack maps.
To reduce code size, this moves the invoke info and inline info
method indices to this table.
Oat size for a large app (arm64): 77746816 -> 74023552 (-4.8%)
Average oat size reduction for golem (arm64): 2%
Repurposed unused SrcMapElem deduping to be for MethodInfo.
TODO: Delete SrcMapElem in a follow up CL.
Bug: 36124906
Test: clean-oat-host && test-art-host-run-test
Change-Id: I2241362e728389030b959f42161ce817cf6e2009
|
|
Invoke info records the invoke type and dex method index for invokes
that may reach artQuickResolutionTrampoline. Having this information
recorded allows the runtime to avoid reading the dex code and pulling
in extra pages.
Code size increase for a large app:
93886360 -> 95811480 (2.05% increase)
1/2 of the code size increase is from making less stack maps deduped.
I suspect there is less deduping because of the invoke info method
index.
Merged disabled until we measure the RAM savings.
Test: test-art-host, N6P boots
Bug: 34109702
Change-Id: I6c5e4a60675a1d7c76dee0561a12909e4ab6d5d9
|
|
Before it only deduplicated the normal stack map dex register maps.
Code size for a large app: 93341616 -> 92678040 (-0.7%)
Added test.
Bug: 34621054
Test: test-art-host
Change-Id: I4fab4e40915bfa12cb978edbb3cbc19e2cf00954
|
|
Previously:
Table layout was computed multiple places like stack_map_stream,
and getters. This made it difficult to add new stack map tables and
made the code hard to understand.
This change makes the table layout specified all inside of the code
info. Updating the layout only requires changing ComputeTableOffsets.
Changed the stack map inline info offset to be an index, so that it is
not require the inline infos are directly after the dex register table.
Oat file size for a large app: 94459576 -> 93882040 (-0.61%)
Updated oatdump and fixed a bug that was incorrectly computing the
register mask bytes.
Bug: 34621054
Test: test-art-host
Change-Id: I3a7f141e09d5a18bce2bc6c9439835244a22016e
|
|
Data is commonly shared between different stack maps. The register
masks are stored after the stack masks.
Oat size for a large app:
96722288 -> 94485872 (-2.31%)
Average oat size reduction according to golem -3.193%.
Bug: 34621054
Test: test-art-host
Change-Id: I5eacf668992e866d11ddba0c01675038a16cdfb4
|
|
The stack masks repeat often enough so that it is worth deduplicating
them.
Oat size for a large app:
98143600 -> 96722288 (-1.44%)
Bug: 34621054
Test: test-art-host
Change-Id: If73d51e46066357049d5be2e406ae9a32b7ff1f4
|
|
Saves 0.65% of boot.oat size, probably similar on apps. Added
BitMemoryRegion to avoid requiring adding state to StackMap. Added
test to memory_region_test.
Test: clean-oat-host && test-art-host
Bug: 34621054
Change-Id: I40279c59e262bd5e3c6a9135f83e22b5b6900d68
|
|
Compress native PC based on instruction alignment. This reduces the
size of stack maps, boot.oat is 0.4% smaller for arm64.
Test: test-art-host, test-art-target, N6P booting
Change-Id: I2b70eecabda88b06fa80a85688fd992070d54278
|
|
Currently done for JIT. Can be extended for AOT and inlined boot
image methods.
Also refactor the lookup of a inlined method at runtime to not
rely on the dex cache, but look at the class loader tables.
bug: 30933338
test: test-art-host, test-art-target
Change-Id: I58bd4d763b82ab8ca3023742835ac388671d1794
|
|
Use the same approach as we do for stackmaps to reduce the size.
It saves 4.0 MB from non-debuggable boot.oat (AOSP).
It does not affect debuggable boot.oat.
It saves 3.6 MB (of 96.6 MB) from /system/framework/arm/ (GOOG).
It saves 0.6 MB (of 26.7 MB) from /system/framework/oat/arm/ (GOOG).
Field loads from inline-info get around 5% slower.
(based on the time it takes to load all inline-infos from boot.oat)
Change-Id: I67b0fa5eef74c1fdb013680d0231fd44ea696176
|
|
Use only the minimum number of bits required to store stack map data.
For example, if native_pc needs 5 bits and dex_pc needs 3 bits, they
will share the first byte of the stack map entry.
The header is changed to store bit offsets of the fields rather than
byte sizes. Offsets also make it easier to access later fields without
calculating sum of all previous sizes.
All of the header fields are byte sized or encoded as ULEB128 instead
of the previous fixed size encoding. This shrinks it by about half.
It saves 3.6 MB from non-debuggable boot.oat (AOSP).
It saves 3.1 MB from debuggable boot.oat (AOSP).
It saves 2.8 MB (of 99.4 MB) from /system/framework/arm/ (GOOG).
It saves 1.0 MB (of 27.8 MB) from /system/framework/oat/arm/ (GOOG).
Field loads from stackmaps seem to get around 10% faster.
(based on the time it takes to load all stackmap entries from boot.oat)
Bug: 27640410
Change-Id: I8bf0996b4eb24300c1b0dfc6e9d99fe85d04a1b7
|
|
Change-Id: I76a291e6a0ac37f0590d16c7f5b866115588bc55
|
|
Motivation is using arenas in the verifier.
Bug: 10921004
Change-Id: I3c7ed369194b2309a47b12a621e897e0f2f65fcf
|
|
When running Optimized code on 64-bit, high value of vreg pair may be
stored in the high 32 bits of a CPU register. This is not reflected in
stack maps which would encode both the low and high vreg as
kInRegister with the same register number, making it indistinguishable
from two non-wide vregs with the same value in the lower 32 bits.
Deoptimization deals with this by running the verifier and thus
obtaining vreg pair information, but this would be too slow for try/
catch. This patch therefore adds two new stack map location kinds:
kInRegisterHigh and kInFpuRegisterHigh to differentiate between the
two cases.
Note that this also applies to floating-point registers on x86.
Change-Id: I15092323e56a661673e77bee1f0fca4261374732
|
|
Also shorten NumberOfDexRegisterLocationCatalogEntries to
NumberOfLocationCatalogEntries.
Change-Id: I55f8ec2960ea67e2eb6871a417bd442d0e2810fb
|
|
Operations on CodeInfo and StackMap objects repeatedly read encoding
information from the MemoryRegion. Since these are 3-bit-loads of
values that never change, caching them can measurably reduce compile
times.
According to benchmarks, this patch saves 1-3% on armv7, 2-4% on x86,
and 0-1% on x64.
Change-Id: I46b197513601325d8bab562cc80100c00ec28a3b
|
|
StackMap::SetStackMask will currently copy a BitVector into a Memory-
Region bit by bit. This patch adds a new function for copying the data
with memcpy.
This is resubmission of CL I28d45a590b35a4a854cca2f57db864cf8a081487
but with a fix for a broken test which it revealed.
Change-Id: Ib65aa614d3ab7b5c99c6719fdc8e436466a4213d
|
|
This will be needed to recover the call stack.
Change-Id: I2fe10785eb1167939c8cce1862b2d7f4066e16ec
|