summaryrefslogtreecommitdiff
path: root/compiler/optimizing/code_generator_arm.cc
AgeCommit message (Collapse)Author
2016-09-05Merge "Avoid excessive spill slots for slow paths."Treehugger Robot
2016-09-05Avoid excessive spill slots for slow paths.Vladimir Marko
Reducing the frame size makes stack maps smaller as we need fewer bits for stack masks and some dex register locations may use short location kind rather than long. On Nexus 9, AOSP ToT, the boot.oat size reduction is prebuilt multi-part boot image: - 32-bit boot.oat: -416KiB (-0.6%) - 64-bit boot.oat: -635KiB (-0.9%) prebuilt multi-part boot image with read barrier: - 32-bit boot.oat: -483KiB (-0.7%) - 64-bit boot.oat: -703KiB (-0.9%) on-device built single boot image: - 32-bit boot.oat: -380KiB (-0.6%) - 64-bit boot.oat: -632KiB (-0.9%) on-device built single boot image with read barrier: - 32-bit boot.oat: -448KiB (-0.6%) - 64-bit boot.oat: -692KiB (-0.9%) The other benefit is that at runtime, threads may need fewer pages for their stacks, reducing overall memory usage. We defer the calculation of the maximum spill size from the main register allocator (linear scan or graph coloring) to the RegisterAllocationResolver and do it based on the live registers at slow path safepoints. The old notion of an artificial slow path safepoint interval is removed as it is no longer needed. Test: Run ART test suite on host and Nexus 9. Bug: 30212852 Change-Id: I40b3d114e278e2c5807982904fa49bf6642c6275
2016-08-31Merge "Add entrypoint type information."Roland Levillain
2016-08-26Merge "ARM/MIPS: Avoid dead dex cache arrays base for intrinsics."Treehugger Robot
2016-08-26Add entrypoint type information.Serban Constantinescu
For some of the runtime calls we do not need to generate stack maps. For example, the Optimizing compiler implements HRem Floating Point by calling libm's fmod(). Since this is a leaf method that does not suspend the execution, we do not need to treat the fmod() invoke as a possible suspend point and thus we do not need to create a stack map for the particular PC. For now conservatively only tag the maths runtime entrypoints with this information. Test: m test-art-target Change-Id: Iab73dcf8047d2edaa7a570113ee792e46ccbc464
2016-08-26ARM/MIPS: Avoid dead dex cache arrays base for intrinsics.Vladimir Marko
Test: Run ART test suite on host and Nexus 6. Change-Id: Ie2ad70f1e3f125eae5dad53a6384d405e0311505
2016-08-26ARM: Make runtime invokes use InvokeRuntime().Serban Constantinescu
This patch refactors all of the ARM Optimizing compiler runtime invokes to use InvokeRuntime(). It also fixes some misuses of RecordPcInfo(). Change-Id: I722bc2ba95e42ff69ca12c3edc09326e0de2881f
2016-08-26Merge "Re-enable the ArraySet fast path with Baker read barriers."Roland Levillain
2016-08-25Re-enable the ArraySet fast path with Baker read barriers.Roland Levillain
Benchmarks (ARM64) score variations on Nexus 5X with CPU cores clamped at 960000 Hz (aosp_bullhead-userdebug build): - Ritzperf - average (lower is better): -0.95% (virtually unchanged) - CaffeineMark - average (higher is better): +2.50% (slightly better) - DeltaBlue (lower is better): -0.55% (virtually unchanged) - Richards - average (lower is better): +0.67% (virtually unchanged) - SciMark2 - average (higher is better): -0.10% (virtually unchanged) Details about Ritzperf benchmarks with meaningful variations (lower is better): - GenericCalcActions.MemAllocTest: -5.05% (better) Details about CaffeineMark benchmarks with meaningful variations (higher is better): - Method: +16.88% (better) Details about Richards benchmarks with meaningful variations (lower is better): - deutsch_acc_interface: +9.86% (worse) Boot image code size variation on Nexus 5X (aosp_bullhead-userdebug build): - total ARM64 framework Oat files size change: 105933472 bytes -> 106027680 bytes (+0.09%) - total ARM framework Oat files size change: 89157936 bytes -> 89239856 bytes (+0.09%) Test: ART host and target (ARM, ARM64) tests. Bug: 29516974 Bug: 29506760 Bug: 12687968 Change-Id: Ib9e9709712295e17804b8888ac10e3d518ff2e70
2016-08-25ArraySet without type check does not need read barrier.Vladimir Marko
Test: Run ART test suite with ART_USE_READ_BARRIER=true on host and Nexus 9. Bug: 12687968 Change-Id: Ie04a34b2149f4fc6fe995f3e43e76986a3f6330f
2016-08-24Revert "Revert "x86/x86-64: Avoid temporary for read barrier field load.""Vladimir Marko
Fixed the fault handler recognizing the TEST instruction and fault address within the lock word. Added tests to 439-npe. Bug: 29966877 Bug: 12687968 Test: Tested with ART_USE_READ_BARRIER=true on host. Test: Tested with ART_USE_READ_BARRIER=true ART_HEAP_POISONING=true on host. This reverts commit ccf15bca330f9a23337b1a4b5850f7fcc6c1bf15. Change-Id: I8990def5f719c9205bf6e5fdba32027fa82bec50
2016-08-23Revert "x86/x86-64: Avoid temporary for read barrier field load."Vladimir Marko
Fault handler does not recognize the instruction F6 /0 ib TEST r/m8, imm8 so we get crashes instead of NPEs. Bug: 29966877 Bug: 12687968 This reverts commit ccf06d8f19a37432de4a3b768747090adfbd18ec. Change-Id: Ib7db3b59f44c0d3ed5e24a20b6c6ee596a89d709
2016-08-23x86/x86-64: Avoid temporary for read barrier field load.Vladimir Marko
Add TEST instructions for memory and immediate. Use the byte version to avoid a temporary in read barrier field load. Test: Tested with ART_USE_READ_BARRIER=true on host. Test: Tested with ART_USE_READ_BARRIER=true ART_HEAP_POISONING=true on host. Bug: 29966877 Bug: 12687968 Change-Id: Ia415d3c2e1ae1ff6dff11d72bbb7d96d5deed6ee
2016-08-19Merge "ART: Implement a fixed size string dex cache"Mathieu Chartier
2016-08-19Add support for Baker read barriers in SystemArrayCopy intrinsics.Roland Levillain
Benchmarks (ARM64) score variations on Nexus 5X with CPU cores clamped at 960000 Hz (aosp_bullhead-userdebug build): - Ritzperf - average (lower is better): -3.03% (slightly better) - CaffeineMark - average (higher is better): +1.26% (slightly better) - DeltaBlue (lower is better): -10.50% (better) - Richards - average (lower is better): -3.36% (slightly better) - SciMark2 - average (higher is better): +0.26% (virtually unchanged) Details about Ritzperf benchmarks with meaningful variations (lower is better): - FormulaEvaluationActions.EvaluateAndApplyChanges: -13.26% (better) - FormulaEvaluationActions.EvaluateCascadingSums: -10.94% (better) - FormulaEvaluationActions.EvaluateComplexFormulas: -15.50% (better) - FormulaEvaluationActions.EvaluateFibonacci: -10.41% (better) - FormulaEvaluationActions.EvaluateLargeSums: +6.02% (worse) Boot image code size variation on Nexus 5X (aosp_bullhead-userdebug build): - total ARM64 framework Oat files size change: 107047632 bytes -> 107154128 bytes (+0.10%) - total ARM framework Oat files size change: 90932028 bytes -> 91009852 bytes (+0.09%) Test: ART host and target (ARM, ARM64) tests + Nexus 5X boot. Bug: 29516905 Bug: 29506760 Bug: 12687968 Change-Id: I85431368d09965687a0301ae2eb3c991f276ce5d
2016-08-18ART: Implement a fixed size string dex cacheChristina Wadsworth
Previously, the string dex cache was dex_file->NumStringIds() size, and @ruhler found that only ~1% of that cache was ever getting filled. Since many of these string dex caches were previously 100,000+ indices in length, we're wasting a few hundred KB per app by storing null pointers. The intent of this project was to reduce the space the string dex cache is using, while not regressing on time that much. This is the first of a few CLs, which implements the new fixed size array and disables the compiled code so it always goes slow path. In four other CLs, I implemented a "medium path" that regresses from the previous "fast path" only a bit in assembly in the entrypoints. @vmarko will introduce new compiled code in the future so that we ultimately won't be regressing on time at all. Overall, space savings have been confirmed as on the order of 100 KB per application. A 4-5% slow down in art-opt on Golem, and no noticeable slow down in the interpreter. The opt slow down should be diminished once the new compiled code is introduced. Test: m test-art-host Bug: 20323084 Change-Id: Ic654a1fb9c1ae127dde59290bf36a23edb55ca8e
2016-08-15Merge "Revert "Enable IntermediateAddress for primitive arrays with read ↵Treehugger Robot
barriers.""
2016-08-15Revert "Enable IntermediateAddress for primitive arrays with read barriers."Roland Levillain
This CL breaks the angler-userdebug build with `ART_USE_READ_BARRIER=true`. Test: Build angler-userdebug with `ART_USE_READ_BARRIER=true`. Bug: 30762467 Bug: 26601270 Bug: 12687968 This reverts commit 12ecf0800d465acdaa3deccd383ff8ed3428a183. Change-Id: Ia2069ac9436d2336311dd8d0f183c02e587586ae
2016-08-12Adjust spacing before NOLINT comments in ART.Roland Levillain
Note that neither clang-tidy nor cpplint.py complain about these style "issues", precisely because of the NOLINT comments. Test: WITH_TIDY=1 WITH_TIDY_CHECKS='-*,misc-macro-parentheses' mmma art Change-Id: Id692fd394ffbd4fe208cbbe4407b4d5e208462bb
2016-08-10Merge "ARM: Embed constants in add/sub-long."Vladimir Marko
2016-08-08Merge "Enable IntermediateAddress for primitive arrays with read barriers."Roland Levillain
2016-08-08Enable IntermediateAddress for primitive arrays with read barriers.Roland Levillain
Test: ART host and target (ARM, ARM64) tests. Bug: 26601270 Bug: 12687968 Change-Id: I6736ba7b1809bece1bf3cd82c69e4f42a0d3c4a7
2016-08-05ARM: Embed constants in add/sub-long.Vladimir Marko
Test: 538-checker-embed-constants Test: Run ART test suite on Nexus 5. Change-Id: Ib9639748c74d5c56dc354a6830987b613b922654
2016-08-04Merge "Change suspend entrypoint to save all registers."Vladimir Marko
2016-08-04Change suspend entrypoint to save all registers.Vladimir Marko
We avoid the need to save/restore registers in slow paths and get significant code size savings. On Nexus 9, AOSP: - 32-bit boot.oat: -1.4MiB (-1.9%) - 64-bit boot.oat: -2.0MiB (-2.3%) - other 32-bit oat files in dalvik-cache: -200KiB (-1.7%) - other 64-bit oat files in dalvik-cache: -2.3MiB (-2.1%) Test: Run ART test suite on host and Nexus 9 with gc stress. Bug: 30212852 Change-Id: I7015afc1e7d30341618c9200a3dc9ae277afd134
2016-08-02Merge "ARM: Embed 0.0 in VCMP."Vladimir Marko
2016-08-02ARM: Embed 0.0 in VCMP.Vladimir Marko
Test: Run ART test suite on Nexus 5. Change-Id: I5cbbd98c4d64a4d9213e27adcae929ead5099a39
2016-08-01ART: Convert pointer size to enumAndreas Gampe
Move away from size_t to dedicated enum (class). Bug: 30373134 Bug: 30419309 Test: m test-art-host Change-Id: Id453c330f1065012e7d4f9fc24ac477cc9bb9269
2016-07-22Do not emit stack maps for runtime calls to ReadBarrierMarkRegX.Roland Levillain
* Boot image code size variation on Nexus 5X (aosp_bullhead-userdebug build): - total ARM64 framework Oat files size change: 115584120 bytes -> 109124728 bytes (-5.59%) - total ARM framework Oat files size change: 97387728 bytes -> 92517584 (-5.00%) Test: ART host and target (ARM, ARM64) tests. Bug: 29506760 Bug: 12687968 Change-Id: I979d9fb2b4e09f4c0c7bf33af2cd91750a67f989
2016-07-21Merge "Move caller-saves saving/restoring to ReadBarrierMarkRegX."Roland Levillain
2016-07-21Move caller-saves saving/restoring to ReadBarrierMarkRegX.Roland Levillain
Instead of saving/restoring live caller-save registers before/after the call to read barrier mark entry points ReadBarrierMarkRegX, have these entry points save/restore all the caller-save registers themselves (except register rX, which contains the return value). Also refactor the assembly code of these entry points using macros. * Boot image code size variation on Nexus 5X (aosp_bullhead-userdebug build): - total ARM64 framework Oat files size change: 119196792 bytes -> 115575920 bytes (-3.04%) - total ARM framework Oat files size change: 100435212 bytes -> 97621188 bytes (-2.80%) * Benchmarks (ARM64) score variations on Nexus 5X (aosp_bullhead-userdebug build): - RitzPerf (lower is better) - average score difference: -2.71% - CaffeineMark (higher is better) - no real difference for most tests (absolute variation lower than 1%) - better score on the "Method" benchmark: score variation 41253 -> 44891 (+8.82%) Test: ART host and target (ARM, ARM64) tests. Bug: 29506760 Bug: 12687968 Change-Id: I881bf73139a3f1c2bee9ffc6fc8c00f9a392afa6
2016-07-21Merge "ARM: Port instr simplification of array accesses."Vladimir Marko
2016-07-21ARM: Port instr simplification of array accesses.Artem Serov
After changing the addressing mode for array accesses (in https://android-review.googlesource.com/248406) the 'add' instruction that calculates the base address for the array can be shared across accesses to the same array. Before https://android-review.googlesource.com/248406: add IP, r[Array], r[Index0], LSL #2 ldr r0, [IP, #12] add IP, r[Array], r[Index1], LSL #2 ldr r0, [IP, #12] Before this CL: add IP. r[Array], #12 ldr r0, [IP, r[Index0], LSL #2] add IP. r[Array], #12 ldr r0, [IP, r[Index1], LSL #2] After this CL: add IP. r[Array], #12 ldr r0, [IP, r[Index0], LSL #2] ldr r0, [IP, r[Index1], LSL #2] Link to the original optimization: https://android-review.googlesource.com/#/c/127310/ Test: Run ART test suite on Nexus 6. Change-Id: Iee26f9a0a7ca46abb90e3f60d19d22dc8dee4d8f
2016-07-21Merge "Revert "Revert "Refactor GetIMTIndex"""Treehugger Robot
2016-07-20Merge "ART: Change return types of field access entrypoints"Vladimir Marko
2016-07-20ART: Change return types of field access entrypointsAndreas Gampe
Ensure that return types guarantee full-width data as the compiled code and mterp expect by using size_t and ssize_t. This fixes Clang no longer sign-/zero-extending small return types. Bug: 30232671 Test: m ART_TEST_RUN_TEST_NDEBUG=true ART_TEST_INTERPRETER=true test-art-host-run-test Change-Id: Ic505befc6c94e2dccbc8abf2b13d4c2d662e68d1
2016-07-20ARM: Change mem address mode for array accesses.Artem Serov
Switch from: add IP, r[Array], r[Index], LSL #2 ldr r0, [IP, #12] To: add IP. r[Array], #12 ldr r0, [IP, r[Index], LSL #2] These is a base for the future TryExtractArrayAccessAddress optimization port to arm. Test: aosp_shamu-userdebug boots and passes "m test-art-target". Change-Id: I6ab01ba3271a8f79599ddd91a6b63cd1b37d2d67
2016-07-19Revert "Revert "Refactor GetIMTIndex""Matthew Gharrity
Originally reverted in order to revert https://android-review.googlesource.com/#/c/244190/ but can now be merged again. This reverts commit d4ceecc85a5aab2ec23ea1bd010692ba8c8aaa0c. Test: m test-art-host Change-Id: Id9205f2b77a378fc0f06088e78c66e81a49f712d
2016-07-14Merge "Introduce more compact ReadBarrierMark slow-paths."Roland Levillain
2016-07-13Fix a bug in ClassTableGet code generation for IMTs.Nicolas Geoffray
Introduced by: https://android-review.googlesource.com/#/c/244980/ test:566-polymorphic-inling for fixing x86 crash. Also fixes a performance regression. bug:29188168 Change-Id: Id90cb819c88e7ba3db1cb3c50c517a112ab7d784
2016-07-13Introduce more compact ReadBarrierMark slow-paths.Roland Levillain
Replace entry point ReadBarrierMark with 32 ReadBarrierMarkRegX entry points, using register number X as input and output (instead of the standard runtime calling convention) to save two moves in Baker's read barrier mark slow-path code. Test: ART host and target (ARM, ARM64) tests. Bug: 29506760 Bug: 12687968 Change-Id: I73cfb82831cf040b8b018e984163c865cc44ed87
2016-07-12Merge "ARM: Shorter fast-path for read barrier field load."Vladimir Marko
2016-07-12Merge "Rename kCall to kCallOnMainOnly"Roland Levillain
2016-07-12ARM: Shorter fast-path for read barrier field load.Vladimir Marko
Reduces the aosp_hammerhead-userdebug boot.oat by 2.2MiB, i.e. ~2.2%, in the ART_USE_READ_BARRIER=true configuration. Test: Tested with ART_USE_READ_BARRIER=true on Nexus 5. Bug: 29966877 Bug: 12687968 Change-Id: I4454150003e12a1aa7f0cf451627dc1ee9a495ae
2016-07-11Rename kCall to kCallOnMainOnlySerban Constantinescu
This patch renames kCall to kCallOnMainOnly in preparation for the next patch in this series which will be adding kCallOnMainAndSlowPath. Note: With this patch there will be places where we use kCallOnMainOnly even though we call on the slow path too. The next patch in this series will fix that. Test: ART host tests. Change-Id: Iabfdb0901990d163be5d780f3bdd2fab6fa17b32
2016-07-08Merge "Revert "Revert "Optimize IMT"""Jeff Hao
2016-07-07ARM: Remove unnecessary VMOV from float/double-to-int.Vladimir Marko
Test: Run standard ART test suite on Nexus 5. Change-Id: I780fd0cca68f89401d2a114e1022bed498d02979
2016-07-07Revert "Revert "Optimize IMT""Artem Udovichenko
This reverts commit 88f288e3564d79d87c0cd8bb831ec5a791ba4861. Change-Id: I49605d53692cbec1e2622e23ff2893fc51ed4115
2016-06-29Revert "Optimize IMT"Nicolas Geoffray
Bug: 29188168 (for initial CL) Bug: 29778499 (reason for revert) This reverts commit badee9820fcf5dca5f8c46c3215ae1779ee7736e. Change-Id: I32b8463122c3521e233c34ca95c96a5078e88848
2016-06-29Merge "Revert "Refactor GetIMTIndex""Nicolas Geoffray