summaryrefslogtreecommitdiff
path: root/compiler/optimizing/optimizing_compiler.cc
AgeCommit message (Collapse)Author
2018-02-21Add timestamps to JIT/DEX native debug info.David Srbecky
This a forward-looking change intended to allow simpleperf to reliably correlate samples and native debug information. I have added the timestamps to both JIT and DEX, and refactored the code in the process to avoid code duplication. Test: testrunner.py -t 137 Change-Id: I45fa4310305aff540e036db9af15a86c5b8b7aff
2018-02-02Create list of open dex files for libbacktrace.David Srbecky
This fixes unwinds after recent changes (oob apks; cdex data sharing). Bug: 72520014 Test: m test-art-host-gtest Change-Id: Ie2a02657b2afbe899acd2e61f0a57d207e688b99
2018-01-24Log JIT mini-debug-info memory usage.David Srbecky
Change-Id: I0ffa6fb466b1635e724b0e782702303b92355408
2018-01-19Refactor jit debugger interface and its ELF creation.David Srbecky
Make it possible to store more then one method per entry, and ref-count the number of live methods per entry. Test: m test-art-host-gtest Change-Id: I45d69185e85e47fbee88a8d1f549ede9875a3c0a
2018-01-13Move debug info offsets into a side tableMathieu Chartier
Add a compact side table for figuring out the debug info offsets for a given method index. This reduces dex size by ~1.2%. The debug table is keyed by method index and has leb encoded offsets for the offsets. This means the table is smaller if debug infos are encoded by method index order. To prevent expansion for method indicies without debug info, there is a bitmap that specifies if a method index has a debug info offset. Motivation: Reduce code item size and allow more deduping in the future. Test: test-art-host Bug: 63756964 Change-Id: Ib983e85c1727f58c97676bde275f4a9756314da0
2018-01-13Revert "Revert "Move quickening info logic to its own table""Mathieu Chartier
Bug: 71605148 Bug: 63756964 Test: test-art-target on angler This reverts commit 6716941120ae9f47ba1b8ef8e79820c4b5640350. Change-Id: Ic01ea4e8bb2c1de761fab354c5bbe27290538631
2018-01-12Revert "Move quickening info logic to its own table"Nicolas Geoffray
Bug: 71605148 Bug: 63756964 Seems to fail on armv7. This reverts commit f5245188d9c61f6b90eb30cca0875fbdcc493b15. Change-Id: I37786c04a8260ae3ec4a2cd73710126783c3ae7e
2018-01-11Move quickening info logic to its own tableMathieu Chartier
Added a table that is indexed by dex method index. To prevent size overhead, there is only one slot for each 16 method indices. This means there is up to 15 loop iterations to get the quickening info for a method. The quickening infos are now prefixed by a leb encoded length. This allows methods that aren't quickened to only have 1.25 bytes of space overhead. The value was picked arbitrarily, there is little advantage to increasing the value since the table only takes 1 byte per 4 method indices currently. JIT benchmarks do not regress with the change. There is a net space saving from removing 8 bytes from each quickening info since most scenarios have more quickened methods than compiled methods. For getting quick access to the table, a 4 byte preheader was added to each dex in the vdex file Removed logic that stored the quickening info in the CodeItem debug_info_offset field. The change adds a small quicken table for each method index, this means that filters that don't quicken will have a slight increase in size. The worst case scenario is compiling all the methods, this results in 0.3% larger vdex for this case. The change also disables deduping since the quicken infos need to be in dex method index order. For filters that don't compile most methods like quicken and speed-profile, there is space savings. For quicken, the vdex is 2% smaller. Bug: 71605148 Bug: 63756964 Test: test-art-host Change-Id: I89cb679538811369c36b6ac8c40ea93135f813cd
2018-01-08Clean up CodeItemAccessors and Compact/StandardDexFileMathieu Chartier
Change constructor to use a reference to a dex file. Remove duplicated logic for GetCodeItemSize. Bug: 63756964 Test: test-art-host Change-Id: I69af8b93abdf6bdfa4454e16db8f4e75883bca46
2018-01-05Create dex subdirectoryDavid Sehr
Move all the DexFile related source to a common subdirectory dex/ of runtime. Bug: 71361973 Test: make -j 50 test-art-host Change-Id: I59e984ed660b93e0776556308be3d653722f5223
2017-12-22Make CodeItem fields privateMathieu Chartier
Make code item fields private and use accessors. Added a hand full of friend classes to reduce the size of the change. Changed default to be nullable and removed CreateNullable. CreateNullable was a bad API since it defaulted to the unsafe, may add a CreateNonNullable if it's important for performance. Motivation: Have a different layout for code items in cdex. Bug: 63756964 Test: test-art-host-gtest Test: test/testrunner/testrunner.py --host Test: art/tools/run-jdwp-tests.sh '--mode=host' '--variant=X32' --debug Change-Id: I42bc7435e20358682075cb6de52713b595f95bf9
2017-12-14Add mini-debug-info generation mode for JIT.David Srbecky
This excludes everything that is not needed for backtraces and compresses the resulting ELF file (wrapped in another ELF file). This approximately halves the size of the debug data for JIT. The vast majority of the data is the overhead of ELF header. We could amortize this by storing more methods per ELF file. It also adds NOBITS .text section to all debug ELF files, as that seems necessary for gdb to find the symbols. On the other hand, it removes .rodata from debug ELF files. Test: Manually tested that gdb can use this data to unwind. Test: m test-art-host-gtest Test: testrunner.py --optimizing --host Test: testrunner.py -t 137-cfi Change-Id: Ic0a2dfa953cb79973a7b2ae99d32018599e61171
2017-11-30Revert^4 "JIT JNI stubs."Vladimir Marko
The original CL, https://android-review.googlesource.com/513417 , has a bug fixed in the Revert^2, https://android-review.googlesource.com/550579 , and this Revert^4 adds two more fixes: - fix obsolete native method getting interpreter entrypoint in 980-redefine-object, - fix random JIT GC flakiness in 667-jit-jni-stub. Test: testrunner.py --host --prebuild --no-relocate \ --no-image --jit -t 980-redefine-object Bug: 65574695 Bug: 69843562 This reverts commit 056d7756152bb3ced81dd57781be5028428ce2bd. Change-Id: Ic778686168b90e29816fd526e23141dcbe5ea880
2017-11-30Revert "Revert "Revert "JIT JNI stubs."""Nicolas Geoffray
Still seeing occasional failures on 667-jit-jni-stub Bug: 65574695 Bug: 69843562 This reverts commit e7441631a11e2e07ce863255a59ee4de29c6a56f. Change-Id: I3db751679ef7bdf31c933208aaffe4fac749a14b
2017-11-29Revert "Revert "JIT JNI stubs.""Vladimir Marko
The original CL, https://android-review.googlesource.com/513417 , had a bug for class unloading where a read barrier was executed at the wrong time from ConcurrentCopying::MarkingPhase() -> ClassLinker::CleanupClassLoaders() -> ClassLinker::DeleteClassLoader() -> JitCodeCache::RemoveMethodsIn() -> JitCodeCache::JniStubKey::UpdateShorty() -> ArtMethod::GetShorty(). This has been fixed by removing sources of the read barrier from ArtMethod::GetShorty(). Test: testrunner.py --host --prebuild --jit --no-relocate \ --no-image -t 998-redefine-use-after-free Bug: 65574695 Bug: 69843562 This reverts commit 47d31853e16a95393d760e6be2ffeeb0193f94a1. Change-Id: I06e7a15b09d9ff11cde15a7d1529644bfeca15e0
2017-11-29Merge "Clean some dex2oat options."Nicolas Geoffray
2017-11-28Revert "JIT JNI stubs."Vladimir Marko
Seems to break 998-redefine-use-after-free in some --no-image configuration. Bug: 65574695 Bug: 69843562 This reverts commit 3417eaefe4e714c489a6fb0cb89b4810d81bdf4d. Change-Id: I2dd157b931c17c791522ea2544c1982ed3519b86
2017-11-28Clean some dex2oat options.Nicolas Geoffray
Remove dump-passes inherited from Quick days, and move dump-timings and dump-stats to CompilerStats. Test: test.py Change-Id: Ie79be858a141e59dc0b2a87d8cb5a5248a5bc7af
2017-11-28JIT JNI stubs.Vladimir Marko
Allow the JIT compiler to compile JNI stubs and make sure they can be collected once they are not in use anymore. Test: 667-jit-jni-stub Test: Pixel 2 XL boots. Test: m test-art-host-gtest Test: testrunner.py --host --jit Test: testrunner.py --target --jit Bug: 65574695 Change-Id: Idf81f50bcfa68c0c403ad2b49058be62b21b7b1f
2017-11-28ART: Minor refactoring of JNI stub compilation.Vladimir Marko
Introduce JniCompiledMethod to avoid JNI compiler dependency on CompiledMethod. This is done in preparation for compiling JNI stubs in JIT as the CompiledMethod class should be used exclusively for AOT compilation. Test: m test-art-host-gtest Bug: 65574695 Change-Id: I1d047d4aebc55057efb7ed3d39ea65600f5fb6ab
2017-11-27Fix stats reporting over 100% methods compiled.Vladimir Marko
Add statistics for intrinsic and native stub compilation and JIT failing to allocate memory for committing the code. Clean up recording of compilation statistics. New statistics when building aosp_taimen-userdebug boot image with --dump-stats: Attempted compilation of 94304 methods: 99.99% (94295) compiled. OptStat#AttemptBytecodeCompilation: 89487 OptStat#AttemptIntrinsicCompilation: 160 OptStat#CompiledNativeStub: 4733 OptStat#CompiledIntrinsic: 84 OptStat#CompiledBytecode: 89478 ... where 94304=89487+4733+84 and 94295=89478+4733+84. Test: testrunner.py -b --host --optimizing Test: Manually inspect output of building boot image with --dump-stats. Bug: 69627511 Change-Id: I15eb2b062a96f09a7721948bcc77b83ee4f18efd
2017-11-20Refactored optimization passes setup.Aart Bik
Rationale: Refactors the way we set up optimization passes in the compiler into a more centralized approach. The refactoring also found some "holes" in the existing mechanism (missing string lookup in the debugging mechanism, or inablity to set alternative name for optimizations that may repeat). Bug: 64538565 Test: test-art-host test-art-target Change-Id: Ie5e0b70f67ac5acc706db91f64612dff0e561f83
2017-11-15Use intrinsic codegen for compiling intrinsic methods.Vladimir Marko
When compiling an intrinsic method, generate a graph that invokes the same method and try to compile it. If the call is actually intrinsified (or simplified to other HIR) and yields a leaf method, use the result of this compilation attempt, otherwise compile the actual code or JNI stub. Note that CodeGenerator::CreateThrowingSlowPathLocations() actually marks the locations as kNoCall if the throw is not in a catch block, thus considering some throwing methods (for example, String.charAt()) as leaf methods. We would ideally want to use the intrinsic codegen for all intrinsics that do not generate a slow-path call to the default implementation. Relying on the leaf method is suboptimal as we're missing out on methods that do other types of calls, for example runtime calls. This shall be fixed in a subsequent CL. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Bug: 67717501 Change-Id: I640fda7c22d4ff494b5ff77ebec3b7f5f75af652
2017-11-14Remove kIsVdexEnabled.Nicolas Geoffray
It is now always assumed there is one. Test: test.py Change-Id: I8f3f5c722fb8c4a0f9ad8ea685d1a956bd0ac9ae
2017-11-10Record @{Fast,Critical}Native in method's access flags.Vladimir Marko
Repurpose the old kAccFastNative flag (which wasn't actually used for some time) and define a new kAccCriticalNative flag to record the native method's annotation-based kind. This avoids repeated determination of the kind from GenericJNI. And making two transitions to runnable and back (using the ScopedObjectAccess) from GenericJniMethodEnd() for normal native methods just to determine that we need to transition to runnable was really weird. Since the IsFastNative() function now records the presence of the @FastNative annotation, synchronized @FastNative method calls now avoid thread state transitions. When initializing the Runtime without a boot image, the WellKnowClasses may not yet be initialized, so relax the DCheckNativeAnnotation() to take that into account. Also revert https://android-review.googlesource.com/509715 as the annotation checks are now much faster. Bug: 65574695 Bug: 35644369 Test: m test-art-host-gtest Test: testrunner.py --host Change-Id: I2fc5ba192b9ce710a0e9202977b4f9543e387efe
2017-11-02ART: Make InstructionSet an enum class and add kLast.Vladimir Marko
Adding InstructionSet::kLast shall make it easier to encode the InstructionSet in fewer bits using BitField<>. However, introducing `kLast` into the `art` namespace is not a good idea, so we change the InstructionSet to an enum class. This also uncovered a case of InstructionSet::kNone being erroneously used instead of vixl32::Condition::None(), so it's good to remove `kNone` from the `art` namespace. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: I6fa6168dfba4ed6da86d021a69c80224f09997a6
2017-10-17Use ScopedArenaAllocator for code generation.Vladimir Marko
Reuse the memory previously allocated on the ArenaStack by optimization passes. This CL handles only the architecture-independent codegen and slow paths, architecture-dependent codegen allocations shall be moved to the ScopedArenaAllocator in a follow-up. Memory needed to compile the two most expensive methods for aosp_angler-userdebug boot image: BatteryStats.dumpCheckinLocked() : 19.6MiB -> 18.5MiB (-1189KiB) BatteryStats.dumpLocked(): 39.3MiB -> 37.0MiB (-2379KiB) Also move definitions of functions that use bit_vector-inl.h from bit_vector.h also to bit_vector-inl.h . Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Bug: 64312607 Change-Id: I84688c3a5a95bf90f56bd3a150bc31fedc95f29c
2017-10-11Use ScopedArenaAllocator for building HGraph.Vladimir Marko
Memory needed to compile the two most expensive methods for aosp_angler-userdebug boot image: BatteryStats.dumpCheckinLocked() : 21.1MiB -> 20.2MiB BatteryStats.dumpLocked(): 42.0MiB -> 40.3MiB This is because all the memory previously used by the graph builder is reused by later passes. And finish the "arena"->"allocator" renaming; make renamed allocator pointers that are members of classes const when appropriate (and make a few more members around them const). Test: m test-art-host-gtest Test: testrunner.py --host Bug: 64312607 Change-Id: Ia50aafc80c05941ae5b96984ba4f31ed4c78255e
2017-10-09Use ScopedArenaAllocator for register allocation.Vladimir Marko
Memory needed to compile the two most expensive methods for aosp_angler-userdebug boot image: BatteryStats.dumpCheckinLocked() : 25.1MiB -> 21.1MiB BatteryStats.dumpLocked(): 49.6MiB -> 42.0MiB This is because all the memory previously used by Scheduler is reused by the register allocator; the register allocator has a higher peak usage of the ArenaStack. And continue the "arena"->"allocator" renaming. Test: m test-art-host-gtest Test: testrunner.py --host Bug: 64312607 Change-Id: Idfd79a9901552b5147ec0bf591cb38120de86b01
2017-10-06ART: Use ScopedArenaAllocator for pass-local data.Vladimir Marko
Passes using local ArenaAllocator were hiding their memory usage from the allocation counting, making it difficult to track down where memory was used. Using ScopedArenaAllocator reveals the memory usage. This changes the HGraph constructor which requires a lot of changes in tests. Refactor these tests to limit the amount of work needed the next time we change that constructor. Test: m test-art-host-gtest Test: testrunner.py --host Test: Build with kArenaAllocatorCountAllocations = true. Bug: 64312607 Change-Id: I34939e4086b500d6e827ff3ef2211d1a421ac91a
2017-10-05MIPS32R2: Share address computationLena Djokic
For array accesses the element address has the following structure: Address = CONST_OFFSET + base_addr + index << ELEM_SHIFT The address part (index << ELEM_SHIFT) can be shared across array accesses with the same data type and index. For example, in the following loop 5 accesses can share address computation: void foo(int[] a, int[] b, int[] c) { for (i...) { a[i] = a[i] + 5; b[i] = b[i] + c[i]; } } Test: test-art-host, test-art-target Change-Id: Id09fa782934aad4ee47669275e7e1a4d7d23b0fa
2017-09-27Enables GVN for x86 and x86_64.Aart Bik
Rationale: As decided after the MIPS change, this change unifies our six code generators again a bit (we cannot move it into the generic path, since arm likes to run the simplifier first). Generally the GVN does some last minute cleanup (such as finding CSE in the runtime tests generated by dynamic BCE). I started a golem run to find impact. Test: test-art-host test-art-target Change-Id: Ib4098c5bae2269e71fee95cc31e3662d3aa47f6a
2017-09-27Merge "Enables GVN for MIPS32 and MIPS64."Treehugger Robot
2017-09-25ART: Introduce compiler data type.Vladimir Marko
Replace most uses of the runtime's Primitive in compiler with a new class DataType. This prepares for introducing new types, such as Uint8, that the runtime does not need to know about. Test: m test-art-host-gtest Test: testrunner.py --host Bug: 23964345 Change-Id: Iec2ad82454eec678fffcd8279a9746b90feb9b0c
2017-09-21Enables GVN for MIPS32 and MIPS64.Lena Djokic
Test: mma test-art-host-gtest Test: mma test-art-target-gtest in QEMU Test: ./testrunner.py --target --optimizing in QEMU Change-Id: Ie3c6b29b9125ff8aef888c3574bdb0ab96574bd4
2017-09-20Refactor compiled_method.h .Vladimir Marko
Move LinkerPatch to compiler/linker/linker_patch.h . Move SrcMapElem to compiler/debug/src_map_elem.h . Introduce compiled_method-inl.h to reduce the number of `#include`s in compiled_method.h . Test: m test-art-host-gtest Test: testrunner.py --host Change-Id: Id211cdf94a63ad265bf4709f1a5e06dffbe30f64
2017-09-20Refactor linker files from compiler/ to dex2oat/.Vladimir Marko
This shifts some code from the libart-compiler.so to dex2oat and reduces memory needed for JIT. We also avoid loading the libart-dexlayout.so for JIT but the memory savings are minimal (one shared clean page, two shared dirty pages and some per-app kernel mmap data) as the code has never been needed in memory by JIT. aosp_angler-userdebug file sizes (stripped): lib64/libart-compiler.so: 2989112 -> 2671888 (-310KiB) lib/libart-compiler.so: 2160816 -> 1939276 (-216KiB) bin/dex2oat: 141868 -> 368808 (+222KiB) LOAD/executable elf mapping sizes: lib64/libart-compiler.so: 2866308 -> 2555500 (-304KiB) lib/libart-compiler.so: 2050960 -> 1834836 (-211KiB) bin/dex2oat: 129316 -> 345916 (+212KiB) Test: m test-art-host-gtest Test: testrunner.py --host Test: cd art/; mma; cd - Change-Id: If62f02847a6cbb208eaf7e1f3e91af4663fa4a5f
2017-09-18ART: Remove old codeAndreas Gampe
Remove unused Quick compiler flag. Remove support for arm32 soft-float code (which is no longer supported by our compiler). Test: m Change-Id: I38b16291d90094dbf26776923a46afbf8de53f20
2017-09-18Add debug info for link-time generated thunks.Vladimir Marko
Add debug info for method call thunks (currently unused) and Baker read barrier thunks. Refactor debug info generation for trampolines and record their sizes; change their names to start with upper-case letters, so that they can be easily generated as `#fn_name`. This improved debug info must be generated by `dex2oat -g`, the debug info generated by `oatdump --symbolize` remains the same as before, except for the renamed trampolines and an adjustment for "code delta", i.e. the Thumb mode bit. Cortex-A53 erratum 843419 workaround thunks are not covered by this CL. Test: Manual; run-test --gdb -Xcompiler-option -g 160, pull symbols for gdbclient, break in the introspection entrypoint, check that gdb knows the new symbols (and disassembles them) and `backtrace` works when setting $pc to an address in the thunk. Bug: 36141117 Change-Id: Id224b72cfa7a0628799c7db65e66e24c8517aabf
2017-09-08Merge "optimizing: add block-scoped constructor fence merging pass"Treehugger Robot
2017-09-08optimizing: add block-scoped constructor fence merging passIgor Murashkin
Introduce a new "Constructor Fence Redundancy Elimination" pass. The pass currently performs local optimization only, i.e. within instructions in the same basic block. All constructor fences preceding a publish (e.g. store, invoke) get merged into one instruction. ============== OptStat#ConstructorFenceGeneratedNew: 43825 OptStat#ConstructorFenceGeneratedFinal: 17631 <+++ OptStat#ConstructorFenceRemovedLSE: 164 OptStat#ConstructorFenceRemovedPFRA: 9391 OptStat#ConstructorFenceRemovedCFRE: 16133 <--- Removes ~91.5% of the 'final' constructor fences in RitzBenchmark: (We do not distinguish the exact reason that a fence was created, so it's possible some "new" fences were also removed.) ============== Test: art/test/run-test --host --optimizing 476-checker-ctor-fence-redun-elim Bug: 36656456 Change-Id: I8020217b448ad96ce9b7640aa312ae784690ad99
2017-09-06Pass stats into the loop optimization phase.Aart Bik
Test: market scan. Change-Id: I58b23b8d254883f30619ea3602d34bf93618d432
2017-08-14Merge "RFC: Generate select instruction for conditional returns."Nicolas Geoffray
2017-08-11Merge changes Ic119441c,I83b96b41Treehugger Robot
* changes: optimizing: Add statistics for # of constructor fences added/removed optimizing: Refactor statistics to use OptimizingCompilerStats helper
2017-08-11optimizing: Add statistics for # of constructor fences added/removedIgor Murashkin
Statistics are attributed as follows: Added because: * HNewInstances requires a HConstructorFence following it. * HReturn requires a HConstructorFence (for final fields) preceding it. Removed because: * Optimized in Load-Store-Elimination. * Optimized in Prepare-For-Register-Allocation. Test: art/test.py Bug: 36656456 Change-Id: Ic119441c5151a5a840fc6532b411340e2d68e5eb
2017-08-11optimizing: Refactor statistics to use OptimizingCompilerStats helperIgor Murashkin
Remove all copies of 'MaybeRecordStat', replacing them with a single OptimizingCompilerStats::MaybeRecordStat helper. Change-Id: I83b96b41439dccece3eee2e159b18c95336ea933
2017-08-10Instrument ARM64 generated code to check the Marking Register.Roland Levillain
Generate run-time code in the Optimizing compiler checking that the Marking Register's value matches `self.tls32_.is.gc_marking` in debug mode (on target; and on host with JIT, or with AOT when compiling the core image). If a check fails, abort. Test: m test-art-target Test: m test-art-target with tree built with ART_USE_READ_BARRIER=false Test: ARM64 device boot test with libartd. Bug: 37707231 Change-Id: Ie9b322b22b3d26654a06821e1db71dbda3c43061
2017-08-10RFC: Generate select instruction for conditional returns.Mads Ager
The select generator currently only inserts select instructions if there is a diamond shape with a phi. This change extends the select generator to also deal with the pattern: if (condition) { movable instruction 0 return value0 } else { movable instruction 1 return value1 } which it turns into: moveable instruction 0 moveable instruction 1 return select (value0, value1, condition) Test: 592-checker-regression-bool-input Change-Id: Iac50fb181dc2c9b7619f28977298662bc09fc0e1
2017-08-10Revert recent JIT code cache changesOrion Hodson
Flakiness observed on the bots. Revert "Jit Code Cache instruction pipeline flushing" This reverts commit 56fe32eecd4f25237e66811fd766355a07908d22. Revert "ARM64: More JIT Code Cache maintenace" This reverts commit 17272ab679c9b5f5dac8754ac070b78b15271c27. Revert "ARM64: JIT Code Cache maintenance" This reverts commit 3ecac070ad55d433bbcbe11e21f4b44ab178effe. Revert "Change flush order in JIT code cache" This reverts commit 43ce5f82dae4dc5eebcf40e54b81ccd96eb5fba3. Revert "Separate rw from rx views of jit code cache" This reverts commit d1dbb74e5946fe6c6098a541012932e1e9dd3115. Test: art/test.py --target --64 Bug: 64527643 Bug: 62356545 Change-Id: Ifa10ac77a60ee96e8cb68881bade4d6b4f828714
2017-07-25Merge "Jit Code Cache instruction pipeline flushing"Treehugger Robot