summaryrefslogtreecommitdiff
path: root/compiler/optimizing/code_generator_arm.cc
AgeCommit message (Collapse)Author
2016-01-14Merge "Implement irreducible loop support in optimizing."Nicolas Geoffray
2016-01-14ART: Remove incorrect HFakeString optimizationDavid Brazdil
Simplification of HFakeString assumes that it cannot be used until String.<init> is called which is not true and causes different behaviour between the compiler and the interpreter. This patch removes the optimization together with the HFakeString instruction. Instead, HNewInstance is generated and an empty String allocated until it is replaced with the result of the StringFactory call. This is consistent with the behaviour of the interpreter but is too conservative. A follow-up CL will attempt to optimize out the initial allocation when possible. Bug: 26457745 Bug: 26486014 Change-Id: I7139e37ed00a880715bfc234896a930fde670c44
2016-01-14Implement irreducible loop support in optimizing.Nicolas Geoffray
So we don't fallback to the interpreter in the presence of irreducible loops. Implications: - A loop pre-header does not necessarily dominate a loop header. - Non-constant redundant phis will be kept in loop headers, to satisfy our linear scan register allocation algorithm. - while-graph optimizations, such as gvn, licm, lse, and dce need to know when they are dealing with irreducible loops. Change-Id: I2cea8934ce0b40162d215353497c7f77d6c9137e
2016-01-12Reduce code size by sharing slow paths.Aart Bik
Rationale: Sharing identical slow path code reduces code size. Background: Currently, slow paths with the same dex-pc, same physical register spilling code, and identical stack maps are shared (making this only useful for deopt slow paths). The newly introduced mechanism is sufficiently general to allow future improvements by e.g. allowing different dex-pc (by passing this to runtime) or even the kind of slow paths (by passing runtime addresses to the slowpath). Change-Id: I819615c47b4fd98440a241f681f93e4fc22d12e0
2016-01-12Merge "Optimizing/ARM: Fix CmpConstant()."Vladimir Marko
2016-01-11Merge "Generate Nops to ensure that debug stack maps have distinct PC."David Srbecky
2016-01-11Merge "Don't use std::abs on INT_MIN/LONG_MIN, it's undefined."Nicolas Geoffray
2016-01-11Generate Nops to ensure that debug stack maps have distinct PC.David Srbecky
Change-Id: I5740ec958a20d236634b66df0e675382ed5c16fc
2016-01-11Don't use std::abs on INT_MIN/LONG_MIN, it's undefined.Nicolas Geoffray
bug:25494265 Change-Id: I560a3a589b92440020285f9adfdf7c9efb06217c
2016-01-08Merge "Add a missing implicit null check in the ARM codegen."Roland Levillain
2016-01-08Small implicit null checks refactoring in the ARM codegen.Roland Levillain
Change-Id: I7dccb02cf7ac2f7d8fd1676b03e0b394701fbe3f
2016-01-08Add a missing implicit null check in the ARM codegen.Roland Levillain
The code generated for object ArraySet on ARM used to miss an implicit null check for the array when the assigned value is `null`. This has not been an actual issue so far, as ArraySet instructions have never been using implicit null checks. Note: This CL comes without a regression test, as the code path in question is not used (yet). Change-Id: If3bc85e32802595e635513dfb83ccfcfd8f00d3d
2016-01-08ARM Baker's read barrier fast path implementation.Roland Levillain
Introduce an ARM fast path implementation in Optimizing for Baker's read barriers (for both heap reference loads and GC root loads). The marking phase of the read barrier is performed by a slow path, invoking the runtime entry point artReadBarrierMark. Other read barrier algorithms continue to use the original slow path based implementation, which has been renamed as GenerateReadBarrierSlow/GenerateReadBarrierForRootSlow. Bug: 12687968 Change-Id: Ie7ee85b1b4c0564148270cebdd3cbd4c3da51b3a
2015-12-23Generate more stack maps during native debugging.David Srbecky
Generate extra stack map at the start of each java statement. The stack maps are later translated to DWARF which allows LLDB to set breakpoints and view local variables. Change-Id: If00ab875513308e4a1399d1e12e0fe8934a6f0c3
2015-12-23Rewrite HInstruction::Is/As<type>().Vladimir Marko
Make Is<type>() and As<type>() non-virtual for concrete instruction types, relying on GetKind(), and mark GetKind() as PURE to improve optimization opportunities. This reduces the number of relocations in libart-compiler.so's .rel.dyn section by ~4K, or ~44%, and in .data.rel.ro by ~18K, or ~65%. The file is 96KiB smaller for Nexus 5, including 8KiB reduction of the .text section. Unfortunately, the g++/clang++ __attribute__((pure)) is not strong enough to avoid duplicated virtual calls and we would need the C++ [[pure]] attribute proposed in n3744 instead. To work around this deficiency, we introduce an extra non-virtual indirection for GetKind(), so that the compiler can optimize common expressions such as instruction->IsAdd() || instruction->IsSub() or instruction->IsAdd() && instruction->AsAdd()->... which contain two virtual calls to GetKind() after inlining. Change-Id: I83787de0671a5cb9f5b0a5f4a536cef239d5b401
2015-12-22Optimizing/ARM: Fix CmpConstant().Vladimir Marko
CMN updates flags based on addition of its operands. Do not confuse the "N" suffix with bitwise inversion performed by MVN. Also add more special cases analogous to AddConstant() and use CmpConstant() more in code generator. Change-Id: I0d4571770a3f0fdf162e97d4bde56814098e7246
2015-12-17Merge "Revert "Revert "ART: Reduce the instructions generated by packed ↵Vladimir Marko
switch."""
2015-12-17Revert "Revert "ART: Reduce the instructions generated by packed switch.""Vladimir Marko
This reverts commit b4c137630fd2226ad07dfd178ab15725374220f1. The underlying issue was fixed by https://android-review.googlesource.com/188271 . Bug: 26121945 Change-Id: I58b08eb1a9f0a5c861f8cda93522af64bcf63920
2015-12-16Merge "Revert "ART: Reduce the instructions generated by packed switch.""Nicolas Geoffray
2015-12-16Revert "ART: Reduce the instructions generated by packed switch."Nicolas Geoffray
This reverts commit 59f054d98f519a3efa992b1c688eb97bdd8bbf55. bug:26121945 Change-Id: I8a5ad7ef1f1de8d44787c27528fa3f7f5c2e9cd3
2015-12-11Optimizing: Clean up after HRor.Vladimir Marko
Change-Id: I96bd7fa2e8bdccb87a3380d063dad0dd57fed9d7
2015-12-11Merge "Replace rotate patterns and invokes with HRor IR."Vladimir Marko
2015-12-11Replace rotate patterns and invokes with HRor IR.Scott Wakeling
Replace constant and register version bitfield rotate patterns, and rotateRight/Left intrinsic invokes, with new HRor IR. Where k is constant and r is a register, with the UShr and Shl on either side of a |, +, or ^, the following patterns are replaced: x >>> #k OP x << #(reg_size - k) x >>> #k OP x << #-k x >>> r OP x << (#reg_size - r) x >>> (#reg_size - r) OP x << r x >>> r OP x << -r x >>> -r OP x << r Implemented for ARM/ARM64 & X86/X86_64. Tests changed to not be inlined to prevent optimization from folding them out. Additional tests added for constant rotate amounts. Change-Id: I5847d104c0a0348e5792be6c5072ce5090ca2c34
2015-12-10Don't generate a slow path for strings in the dex cache.Nicolas Geoffray
Change-Id: I1d258f1a89bf0ec7c7ddd134be9215d480f0b09a
2015-12-08ART: Reduce the instructions generated by packed switch.Zheng Xu
Implement Vladimir Marko's suggestion. The new compare/jump series reduce the number of instructions from (2*n+1) to (1.5*n+3). Generate normal compare/jump series when numEntries <= 3. Generate optimal compare/jump series when numEntries <= threshold. Generate jump tables otherwise. Change-Id: I425547b6787057c7fa84e71f17c145b63b208633
2015-12-02Revert "Revert "Don't use the compiler driver for method resolution.""Nicolas Geoffray
This reverts commit c88ef3a10c474045a3476a02ae75d07ddd3230b7. Change-Id: I0ed88a48b313a8d28bc39fae40631123aadb13ef
2015-12-01Merge "Revert "Don't use the compiler driver for method resolution.""Nicolas Geoffray
2015-12-01Revert "Don't use the compiler driver for method resolution."Nicolas Geoffray
Fails 425 in debuggable mode. This reverts commit 4db0bf9c4db6a09716c3388b7d2f88d534470339. Change-Id: I346df8f75674564fc4fb241c60f23e250fc7f0a7
2015-12-01Merge "Don't use the compiler driver for method resolution."Nicolas Geoffray
2015-12-01Don't use the compiler driver for method resolution.Nicolas Geoffray
The compiler driver makes assumptions that don't hold for the optimizing compiler, and will for example always go to slow path for an invoke-super when there's no verified method. Also fix GenerateInvokeVirtual in the presence of intrinsics. Next change will address some of the TODOs in sharpening.cc. Change-Id: I2b0e543ee9b9bebcadb2d26de29e850c59ad58b9
2015-12-01Optimizing/ARM: Implement kDexCachePcRelative dispatch.Vladimir Marko
Change-Id: I0fe2da50a30a3f62bec8ea01688dd1fec84b1831
2015-11-24Optimize HLoadClass when we know the class is in the cache.Nicolas Geoffray
Change-Id: Iaa74591eed0f2eabc9ba9f9988681d9582faa320
2015-11-24Revamp art::CheckEntrypointTypes uses.Roland Levillain
Change-Id: I6e13e594539e766ed94524ac3282cec292ba91da
2015-11-23Clean up read barrier related comments in Optimizing.Roland Levillain
Bug: 12687968 Change-Id: Idf2e371e01e10d9d32c95b150735e2c96244232e
2015-11-23Merge "Optimizing/ARM: Improve long shifts by 1."Vladimir Marko
2015-11-23Merge "Explicitly add HLoadClass/HClinitCheck for HNewInstance."Nicolas Geoffray
2015-11-20Explicitly add HLoadClass/HClinitCheck for HNewInstance.Nicolas Geoffray
bug:25735083 bug:25173758 Change-Id: Ie81cfa4fa9c47cc025edb291cdedd7af209a03db
2015-11-20Optimizing/ARM: Improve long shifts by 1.Vladimir Marko
Implement long Shl(x,1) as LSLS+ADC, Shr(x,1) as ASR+RRX and UShr(x,1) as LSR+RRX. Remove the simplification substituting Shl(x,1) with ADD(x,x) as it interferes with some other optimizations instead of helping them. And since it didn't help 64-bit architectures anyway, codegen is the correct place for it. This is now implemented for ARM and x86, so only mips32 can be improved. Change-Id: Idd14f23292198b2260189e1497ca5411b21743b3
2015-11-19Clean up the special input in HInvokeStaticOrDirect.Vladimir Marko
Change-Id: I4042aefbdac1a8c236d00e2e7145349a64f6486b
2015-11-17ARM read barrier support for concurrent GC in Optimizing.Roland Levillain
This first implementation uses slow paths to instrument heap reference loads and GC root loads for the concurrent copying collector, respectively calling the artReadBarrierSlow and artReadBarrierForRootSlow runtime entry points. Notes: - This implementation does not instrument HInvokeVirtual nor HInvokeInterface instructions (for class reference loads), as the corresponding read barriers are not stricly required with the current concurrent copying collector. - Intrinsics which may eventually call (on slow path) are disabled when read barriers are enabled, as the current slow path infrastructure does not support this case. - When read barriers are enabled, the code generated for a HArraySet instruction always go into the array set slow path for object arrays (delegating the operation to the runtime), as we are lacking a mechanism to keep a temporary register live accross a runtime call (needed for the instrumentation of type checking code, which requires two successive read barriers). Bug: 12687968 Change-Id: I92e8db414d029f952c07f3d3a98069e46dfdbc2a
2015-11-17ART: Refactor GenerateTestAndBranchDavid Brazdil
Each code generator implements a method for generating condition evaluation and branching to arbitrary labels. This patch refactors it for better clarity but also to generate fewer jumps when the true branch is the fallthrough successor. This is preliminary work for implementing HSelect. Change-Id: Iaa545a5ecbacb761c5aa241fa69140cf6eb5952f
2015-11-11Optimizing: Clean up constant location handling.Vladimir Marko
Locations builder should use ConstantLocation() when the code generator relies on a location to be constant. Code generator should interrogate locations, not inputs, about being const. Change-Id: Ic35bb84aa9f83e0977b151a0430aca6c88f19cf0
2015-11-11Optimizing/ARM: Improve shifts of long values by a constant.Vladimir Marko
Change-Id: Id66ef8cdb9e64306f2be547370b90cc100a3e086
2015-11-05Fix conditional jump over jmp (X86/X86-64/ARM32)Mark Mendell
Optimize the code generation for 'if' statements to jump to the 'false' block if the next block to be generated is the 'true' block. Add an X86-64 test for this case. Note that ARM64 & MIPS64 have not been updated. Change-Id: Iebb1352feb9d3bd0142d8b0621a2e3069a708ea7 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
2015-10-30ART: Arm32 packed-switch jump tablesAndreas Gampe
Add jump table support to the thumb2 assembler. Jump tables are a collection of labels for the case targets, and an anchor label denoting the position of the jump. Use the jump table support to implement packed-switch support for arm32. Add tests for BindTrackedLabel and JumpTable to the thumb2 assembler test. Bug: 24092914 Change-Id: I5c84f193dfebf9e07f48678efc8bd151bb1410dd
2015-10-23Optimizing: Determine invoke-static/-direct dispatch early.Vladimir Marko
Determine the dispatch type of invoke-static/-direct in a special pass right after the type inference. This allows the inliner to pass the "needs dex cache" check and inline more. It also allows the code generator to avoid requesting a register location for the ArtMethod* for kDexCachePcRelative and direct methods. The supported dispatch check handles also situations that the CompilerDriver currently doesn't allow. The cleanup of the CompilerDriver and required changes to Quick will come in a separate change. Change-Id: I3f8e903a119949e95871d8ab0a995f4731a13a07
2015-10-19Merge "Generalize codegen and simplification of deopt."Aart Bik
2015-10-19Generalize codegen and simplification of deopt.Aart Bik
Rationale: the de-opt instruction is very similar to an if, so the existing assumption that it always has a conditional "under the hood" is very unsafe, since optimizations may have replaced conditionals with actual values; this CL generalizes handling of deopt. Change-Id: I1c6cb71fdad2af869fa4714b38417dceed676459
2015-10-15Use ATTRIBUTE_UNUSED more.Roland Levillain
Use it in lieu of UNUSED(), which had some incorrect uses. Change-Id: If247dce58b72056f6eea84968e7196f0b5bef4da
2015-10-15Merge "Added support for unsigned comparisons"Aart Bik