summaryrefslogtreecommitdiff
path: root/compiler/optimizing/code_generator.cc
AgeCommit message (Collapse)Author
2016-06-21Replace String.charAt() with HIR.Vladimir Marko
Replace String.charAt() with HArrayLength, HBoundsCheck and HArrayGet. This allows GVN on the HArrayLength and BCE on the HBoundsCheck as well as using the infrastructure for HArrayGet, i.e. better handling of constant indexes than the old intrinsic and using the HArm64IntermediateAddress. Bug: 28330359 Change-Id: I32bf1da7eeafe82537a60416abf6ac412baa80dc
2016-06-02Refactor handling of input records.Vladimir Marko
Introduce HInstruction::GetInputRecords(), a new virtual function that returns an ArrayRef<> to all input records. Implement all other functions dealing with input records as wrappers around GetInputRecords(). Rewrite functions that previously used multiple virtual calls to deal with input records, especially in loops, to prefetch the ArrayRef<> only once for each instruction. Besides avoiding all the extra calls, this also allows the compiler (clang++) to perform additional optimizations. This speeds up the Nexus 5 boot image compilation by ~0.5s (4% of "Compile Dex File", 2% of dex2oat time) on AOSP ToT. Change-Id: Id8ebe0fb9405e38d918972a11bd724146e4ca578
2016-05-16Revert "Revert "ART: Reference.getReferent intrinsic for x86 and x86_64""Serguei Katkov
This reverts commit 0997d24e67d78f2146ebae2888eda0d7d254789a. ART_HEAP_POISONING=true mode is fixed. Change-Id: I83f6d5c101ea6a86802753f81b3e4348a263fb21 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
2016-05-13Merge "Revert "ART: Reference.getReferent intrinsic for x86 and x86_64""Nicolas Geoffray
2016-05-13Revert "ART: Reference.getReferent intrinsic for x86 and x86_64"Nicolas Geoffray
Fails heap poisoning configuration. This reverts commit afdc97ebcb4e58afb7cf54d846d30314e6499d83. Change-Id: I50e53756a2b85059b89cfb8950f8c9e2b032743c
2016-05-12Merge "ART: Reference.getReferent intrinsic for x86 and x86_64"Roland Levillain
2016-05-10ART: Reference.getReferent intrinsic for x86 and x86_64Serguei Katkov
Change-Id: I7a7ac9244847dd80d9fa4e4b5ebc5bf451c628ff Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
2016-05-09Intrinsify String.length() and String.isEmpty() as HIR.Vladimir Marko
Use HArrayLength for String.length() in anticipation of changing the String.charAt() to HBoundsCheck+HArrayGet to allow the existing BCE to seamlessly work for strings. Use HArrayLength+HEqual for String.isEmpty(). We previously relied on inlining but we now want to apply the new intrinsics even when we do not inline, i.e. when compiling debuggable (as is currently the case for boot image) or when we hit inlining limits, i.e. depth, size, or the number of accumulated dex registers. Bug: 28330359 Change-Id: Iab9d2f6d2967bdd930a72eb461f27efe8f37c103
2016-04-15Fix: correctly destruct VIXL labels.Alexandre Rames
Bug: 27505766 Change-Id: I077465e3d308f4331e7a861902e05865f9d99835
2016-04-12Allocate code generators on the arena.Vladimir Marko
Change-Id: If8cf0ee43711f6e13171443e3c057ff370ccfbaa
2016-04-07Revert "Revert "Refactor HGraphBuilder and SsaBuilder to remove HLocals""David Brazdil
This patch merges the instruction-building phases from HGraphBuilder and SsaBuilder into a single HInstructionBuilder class. As a result, it is not necessary to generate HLocal, HLoadLocal and HStoreLocal instructions any more, as the builder produces SSA form directly. Saves 5-15% of arena-allocated memory (see bug for more data): GMS 20.46MB => 19.26MB (-5.86%) Maps 24.12MB => 21.47MB (-10.98%) YouTube 28.60MB => 26.01MB (-9.05%) This CL fixed an issue with parsing quickened instructions. Bug: 27894376 Bug: 27998571 Bug: 27995065 Change-Id: I20dbe1bf2d0fe296377478db98cb86cba695e694
2016-04-05Merge "Clean up OatQuickMethodHeader after Quick removal."Vladimir Marko
2016-04-04Revert "Refactor HGraphBuilder and SsaBuilder to remove HLocals"David Brazdil
Bug: 27995065 This reverts commit e3ff7b293be2a6791fe9d135d660c0cffe4bd73f. Change-Id: I5363c7ce18f47fd422c15eed5423a345a57249d8
2016-04-04Clean up OatQuickMethodHeader after Quick removal.Vladimir Marko
This reduces the size of the pre-header by 8 bytes, reducing oat file size and mmapped .text section size. The memory needed to store a CompiledMethod by dex2oat is also reduced, for 32-bit dex2oat by 8B and for 64-bit dex2oat by 16B. The aosp_flounder-userdebug 32-bit and 64-bit boot.oat are each about 1.1MiB smaller. Disable the broken StubTest.IMT, b/27991555 . Change-Id: I05fe45c28c8ffb7a0fa8b1117b969786748b1039
2016-04-04Refactor HGraphBuilder and SsaBuilder to remove HLocalsDavid Brazdil
This patch merges the instruction-building phases from HGraphBuilder and SsaBuilder into a single HInstructionBuilder class. As a result, it is not necessary to generate HLocal, HLoadLocal and HStoreLocal instructions any more, as the builder produces SSA form directly. Saves 5-15% of arena-allocated memory (see bug for more data): GMS 20.46MB => 19.26MB (-5.86%) Maps 24.12MB => 21.47MB (-10.98%) YouTube 28.60MB => 26.01MB (-9.05%) Bug: 27894376 Change-Id: Iefe28d40600c169c5d306fd2c77034ae19476d90
2016-04-04Build dominator tree before generating HInstructionsDavid Brazdil
Second CL in the series of merging HGraphBuilder and SsaBuilder. This patch refactors the builders so that dominator tree can be built before any HInstructions are generated. This puts the SsaBuilder removal of HLoadLocals/HStoreLocals straight after HGraphBuilder's HInstruction generation phase. Next CL will therefore be able to merge them. This patch also adds util classes for iterating bytecode and switch tables which allowed to simplify the code. Bug: 27894376 Change-Id: Ic425d298b2e6e7980481ed697230b1a0b7904526
2016-04-01Merge "Pack stack map entries on bit level to save space."Calin Juravle
2016-03-31Pack stack map entries on bit level to save space.David Srbecky
Use only the minimum number of bits required to store stack map data. For example, if native_pc needs 5 bits and dex_pc needs 3 bits, they will share the first byte of the stack map entry. The header is changed to store bit offsets of the fields rather than byte sizes. Offsets also make it easier to access later fields without calculating sum of all previous sizes. All of the header fields are byte sized or encoded as ULEB128 instead of the previous fixed size encoding. This shrinks it by about half. It saves 3.6 MB from non-debuggable boot.oat (AOSP). It saves 3.1 MB from debuggable boot.oat (AOSP). It saves 2.8 MB (of 99.4 MB) from /system/framework/arm/ (GOOG). It saves 1.0 MB (of 27.8 MB) from /system/framework/oat/arm/ (GOOG). Field loads from stackmaps seem to get around 10% faster. (based on the time it takes to load all stackmap entries from boot.oat) Bug: 27640410 Change-Id: I8bf0996b4eb24300c1b0dfc6e9d99fe85d04a1b7
2016-03-28ART: Clean up verifierAndreas Gampe
Clean up verifier post-Quick. Change-Id: I0b05e10dd06edd228fe2068c8afffc4b7d7fdffa
2016-03-21Optimizing: Fix register allocator validation memory usage.Vladimir Marko
Also attribute ArenaBitVector allocations to appropriate passes. This was used to track down the source of the excessive memory alloactions. Bug: 27690481 Change-Id: Ib895984cb7c04e24cbc7abbd8322079bab8ab100
2016-03-18Merge "Generate native debug stackmaps before calls as well."David Srbecky
2016-03-17Generate native debug stackmaps before calls as well.David Srbecky
The debugger looks up PC of the call instruction, so the runtime's stackmap is not sufficient since it is at PC after the instruction. Change-Id: I0dd06c0b52e8079ea5d064ea10beb12c93584092
2016-03-16Merge "Clean up NullCheck generation and record stats about it."Calin Juravle
2016-03-16Clean up NullCheck generation and record stats about it.Calin Juravle
This removes redundant code from the generators and allows for easier stat recording. Change-Id: Iccd4368f9e9d87a6fecb863dee4e2145c97851c4
2016-03-15Make art::HCompare side effect free.Roland Levillain
All our back ends implement all comparisons without making a runtime call, so we can mark art::HCompare as a side effect free instruction unconditionally. Change-Id: I9a9e7c09156c642edb6af1fe84408f887e762f2e
2016-03-10Avoid generating dead code on frame enter/exit.Aart Bik
This includes stack operations and, on x86, call/pop to read PC. bug=26997690 Rationale: (1) If method is fully intrinsified, and makes no calls in slow path or uses special input, no need to require current method. (2) Invoke instructions with HasPcRelativeDexCache() generate code that reads the PC (call/pop) on x86. However, if the invoke is an intrinsic that is later replaced with actual code, this PC reading code may be dead. Example X86 (before/after): 0x0000108c: 83EC0C sub esp, 12 0x0000108f: 890424 mov [esp], eax <-- not needed 0x00001092: E800000000 call +0 (0x00001097) 0x00001097: 58 pop eax <-- dead code to read PC 0x00001098: F30FB8C1 popcnt eax, ecx 0x0000109c: F30FB8DA popcnt ebx, edx 0x000010a0: 03D8 add ebx, eax 0x000010a2: 89D8 mov eax, ebx 0x000010a4: 83C40C add esp, 12 <-- not needed 0x000010a7: C3 ret 0x0000103c: F30FB8C1 popcnt eax, ecx 0x00001040: F30FB8DA popcnt ebx, edx 0x00001044: 03D8 add ebx, eax 0x00001046: 89D8 mov eax, ebx 0x00001048: C3 ret Example ARM64 (before/after): 0x0000103c: f81e0fe0 str x0, [sp, #-32]! 0x00001040: f9000ffe str lr, [sp, #24] 0x00001044: dac01020 clz x0, x1 0x00001048: f9400ffe ldr lr, [sp, #24] 0x0000104c: 910083ff add sp, sp, #0x20 (32) 0x00001050: d65f03c0 ret 0x0000103c: dac01020 clz x0, x1 0x00001040: d65f03c0 ret Change-Id: I8377db80c9a901a08fff4624927cf4a6e585da0c
2016-02-24Associate slow paths with the instruction that they belong to.David Srbecky
Almost all slow paths already know the instruction they belong to, this CL just moves the knowledge to the base class as well. This is needed to be be able to get the corresponding dex pc for slow path, which allows us generate better native line numbers, which in turn fixes some native debugging stepping issues. Change-Id: I568dbe78a7cea6a43a4a71a014b3ad135782c270
2016-02-24Remove HNativeDebugInfo from start of basic blocks.David Srbecky
We do not require full environment at the start of basic block. The dex pc contained in basic block is sufficient for line mapping. Change-Id: I5ba9e5f5acbc4a783ad544769f9a73bb33e2bafa
2016-02-12ART: Remove HTemporaryDavid Brazdil
Change-Id: I21b984224370a9ce7a4a13a9652503cfb03c5f03
2016-02-05Revert "Revert "Implement on-stack replacement for arm/arm64/x86/x86_64.""Nicolas Geoffray
This reverts commit bd89a5c556324062b7d841843b039392e84cfaf4. Change-Id: I08d190431520baa7fcec8fbdb444519f25ac8d44
2016-01-18ART: Remove Baseline compilerDavid Brazdil
We don't need Baseline any more and it hasn't been maintained for a while anyway. Let's remove it. Change-Id: I442ed26855527be2df3c79935403a25b1ee55df6
2016-01-13Update `ValidateInvokeRuntime()` and HDivZeroCheck.Alexandre Rames
Change-Id: I35beab2777a8c83bd508d56966afa1ceff9ee24f
2016-01-11Generate Nops to ensure that debug stack maps have distinct PC.David Srbecky
Change-Id: I5740ec958a20d236634b66df0e675382ed5c16fc
2015-12-10Get source mapping table from stack maps.David Srbecky
Stack maps contain pc to dex mapping. Reuse them instead of maintaining separate map. Change-Id: Iaaec9a6bd2603eace1dfc8f4344087883d88cce3
2015-11-19Clean up the special input in HInvokeStaticOrDirect.Vladimir Marko
Change-Id: I4042aefbdac1a8c236d00e2e7145349a64f6486b
2015-11-15x86/x86-64 read barrier support for concurrent GC in Optimizing.Roland Levillain
This first implementation uses slow paths to instrument heap reference loads and GC root loads for the concurrent copying collector, respectively calling the artReadBarrierSlow and artReadBarrierForRootSlow (new) runtime entry points. Notes: - This implementation does not instrument HInvokeVirtual nor HInvokeInterface instructions (for class reference loads), as the corresponding read barriers are not stricly required with the current concurrent copying collector. - Intrinsics which may eventually call (on slow path) are disabled when read barriers are enabled, as the current slow path infrastructure does not support this case. - When read barriers are enabled, the code generated for a HArraySet instruction always go into the array set slow path for object arrays (delegating the operation to the runtime), as we are lacking a mechanism to keep a temporary register live accross a runtime call (needed for the instrumentation of type checking code, which requires two successive read barriers). Bug: 12687968 Change-Id: I14cd6107233c326389120336f93955b28ffbb329
2015-11-12Optimizing/X86: PC-relative dex cache array addressing.Vladimir Marko
Add PC-relative dex cache array addressing for X86 and use it for better invoke-static/-direct dispatch. Also delay the initialization to the PC-relative base until needed. Change-Id: Ib8634d5edce4920cd70172fd13211809cf6948d1
2015-11-05Code cleanup to avoid CompilerDriver abstractions in JIT.Nicolas Geoffray
Avoids allocating a CompiledMethod. Change-Id: I35b4aa0d7c74daba68e827a01e71c300fce3b3bf
2015-10-23Optimizing: Determine invoke-static/-direct dispatch early.Vladimir Marko
Determine the dispatch type of invoke-static/-direct in a special pass right after the type inference. This allows the inliner to pass the "needs dex cache" check and inline more. It also allows the code generator to avoid requesting a register location for the ArtMethod* for kDexCachePcRelative and direct methods. The supported dispatch check handles also situations that the CompilerDriver currently doesn't allow. The cleanup of the CompilerDriver and required changes to Quick will come in a separate change. Change-Id: I3f8e903a119949e95871d8ab0a995f4731a13a07
2015-10-22MIPS: Initial version of optimizing compiler for MIPS32Goran Jakovljevic
Change-Id: I370388e8d5de52c7001552b513877ef5833aa621
2015-10-13Implement System.arraycopy intrinsic for arm.Nicolas Geoffray
Change-Id: I58ae1af5103e281fe59fbe022b718d6d8f293a5e
2015-10-08Add DCHECKs to ArenaVector and ScopedArenaVector.Vladimir Marko
Implement dchecked_vector<> template that DCHECK()s element access and insert()/emplace()/erase() positions. Change the ArenaVector<> and ScopedArenaVector<> aliases to use the new template instead of std::vector<>. Remove DCHECK()s that have now become unnecessary from the Optimizing compiler. Change-Id: Ib8506bd30d223f68f52bd4476c76d9991acacadc
2015-10-06Fix location summary for LoadClassCalin Juravle
Don't request a register for the current method if we're gonna call the runtime. Change-Id: I9760d15108bd95efb2a34e6eacd84b60841781d7
2015-10-06Add support for unresolved classes in optimizing.Calin Juravle
Change-Id: I0e299a81e560eb9cb0737ec46125dffc99333b54
2015-10-02Revert "Revert "Support unresolved fields in optimizing"Calin Juravle
The CL also changes the calling convetion for 64bit static field set to use kArg2 instead of kArg1. This allows optimizing to keep the asumptions: - arm pairs are always of form (even_reg, odd_reg) - ecx_edx is not used as a register on x86. This reverts commit e6f49b47b6a4dc9c7684e4483757872cfc7ff1a1. Change-Id: I93159917565824084abc96775f31be1a4249f2f3
2015-09-29Optimizing: Tag arena allocations in code generators.Vladimir Marko
And completely remove the deprecated GrowableArray. Replace GrowableArray with ArenaVector in code generators and related classes and tag arena allocations. Label arrays use direct allocations from ArenaAllocator because Label is non-copyable and non-movable and as such cannot be really held in a container. The GrowableArray never actually constructed them, instead relying on the zero-initialized storage from the arena allocator to be correct. We now actually construct the labels. Also avoid StackMapStream::ComputeDexRegisterMapSize() being passed null references, even though unused. Change-Id: I26a46fdd406b23a3969300a67739d55528df8bf4
2015-09-17Revert "Support unresolved fields in optimizing"Calin Juravle
breaks debuggable tests. This reverts commit 23a8e35481face09183a24b9d11e505597c75ebb. Change-Id: I8e60b5c8f48525975f25d19e5e8066c1c94bd2e5
2015-09-17Support unresolved fields in optimizingCalin Juravle
Change-Id: I9941fa5fcb6ef0a7a253c7a0b479a44a0210aad4
2015-09-17Support unresolved methods in OptimizingCalin Juravle
Change-Id: If2da02b50d2fa668cd58f134a005f1752e7746b1
2015-09-16Merge "Add OptimizingCompilerStats to the CodeGenerator class."Calin Juravle