summaryrefslogtreecommitdiff
path: root/compiler/optimizing/code_generator.h
AgeCommit message (Collapse)Author
2016-08-01ART: Convert pointer size to enumAndreas Gampe
Move away from size_t to dedicated enum (class). Bug: 30373134 Bug: 30419309 Test: m test-art-host Change-Id: Id453c330f1065012e7d4f9fc24ac477cc9bb9269
2016-07-28Merge "Remove two ReadBarrierMarkRegX entrypoints."Roland Levillain
2016-07-25Remove two ReadBarrierMarkRegX entrypoints.Roland Levillain
As entry points ReadBarrierMarkReg30 and ReadBarrierMarkReg31 are undefined on all architectures supporting the read barrier configuration (ARM, ARM64, x86 and x86-64), remove them from the entry point list. Test: ART host and target (ARM, ARM64) tests. Bug: 29506760 Bug: 12687968 Change-Id: I500626e54f00aebfc095b4ef5f81b49fa43f7768
2016-07-22Do not emit stack maps for runtime calls to ReadBarrierMarkRegX.Roland Levillain
* Boot image code size variation on Nexus 5X (aosp_bullhead-userdebug build): - total ARM64 framework Oat files size change: 115584120 bytes -> 109124728 bytes (-5.59%) - total ARM framework Oat files size change: 97387728 bytes -> 92517584 (-5.00%) Test: ART host and target (ARM, ARM64) tests. Bug: 29506760 Bug: 12687968 Change-Id: I979d9fb2b4e09f4c0c7bf33af2cd91750a67f989
2016-07-21Move caller-saves saving/restoring to ReadBarrierMarkRegX.Roland Levillain
Instead of saving/restoring live caller-save registers before/after the call to read barrier mark entry points ReadBarrierMarkRegX, have these entry points save/restore all the caller-save registers themselves (except register rX, which contains the return value). Also refactor the assembly code of these entry points using macros. * Boot image code size variation on Nexus 5X (aosp_bullhead-userdebug build): - total ARM64 framework Oat files size change: 119196792 bytes -> 115575920 bytes (-3.04%) - total ARM framework Oat files size change: 100435212 bytes -> 97621188 bytes (-2.80%) * Benchmarks (ARM64) score variations on Nexus 5X (aosp_bullhead-userdebug build): - RitzPerf (lower is better) - average score difference: -2.71% - CaffeineMark (higher is better) - no real difference for most tests (absolute variation lower than 1%) - better score on the "Method" benchmark: score variation 41253 -> 44891 (+8.82%) Test: ART host and target (ARM, ARM64) tests. Bug: 29506760 Bug: 12687968 Change-Id: I881bf73139a3f1c2bee9ffc6fc8c00f9a392afa6
2016-07-18ARM64: Improve code generated to spill/restore for slow paths.Alexandre Rames
Aligning the accesses allows generating better code. Before: add x16, sp, #0x44 (68) stp x0, x1, [x16, #-16] After: stp x0, x1, [sp, #56] Change-Id: I3e20ad3fa59d00aee4b4d14ea9d59c7cd546509e
2016-07-13Introduce more compact ReadBarrierMark slow-paths.Roland Levillain
Replace entry point ReadBarrierMark with 32 ReadBarrierMarkRegX entry points, using register number X as input and output (instead of the standard runtime calling convention) to save two moves in Baker's read barrier mark slow-path code. Test: ART host and target (ARM, ARM64) tests. Bug: 29506760 Bug: 12687968 Change-Id: I73cfb82831cf040b8b018e984163c865cc44ed87
2016-06-21Merge "Replace String.charAt() with HIR."Vladimir Marko
2016-06-21Replace String.charAt() with HIR.Vladimir Marko
Replace String.charAt() with HArrayLength, HBoundsCheck and HArrayGet. This allows GVN on the HArrayLength and BCE on the HBoundsCheck as well as using the infrastructure for HArrayGet, i.e. better handling of constant indexes than the old intrinsic and using the HArm64IntermediateAddress. Bug: 28330359 Change-Id: I32bf1da7eeafe82537a60416abf6ac412baa80dc
2016-06-21Improve HLoadClass code generation.Vladimir Marko
For classes in the boot image, use either direct pointers or PC-relative addresses. For other classes, use PC-relative access to the dex cache arrays for AOT and direct address of the type's dex cache slot for JIT. For aosp_flounder-userdebug: - 32-bit boot.oat: -252KiB (-0.3%) - 64-bit boot.oat: -412KiB (-0.4%) - 32-bit dalvik cache total: -392KiB (-0.4%) - 64-bit dalvik-cache total: -2312KiB (-1.0%) (contains more files than the 32-bit dalvik cache) For aosp_flounder-userdebug forced to compile PIC: - 32-bit boot.oat: -124KiB (-0.2%) - 64-bit boot.oat: -420KiB (-0.5%) - 32-bit dalvik cache total: -136KiB (-0.1%) - 64-bit dalvik-cache total: -1136KiB (-0.5%) (contains more files than the 32-bit dalvik cache) Bug: 27950288 Change-Id: I4da991a4b7e53c63c92558b97923d18092acf139
2016-05-16Revert "Revert "ART: Reference.getReferent intrinsic for x86 and x86_64""Serguei Katkov
This reverts commit 0997d24e67d78f2146ebae2888eda0d7d254789a. ART_HEAP_POISONING=true mode is fixed. Change-Id: I83f6d5c101ea6a86802753f81b3e4348a263fb21 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
2016-05-13Merge "Revert "ART: Reference.getReferent intrinsic for x86 and x86_64""Nicolas Geoffray
2016-05-13Revert "ART: Reference.getReferent intrinsic for x86 and x86_64"Nicolas Geoffray
Fails heap poisoning configuration. This reverts commit afdc97ebcb4e58afb7cf54d846d30314e6499d83. Change-Id: I50e53756a2b85059b89cfb8950f8c9e2b032743c
2016-05-12Merge "ART: Reference.getReferent intrinsic for x86 and x86_64"Roland Levillain
2016-05-10ART: Reference.getReferent intrinsic for x86 and x86_64Serguei Katkov
Change-Id: I7a7ac9244847dd80d9fa4e4b5ebc5bf451c628ff Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
2016-05-09Intrinsify String.length() and String.isEmpty() as HIR.Vladimir Marko
Use HArrayLength for String.length() in anticipation of changing the String.charAt() to HBoundsCheck+HArrayGet to allow the existing BCE to seamlessly work for strings. Use HArrayLength+HEqual for String.isEmpty(). We previously relied on inlining but we now want to apply the new intrinsics even when we do not inline, i.e. when compiling debuggable (as is currently the case for boot image) or when we hit inlining limits, i.e. depth, size, or the number of accumulated dex registers. Bug: 28330359 Change-Id: Iab9d2f6d2967bdd930a72eb461f27efe8f37c103
2016-04-15Fix: correctly destruct VIXL labels.Alexandre Rames
Bug: 27505766 Change-Id: I077465e3d308f4331e7a861902e05865f9d99835
2016-04-12Allocate code generators on the arena.Vladimir Marko
Change-Id: If8cf0ee43711f6e13171443e3c057ff370ccfbaa
2016-04-07Revert "Revert "Refactor HGraphBuilder and SsaBuilder to remove HLocals""David Brazdil
This patch merges the instruction-building phases from HGraphBuilder and SsaBuilder into a single HInstructionBuilder class. As a result, it is not necessary to generate HLocal, HLoadLocal and HStoreLocal instructions any more, as the builder produces SSA form directly. Saves 5-15% of arena-allocated memory (see bug for more data): GMS 20.46MB => 19.26MB (-5.86%) Maps 24.12MB => 21.47MB (-10.98%) YouTube 28.60MB => 26.01MB (-9.05%) This CL fixed an issue with parsing quickened instructions. Bug: 27894376 Bug: 27998571 Bug: 27995065 Change-Id: I20dbe1bf2d0fe296377478db98cb86cba695e694
2016-04-04Revert "Refactor HGraphBuilder and SsaBuilder to remove HLocals"David Brazdil
Bug: 27995065 This reverts commit e3ff7b293be2a6791fe9d135d660c0cffe4bd73f. Change-Id: I5363c7ce18f47fd422c15eed5423a345a57249d8
2016-04-04Refactor HGraphBuilder and SsaBuilder to remove HLocalsDavid Brazdil
This patch merges the instruction-building phases from HGraphBuilder and SsaBuilder into a single HInstructionBuilder class. As a result, it is not necessary to generate HLocal, HLoadLocal and HStoreLocal instructions any more, as the builder produces SSA form directly. Saves 5-15% of arena-allocated memory (see bug for more data): GMS 20.46MB => 19.26MB (-5.86%) Maps 24.12MB => 21.47MB (-10.98%) YouTube 28.60MB => 26.01MB (-9.05%) Bug: 27894376 Change-Id: Iefe28d40600c169c5d306fd2c77034ae19476d90
2016-03-29Optimizing: Improve const-string code generation.Vladimir Marko
For strings in the boot image, use either direct pointers or pc-relative addresses. For other strings, use PC-relative access to the dex cache arrays for AOT and direct address of the string's dex cache slot for JIT. For aosp_flounder-userdebug: - 32-bit boot.oat: -692KiB (-0.9%) - 64-bit boot.oat: -948KiB (-1.1%) - 32-bit dalvik cache total: -900KiB (-0.9%) - 64-bit dalvik cache total: -3672KiB (-1.5%) (contains more files than the 32-bit dalvik cache) For aosp_flounder-userdebug forced to compile PIC: - 32-bit boot.oat: -380KiB (-0.5%) - 64-bit boot.oat: -928KiB (-1.0%) - 32-bit dalvik cache total: -468KiB (-0.4%) - 64-bit dalvik cache total: -1928KiB (-0.8%) (contains more files than the 32-bit dalvik cache) Bug: 26884697 Change-Id: Iec7266ce67e6fedc107be78fab2e742a8dab2696
2016-03-18Merge "Generate native debug stackmaps before calls as well."David Srbecky
2016-03-17Generate native debug stackmaps before calls as well.David Srbecky
The debugger looks up PC of the call instruction, so the runtime's stackmap is not sufficient since it is at PC after the instruction. Change-Id: I0dd06c0b52e8079ea5d064ea10beb12c93584092
2016-03-16Clean up NullCheck generation and record stats about it.Calin Juravle
This removes redundant code from the generators and allows for easier stat recording. Change-Id: Iccd4368f9e9d87a6fecb863dee4e2145c97851c4
2016-02-24Associate slow paths with the instruction that they belong to.David Srbecky
Almost all slow paths already know the instruction they belong to, this CL just moves the knowledge to the base class as well. This is needed to be be able to get the corresponding dex pc for slow path, which allows us generate better native line numbers, which in turn fixes some native debugging stepping issues. Change-Id: I568dbe78a7cea6a43a4a71a014b3ad135782c270
2016-02-24Remove HNativeDebugInfo from start of basic blocks.David Srbecky
We do not require full environment at the start of basic block. The dex pc contained in basic block is sufficient for line mapping. Change-Id: I5ba9e5f5acbc4a783ad544769f9a73bb33e2bafa
2016-02-12ART: Remove HTemporaryDavid Brazdil
Change-Id: I21b984224370a9ce7a4a13a9652503cfb03c5f03
2016-02-05Revert "Revert "Implement on-stack replacement for arm/arm64/x86/x86_64.""Nicolas Geoffray
This reverts commit bd89a5c556324062b7d841843b039392e84cfaf4. Change-Id: I08d190431520baa7fcec8fbdb444519f25ac8d44
2016-01-18ART: Remove Baseline compilerDavid Brazdil
We don't need Baseline any more and it hasn't been maintained for a while anyway. Let's remove it. Change-Id: I442ed26855527be2df3c79935403a25b1ee55df6
2016-01-12Reduce code size by sharing slow paths.Aart Bik
Rationale: Sharing identical slow path code reduces code size. Background: Currently, slow paths with the same dex-pc, same physical register spilling code, and identical stack maps are shared (making this only useful for deopt slow paths). The newly introduced mechanism is sufficiently general to allow future improvements by e.g. allowing different dex-pc (by passing this to runtime) or even the kind of slow paths (by passing runtime addresses to the slowpath). Change-Id: I819615c47b4fd98440a241f681f93e4fc22d12e0
2016-01-11Generate Nops to ensure that debug stack maps have distinct PC.David Srbecky
Change-Id: I5740ec958a20d236634b66df0e675382ed5c16fc
2015-12-10Get source mapping table from stack maps.David Srbecky
Stack maps contain pc to dex mapping. Reuse them instead of maintaining separate map. Change-Id: Iaaec9a6bd2603eace1dfc8f4344087883d88cce3
2015-11-15x86/x86-64 read barrier support for concurrent GC in Optimizing.Roland Levillain
This first implementation uses slow paths to instrument heap reference loads and GC root loads for the concurrent copying collector, respectively calling the artReadBarrierSlow and artReadBarrierForRootSlow (new) runtime entry points. Notes: - This implementation does not instrument HInvokeVirtual nor HInvokeInterface instructions (for class reference loads), as the corresponding read barriers are not stricly required with the current concurrent copying collector. - Intrinsics which may eventually call (on slow path) are disabled when read barriers are enabled, as the current slow path infrastructure does not support this case. - When read barriers are enabled, the code generated for a HArraySet instruction always go into the array set slow path for object arrays (delegating the operation to the runtime), as we are lacking a mechanism to keep a temporary register live accross a runtime call (needed for the instrumentation of type checking code, which requires two successive read barriers). Bug: 12687968 Change-Id: I14cd6107233c326389120336f93955b28ffbb329
2015-11-12Optimizing/X86: PC-relative dex cache array addressing.Vladimir Marko
Add PC-relative dex cache array addressing for X86 and use it for better invoke-static/-direct dispatch. Also delay the initialization to the PC-relative base until needed. Change-Id: Ib8634d5edce4920cd70172fd13211809cf6948d1
2015-11-05Code cleanup to avoid CompilerDriver abstractions in JIT.Nicolas Geoffray
Avoids allocating a CompiledMethod. Change-Id: I35b4aa0d7c74daba68e827a01e71c300fce3b3bf
2015-10-23Optimizing: Determine invoke-static/-direct dispatch early.Vladimir Marko
Determine the dispatch type of invoke-static/-direct in a special pass right after the type inference. This allows the inliner to pass the "needs dex cache" check and inline more. It also allows the code generator to avoid requesting a register location for the ArtMethod* for kDexCachePcRelative and direct methods. The supported dispatch check handles also situations that the CompilerDriver currently doesn't allow. The cleanup of the CompilerDriver and required changes to Quick will come in a separate change. Change-Id: I3f8e903a119949e95871d8ab0a995f4731a13a07
2015-10-13Implement System.arraycopy intrinsic for arm.Nicolas Geoffray
Change-Id: I58ae1af5103e281fe59fbe022b718d6d8f293a5e
2015-10-08Optimizing: Clean up after tagging arena allocations.Vladimir Marko
Change-Id: Id6ee1fe44c4c57d373db7a39530f29a5ca9aee18
2015-10-06Add support for unresolved classes in optimizing.Calin Juravle
Change-Id: I0e299a81e560eb9cb0737ec46125dffc99333b54
2015-10-02Revert "Revert "Support unresolved fields in optimizing"Calin Juravle
The CL also changes the calling convetion for 64bit static field set to use kArg2 instead of kArg1. This allows optimizing to keep the asumptions: - arm pairs are always of form (even_reg, odd_reg) - ecx_edx is not used as a register on x86. This reverts commit e6f49b47b6a4dc9c7684e4483757872cfc7ff1a1. Change-Id: I93159917565824084abc96775f31be1a4249f2f3
2015-09-29Optimizing: Tag even more arena allocations.Vladimir Marko
Tag previously "Misc" arena allocations with more specific allocation types. Move some native heap allocations to the arena in BCE. Bug: 23736311 Change-Id: If8ef15a8b614dc3314bdfb35caa23862c9d4d25c
2015-09-29Optimizing: Tag arena allocations in code generators.Vladimir Marko
And completely remove the deprecated GrowableArray. Replace GrowableArray with ArenaVector in code generators and related classes and tag arena allocations. Label arrays use direct allocations from ArenaAllocator because Label is non-copyable and non-movable and as such cannot be really held in a container. The GrowableArray never actually constructed them, instead relying on the zero-initialized storage from the arena allocator to be correct. We now actually construct the labels. Also avoid StackMapStream::ComputeDexRegisterMapSize() being passed null references, even though unused. Change-Id: I26a46fdd406b23a3969300a67739d55528df8bf4
2015-09-17ART: Refactor intrinsics slow-pathsAndreas Gampe
Refactor slow paths so that there is a default implementation for common cases (only arm64 with vixl is special). Write a generic intrinsic slow-path that can be reused for the specific architectures. Move helper functions into CodeGenerator so that they are accessible. Change-Id: Ibd788dce432601c6a9f7e6f13eab31f28dcb8550
2015-09-17Revert "Support unresolved fields in optimizing"Calin Juravle
breaks debuggable tests. This reverts commit 23a8e35481face09183a24b9d11e505597c75ebb. Change-Id: I8e60b5c8f48525975f25d19e5e8066c1c94bd2e5
2015-09-17Support unresolved fields in optimizingCalin Juravle
Change-Id: I9941fa5fcb6ef0a7a253c7a0b479a44a0210aad4
2015-09-17Support unresolved methods in OptimizingCalin Juravle
Change-Id: If2da02b50d2fa668cd58f134a005f1752e7746b1
2015-09-16Merge "Add OptimizingCompilerStats to the CodeGenerator class."Calin Juravle
2015-09-16Optimizing: Tag arena allocations in HGraph.Vladimir Marko
Replace GrowableArray with ArenaVector in HGraph and related classes HEnvironment, HLoopInformation, HInvoke and HPhi, and tag allocations with new arena allocation types. Change-Id: I3d79897af405b9a1a5b98bfc372e70fe0b3bc40d
2015-09-15Revert "Revert "ART: Register allocation and runtime support for try/catch""David Brazdil
The original CL triggered b/24084144 which has been fixed by Ib72e12a018437c404e82f7ad414554c66a4c6f8c. This reverts commit 659562aaf133c41b8d90ec9216c07646f0f14362. Change-Id: Id8980436172457d0fcb276349c4405f7c4110a55