summaryrefslogtreecommitdiff
path: root/compiler/optimizing/graph_visualizer.cc
AgeCommit message (Collapse)Author
2016-07-01Create a typedef for HInstruction::GetInputs() return type.Vladimir Marko
And some other cleanup after https://android-review.googlesource.com/230742 Test: No new tests. ART test suite passed (tested on host). Change-Id: I4743bf17544d0234c6ccb46dd0c1b9aae5c93e17
2016-06-21Merge "Replace String.charAt() with HIR."Vladimir Marko
2016-06-21Replace String.charAt() with HIR.Vladimir Marko
Replace String.charAt() with HArrayLength, HBoundsCheck and HArrayGet. This allows GVN on the HArrayLength and BCE on the HBoundsCheck as well as using the infrastructure for HArrayGet, i.e. better handling of constant indexes than the old intrinsic and using the HArm64IntermediateAddress. Bug: 28330359 Change-Id: I32bf1da7eeafe82537a60416abf6ac412baa80dc
2016-06-21Improve HLoadClass code generation.Vladimir Marko
For classes in the boot image, use either direct pointers or PC-relative addresses. For other classes, use PC-relative access to the dex cache arrays for AOT and direct address of the type's dex cache slot for JIT. For aosp_flounder-userdebug: - 32-bit boot.oat: -252KiB (-0.3%) - 64-bit boot.oat: -412KiB (-0.4%) - 32-bit dalvik cache total: -392KiB (-0.4%) - 64-bit dalvik-cache total: -2312KiB (-1.0%) (contains more files than the 32-bit dalvik cache) For aosp_flounder-userdebug forced to compile PIC: - 32-bit boot.oat: -124KiB (-0.2%) - 64-bit boot.oat: -420KiB (-0.5%) - 32-bit dalvik cache total: -136KiB (-0.1%) - 64-bit dalvik-cache total: -1136KiB (-0.5%) (contains more files than the 32-bit dalvik cache) Bug: 27950288 Change-Id: I4da991a4b7e53c63c92558b97923d18092acf139
2016-06-02Refactor handling of input records.Vladimir Marko
Introduce HInstruction::GetInputRecords(), a new virtual function that returns an ArrayRef<> to all input records. Implement all other functions dealing with input records as wrappers around GetInputRecords(). Rewrite functions that previously used multiple virtual calls to deal with input records, especially in loops, to prefetch the ArrayRef<> only once for each instruction. Besides avoiding all the extra calls, this also allows the compiler (clang++) to perform additional optimizations. This speeds up the Nexus 5 boot image compilation by ~0.5s (4% of "Compile Dex File", 2% of dex2oat time) on AOSP ToT. Change-Id: Id8ebe0fb9405e38d918972a11bd724146e4ca578
2016-05-12Merge "Fix oatdump crash on arm64/arm code. Also adds 16 bit literal ↵Aart Bik
information."
2016-05-12Fix oatdump crash on arm64/arm code.Aart Bik
Also adds 16 bit literal information. Rationale: When "run-away" instructions are disassembled, the literal addresses may go out of range, causing oatdump to crash. This CL guards memory access against the full memory range allocated to assembly instructions and data (it is possible but not really necessary to refine this a bit). Out of range arguments are now displayed as (?) to denote the issue, which is a lot nicer than crashing. BUG=28670871 Change-Id: I51e9b6a6a99162546fe31059f14278e8980451c2
2016-05-12Fix another case of live_in at irreducible loop entry.Nicolas Geoffray
GVN was implicitly extending the liveness of an instruction across an irreducible loop. Fix this problem by clearing the value set at loop entries that contain an irreducible loop. bug:28252896 (cherry picked from commit 77ce6430af2709432b22344ed656edd8ec80581b) Change-Id: Ie0121e83b2dfe47bcd184b90a69c0194d13fce54
2016-05-09Intrinsify String.length() and String.isEmpty() as HIR.Vladimir Marko
Use HArrayLength for String.length() in anticipation of changing the String.charAt() to HBoundsCheck+HArrayGet to allow the existing BCE to seamlessly work for strings. Use HArrayLength+HEqual for String.isEmpty(). We previously relied on inlining but we now want to apply the new intrinsics even when we do not inline, i.e. when compiling debuggable (as is currently the case for boot image) or when we hit inlining limits, i.e. depth, size, or the number of accumulated dex registers. Bug: 28330359 Change-Id: Iab9d2f6d2967bdd930a72eb461f27efe8f37c103
2016-04-19Use iterators "before" the use node in HUserRecord<>.Vladimir Marko
Create a new template class IntrusiveForwardList<> that mimicks std::forward_list<> except that all allocations are handled externally. This is essentially the same as boost::intrusive::slist<> but since we're not using Boost we have to reinvent the wheel. Use the new container to replace the HUseList and use the iterators to "before" use nodes in HUserRecord<> to avoid the extra pointer to the previous node which was used exclusively for removing nodes from the list. This reduces the size of the HUseListNode by 25%, 32B to 24B in 64-bit compiler, 16B to 12B in 32-bit compiler. This translates directly to overall memory savings for the 64-bit compiler but due to rounding up of the arena allocations to 8B, we do not get any improvement in the 32-bit compiler. Compiling the Nexus 5 boot image with the 64-bit dex2oat on host this CL reduces the memory used for compiling the most hungry method, BatteryStats.dumpLocked(), by ~3.3MiB: Before: MEM: used: 47829200, allocated: 48769120, lost: 939920 Number of arenas allocated: 345, Number of allocations: 815492, avg size: 58 ... UseListNode 13744640 ... After: MEM: used: 44393040, allocated: 45361248, lost: 968208 Number of arenas allocated: 319, Number of allocations: 815492, avg size: 54 ... UseListNode 10308480 ... Note that while we do not ship the 64-bit dex2oat to the device, the JIT compilation for 64-bit processes is using the 64-bit libart-compiler. Bug: 28173563 Change-Id: I985eabd4816f845372d8aaa825a1489cf9569208
2016-03-30Merge "Optimizing: Improve const-string code generation."Vladimir Marko
2016-03-30ART: Flush ostream less frequently in GraphVisualizerDavid Brazdil
We have seen Checker tests timing out on debug-GC configurations after having switched to Optimizing because its GraphVisualizer makes too many syscalls which the configuration keeps track of. This patch replaces std::endl with "\n" across GraphVisualizer so as to not flush the stream after every line of output. Bug: 27826765 Change-Id: I5e3f1e92f8a84f36d324d56945e2d420b2d36a5d
2016-03-29Optimizing: Improve const-string code generation.Vladimir Marko
For strings in the boot image, use either direct pointers or pc-relative addresses. For other strings, use PC-relative access to the dex cache arrays for AOT and direct address of the string's dex cache slot for JIT. For aosp_flounder-userdebug: - 32-bit boot.oat: -692KiB (-0.9%) - 64-bit boot.oat: -948KiB (-1.1%) - 32-bit dalvik cache total: -900KiB (-0.9%) - 64-bit dalvik cache total: -3672KiB (-1.5%) (contains more files than the 32-bit dalvik cache) For aosp_flounder-userdebug forced to compile PIC: - 32-bit boot.oat: -380KiB (-0.5%) - 64-bit boot.oat: -928KiB (-1.0%) - 32-bit dalvik cache total: -468KiB (-0.4%) - 64-bit dalvik cache total: -1928KiB (-0.8%) (contains more files than the 32-bit dalvik cache) Bug: 26884697 Change-Id: Iec7266ce67e6fedc107be78fab2e742a8dab2696
2016-03-24ART: Loosen a GraphChecker rule on Boolean inputsDavid Brazdil
GraphChecker tries to verify that Boolean inputs are properly typed. This is non-trivial in the presence of simplifying optimizations which capitalize on the fact that a Boolean value is internally represented as an integer. This patch removes the test from GraphChecker. Bug: 27625564 Change-Id: Ic61ea2193765b4578550538e965ca4f80fa4b287
2016-03-23Ensure object ArraySet with null value does not need a type check.Roland Levillain
The art::PrepareForRegisterAllocation visitor can remove an art::BoundType instruction as value input of an art::ArraySet instruction, possibly replacing it with an art::NullConstant. If this happens, remove the need for a type check in this art::ArraySet. Bug: 27638110 Change-Id: I6270f8a8e22822a24d8a5919df427ca9c64d121b
2016-03-11Integrate BitwiseNegated into shared framework.Artem Serov
Share implementation between arm and arm64. Change-Id: I0dd12e772cb23b4c181fd0b1e2a447470b1d8702
2016-02-25Optimizing: ARM64 negated bitwise operations simplificationKevin Brodsky
Use negated instructions on ARM64 to replace [bitwise operation + not] patterns, that is: a & ~b (BIC) a | ~b (ORN) a ^ ~b (EON) The simplification only happens if the Not is only used by the bitwise operation. It does not happen if both inputs are Not's (this should be handled by a generic simplification applying De Morgan's laws). Change-Id: I0e112b23fd8b8e10f09bfeff5994508a8ff96e9c
2016-02-25Revert "Revert "ARM/ARM64: Extend support of instruction combining.""Artem Udovichenko
This reverts commit 6b5afdd144d2bb3bf994240797834b5666b2cf98. Change-Id: Ic27a10f02e21109503edd64e6d73d1bb0c6a8ac6
2016-02-17Extend constant folding to float and double operations.Roland Levillain
Change-Id: I2837064b2ceea587bc171fc520507f13355292c6
2016-02-15ART: Run SsaBuilder from HGraphBuilderDavid Brazdil
First step towards merging the two passes, which will later result in HGraphBuilder directly producing SSA form. This CL mostly just updates tests broken by not being able to inspect the pre-SSA form. Using HLocals outside the HGraphBuilder is now deprecated. Bug: 27150508 Change-Id: I00fb6050580f409dcc5aa5b5aa3a536d6e8d759e
2016-02-11Fix x86-64 Baker's read barrier fast path for CheckCast.Roland Levillain
Use an art::x86_64::Label instead of an art::x86_64::NearLabel as end label when emitting code for a HCheckCast instruction, as the range of the latter may sometimes be too short when Baker's read barriers are enabled. Bug: 12687968 Change-Id: Ia9742dce65be7d4fb104688f3c4717b65df1fb54
2016-01-28Revert "Revert "Lift the spill at each irreducible loop block restriction.""Nicolas Geoffray
This reverts commit 2818dbcd75ea9beadcba9d18e2f68523108d0cf5. Change-Id: I92b2b60b4f08f50cacfea4132f1c28cfbd628f1a
2016-01-26Some minor simplifications in code and tests.Aart Bik
Background: This is actually a resubmit of an earlier cl that was reverted because was test was less robust against inlining changes (it assumed a virtual call would never be inlined). original cl: If8ada79dfd70bea991c11d2b18661b951b6c4cd4 revert cl: I739aaaccd0509d02a62ef01e797a6d45bfe941df Change-Id: I952680d60ff488874907f066bfdf156a45b409ba
2016-01-26Revert "Lift the spill at each irreducible loop block restriction."Bart Sears
This reverts commit 79e9f43951c3cfa9ab3b0fea93e5bfdfa7aa5950. Change-Id: I0670618b4076e06bd3f6bf8c385abfd1b651393c
2016-01-26Lift the spill at each irreducible loop block restriction.Nicolas Geoffray
It was not intended to have it this way anyway. This also required to fix GetSiblingAt, to take into account interval holes, and ConnectSplitSibling to re-materialize a constant or a method. Change-Id: Ia5534a93a5413cd0458a251c022d0b655369502b
2016-01-22Revert "Some minor simplifications in code and tests."Nicolas Geoffray
Fails 530-checker-loops on arm This reverts commit bf03fcd10a3ffa15468d335f26697b0473e45b36. Change-Id: I739aaaccd0509d02a62ef01e797a6d45bfe941df
2016-01-20Some minor simplifications in code and tests.Aart Bik
Rationale: fell through the cracks of previous "intrinsics" CL. Change-Id: If8ada79dfd70bea991c11d2b18661b951b6c4cd4
2016-01-14Implement irreducible loop support in optimizing.Nicolas Geoffray
So we don't fallback to the interpreter in the presence of irreducible loops. Implications: - A loop pre-header does not necessarily dominate a loop header. - Non-constant redundant phis will be kept in loop headers, to satisfy our linear scan register allocation algorithm. - while-graph optimizations, such as gvn, licm, lse, and dce need to know when they are dealing with irreducible loops. Change-Id: I2cea8934ce0b40162d215353497c7f77d6c9137e
2015-12-31ART: Refactor SsaBuilder for more precise typing infoDavid Brazdil
This reverts commit 68289a531484d26214e09f1eadd9833531a3bc3c. Now uses Primitive::Is64BitType instead of Primitive::ComponentSize because it was incorrectly optimized by GCC. Bug: 26208284 Bug: 24252151 Bug: 24252100 Bug: 22538329 Bug: 25786318 Change-Id: Ib39f3da2b92bc5be5d76f4240a77567d82c6bebe
2015-12-15Revert "ART: Refactor SsaBuilder for more precise typing info"Alex Light
This reverts commit d9510dfc32349eeb4f2145c801f7ba1d5bccfb12. Bug: 26208284 Bug: 24252151 Bug: 24252100 Bug: 22538329 Bug: 25786318 Change-Id: I5f491becdf076ff51d437d490405ec4e1586c010
2015-12-14ART: Refactor SsaBuilder for more precise typing infoDavid Brazdil
This patch refactors the SsaBuilder to do the following: 1) All phis are constructed live and marked dead if not used or proved to be conflicting. 2) Primitive type propagation, now not a separate pass, identifies conflicting types and marks corresponding phis dead. 3) When compiling --debuggable, DeadPhiHandling used to revive phis which had only environmental uses but did not attempt to resolve conflicts. This pass was removed as obsolete and is now superseded by primitive type propagation (identifying conflicting phis) and SsaDeadPhiEliminiation (keeping phis live if debuggable + env use). 4) Resolving conflicts requires correct primitive type information on all instructions. This was not the case for ArrayGet instructions which can have ambiguous types in the bytecode. To this end, SsaBuilder now runs reference type propagation and types ArrayGets from the type of the input array. 5) With RTP being run inside the SsaBuilder, it is not necessary to run it as a separate optimization pass. Optimizations can now assume that all instructions of type kPrimNot have reference type info after SsaBuilder (with the exception of NullConstant). 6) Graph now contains a reference type to be assigned to NullConstant. All reference type instructions therefore have RTI, as now enforced by the SsaChecker. Bug: 24252151 Bug: 24252100 Bug: 22538329 Bug: 25786318 Change-Id: I7a3aee1ff66c82d64b4846611c547af17e91d260
2015-12-02Merge "Revert "Revert "Don't use the compiler driver for method resolution."""Nicolas Geoffray
2015-12-02Revert "Revert "Don't use the compiler driver for method resolution.""Nicolas Geoffray
This reverts commit c88ef3a10c474045a3476a02ae75d07ddd3230b7. Change-Id: I0ed88a48b313a8d28bc39fae40631123aadb13ef
2015-12-02Optimizing: Add checker tests for sharpening.Vladimir Marko
This is a follow-up to https://android-review.googlesource.com/184116 . Change-Id: Ib03c424fb673afc5ccce15d7d072b7572b47799a
2015-12-01Revert "Don't use the compiler driver for method resolution."Nicolas Geoffray
Fails 425 in debuggable mode. This reverts commit 4db0bf9c4db6a09716c3388b7d2f88d534470339. Change-Id: I346df8f75674564fc4fb241c60f23e250fc7f0a7
2015-12-01Don't use the compiler driver for method resolution.Nicolas Geoffray
The compiler driver makes assumptions that don't hold for the optimizing compiler, and will for example always go to slow path for an invoke-super when there's no verified method. Also fix GenerateInvokeVirtual in the presence of intrinsics. Next change will address some of the TODOs in sharpening.cc. Change-Id: I2b0e543ee9b9bebcadb2d26de29e850c59ad58b9
2015-11-25ARM64: Use the shifter operands.Alexandre Rames
This introduces architecture-specific instruction simplification. On ARM64 we try to merge shifts and sign-extension operations into arithmetic and logical instructions. For example for the Java code int res = a + (b << 5); we would generate lsl w3, w2, #5 add w0, w1, w3 and we now generate add w0, w1, w2, lsl #5 Change-Id: Ic03bdff44a1c12e21ddff1b0513bd32a730742b7
2015-11-20ARM64: Add support for multiply-accumulate.Alexandre Rames
Change-Id: I88dc313df520480f3fd16bbabda27f9435d25368
2015-11-19Merge "Allow NullConstant to be untyped in GraphVisualiser."Calin Juravle
2015-11-19Allow NullConstant to be untyped in GraphVisualiser.Mark Mendell
The NullConstant may be added to the graph during other passes that happen between ReferenceTypePropagation and Inliner (e.g. InstructionSimplifier). If the inliner doesn't run or doesn't inline anything, the NullConstant remains untyped. The infrastructure to properly type NullConstants everywhere is to complex to add for the benefits Bug: 25786318 Change-Id: I904a3e605b57f8cac9936e82f19a4994c7b1a82a Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
2015-11-19Fix ClinitCheck pruning.Vladimir Marko
Make sure we merge the ClinitCheck only with LoadClass and HInvokeStaticOrDirect that is a part of the very same dex instruction. This fixes incorrect stack traces from class initializers (wrong dex pcs). Rewrite the pruning to do all the ClinitCheck merging when we see the ClinitCheck, instead of merging ClinitCheck into LoadClass and then LoadClass into HInvokeStaticOrDirect. When we later see an HInvokeStaticOrDirect with an explicit check (i.e. not merged), we know that some other instruction is doing the check and the invoke doesn't need to, so we mark it as not requiring the check at all. (Previously it would have been marked as having an implicit check.) Remove the restriction on merging with inlined invoke static as this is not necessary anymore. This was a workaround for X.test(): invoke-static C.foo() [1] C.foo(): invoke-static C.bar() [2] After inlining and GVN we have X.test(): LoadClass C (from [1]) ClinitCheck C (from [1], to be merged to LoadClass) InvokeStaticOrDirect C.bar() (from [2]) and the LoadClass must not be merged into the invoke as this would cause the resolution trampoline to see an inlined frame from the not-yet-loaded class C during the stack walk and try to load the class. However, we're not allowed to load new classes at that point, so an attempt to do so leads to an assertion failure. With this CL, LoadClass is not merged when it comes from a different instruction, so we can guarantee that all inlined frames seen by the stack walk in the resolution trampoline belong to already loaded classes. Change-Id: I2b8da8d4f295355dce17141f0fab2dace126684d
2015-11-11Revert "Revert "Run type propagation after inliner only when needed.""Calin Juravle
This reverts commit 271743601650308c7ac5c7a3ec35025d8130a298. Change-Id: I173e27a0a4d7d54f90ca459eb48d280d1d40ab70
2015-11-10ART: Refactor iteration over normal/exceptional successorsDavid Brazdil
Add helper methods on HBasicBlock which return ArrayRef with the suitable sub-array of the `successors_` list. Change-Id: I66c83bb56f2984d7550bf77c48110af4087515a8
2015-10-26Revert "Run type propagation after inliner only when needed."Calin Juravle
This reverts commit 4e5dd521063beae1706410419f19c7e224db50fe. Change-Id: I0de261d14dd3f71abe05f9bc71744820cf23b937
2015-10-23Run type propagation after inliner only when needed.Calin Juravle
Currently we run a type propagation pass unconditionally after the inliner. This change looks at the returned value (if any) and runs a minimal type propagation only if its type has changed. Change-Id: I0dd72bd481219081e8a978d2632426afc980d73a
2015-10-08Merge "Add DCHECKs to ArenaVector and ScopedArenaVector."Vladimir Marko
2015-10-08Make sure classes with different access checks are not GVN-edCalin Juravle
Change-Id: I89f72fef3be35a4dd9585d97d03a3150386e0891
2015-10-08Add DCHECKs to ArenaVector and ScopedArenaVector.Vladimir Marko
Implement dchecked_vector<> template that DCHECK()s element access and insert()/emplace()/erase() positions. Change the ArenaVector<> and ScopedArenaVector<> aliases to use the new template instead of std::vector<>. Remove DCHECK()s that have now become unnecessary from the Optimizing compiler. Change-Id: Ib8506bd30d223f68f52bd4476c76d9991acacadc
2015-10-06Add support for unresolved classes in optimizing.Calin Juravle
Change-Id: I0e299a81e560eb9cb0737ec46125dffc99333b54
2015-10-02Revert "Revert "Support unresolved fields in optimizing"Calin Juravle
The CL also changes the calling convetion for 64bit static field set to use kArg2 instead of kArg1. This allows optimizing to keep the asumptions: - arm pairs are always of form (even_reg, odd_reg) - ecx_edx is not used as a register on x86. This reverts commit e6f49b47b6a4dc9c7684e4483757872cfc7ff1a1. Change-Id: I93159917565824084abc96775f31be1a4249f2f3