summaryrefslogtreecommitdiff
path: root/compiler/optimizing/graph_visualizer.cc
AgeCommit message (Collapse)Author
2018-05-11ART: Compiler support for const-method-handleOrion Hodson
Implemented as a runtime call. Bug: 66890674 Test: art/test.py --target -r -t 979 Test: art/test.py --target --64 -r -t 979 Test: art/test.py --host -r -t 979 Change-Id: I67f461c819a7d528d7455afda8b4a59e9aed381c
2018-05-10ART: Compiler support for const-method-typeOrion Hodson
Implemented as a runtime call. Bug: 66890674 Test: art/test.py --target -r -t 979 Test: art/test.py --target --64 -r -t 979 Test: art/test.py --host -r -t 979 Change-Id: I4b3d3969d455d0198cfe122eea8abd54e0ea20ee
2018-03-27Merge "Revert^4 "Compiler changes for bitstring based type checks.""Treehugger Robot
2018-03-27Revert^4 "Compiler changes for bitstring based type checks."Vladimir Marko
Disabled the build time flag. (No image version bump needed.) Bug: 26687569 Bug: 64692057 Bug: 76420366 This reverts commit 3fbd3ad99fad077e5c760e7238bcd55b07d4c06e. Change-Id: I5d83c4ce8a7331c435d5155ac6e0ce1c77d60004
2018-03-26Revert^3 "Compiler changes for bitstring based type checks."Andreas Gampe
This reverts commit 3f41323cc9da335e9aa4f3fbad90a86caa82ee4d. Reason for revert: Fails sporadically. Bug: 26687569 Bug: 64692057 Bug: 76420366 Change-Id: I84d1e9e46c58aeecf17591ff71fbac6a1e583909
2018-03-26ART: Fix infinite recursion for deopt at dex pc 0.Vladimir Marko
Previously, the interpreter checked for dex pc 0 to see if the method was just entered. If we deopt at dex pc 0, the instrumentation would emit an erroneous MethodEnteredEvent and the JIT would have received a MethodEntered() call. For JIT-on-first-use, the method would be compiled the same way as before, leading to the same deopt until stack overflow. We fix this by using a new `from_deoptimize` flag passed by the caller. Test: 680-checker-deopt-dex-pc-0 Test: testrunner.py --host \ --jit --runtime-option=-Xjitthreshold:0 Bug: 62611253 Change-Id: I50b88f15484aeae16e1375a1d80f6563fb9066e7
2018-03-22Revert^2 "Compiler changes for bitstring based type checks."Vladimir Marko
Add extra output for debugging failures and re-enable the bitstring type checks. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing --jit Test: testrunner.py --host -t 670-bitstring-type-check Test: Pixel 2 XL boots. Test: testrunner.py --target --optimizing --jit Test: testrunner.py --target -t 670-bitstring-type-check Bug: 64692057 Bug: 26687569 This reverts commit bff7a52e2c6c9e988c3ed1f12a2da0fa5fd37cfb. Change-Id: I090e241983f3ac6ed8394d842e17716087d169ac
2018-02-21Remove duplication, split testsDavid Sehr
The code move to libdexfile/dex/descriptors_names.cc apparently did not remove the original code from runtime/utils.cc. Fix that duplication and all the header mentions needed. Also, split the test files to go along with the new locations for the code to be tested. Bug: 22322814 Test: make -j 50 checkbuild make -j 50 test-art-host-gtest flash & boot marlin Change-Id: Ie734672c4bca2c647d8016291f910b5608674545
2018-02-07Don't analyze methods with verification errors.Aart Bik
With regression test! Bug: 72874888 Test: test-art-host Change-Id: Icb3ec5dbfa14a1f77da681ba7e100ec9a5ab9ba6
2018-02-01Clean up signed/unsigned in vectorizer.Aart Bik
Rationale: Currently we have some remaining ugliness around signed and unsigned SIMD operations due to lack of kUint32 and kUint64 in the HIR. By "softly" introducing these types, ABS/MIN/MAX/HALVING_ADD/SAD_ACCUMULATE operations can solely rely on the packed data types to distinguish between signed and unsigned operations. Cleaner, and also allows for some code removal in the current loop optimizer. Bug: 72709770 Test: test-art-host test-art-target Change-Id: I68e4cdfba325f622a7256adbe649735569cab2a3
2018-01-25Revert "Compiler changes for bitstring based type checks."Nicolas Geoffray
Bug: 64692057 Bug: 71853552 Bug: 26687569 This reverts commit eb0ebed72432b3c6b8c7b38f8937d7ba736f4567. Change-Id: I7daeaa077960ba41b2ed42bc47f17501621be4be
2018-01-23Compiler changes for bitstring based type checks.Vladimir Marko
We guard the use of this feature with a compile-time flag, set to true in this CL. Boot image size for aosp_taimen-userdebug in AOSP master: - before: arm boot*.oat: 63604740 arm64 boot*.oat: 74237864 - after: arm boot*.oat: 63531172 (-72KiB, -0.1%) arm64 boot*.oat: 74135008 (-100KiB, -0.1%) The new TypeCheckBenchmark yields the following changes using the little cores of taimen fixed at 1.4016GHz: 32-bit 64-bit timeCheckCastLevel1ToLevel1 11.48->15.80 11.47->15.78 timeCheckCastLevel2ToLevel1 15.08->15.79 15.08->15.79 timeCheckCastLevel3ToLevel1 19.01->15.82 17.94->15.81 timeCheckCastLevel9ToLevel1 42.55->15.79 42.63->15.81 timeCheckCastLevel9ToLevel2 39.70->14.36 39.70->14.35 timeInstanceOfLevel1ToLevel1 13.74->17.93 13.76->17.95 timeInstanceOfLevel2ToLevel1 17.02->17.95 16.99->17.93 timeInstanceOfLevel3ToLevel1 24.03->17.95 24.45->17.95 timeInstanceOfLevel9ToLevel1 47.13->17.95 47.14->18.00 timeInstanceOfLevel9ToLevel2 44.19->16.52 44.27->16.51 This suggests that the bitstring typecheck should not be used for exact type checks which would be equivalent to the "Level1ToLevel1" benchmark. Whether the implementation is a beneficial replacement for the kClassHierarchyCheck and kAbstractClassCheck on average depends on how many levels from the target class (or Object for a negative result) is a typical object's class. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing --jit Test: testrunner.py --host -t 670-bitstring-type-check Test: Pixel 2 XL boots. Test: testrunner.py --target --optimizing --jit Test: testrunner.py --target -t 670-bitstring-type-check Bug: 64692057 Bug: 71853552 Bug: 26687569 Change-Id: I538d7e036b5a8ae2cc3fe77662a5903d74854562
2017-11-02ART: Make InstructionSet an enum class and add kLast.Vladimir Marko
Adding InstructionSet::kLast shall make it easier to encode the InstructionSet in fewer bits using BitField<>. However, introducing `kLast` into the `art` namespace is not a good idea, so we change the InstructionSet to an enum class. This also uncovered a case of InstructionSet::kNone being erroneously used instead of vixl32::Condition::None(), so it's good to remove `kNone` from the `art` namespace. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: I6fa6168dfba4ed6da86d021a69c80224f09997a6
2017-10-27Alignment optimizations in vectorizer.Aart Bik
Rationale: Since aligned data access is generally better (enables more efficient aligned moves and prevents nasty cache line splits), computing and/or enforcing alignment has been added to the vectorizer: (1) If the initial alignment is known completely and suffices, then a static peeling factor enforces proper alignment. (2) If (1) fails, but the base alignment allows, dynamically peeling until total offset is aligned forces proper aligned access patterns. By using ART conventions only, any forced alignment is preserved over suspends checks where data may move. Note 1: Current allocation convention is just 8 byte alignment on arrays/strings, so only ARM32 benefits. However, all optimizations are implemented in a general way, so moving to a 16 byte alignment will immediately take advantage of any new convention!! Note 2: This CL also exposes how bad the choice of 12 byte offset of arrays really is. Even though the new optimizations fix the misaligned, it requires peeling for the most common case: 0 indexed loops. Therefore, we may even consider moving to a 16 byte offset. Again the optimizations in this CL will immediately take advantage of that new convention!! Test: test-art-host test-art-target Change-Id: Ib6cc0fb68c9433d3771bee573603e64a3a9423ee
2017-10-11Use ScopedArenaAllocator for building HGraph.Vladimir Marko
Memory needed to compile the two most expensive methods for aosp_angler-userdebug boot image: BatteryStats.dumpCheckinLocked() : 21.1MiB -> 20.2MiB BatteryStats.dumpLocked(): 42.0MiB -> 40.3MiB This is because all the memory previously used by the graph builder is reused by later passes. And finish the "arena"->"allocator" renaming; make renamed allocator pointers that are members of classes const when appropriate (and make a few more members around them const). Test: m test-art-host-gtest Test: testrunner.py --host Bug: 64312607 Change-Id: Ia50aafc80c05941ae5b96984ba4f31ed4c78255e
2017-10-03ART: Introduce Uint8 compiler data type.Vladimir Marko
This CL adds all the necessary codegen for the Uint8 type but does not add code transformations that use that code. Vectorization codegens are modified to use Uint8 as the packed type when appropriate. The side effects are now disconnected from the instruction's type after the graph has been built to allow changing HArrayGet/H*FieldGet/HVecLoad to use a type different from the underlying field or array. Note: HArrayGet for String.charAt() is modified to have no side effects whatsoever; Strings are immutable. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing --jit Test: testrunner.py --target --optimizing on Nexus 6P Test: Nexus 6P boots. Bug: 23964345 Change-Id: If2dfffedcfb1f50db24570a1e9bd517b3f17bfd0
2017-09-25ART: Introduce compiler data type.Vladimir Marko
Replace most uses of the runtime's Primitive in compiler with a new class DataType. This prepares for introducing new types, such as Uint8, that the runtime does not need to know about. Test: m test-art-host-gtest Test: testrunner.py --host Bug: 23964345 Change-Id: Iec2ad82454eec678fffcd8279a9746b90feb9b0c
2017-08-30ART: Describe static fields in GraphVisualizer.Vladimir Marko
Test: Rely on TreeHugger. Change-Id: I3388a469a96c665abc51abe2cf7d2b2004db7d78
2017-06-26Don't use the graph's dex file when printing HInvoke.Nicolas Geoffray
It's not the right dex file if the invokes come from inlined methods. Test: manual Change-Id: I4e3fb35e2bddc67510c39e12075c9a5ca0498a3a
2017-06-01Use IntrusiveForwardList<> for Env-/UsePosition.Vladimir Marko
Test: m test-art-host-gtest Test: testrunner.py --host Change-Id: I2b720e2ed8f96303cf80e9daa6d5278bf0c3da2f
2017-05-15Min/max SIMDization support.Aart Bik
Rationale: The more vectorized, the better! Test: test-art-target, test-art-host Change-Id: I758becca5beaa5b97fab2ab70f2e00cb53458703
2017-04-20ARM64: Support MultiplyAccumulate for SIMD.Artem Serov
Test: test-art-host, test-art-target. Change-Id: I06af8415e15352d09d176cae828163cbe99ae7a7
2017-04-19Implement halving add idiom (with checker tests).Aart Bik
Rationale: First of several idioms that map to very efficient SIMD instructions. Note that the is-zero-ext and is-sign-ext are general-purpose utilities that will be widely used in the vectorizer to detect low precision idioms, so expect that code to be shared with many CLs to come. Test: test-art-host, test-art-target Change-Id: If7dc2926c72a2e4b5cea15c44ef68cf5503e9be9
2017-03-28Merge "Make data dependency around HDeoptimize correct."Nicolas Geoffray
2017-03-27Make data dependency around HDeoptimize correct.Nicolas Geoffray
We use HDeoptimize in a few places, but when it comes to data dependency we either: - don't have any (BCE, CHA), in which case we should make sure no code that the deoptimzation guards moves before the HDeoptimize - have one on the receiver (inline cache), in which case we can update the dominated users with the HDeoptimize to get the data dependency correct. bug:35661819 bug:36371709 test: 644-checker-deopt Change-Id: I4820c6710b06939e7f5a59606971693e995fb958
2017-03-23Implement a SIMD spilling slot.Aart Bik
Rationale: The last ART vectorizer break-out CL \O/ This ensures spilling on x86 and x86_4 is correct. Also, it paves the way to wider SIMD on ARM and MIPS. Test: test-art-host Bug: 34083438 Change-Id: I5b27d18c2045f3ab70b64c335423b3ff2a507ac2
2017-02-17ARM: Merge data-processing instructions and shifts/(un)signed extensionsAnton Kirilov
This commit mirrors the work that has already been done for ARM64. Test: m test-art-target-run-test-551-checker-shifter-operand Change-Id: Iec8c1563b035f40f0e18dcffde28d91dc21922f8
2017-01-15Revert "Revert "ART: Compiler support for invoke-polymorphic.""Orion Hodson
This reverts commit 0fb5af1c8287b1ec85c55c306a1c43820c38a337. This takes us back to the original change and attempts to fix the issues encountered: - Adds transition record push/pop around artInvokePolymorphic. - Changes X86/X64 relocations for MacSDK. - Implements MIPS entrypoint for art_quick_invoke_polymorphic. - Corrects size of returned reference in art_quick_invoke_polymorphic on ARM. Bug: 30550796,33191393 Test: art/test/run-test 953 Test: m test-art-run-test Change-Id: Ib6b93e00b37b9d4ab743a3470ab3d77fe857cda8
2017-01-11Revert "ART: Compiler support for invoke-polymorphic."Orion Hodson
This reverts commit 02e3092f8d98f339588e48691db77f227b48ac1e. Reasons for revert: - Breaks MIPS/MIPS64 build. - Fails under GCStress test on x64. - Different x64 build configuration doesn't like relocation. Change-Id: I512555b38165d05f8a07e8aed528f00302061001
2017-01-11ART: Compiler support for invoke-polymorphic.Orion Hodson
Adds basic support to invoke method handles in compiled code. Enables method verification for methods containing invoke-polymorphic. Adds k45cc/k45rc output to Instruction::DumpString() which was found to be missing when enabling verification. Include stack traces in test 957-methodhandle-transforms for failures so they can be easily identified. Bug: 30550796,33191393 Test: art/test/run-test 953 Test: m test-art-run-test Change-Id: Ic9a96ea24906087597d96ad8159a5bc349d06950
2016-10-18Remove mirror:: and ArtMethod deps in utils.{h,cc}David Sehr
The latest chapter in the ongoing saga of attempting to dump a DEX file without having to start a whole runtime instance. This episode finds us removing references to ArtMethod/ArtField/mirror. One aspect of this change that I would like to call out specfically is that the utils versions of the "Pretty*" functions all were written to accept nullptr as an argument. I have split these functions up as follows: 1) an instance method, such as PrettyClass that obviously requires this != nullptr. 2) a static method, that behaves the same way as the util method, but calls the instance method if p != nullptr. This requires using a full class qualifier for the static methods, which isn't exactly beautiful. I have tried to remove as many cases as possible where it was clear p != nullptr. Bug: 22322814 Test: test-art-host Change-Id: I21adee3614aa697aa580cd1b86b72d9206e1cb24
2016-09-23Clean-up sharpening and compiler driver.Nicolas Geoffray
Remove dependency on compiler driver for sharpening and dex2dex (the methods called on the compiler driver were doing unnecessary work), and remove the now unused methods in compiler driver. Also remove test that is now invalid, as sharpening always succeeds. test: m test-art-host m test-art-target Change-Id: I54e91c6839bd5b0b86182f2f43ba5d2c112ef908
2016-08-19ART: Add thread offset printing hook to disassemblerAndreas Gampe
To prepare separation of disassembler from libart, add a function hook to the disassembler options for thread offset name printing. Bug: 15436106 Change-Id: I9e9b7e565ae923952c64026f675ac527b560f51b
2016-08-02Improve the graph visualizer's output for constant locations.Alexandre Rames
Change-Id: I423fb378ee61fb53c3b328fc74f4e95cdef0992a
2016-07-15Rename current register allocator implementationMatthew Gharrity
This will allow a cleaner commit in an upcoming refactoring of register allocation. Test: m test-art-host Change-Id: If420c97b088b3c934411ff83373e024003120746
2016-07-13X86: Use memory to do array range checksMark Mendell
Currently, an HBoundsCheck is fed by an HArrayLength, causing a load of the array length, followed by a register compare. Avoid the load when we can by comparing directly with the array length in memory. Implement this by marking the HArrayLength as 'emitted at use site', and then generating the code in the HBoundsCheck. Only do this replacement when we are the only user of the ArrayLength and it isn't visible to the environment. Handle the special case where the array is 'null' and where an implicit null check can't be eliminated. This code moves the load of the length to the slow code for the failed check, which is what we want. Test: 609-checker-x86-bounds-check Change-Id: I9cdb183301e048234bb0ffeda940eedcf4a655bd Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
2016-07-01Create a typedef for HInstruction::GetInputs() return type.Vladimir Marko
And some other cleanup after https://android-review.googlesource.com/230742 Test: No new tests. ART test suite passed (tested on host). Change-Id: I4743bf17544d0234c6ccb46dd0c1b9aae5c93e17
2016-06-21Merge "Replace String.charAt() with HIR."Vladimir Marko
2016-06-21Replace String.charAt() with HIR.Vladimir Marko
Replace String.charAt() with HArrayLength, HBoundsCheck and HArrayGet. This allows GVN on the HArrayLength and BCE on the HBoundsCheck as well as using the infrastructure for HArrayGet, i.e. better handling of constant indexes than the old intrinsic and using the HArm64IntermediateAddress. Bug: 28330359 Change-Id: I32bf1da7eeafe82537a60416abf6ac412baa80dc
2016-06-21Improve HLoadClass code generation.Vladimir Marko
For classes in the boot image, use either direct pointers or PC-relative addresses. For other classes, use PC-relative access to the dex cache arrays for AOT and direct address of the type's dex cache slot for JIT. For aosp_flounder-userdebug: - 32-bit boot.oat: -252KiB (-0.3%) - 64-bit boot.oat: -412KiB (-0.4%) - 32-bit dalvik cache total: -392KiB (-0.4%) - 64-bit dalvik-cache total: -2312KiB (-1.0%) (contains more files than the 32-bit dalvik cache) For aosp_flounder-userdebug forced to compile PIC: - 32-bit boot.oat: -124KiB (-0.2%) - 64-bit boot.oat: -420KiB (-0.5%) - 32-bit dalvik cache total: -136KiB (-0.1%) - 64-bit dalvik-cache total: -1136KiB (-0.5%) (contains more files than the 32-bit dalvik cache) Bug: 27950288 Change-Id: I4da991a4b7e53c63c92558b97923d18092acf139
2016-06-02Refactor handling of input records.Vladimir Marko
Introduce HInstruction::GetInputRecords(), a new virtual function that returns an ArrayRef<> to all input records. Implement all other functions dealing with input records as wrappers around GetInputRecords(). Rewrite functions that previously used multiple virtual calls to deal with input records, especially in loops, to prefetch the ArrayRef<> only once for each instruction. Besides avoiding all the extra calls, this also allows the compiler (clang++) to perform additional optimizations. This speeds up the Nexus 5 boot image compilation by ~0.5s (4% of "Compile Dex File", 2% of dex2oat time) on AOSP ToT. Change-Id: Id8ebe0fb9405e38d918972a11bd724146e4ca578
2016-05-12Merge "Fix oatdump crash on arm64/arm code. Also adds 16 bit literal ↵Aart Bik
information."
2016-05-12Fix oatdump crash on arm64/arm code.Aart Bik
Also adds 16 bit literal information. Rationale: When "run-away" instructions are disassembled, the literal addresses may go out of range, causing oatdump to crash. This CL guards memory access against the full memory range allocated to assembly instructions and data (it is possible but not really necessary to refine this a bit). Out of range arguments are now displayed as (?) to denote the issue, which is a lot nicer than crashing. BUG=28670871 Change-Id: I51e9b6a6a99162546fe31059f14278e8980451c2
2016-05-12Fix another case of live_in at irreducible loop entry.Nicolas Geoffray
GVN was implicitly extending the liveness of an instruction across an irreducible loop. Fix this problem by clearing the value set at loop entries that contain an irreducible loop. bug:28252896 (cherry picked from commit 77ce6430af2709432b22344ed656edd8ec80581b) Change-Id: Ie0121e83b2dfe47bcd184b90a69c0194d13fce54
2016-05-09Intrinsify String.length() and String.isEmpty() as HIR.Vladimir Marko
Use HArrayLength for String.length() in anticipation of changing the String.charAt() to HBoundsCheck+HArrayGet to allow the existing BCE to seamlessly work for strings. Use HArrayLength+HEqual for String.isEmpty(). We previously relied on inlining but we now want to apply the new intrinsics even when we do not inline, i.e. when compiling debuggable (as is currently the case for boot image) or when we hit inlining limits, i.e. depth, size, or the number of accumulated dex registers. Bug: 28330359 Change-Id: Iab9d2f6d2967bdd930a72eb461f27efe8f37c103
2016-04-19Use iterators "before" the use node in HUserRecord<>.Vladimir Marko
Create a new template class IntrusiveForwardList<> that mimicks std::forward_list<> except that all allocations are handled externally. This is essentially the same as boost::intrusive::slist<> but since we're not using Boost we have to reinvent the wheel. Use the new container to replace the HUseList and use the iterators to "before" use nodes in HUserRecord<> to avoid the extra pointer to the previous node which was used exclusively for removing nodes from the list. This reduces the size of the HUseListNode by 25%, 32B to 24B in 64-bit compiler, 16B to 12B in 32-bit compiler. This translates directly to overall memory savings for the 64-bit compiler but due to rounding up of the arena allocations to 8B, we do not get any improvement in the 32-bit compiler. Compiling the Nexus 5 boot image with the 64-bit dex2oat on host this CL reduces the memory used for compiling the most hungry method, BatteryStats.dumpLocked(), by ~3.3MiB: Before: MEM: used: 47829200, allocated: 48769120, lost: 939920 Number of arenas allocated: 345, Number of allocations: 815492, avg size: 58 ... UseListNode 13744640 ... After: MEM: used: 44393040, allocated: 45361248, lost: 968208 Number of arenas allocated: 319, Number of allocations: 815492, avg size: 54 ... UseListNode 10308480 ... Note that while we do not ship the 64-bit dex2oat to the device, the JIT compilation for 64-bit processes is using the 64-bit libart-compiler. Bug: 28173563 Change-Id: I985eabd4816f845372d8aaa825a1489cf9569208
2016-03-30Merge "Optimizing: Improve const-string code generation."Vladimir Marko
2016-03-30ART: Flush ostream less frequently in GraphVisualizerDavid Brazdil
We have seen Checker tests timing out on debug-GC configurations after having switched to Optimizing because its GraphVisualizer makes too many syscalls which the configuration keeps track of. This patch replaces std::endl with "\n" across GraphVisualizer so as to not flush the stream after every line of output. Bug: 27826765 Change-Id: I5e3f1e92f8a84f36d324d56945e2d420b2d36a5d
2016-03-29Optimizing: Improve const-string code generation.Vladimir Marko
For strings in the boot image, use either direct pointers or pc-relative addresses. For other strings, use PC-relative access to the dex cache arrays for AOT and direct address of the string's dex cache slot for JIT. For aosp_flounder-userdebug: - 32-bit boot.oat: -692KiB (-0.9%) - 64-bit boot.oat: -948KiB (-1.1%) - 32-bit dalvik cache total: -900KiB (-0.9%) - 64-bit dalvik cache total: -3672KiB (-1.5%) (contains more files than the 32-bit dalvik cache) For aosp_flounder-userdebug forced to compile PIC: - 32-bit boot.oat: -380KiB (-0.5%) - 64-bit boot.oat: -928KiB (-1.0%) - 32-bit dalvik cache total: -468KiB (-0.4%) - 64-bit dalvik cache total: -1928KiB (-0.8%) (contains more files than the 32-bit dalvik cache) Bug: 26884697 Change-Id: Iec7266ce67e6fedc107be78fab2e742a8dab2696
2016-03-24ART: Loosen a GraphChecker rule on Boolean inputsDavid Brazdil
GraphChecker tries to verify that Boolean inputs are properly typed. This is non-trivial in the presence of simplifying optimizations which capitalize on the fact that a Boolean value is internally represented as an integer. This patch removes the test from GraphChecker. Bug: 27625564 Change-Id: Ic61ea2193765b4578550538e965ca4f80fa4b287