summaryrefslogtreecommitdiff
path: root/compiler/optimizing/optimization.cc
AgeCommit message (Collapse)Author
2020-08-21Improved LSE: Replacing loads with Phis.Vladimir Marko
Create "Phi placeholders" for tracking heap values that can merge from different values and try to match existing Phis or create new Phis to replace loads. For Phi placeholders from loop headers we do not know whether they are fed by unknown values through back-edges when processing the loop header, so we delay processing loads that depend on them until we walked the entire graph. We then try to match them with existing instructions (when the location is unchanged in the loop) or Phis or create new Phis if needed. If we find a loop Phi placeholder fed with unknown value from a back-edge, we mark the Phi placeholder unreplaceable and reprocess loads and stores to propagate the unknown value. This can sometimes allow other loads to be replaced. At the end we re-calculate the heap values to find stores that can be eliminated because they write over the same value. Golem results: art-opt-cc arm arm64 x86 x86-64 CaffeineFloat +6.7% +3.0% +5.9% +3.8% KotlinMicroWhen +33.7% +4.8% +1.8% +0.6% art-opt (more noisy than art-opt-cc) CaffeineFloat +4.1% +4.4% +7.8% +10.5% KotlinMicroWhen +33.6% +2.0% +1.8% +1.8% The MoveLiteralColumn benchmark seems to gain significantly (up to 22% on art-opt-cc but under 10% on art-opt) but it is very noisy and the results are therefore unreliable. Insignificant code size changes for aosp_blueline-userdebug: - before: arm boot*.oat: 15303468 arm64 boot*.oat: 18184736 services.odex: 25195944 grep -c pAllocObject boot.arm64.oatdump.txt: 27213 grep -c pAllocArray boot.arm64.oatdump.txt: 3620 - after: arm boot*.oat: 15299524 (-4KiB, -0.03%) arm64 boot*.oat: 18176528 (-8KiB, -0.05%) services.odex: 25191832 (-4KiB, -0.02%) grep -c pAllocObject boot.arm64.oatdump.txt: 27206 (-7) grep -c pAllocArray boot.arm64.oatdump.txt: 3615 (-5) Test: New tests in 530-checker-lse. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Test: blueline-userdebug boots. Bug: 77906240 Change-Id: Ia9fe0cd3530f9d3941650dfefc00a7f7fd821994
2020-08-10ARM: Allow FP args in core regs for @CriticalNative.Vladimir Marko
If a float or double argument needs to be passed in core register to a @CriticalNative method due to soft-float native ABI, insert a fake call to Float.floatToRawIntBits() or Double.doubleToRawLongBits() to satisfy type checks in the compiler. We cannot do that for intrinsics that expect those inputs in actual FP registers, so we still prevent such intrinsics from using `kCallCriticalNative`. This should be irrelevant if an actual intrinsic implementation is emitted. There are currently two unimplemented intrinsics that are affected by the carve-out, namely MathRoundDouble and FP16ToHalf, and four intrinsics implemented only when ARMv8A is supported, namely MathRint, MathRoundFloat, MathCeil and MathFloor. Test: testrunner.py --target --32 -t 178-app-image-native-method Bug: 112189621 Change-Id: Id14ef4f49f8a0e6489f97dc9588c0e6a5c122632
2020-07-28More inclusive language in the runtimeDavid Srbecky
Test: m Bug: 161896447 Bug: 161850439 Bug: 161336379 Change-Id: Iabc29fa43b4b5a403699d6bca95e9a2cb8945d77
2020-06-17ART: Simplify HRem to reuse existing HDivEvgeny Astigeevich
A pattern seen in libcore and SPECjvm2008 workloads is a pair of HRem/HDiv having the same dividend and divisor. The code generator processes them separately and generates duplicated instructions calculating HDiv. This CL adds detection of such a pattern to the instruction simplifier. This optimization affects HInductionVarAnalysis and HLoopOptimization preventing some loop optimizations. To avoid this the instruction simplifier has the loop_friendly mode which means not to optimize HRems if they are in a loop. A microbenchmark run on Pixel 3 shows the following improvements: | little cores | big cores arm32 Int32 | +21% | +40% arm32 Int64 | +46% | +44% arm64 Int32 | +27% | +14% arm64 Int64 | +33% | +27% Test: 411-checker-instruct-simplifier-hrem Test: test.py --host --optimizing --jit --gtest --interpreter Test: test.py --target --optimizing --jit --interpreter Test: run-gtests.sh Change-Id: I376a1bd299d7fe10acad46771236edd5f85dfe56
2020-06-08Run LSA as a part of the LSE pass.Vladimir Marko
Make LSA a helper class, not an optimization pass. Move all its allocations to ScopedArenaAllocator to reduce the peak memory usage a little bit. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: I7fc634abe732d22c99005921ffecac5207bcf05f
2020-05-13Move HandleCache to HGraph.Vladimir Marko
This avoids passing the `VariableSizedHandleScope*` argument around and eliminates HGraph::inexact_object_rti_ and its initialization. The latter shall allow running Optimizing gtests that do not require type information without creating a Runtime in future. (To be implemented in a separate CL.) Test: m test-art-host-gtest Test: testrunner.py --host --optmizing Test: aosp_taimen-userdebug boots. Change-Id: I36fe9bc556c6d610d644c8c14cc74c9985a14d64
2020-04-17ART: Refactor SIMD slots and regs size processing.Artem Serov
ART vectorizer assumes that there is single size of SIMD register used for the whole program. Make this assumption explicit and refactor the code. Note: This is a base for the future introduction of SIMD slots of size other than 8 or 16 bytes. Test: test-art-target, test-art-host. Change-Id: Id699d5e3590ca8c655ecd9f9ed4e63f49e3c4f9c
2020-02-13Remove MIPS support from Optimizing.Vladimir Marko
Test: aosp_taimen-userdebug boots. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Bug: 147346243 Change-Id: I97fdc15e568ae3fe390efb1da690343025f84944
2019-10-14Revert "Make compiler/optimizing/ symbols hidden."Vladimir Marko
This reverts commit e2727154f25e0db9a5bb92af494d8e47b181dfcf. Reason for revert: Breaks ASAN tests (ODR violation). Bug: 142365358 Change-Id: I38103d74a1297256c81d90872b6902ff1e9ef7a4
2019-10-14Make compiler/optimizing/ symbols hidden.Vladimir Marko
Make symbols in compiler/optimizing hidden by a namespace attribute. The unit intrinsic_objects.{h,cc} is excluded as it is needed by dex2oat. As the symbols are no longer exported, gtests are now linked with the static version of the libartd-compiler library. libart-compiler.so size: - before: arm: 2396152 arm64: 3345280 - after: arm: 2016176 (-371KiB, -15.9%) arm64: 2874480 (-460KiB, -14.1%) Test: m test-art-host-gtest Test: testrunner.py --host --optimizing --jit Bug: 142365358 Change-Id: I1fb04a33351f53f00b389a1642e81a68e40912a8
2018-12-27ART: Refactor for bugprone-argument-commentAndreas Gampe
Handles compiler. Bug: 116054210 Test: WITH_TIDY=1 mmma art Change-Id: I5cdfe73c31ac39144838a2736146b71de037425e
2018-12-06Refactor CompilerDriver::CompileAll().Vladimir Marko
Treat verification results and image classes as mutable only in CompilerDriver::PreCompile(), and treat them as immutable during compilation, accessed through the CompilerOptions. This severs the dependency of the inliner on the CompilerDriver. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: I594a0213ca6a5003c19b4bd488af98db4358d51d
2018-11-08Emit bit manipulation instructions for x86 and x86_64Shalini Salomi Bodapati
This patch performs instruction simplification to generate instructions andn, blsmsk and blsr on cpus that have avx2. Test: test.py --host --64, test-art-host-gtest Change-Id: Ie41a1b99ac2980f1e9f6a831a7d639bc3e248f0f Signed-off-by: Shalini Salomi Bodapati <shalini.salomi.bodapati@intel.com>
2018-09-28Remove need for intrinsic recognizer to be a pass.Nicolas Geoffray
Instead just recognize the intrinsic when creating an invoke instruction. Also remove some old code related to compiler driver sharpening. Test: test.py Change-Id: Iecb668f30e95034970fcf57160ca12092c9c610d
2018-09-19Remove sharpening as an optimization pass.Nicolas Geoffray
Make the last sharpening helper (methods) like the other helpers: being invoked by the instruction builder. Test: test.py Change-Id: Ic80a454f9b59b0b4ef7825590b24402500ba851c
2018-07-13Merge "Revert "Emit vector mulitply and accumulate instructions for x86.""Hans Boehm
2018-07-13Revert "Emit vector mulitply and accumulate instructions for x86."Hans Boehm
This reverts commit 61908880e6565acfadbafe93fa64de000014f1a6. Reason for revert: By failing to round multiply results, it does not follow Java rounding rules. Change-Id: Ic0ef08691bef266c9f8d91973e596e09ff3307c6
2018-07-02Merge "Emit vector mulitply and accumulate instructions for x86."Treehugger Robot
2018-07-02Emit vector mulitply and accumulate instructions for x86.Gupta Kumar, Sanjiv
This patch adds a new cpu vaiant named kabylake and performs instruction simplification to generate VectorMulitplyAccumulate. Test: ./test.py --host --64 Change-Id: Ie6cc882dadf1322dd4d3ae49bfdb600b0c447765 Signed-off-by: Gupta Kumar, Sanjiv <sanjiv.kumar.gupta@intel.com>
2018-06-28Remove CompilerDriver::support_boot_image_fixup_.Vladimir Marko
Check for non-PIC boot image as a testing config instead. Honor the config for HInvokeStaticOrDirect sharpening. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: I3645f4fefe322f1fd64ea88a2b41a35ceccea688
2018-06-25Move instruction_set_ to CompilerOptions.Vladimir Marko
Removes CompilerDriver dependency from ImageWriter and several other classes. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Test: Pixel 2 XL boots. Test: m test-art-target-gtest Test: testrunner.py --target --optimizing Change-Id: I3c5b8ff73732128b9c4fad9405231a216ea72465
2018-04-30Step 2 of 2: conditional passes.Aart Bik
Rationale: The change introduces actual conditional passes (dependence on inliner). This ensures more cases are optimized downstream without needlessly introducing compile-time. NOTE: Some checker tests needed to be rewritten due to subtle changes in the phase ordering. No optimizations were harmed in the process, though. Bug: b/78171933, b/74026074 Test: test-art-host,target Change-Id: I335260df780e14ba1f22499ad74d79060c7be44d
2018-01-08Clean up CodeItemAccessors and Compact/StandardDexFileMathieu Chartier
Change constructor to use a reference to a dex file. Remove duplicated logic for GetCodeItemSize. Bug: 63756964 Test: test-art-host Change-Id: I69af8b93abdf6bdfa4454e16db8f4e75883bca46
2018-01-05Create dex subdirectoryDavid Sehr
Move all the DexFile related source to a common subdirectory dex/ of runtime. Bug: 71361973 Test: make -j 50 test-art-host Change-Id: I59e984ed660b93e0776556308be3d653722f5223
2017-12-22Make CodeItem fields privateMathieu Chartier
Make code item fields private and use accessors. Added a hand full of friend classes to reduce the size of the change. Changed default to be nullable and removed CreateNullable. CreateNullable was a bad API since it defaulted to the unsafe, may add a CreateNonNullable if it's important for performance. Motivation: Have a different layout for code items in cdex. Bug: 63756964 Test: test-art-host-gtest Test: test/testrunner/testrunner.py --host Test: art/tools/run-jdwp-tests.sh '--mode=host' '--variant=X32' --debug Change-Id: I42bc7435e20358682075cb6de52713b595f95bf9
2017-12-08Determine HLoadClass/String load kind early.Vladimir Marko
This helps save memory by avoiding the allocation of HEnvironment and related objects for AOT references to boot image strings and classes (kBootImage* load kinds) and also for JIT references (kJitTableAddress). Compiling aosp_taimen-userdebug boot image, the most memory hungry method BatteryStats.dumpLocked() needs - before: Used 55105384 bytes of arena memory... ... UseListNode 10009704 Environment 423248 EnvVRegs 20676560 ... - after: Used 50559176 bytes of arena memory... ... UseListNode 8568936 Environment 365680 EnvVRegs 17628704 ... Test: m test-art-host-gtest Test: testrunner.py --host --optimizing --jit Bug: 34053922 Change-Id: I68e73a438e6ac8e8908e6fccf53bbeea8a64a077
2017-11-20Refactored optimization passes setup.Aart Bik
Rationale: Refactors the way we set up optimization passes in the compiler into a more centralized approach. The refactoring also found some "holes" in the existing mechanism (missing string lookup in the debugging mechanism, or inablity to set alternative name for optimizations that may repeat). Bug: 64538565 Test: test-art-host test-art-target Change-Id: Ie5e0b70f67ac5acc706db91f64612dff0e561f83
2017-08-11optimizing: Refactor statistics to use OptimizingCompilerStats helperIgor Murashkin
Remove all copies of 'MaybeRecordStat', replacing them with a single OptimizingCompilerStats::MaybeRecordStat helper. Change-Id: I83b96b41439dccece3eee2e159b18c95336ea933
2015-06-24ART: Run GraphChecker after Builder and SsaBuilderDavid Brazdil
This patch refactors the way GraphChecker is invoked, utilizing the same scoping mechanism as pass timing and graph visualizer. Therefore, GraphChecker will now run not just after instances of HOptimization but after the builders and reg alloc, too. Change-Id: I8173b98b79afa95e1fcbf3ac9630a873d7f6c1d4
2015-04-24ART: Dead block removalDavid Brazdil
Adds a new pass which finds all unreachable blocks, typically due to simplifying an if-condition to a constant, and removes them from the graph. The patch also slightly generalizes the graph-transforming operations. Change-Id: Iff7c97f1d10b52886f3cd7401689ebe1bfdbf456
2015-02-19Reference type propagationCalin Juravle
- propagate reference types between instructions - remove checked casts when possible - add StackHandleScopeCollection to manage an arbitrary number of stack handles (see comments) Change-Id: I31200067c5e7375a5ea8e2f873c4374ebdb5ee60
2014-11-25Fix a bug in the type analysis phase of optimizing.Nicolas Geoffray
Dex code can lead to the creation of a phi with one float input and one integer input. Since the SSA builder trusts the verifier, it assumes that the integer input must be converted to float. However, when the register is not used afterwards, the verifier hasn't ensured that. Therefore, the compiler must remove the phi prior to doing type propagation. Change-Id: Idcd51c4dccce827c59d1f2b253bc1c919bc07df5
2014-11-19Use HOptimization abstraction for running optimizations.Nicolas Geoffray
Move existing optimizations to it. Change-Id: I3b43f9997faf4ed8875162e3a3abdf99375478dd
2014-10-22Tidy up logging.Ian Rogers
Move gVerboseMethods to CompilerOptions. Now "--verbose-methods=" option to dex2oat rather than runtime argument "-verbose-methods:". Move ToStr and Dumpable out of logging.h, move LogMessageData into logging.cc except for a forward declaration. Remove ConstDumpable as Dump methods are all const (and make this so if not currently true). Make LogSeverity an enum and improve compile time assertions and type checking. Remove log_severity.h that's only used in logging.h. With system headers gone from logging.h, go add to .cc files missing system header includes. Also, make operator new in ValueObject private for compile time instantiation checking. Change-Id: I3228f614500ccc9b14b49c72b9821c8b0db3d641
2014-10-17Revert "Revert "Introduce a class to implement optimization passes.""Roland Levillain
This reverts commit 1ddbf6d4b37979a9f11a203c12befd5ae8b65df4. Change-Id: I110a14668d1564ee0604dc958b91394b40da89fc
2014-10-01Revert "Introduce a class to implement optimization passes."Nicolas Geoffray
This reverts commit bf9cd7ba2118a75f5aa9b56241c4d5fa00dedeb8. Change-Id: I0a483446666c9c24c45925a5fc199debdefd8b3e
2014-10-01Introduce a class to implement optimization passes.Roland Levillain
- Add art::HOptimization. - Rename art::ConstantPropagation to art::HConstantFolding in compiler/optimizing/constant_folding.h to avoid name clashes with a class of the same name in compiler/dex/post_opt_passes.h. - Rename art::DeadCodeElimination to art::HDeadCodeElimination for consistency reasons. - Have art::HDeadCodeElimination and art::HConstantFolding derive from art::HOptimization. - Start to use these optimizations in art:OptimizingCompiler::TryCompile. Change-Id: Iaab350c122d87b2333b3760312b15c0592d7e010