Improved LSE: Replacing loads with Phis.

Create "Phi placeholders" for tracking heap values that can merge from different values and try to match existing Phis or create new Phis to replace loads. For Phi placeholders from loop headers we do not know whether they are fed by unknown values through back-edges when processing the loop header, so we delay processing loads that depend on them until we walked the entire graph. We then try to match them with existing instructions (when the location is unchanged in the loop) or Phis or create new Phis if needed. If we find a loop Phi placeholder fed with unknown value from a back-edge, we mark the Phi placeholder unreplaceable and reprocess loads and stores to propagate the unknown value. This can sometimes allow other loads to be replaced. At the end we re-calculate the heap values to find stores that can be eliminated because they write over the same value. Golem results: art-opt-cc arm arm64 x86 x86-64 CaffeineFloat +6.7% +3.0% +5.9% +3.8% KotlinMicroWhen +33.7% +4.8% +1.8% +0.6% art-opt (more noisy than art-opt-cc) CaffeineFloat +4.1% +4.4% +7.8% +10.5% KotlinMicroWhen +33.6% +2.0% +1.8% +1.8% The MoveLiteralColumn benchmark seems to gain significantly (up to 22% on art-opt-cc but under 10% on art-opt) but it is very noisy and the results are therefore unreliable. Insignificant code size changes for aosp_blueline-userdebug: - before: arm boot*.oat: 15303468 arm64 boot*.oat: 18184736 services.odex: 25195944 grep -c pAllocObject boot.arm64.oatdump.txt: 27213 grep -c pAllocArray boot.arm64.oatdump.txt: 3620 - after: arm boot*.oat: 15299524 (-4KiB, -0.03%) arm64 boot*.oat: 18176528 (-8KiB, -0.05%) services.odex: 25191832 (-4KiB, -0.02%) grep -c pAllocObject boot.arm64.oatdump.txt: 27206 (-7) grep -c pAllocArray boot.arm64.oatdump.txt: 3615 (-5) Test: New tests in 530-checker-lse. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Test: blueline-userdebug boots. Bug: 77906240 Change-Id: Ia9fe0cd3530f9d3941650dfefc00a7f7fd821994
author: Vladimir Marko <vmarko@google.com> 2020-06-23 14:19:53 +0100
committer: Vladimir Marko <vmarko@google.com> 2020-08-21 09:12:16 +0000
commit: 3224f38567100e62f9cdf8258f4b308f6bc671e1 (patch)
tree: a493903b987d8cc5be7cfa4b48a732bf2f9295be /compiler/optimizing/optimization.cc
parent: 3e8caebc5fe05c02d05b5e315d6d8945fd509a26 (diff)
1 files changed, 3 insertions, 5 deletions
diff --git a/compiler/optimizing/optimization.cc b/compiler/optimizing/optimization.cc
index b28bda6e65..2cac38b715 100644
--- a/compiler/optimizing/optimization.cc
+++ b/compiler/optimizing/optimization.cc
@@ -217,11 +217,6 @@ ArenaVector<HOptimization*> ConstructOptimizations(
         opt = new (allocator) BoundsCheckElimination(
             graph, *most_recent_side_effects, most_recent_induction, pass_name);
         break;
-      case OptimizationPass::kLoadStoreElimination:
-        CHECK(most_recent_side_effects != nullptr && most_recent_induction != nullptr);
-        opt = new (allocator) LoadStoreElimination(
-            graph, *most_recent_side_effects, stats, pass_name);
-        break;
       //
       // Regular passes.
       //
@@ -269,6 +264,9 @@ ArenaVector<HOptimization*> ConstructOptimizations(
       case OptimizationPass::kConstructorFenceRedundancyElimination:
         opt = new (allocator) ConstructorFenceRedundancyElimination(graph, stats, pass_name);
         break;
+      case OptimizationPass::kLoadStoreElimination:
+        opt = new (allocator) LoadStoreElimination(graph, stats, pass_name);
+        break;
       case OptimizationPass::kScheduling:
         opt = new (allocator) HInstructionScheduling(
             graph, codegen->GetCompilerOptions().GetInstructionSet(), codegen, pass_name);
author	Vladimir Marko <vmarko@google.com>	2020-06-23 14:19:53 +0100
committer	Vladimir Marko <vmarko@google.com>	2020-08-21 09:12:16 +0000
commit	3224f38567100e62f9cdf8258f4b308f6bc671e1 (patch)
tree	a493903b987d8cc5be7cfa4b48a732bf2f9295be /compiler/optimizing/optimization.cc
parent	3e8caebc5fe05c02d05b5e315d6d8945fd509a26 (diff)