Age | Commit message (Collapse) | Author |
|
|
|
Simplification of HFakeString assumes that it cannot be used until
String.<init> is called which is not true and causes different
behaviour between the compiler and the interpreter. This patch
removes the optimization together with the HFakeString instruction.
Instead, HNewInstance is generated and an empty String allocated
until it is replaced with the result of the StringFactory call. This
is consistent with the behaviour of the interpreter but is too
conservative. A follow-up CL will attempt to optimize out the initial
allocation when possible.
Bug: 26457745
Bug: 26486014
Change-Id: I7139e37ed00a880715bfc234896a930fde670c44
|
|
So we don't fallback to the interpreter in the presence of
irreducible loops.
Implications:
- A loop pre-header does not necessarily dominate a loop header.
- Non-constant redundant phis will be kept in loop headers, to
satisfy our linear scan register allocation algorithm.
- while-graph optimizations, such as gvn, licm, lse, and dce
need to know when they are dealing with irreducible loops.
Change-Id: I2cea8934ce0b40162d215353497c7f77d6c9137e
|
|
Rationale:
Sharing identical slow path code reduces code size.
Background:
Currently, slow paths with the same dex-pc, same physical register
spilling code, and identical stack maps are shared (making this
only useful for deopt slow paths). The newly introduced mechanism
is sufficiently general to allow future improvements by e.g.
allowing different dex-pc (by passing this to runtime) or even
the kind of slow paths (by passing runtime addresses to the slowpath).
Change-Id: I819615c47b4fd98440a241f681f93e4fc22d12e0
|
|
|
|
|
|
|
|
Change-Id: I5740ec958a20d236634b66df0e675382ed5c16fc
|
|
bug:25494265
Change-Id: I560a3a589b92440020285f9adfdf7c9efb06217c
|
|
|
|
Change-Id: I7dccb02cf7ac2f7d8fd1676b03e0b394701fbe3f
|
|
The code generated for object ArraySet on ARM used to
miss an implicit null check for the array when the assigned
value is `null`. This has not been an actual issue so far,
as ArraySet instructions have never been using implicit null
checks.
Note: This CL comes without a regression test, as the code
path in question is not used (yet).
Change-Id: If3bc85e32802595e635513dfb83ccfcfd8f00d3d
|
|
Introduce an ARM fast path implementation in Optimizing for
Baker's read barriers (for both heap reference loads and GC
root loads). The marking phase of the read barrier is
performed by a slow path, invoking the runtime entry point
artReadBarrierMark.
Other read barrier algorithms continue to use the original
slow path based implementation, which has been renamed as
GenerateReadBarrierSlow/GenerateReadBarrierForRootSlow.
Bug: 12687968
Change-Id: Ie7ee85b1b4c0564148270cebdd3cbd4c3da51b3a
|
|
Generate extra stack map at the start of each java statement.
The stack maps are later translated to DWARF which allows
LLDB to set breakpoints and view local variables.
Change-Id: If00ab875513308e4a1399d1e12e0fe8934a6f0c3
|
|
Make Is<type>() and As<type>() non-virtual for concrete
instruction types, relying on GetKind(), and mark GetKind()
as PURE to improve optimization opportunities. This reduces
the number of relocations in libart-compiler.so's .rel.dyn
section by ~4K, or ~44%, and in .data.rel.ro by ~18K, or
~65%. The file is 96KiB smaller for Nexus 5, including 8KiB
reduction of the .text section.
Unfortunately, the g++/clang++ __attribute__((pure)) is not
strong enough to avoid duplicated virtual calls and we would
need the C++ [[pure]] attribute proposed in n3744 instead.
To work around this deficiency, we introduce an extra
non-virtual indirection for GetKind(), so that the compiler
can optimize common expressions such as
instruction->IsAdd() || instruction->IsSub()
or
instruction->IsAdd() && instruction->AsAdd()->...
which contain two virtual calls to GetKind() after inlining.
Change-Id: I83787de0671a5cb9f5b0a5f4a536cef239d5b401
|
|
CMN updates flags based on addition of its operands.
Do not confuse the "N" suffix with bitwise inversion
performed by MVN.
Also add more special cases analogous to AddConstant()
and use CmpConstant() more in code generator.
Change-Id: I0d4571770a3f0fdf162e97d4bde56814098e7246
|
|
switch."""
|
|
This reverts commit b4c137630fd2226ad07dfd178ab15725374220f1.
The underlying issue was fixed by https://android-review.googlesource.com/188271 .
Bug: 26121945
Change-Id: I58b08eb1a9f0a5c861f8cda93522af64bcf63920
|
|
|
|
This reverts commit 59f054d98f519a3efa992b1c688eb97bdd8bbf55.
bug:26121945
Change-Id: I8a5ad7ef1f1de8d44787c27528fa3f7f5c2e9cd3
|
|
Change-Id: I96bd7fa2e8bdccb87a3380d063dad0dd57fed9d7
|
|
|
|
Replace constant and register version bitfield rotate patterns, and
rotateRight/Left intrinsic invokes, with new HRor IR.
Where k is constant and r is a register, with the UShr and Shl on
either side of a |, +, or ^, the following patterns are replaced:
x >>> #k OP x << #(reg_size - k)
x >>> #k OP x << #-k
x >>> r OP x << (#reg_size - r)
x >>> (#reg_size - r) OP x << r
x >>> r OP x << -r
x >>> -r OP x << r
Implemented for ARM/ARM64 & X86/X86_64.
Tests changed to not be inlined to prevent optimization from folding
them out. Additional tests added for constant rotate amounts.
Change-Id: I5847d104c0a0348e5792be6c5072ce5090ca2c34
|
|
Change-Id: I1d258f1a89bf0ec7c7ddd134be9215d480f0b09a
|
|
Implement Vladimir Marko's suggestion. The new compare/jump series
reduce the number of instructions from (2*n+1) to (1.5*n+3).
Generate normal compare/jump series when numEntries <= 3.
Generate optimal compare/jump series when numEntries <= threshold.
Generate jump tables otherwise.
Change-Id: I425547b6787057c7fa84e71f17c145b63b208633
|
|
This reverts commit c88ef3a10c474045a3476a02ae75d07ddd3230b7.
Change-Id: I0ed88a48b313a8d28bc39fae40631123aadb13ef
|
|
|
|
Fails 425 in debuggable mode.
This reverts commit 4db0bf9c4db6a09716c3388b7d2f88d534470339.
Change-Id: I346df8f75674564fc4fb241c60f23e250fc7f0a7
|
|
|
|
The compiler driver makes assumptions that don't hold for
the optimizing compiler, and will for example always go to
slow path for an invoke-super when there's no verified method.
Also fix GenerateInvokeVirtual in the presence of intrinsics.
Next change will address some of the TODOs in sharpening.cc.
Change-Id: I2b0e543ee9b9bebcadb2d26de29e850c59ad58b9
|
|
Change-Id: I0fe2da50a30a3f62bec8ea01688dd1fec84b1831
|
|
Change-Id: Iaa74591eed0f2eabc9ba9f9988681d9582faa320
|
|
Change-Id: I6e13e594539e766ed94524ac3282cec292ba91da
|
|
Bug: 12687968
Change-Id: Idf2e371e01e10d9d32c95b150735e2c96244232e
|
|
|
|
|
|
bug:25735083
bug:25173758
Change-Id: Ie81cfa4fa9c47cc025edb291cdedd7af209a03db
|
|
Implement long
Shl(x,1) as LSLS+ADC,
Shr(x,1) as ASR+RRX and
UShr(x,1) as LSR+RRX.
Remove the simplification substituting Shl(x,1) with
ADD(x,x) as it interferes with some other optimizations
instead of helping them. And since it didn't help 64-bit
architectures anyway, codegen is the correct place for it.
This is now implemented for ARM and x86, so only mips32 can
be improved.
Change-Id: Idd14f23292198b2260189e1497ca5411b21743b3
|
|
Change-Id: I4042aefbdac1a8c236d00e2e7145349a64f6486b
|
|
This first implementation uses slow paths to instrument heap
reference loads and GC root loads for the concurrent copying
collector, respectively calling the artReadBarrierSlow and
artReadBarrierForRootSlow runtime entry points.
Notes:
- This implementation does not instrument HInvokeVirtual
nor HInvokeInterface instructions (for class reference
loads), as the corresponding read barriers are not stricly
required with the current concurrent copying collector.
- Intrinsics which may eventually call (on slow path) are
disabled when read barriers are enabled, as the current
slow path infrastructure does not support this case.
- When read barriers are enabled, the code generated for a
HArraySet instruction always go into the array set slow
path for object arrays (delegating the operation to the
runtime), as we are lacking a mechanism to keep a
temporary register live accross a runtime call (needed for
the instrumentation of type checking code, which requires
two successive read barriers).
Bug: 12687968
Change-Id: I92e8db414d029f952c07f3d3a98069e46dfdbc2a
|
|
Each code generator implements a method for generating condition
evaluation and branching to arbitrary labels. This patch refactors
it for better clarity but also to generate fewer jumps when the true
branch is the fallthrough successor.
This is preliminary work for implementing HSelect.
Change-Id: Iaa545a5ecbacb761c5aa241fa69140cf6eb5952f
|
|
Locations builder should use ConstantLocation() when the
code generator relies on a location to be constant. Code
generator should interrogate locations, not inputs, about
being const.
Change-Id: Ic35bb84aa9f83e0977b151a0430aca6c88f19cf0
|
|
Change-Id: Id66ef8cdb9e64306f2be547370b90cc100a3e086
|
|
Optimize the code generation for 'if' statements to jump to the
'false' block if the next block to be generated is the 'true' block.
Add an X86-64 test for this case.
Note that ARM64 & MIPS64 have not been updated.
Change-Id: Iebb1352feb9d3bd0142d8b0621a2e3069a708ea7
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
|
Add jump table support to the thumb2 assembler. Jump tables are
a collection of labels for the case targets, and an anchor label
denoting the position of the jump.
Use the jump table support to implement packed-switch support for
arm32.
Add tests for BindTrackedLabel and JumpTable to the thumb2 assembler
test.
Bug: 24092914
Change-Id: I5c84f193dfebf9e07f48678efc8bd151bb1410dd
|
|
Determine the dispatch type of invoke-static/-direct in a
special pass right after the type inference. This allows the
inliner to pass the "needs dex cache" check and inline more.
It also allows the code generator to avoid requesting a
register location for the ArtMethod* for kDexCachePcRelative
and direct methods.
The supported dispatch check handles also situations that
the CompilerDriver currently doesn't allow. The cleanup of
the CompilerDriver and required changes to Quick will come
in a separate change.
Change-Id: I3f8e903a119949e95871d8ab0a995f4731a13a07
|
|
|
|
Rationale: the de-opt instruction is very similar to an if,
so the existing assumption that it always has a
conditional "under the hood" is very unsafe, since
optimizations may have replaced conditionals with
actual values; this CL generalizes handling of deopt.
Change-Id: I1c6cb71fdad2af869fa4714b38417dceed676459
|
|
Use it in lieu of UNUSED(), which had some incorrect uses.
Change-Id: If247dce58b72056f6eea84968e7196f0b5bef4da
|
|
|