Age | Commit message (Collapse) | Author |
|
Change-Id: Ie6aba02f4223b1de02530e1515c63505f37e184c
|
|
Implement floor/ceil/round/RoundFloat on x86 and x86_64.
Implement RoundDouble on x86_64.
Add support for roundss and roundsd on both architectures. Support them
in the disassembler as well.
Add the instruction set features for x86, as the 'round' instruction is
only supported if SSE4.1 is supported.
Fix the tests to handle the addition of passing the instruction set
features to x86 and x86_64.
Add assembler tests for roundsd and roundss to x86_64 assembler tests.
Change-Id: I9742d5930befb0bbc23f3d6c83ce0183ed9fe04f
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
|
|
|
The optimization recognizes the negation pattern generated by 'javac'
and replaces it with a single condition. To this end, boolean values
are now consistently assumed to be represented by an integer.
This is a first optimization which deletes blocks from the HGraph and
does so by replacing the corresponding entries with null. Hence,
existing code can continue indexing the list of blocks with the block
ID, but must check for null when iterating over the list.
Change-Id: I7779da69cfa925c6521938ad0bcc11bc52335583
|
|
ART had several implementations of art::bit_cast:
1. one in runtime/base/casts.h, declared as:
template <class Dest, class Source>
inline Dest bit_cast(const Source& source);
2. another one in runtime/utils.h, declared as:
template<typename U, typename V>
static inline V bit_cast(U in);
3. and a third local version, in runtime/memory_region.h,
similar to the previous one:
template<typename Source, typename Destination>
static Destination MemoryRegion::local_bit_cast(Source in);
This CL removes versions 2. and 3. and changes their callers
to use 1. instead. That version was chosen over the others
as:
- it was the oldest one in the code base; and
- its syntax was closer to the standard C++ cast operators,
as it supports the following use:
bit_cast<Destination>(source)
since `Source' can be deduced from `source'.
Change-Id: I7334fd5d55bf0b8a0c52cb33cfbae6894ff83633
|
|
Change-Id: Id9aafcc13c1a085c17ce65d704c67b73f9de695d
|
|
|
|
- moved environment recording from code generator to stack map stream
- added creation/loading factory methods for the DexRegisterMap (hides
internal details)
- added new tests
Change-Id: Ic8b6d044f0d8255c6759c19a41df332ef37876fe
|
|
Move the logic of saving/restoring live registers in slow path
in the SlowPathCode method. Also add a RecordPcInfo helper to
SlowPathCode, that will act as the placeholder of saving correct
stack maps.
Change-Id: I25c2bc7a642ef854bbc8a3eb570e5c8c8d2d030c
|
|
Change-Id: I86959eca5d8f5458ff75c78776b0af9db9c26800
|
|
|
|
Instructions remain live when debuggable, but only instructions
with object types remain live when non-debuggable.
Enable StackVisitor::GetThisObject for optimizing.
Change-Id: Id87b2cbf33a02450059acc9993995782e5f28987
|
|
- Replace the current list-based (fixed-size) Dex register
encoding in stack maps emitted by the optimizing compiler
with another list-based variable-size Dex register
encoding compressing short locations on 1 byte (3 bits for
the location kind, 5 bits for the value); other (large)
values remain encoded on 5 bytes.
- In addition, use slot offsets instead of byte offsets to
encode the location of Dex registers placed in stack
slots at small offsets, as it enables more values to use
the short (1-byte wide) encoding instead of the large
(5-byte wide) one.
- Rename art::DexRegisterMap::LocationKind as
art::DexRegisterLocation::Kind, turn it into a
strongly-typed enum based on a uint8_t, and extend it to
support new kinds (kInStackLargeOffset and
kConstantLargeValue).
- Move art::DexRegisterEntry from
compiler/optimizing/stack_map_stream.h to
runtime/stack_map.h and rename it as
art::DexRegisterLocation.
- Adjust art::StackMapStream,
art::CodeGenerator::RecordPcInfo,
art::CheckReferenceMapVisitor::CheckOptimizedMethod,
art::StackVisitor::GetVRegFromOptimizedCode, and
art::StackVisitor::SetVRegFromOptimizedCode.
- Implement unaligned memory accesses in art::MemoryRegion.
- Use them to manipulate data in Dex register maps.
- Adjust oatdump to support the new Dex register encoding.
- Update compiler/optimizing/stack_map_test.cc.
Change-Id: Icefaa2e2b36b3c80bb1b882fe7ea2f77ba85c505
|
|
Add ESI, EDI, EBP to available registers for non-baseline mode. Ensure
that they aren't used when byte addressible registers are needed.
Change-Id: Ie7130d4084c2ae9cfcd1e47c26eb3e5dcac1ebd6
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
|
Implement remaining explicit memory barrier code paths and temporarily
enable the use of explicit memory barriers for testing.
This CL also enables the use of instruction set features in the ARM64
backend. kUseAcquireRelease has been replaced with PreferAcquireRelease(),
which for now is statically set to false (prefer explicit memory barriers).
Please note that we still prefer acquire-release for the ARM64 Optimizing
Compiler, but we would like to exercise the explicit memory barrier code
path too.
Change-Id: I84e047ecd43b6fbefc5b82cf532e3f5c59076458
Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
|
|
|
|
We used to be forgiving because of HIntConstant(0) also being
used for null. We now create a special HNullConstant for such uses.
Also, we need to run the dead phi elimination twice during ssa
building to ensure the correctness.
Change-Id: If479efa3680d3358800aebb1cca692fa2d94f6e5
|
|
A few elements are not displayed yet (stack mask, inline info) though.
Change-Id: I5e51a801c580169abc5d1ef43ad581aadc110754
|
|
When a block branches to a non-following block, but blocks
in-between do branch to it, we can avoid doing the branch.
Change-Id: I9b343f662a4efc718cd4b58168f93162a24e1219
|
|
Avoid suspend checks and stack changes when not needed.
Change-Id: I0fdb31e8c631e99091b818874a558c9aa04b1628
|
|
The [i, i + 1) interval scheme we chose for representing
lifetime positions is not optimal for doing this optimization.
It however doesn't prevent recognizing a non-split interval
during the TryAllocateFreeReg phase, and try to re-use
its inputs' registers.
Change-Id: I80a2823b0048d3310becfc5f5fb7b1230dfd8201
|
|
Use the linearized order instead, as it puts blocks logically
next to each other in a better way. Also, it does not contain
dead blocks.
Change-Id: Ie65b56041a093c8155e6c1e06351cb36a4053505
|
|
Change-Id: I7c519b7a828c9891b1141a8e51e12d6a8bc84118
|
|
- Share the computation of core_spill_mask and fpu_spill_mask
between backends.
- Remove explicit stack overflow check support: we need to adjust
them and since they are not tested, they will easily bitrot.
Change-Id: I0b619b8de4e1bdb169ea1ae7c6ede8df0d65837a
|
|
Will work on other architectures and FP support in other CLs.
Change-Id: I8cef0343eedc7202d206f5217fdf0349035f0e4d
|
|
|
|
Replace macro NULL to the nullptr variation for C++.
Change-Id: Ib6e48dd4bb3c254343383011b67372622578ca76
Signed-off-by: Jean Christophe Beyler <jean.christophe.beyler@intel.com>
|
|
This reverts commit c399fdc442db82dfda66e6c25518872ab0f1d24f.
Change-Id: I19f8215c4b98f2f0827e04bf7806c3ca439794e5
|
|
ImplicitNullChecks are recorded only for instructions directly (see NB
below) preceeded by NullChecks in the graph. This way we avoid recording
redundant safepoints and minimize the code size increase.
NB: ParallalelMoves might be inserted by the register allocator between
the NullChecks and their uses. These modify the environment and the
correct action would be to reverse their modification. This will be
addressed in a follow-up CL.
Change-Id: Ie50006e5a4bd22932dcf11348f5a655d253cd898
|
|
Libcore tests fail.
This reverts commit 41aedbb684ccef76ff8373f39aba606ce4cb3194.
Change-Id: I2572f120d4bbaeb7a4d4cbfd47ab00c9ea39ac6c
|
|
Enabled on ARM for longs and doubles.
Change-Id: Id8792d08bd7ca9fb049c5db8a40ae694bafc2d8b
|
|
|
|
- for backends: arm, arm64, x86, x86_64
- fixed parameter passing for CodeGenerator
- 003-omnibus-opcodes test verifies that NullPointerExceptions work as
expected
Change-Id: I1b302acd353342504716c9169a80706cf3aba2c8
|
|
The ParallelMoveResolver does not work with pairs. Instead,
decompose the pair into two individual moves.
Change-Id: Ie9d3f0b078cef8dc20640c98b20bb20cc4971a7f
|
|
The current stack frame calculation assumes that each live register to
be saved/restored has the word size of the machine. This fails for X86,
where a double in an XMM register takes up 8 bytes. Change the
calculation to keep track of the number of core registers and number of
fp registers to handle this distinction.
This is slightly pessimal, as the registers may not be active at the
same time, but the only way to handle this would be to allocate both
classes of registers simultaneously, or remember all the active
intervals, matching them up and compute the size of each safepoint
interval.
Change-Id: If7860aa319b625c214775347728cdf49a56946eb
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
|
|
|
Also fix an (intentional) memory leak, by allocating the CodeGenerator
on the heap instead of the arena: they construct an Assembler object
that requires destruction.
BUG:18787334
Change-Id: I8cf0667cb70ce5b14d4ac334bd4487a562635f1b
|
|
The basic approach is:
- An instruction that needs two registers gets two intervals.
- When allocating the low part, we also allocate the high part.
- When splitting a low (or high) interval, we also split the high
(or low) equivalent.
- Allocation follows the (S/D register) requirement that low
registers are always even and the high equivalent is low + 1.
Change-Id: I06a5148e05a2ffc7e7555d08e871ed007b4c2797
|
|
Change-Id: Ia882405719fdd60b63e4102af7e085f7cbe0bb2a
|
|
Introduce a swap-space and corresponding allocator to transparently
switch native allocations to memory backed by a file.
Bug: 18596910
(cherry picked from commit 62746d8d9c4400e4764f162b22bfb1a32be287a9)
Change-Id: I131448f3907115054a592af73db86d2b9257ea33
|
|
|
|
X64 has one libcore test failing, and codegen_test on
arm is failing.
This reverts commit 6004796d6c630696127df2494dcd4f30d1367a34.
Change-Id: I20e00431fa18e11ce4c0cb6fffa91977fa8e9b4f
|
|
|
|
This change builds on:
https://android-review.googlesource.com/#/c/118983/
- Also fix x86_64 assembler bug triggered by this change.
- Fix (and improve) x86's backend byte register usage.
- Fix a bug in baseline register allocator: a fixed
out register must prevent inputs from allocating it.
Change-Id: I4883862e29b4e4b6470f1823cf7eab7e7863d8ad
|
|
Currently only inlines simple things that don't require an
environment, such as:
- Returning a constant.
- Returning a parameter.
- Returning an arithmetic operation.
Change-Id: Ie844950cb44f69e104774a3cf7a8dea66bc85661
|
|
- for arm, x86, x86_64 backends
- reinstated fmod quick entry points for x86. This is a partial revert
of bd3682eada753de52975ae2b4a712bd87dc139a6 which added inline assembly
for floting point rem on x86. Note that Quick still uses the inline
version.
- fix rem tests for longs
Change-Id: I73be19a9f2f2bcf3f718d9ca636e67bdd72b5440
|
|
- Add support for the float-to-long Dex instruction in the
optimizing compiler.
- Add a Dex PC field to art::HTypeConversion to allow the
x86 and ARM code generators to produce runtime calls.
- Instruct art::CodeGenerator::RecordPcInfo not to record
PC information for HTypeConversion instructions.
- Add S0 to the list of ARM FPU parameter registers.
- Have art::x86_64::X86_64Assembler::cvttss2si work with
64-bit operands.
- Generate x86, x86-64 and ARM (but not ARM64) code for
float to long HTypeConversion nodes.
- Add related tests to test/422-type-conversion.
Change-Id: I954214f0d537187883f83f7a83a1bb2dd8a21fd4
|
|
This patch updates the interface to VIXL 1.7 and enables the debug version of
VIXL when ART is built in debug mode.
Change-Id: I443fb941bec3cffefba7038f93bb972e6b7d8db5
Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
|
|
- Add support for the long-to-double Dex instruction in the
optimizing compiler.
- Enable requests of temporary FPU (double) registers during
code generation.
- Fix art::x86::X86Assembler::LoadLongConstant and extend
it to int64_t values.
- Have art::x86_64::X86_64Assembler::cvtsi2sd work with
64-bit operands.
- Generate x86, x86-64 and ARM (but not ARM64) code for
long to double HTypeConversion nodes.
- Add related tests to test/422-type-conversion.
Change-Id: Ie73d9e5e25bd2e15f585c371e8fc2dcb83438ccd
|
|
Change-Id: I8f28dd287c0e04223c49dea6a323058c1b210913
|