- 06 May, 2016 1 commit
-
-
Serguei Katkov authored
Change-Id: I43f41ef2fdf6475238f0987842aefb1c2eb6a36d Signed-off-by:
Serguei Katkov <serguei.i.katkov@intel.com>
-
- 13 Apr, 2016 1 commit
-
-
Vladimir Marko authored
And clean up some APIs to return std::unique_ptr<> instead of raw pointers that don't communicate ownership. Change-Id: I3017302307a0253d661240750298802fb0d9585e
-
- 12 Feb, 2016 1 commit
-
-
Mark Mendell authored
Add support for the memory form of CMOV. Add tests. Change-Id: Ib9f5dbd3031c7e235ee3f2afdb7db75eed46277a Signed-off-by:
Mark Mendell <mark.p.mendell@intel.com>
-
- 21 Jan, 2016 1 commit
-
-
Aart Bik authored
Rationale: Recognizing this important operation as an intrinsic has various advantages: (1) having the no-side-effects/no-throw allows for much more GVN/LICM/BCE. (2) Some architectures, like x86_64, provide direct support for this operation. Performance improvements on X86_64: CheckersEvalBench (32-bit bitboard): 27,210KNS -> 36,798KNS = + 35% ReversiEvalBench (64-bit bitboard): 52,562KNS -> 89,086KNS = + 69% Change-Id: I65d549b0469b7909b12c6611cdc34a8640a5751f
-
- 14 Oct, 2015 1 commit
-
-
Mark Mendell authored
Implement PackedSwitch using a jump table of offsets to blocks. Bug: 24092914 Bug: 21119474 Change-Id: I83430086c03ef728d30d79b4022607e9245ef98f Signed-off-by:
Mark Mendell <mark.p.mendell@intel.com>
-
- 17 Sep, 2015 1 commit
-
-
Andreas Gampe authored
Refactor slow paths so that there is a default implementation for common cases (only arm64 with vixl is special). Write a generic intrinsic slow-path that can be reused for the specific architectures. Move helper functions into CodeGenerator so that they are accessible. Change-Id: Ibd788dce432601c6a9f7e6f13eab31f28dcb8550
-
- 16 Sep, 2015 1 commit
-
-
Mark Mendell authored
These are for use in new intrinsics. Bsf (Bit Scan Forward) is used in {Long,Integer}NumberOfTrailingZeros and the rotates are used in {Long,Integer}Rotate{Left,Right}. Change-Id: Icb599d7e1eec4e4ea9e5b4f0b1654c7b8d4de678 Signed-off-by:
Mark Mendell <mark.p.mendell@intel.com>
-
- 26 Aug, 2015 1 commit
-
-
Mark Mendell authored
The optimizing compiler uses 32 bit relative jumps for all forward jumps, just in case the offset is too large to fit in one byte. Some of the generated code knows that the jumps will in fact fit. Add a 'NearLabel' class to the x86 and x86_64 assemblers. This will be used to generate known short forward branches. Add jecxz/jrcxz instructions, which only handle a short offset. They will be used for intrinsics. Add tests for the new instructions and NearLabel. Change-Id: I11177f36394d35d63b32364b0e6289ee6d97de46 Signed-off-by:
Mark Mendell <mark.p.mendell@intel.com>
-
- 14 Aug, 2015 2 commits
-
-
Mark Mendell authored
Add support for 'bsr' instruction. Add tests. Change-Id: I1cd8b30d7f3f5ee7fbeef8124cc6a31bf8ce59d5 Signed-off-by:
Mark Mendell <mark.p.mendell@intel.com>
-
Mark Mendell authored
Add 'REP MOVSW' as a supported instruction for x86 32 and 64 bit. Added tests. Change-Id: I1c615ac1e7fa46c48983c90f791b92be0375c8b8 Signed-off-by:
Mark Mendell <mark.p.mendell@intel.com>
-
- 11 Aug, 2015 1 commit
-
-
Mark Mendell authored
Use the constant area some more, use 32-bit immediates in movq instructions when possible, and other small tweaks. Remove the commented out code for Math.Abs(float/double) as it would fail for baseline compiler due to the output being the same as the input. Change-Id: Ifa39f1865b94cec2e1c0a99af3066a645e9d3618 Signed-off-by:
Mark Mendell <mark.p.mendell@intel.com>
-
- 04 Aug, 2015 1 commit
-
-
agicsaki authored
Change-Id: I9085694fd3313581b2775a8267ccda58fec19a1a
-
- 31 Jul, 2015 1 commit
-
- 30 Jul, 2015 1 commit
-
-
Mark Mendell authored
Add moves that don't pollute the data cache. These can be used for assigning large data structures. Change-Id: I14d91ba6264f5ce2f128033d65d59b2536426643 Signed-off-by:
Mark Mendell <mark.p.mendell@intel.com>
-
- 28 Jul, 2015 1 commit
-
-
Mark Mendell authored
The X86_64 code generator generated 3 operand multiplies for long multiplication only. Add support for 3 operand multiplication for int as well for both X86 and X86_64. Note that the RHS operand must be a 32 bit constant, and that it is possible for the constant to end up in a register (!) due to a previous use by another instruction. Handle this case by checking the operand, otherwise the first input might not be the same as the output, due to the use of Any(). Also allow stack operands for multiplication. Change-Id: I8f3d14cc01e9a91210f418258aa18065ee87979d Signed-off-by:
Mark Mendell <mark.p.mendell@intel.com>
-
- 27 Jul, 2015 1 commit
-
-
agicsaki authored
Change-Id: I7634959eebb64d607f47497db320d5c2afdef16b
-
- 01 Jul, 2015 1 commit
-
-
Roland Levillain authored
- Instrument ARM, ARM64, x86 and x86-64 code generators. - Note: To turn heap poisoning on in Optimizing, set the environment variable `ART_HEAP_POISONING' to "true" before compiling ART. Bug: 12687968 Change-Id: Ib3120b38cf805a8a50207a314b9ccc90c8d93740
-
- 02 Jun, 2015 1 commit
-
-
Mathieu Chartier authored
Optimizing + quick tests are passing, devices boot. TODO: Test and fix bugs in mips64. Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS. Some of the savings are from removal of virtual methods and direct methods object arrays. Bug: 19264997 (cherry picked from commit e401d146) Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d Fix some ArtMethod related bugs Added root visiting for runtime methods, not currently required since the GcRoots in these methods are null. Added missing GetInterfaceMethodIfProxy in GetMethodLine, fixes --trace run-tests 005, 044. Fixed optimizing compiler bug where we used a normal stack location instead of double on ARM64, this fixes the debuggable tests. TODO: Fix JDWP tests. Bug: 19264997 Change-Id: I7c55f69c61d1b45351fd0dc7185ffe5efad82bd3 ART: Fix casts for 64-bit pointers on 32-bit compiler. Bug: 19264997 Change-Id: Ief45cdd4bae5a43fc8bfdfa7cf744e2c57529457 Fix JDWP tests after ArtMethod change Fixes Throwable::GetStackDepth for exception event detection after internal stack trace representation change. Adds missing ArtMethod::GetInterfaceMethodIfProxy call in case of proxy method. Bug: 19264997 Change-Id: I363e293796848c3ec491c963813f62d868da44d2 Fix accidental IMT and root marking regression Was always using the conflict trampoline. Also included fix for regression in GC time caused by extra roots. Most of the regression was IMT. Fixed bug in DumpGcPerformanceInfo where we would get SIGABRT due to detached thread. EvaluateAndApplyChanges: From ~2500 -> ~1980 GC time: 8.2s -> 7.2s due to 1s less of MarkConcurrentRoots Bug: 19264997 Change-Id: I4333e80a8268c2ed1284f87f25b9f113d4f2c7e0 Fix bogus image test assert Previously we were comparing the size of the non moving space to size of the image file. Now we properly compare the size of the image space against the size of the image file. Bug: 19264997 Change-Id: I7359f1f73ae3df60c5147245935a24431c04808a [MIPS64] Fix art_quick_invoke_stub argument offsets. ArtMethod reference's size got bigger, so we need to move other args and leave enough space for ArtMethod* and 'this' pointer. This fixes mips64 boot. Bug: 19264997 Change-Id: I47198d5f39a4caab30b3b77479d5eedaad5006ab
-
- 30 May, 2015 1 commit
-
-
Mathieu Chartier authored
Optimizing + quick tests are passing, devices boot. TODO: Test and fix bugs in mips64. Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS. Some of the savings are from removal of virtual methods and direct methods object arrays. Bug: 19264997 Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d
-
- 26 May, 2015 2 commits
-
-
Vladimir Marko authored
Avoid undefined behavior for arm64 stemming from 1u << 32 in loops with upper bound kNumberOfXRegisters. Create iterators for enumerating bits in an integer either from high to low or from low to high and use them for <arch>Context::FillCalleeSaves() on all architectures. Refactor runtime/utils.{h,cc} by moving all bit-fiddling functions to runtime/base/bit_utils.{h,cc} (together with the new bit iterators) and all time-related functions to runtime/base/time_utils.{h,cc}. Improve test coverage and fix some corner cases for the bit-fiddling functions. Bug: 13925192 (cherry picked from commit 80afd020) Change-Id: I905257a21de90b5860ebe1e39563758f721eab82
-
Vladimir Marko authored
Avoid undefined behavior for arm64 stemming from 1u << 32 in loops with upper bound kNumberOfXRegisters. Create iterators for enumerating bits in an integer either from high to low or from low to high and use them for <arch>Context::FillCalleeSaves() on all architectures. Refactor runtime/utils.{h,cc} by moving all bit-fiddling functions to runtime/base/bit_utils.{h,cc} (together with the new bit iterators) and all time-related functions to runtime/base/time_utils.{h,cc}. Improve test coverage and fix some corner cases for the bit-fiddling functions. Bug: 13925192 Change-Id: I704884dab15b41ecf7a1c47d397ab1c3fc7ee0f7
-
- 11 May, 2015 1 commit
-
-
Andreas Gampe authored
Add intrinsics implementations for indexOf in the optimizing compiler. These are mostly ported from Quick. Add instruction support to assemblers where necessary. Change-Id: Ife90ed0245532a5c436a26fe84715dc357f353c8
-
- 22 Apr, 2015 1 commit
-
-
Mathieu Chartier authored
Also fixed some lines that were too long, and a few other minor details. Change-Id: I6efba5fb6e03eb5d0a300fddb2a75bf8e2f175cb
-
- 21 Apr, 2015 1 commit
-
-
Mark Mendell authored
Allow constant and memory addresses to more X86_64 instructions. Add memory formats to X86_64 instructions to match. Fix a bug in cmpq(CpuRegister, const Address&). Allow mov <addr>,immediate (instruction 0xC7) to be a valid faulting instruction. Change-Id: I5b8a409444426633920cd08e09f687a7afc88a39 Signed-off-by:
Mark Mendell <mark.p.mendell@intel.com>
-
- 13 Apr, 2015 3 commits
-
-
Roland Levillain authored
- Ensure the double- and quadword x87 (FPU) instructions for integer loading (resp. fildl and fildll) are properly generated by the x86 and x86-64 generators (resp. X86Assembler::filds/X86_64Assembler::filds and X86Assembler::fildl/X86_64Assembler::fildl). - Ensure the double- and quadword x87 (FPU) instructions for integer storing & popping (resp. filstpl and fistpll) are properly generated by the x86 and x86-64 generators (resp. X86Assembler::fistps/X86_64Assembler::fistps and X86Assembler::fistpl/X86_64Assembler::fistpl). These instructions can be used in the implementation of the long-to-float and long-to-double Dex type conversions. Change-Id: Iade52a9aee326d189d77d3dbd352a2b5dab52e46
-
Nicolas Geoffray authored
Test fails on arm. This reverts commit 2d45b4df. Change-Id: Id2864917b52f7ffba459680303a2d15b34f16a4e
-
Serguei Katkov authored
long-to-fp conversion implemented using SSE loses the precision. The test is included. CL uses FPU to provide the correct result. Change-Id: I8eaf3c46819a8cb52642a7e7d7c4e3e0edbc88db Signed-off-by:
Serguei Katkov <serguei.i.katkov@intel.com>
-
- 10 Apr, 2015 1 commit
-
-
Mark Mendell authored
Nicolas had some comments after the patch https://android-review.googlesource.com/#/c/144100 had merged. Fix the problems that he found. Change-Id: I40e8a4273997860db7511dc8f1986281b72bead2 Signed-off-by:
Mark Mendell <mark.p.mendell@intel.com>
-
- 09 Apr, 2015 2 commits
-
-
Mark Mendell authored
Support a constant area addressed using RIP on x86_64. Use it for FP operations to avoid loading constants into a CPU register and moving to a XMM register. Change-Id: I58421759ef2a8475538876c20e696ec787015a72 Signed-off-by:
Mark Mendell <mark.p.mendell@intel.com>
-
Guillaume Sanchez authored
This is done using the algorithms in Hacker's Delight chapter 10. Change-Id: I7bacefe10067569769ed31a1f7834f796fb41119
-
- 07 Apr, 2015 2 commits
-
-
David Srbecky authored
Change-Id: I12a17a8a1c39ffccaa499c328ebac36e4d74dc4e
-
Mark Mendell authored
Implement CAS and bit reverse and byte reverse intrinsics that were missing from x86 and x86_64 implementations. Add assembler tests and compareAndSwapLong test. Change-Id: Iabb2ff46036645df0a91f640288ef06090a64ee3 Signed-off-by:
Mark Mendell <mark.p.mendell@intel.com>
-
- 01 Apr, 2015 1 commit
-
-
Mark Mendell authored
Implement floor/ceil/round/RoundFloat on x86 and x86_64. Implement RoundDouble on x86_64. Add support for roundss and roundsd on both architectures. Support them in the disassembler as well. Add the instruction set features for x86, as the 'round' instruction is only supported if SSE4.1 is supported. Fix the tests to handle the addition of passing the instruction set features to x86 and x86_64. Add assembler tests for roundsd and roundss to x86_64 assembler tests. Change-Id: I9742d5930befb0bbc23f3d6c83ce0183ed9fe04f Signed-off-by:
Mark Mendell <mark.p.mendell@intel.com>
-
- 13 Mar, 2015 1 commit
-
-
Mark Mendell authored
Tweak the generated code to allow more use of constants and other small changes - Use test vs. compare to 0 - EmitMove of 0.0 should use xorps - VisitCompare kPrimLong can use constants - cmp/add/sub/mul on x86_64 can use constants if in int32_t range - long bit operations on x86 examine long constant high/low to optimize - Use 3 operand imulq if constant is in int32_t range Change-Id: I2dd4010fdffa129fe00905b0020590fe95f3f926 Signed-off-by:
Mark Mendell <mark.p.mendell@intel.com>
-
- 19 Feb, 2015 1 commit
-
-
Andreas Gampe authored
Ensure that things are used correctly. Change-Id: I76f082b32dcee28bbfb4c519daa401ac595873b3
-
- 06 Feb, 2015 1 commit
-
-
Nicolas Geoffray authored
- Use test instead of cmp when comparing against 0. - Make it possible to use lea for add. - Use xor instead of mov when loading 0. Change-Id: Ide95c4e2d9b773e952412892f2df6869600c324e
-
- 21 Jan, 2015 2 commits
-
-
Nicolas Geoffray authored
Will work on other architectures and FP support in other CLs. Change-Id: I8cef0343eedc7202d206f5217fdf0349035f0e4d
-
Mark Mendell authored
Replace the calls to fmod/fmodf by inline code as is done in the Quick compiler. Remove the quick fmod/fmodf runtime entries, as they are no longer in use. 64 bit code generator Move() routine needed to be enhanced to handle constants, as Location::Any() allows them to be generated. Change-Id: I6b6a42f6faeed4b0b3c940453e487daf5b25d184 Signed-off-by:
Mark Mendell <mark.p.mendell@intel.com>
-
- 16 Jan, 2015 1 commit
-
-
Calin Juravle authored
- for backends: arm, arm64, x86, x86_64 - fixed parameter passing for CodeGenerator - 003-omnibus-opcodes test verifies that NullPointerExceptions work as expected Change-Id: I1b302acd353342504716c9169a80706cf3aba2c8
-
- 15 Jan, 2015 1 commit
-
-
Andreas Gampe authored
Add intrinsics infrastructure to the optimizing compiler. Add almost all intrinsics supported by Quick to the x86-64 backend. Further intrinsics require more assembler support. Change-Id: I48de9b44c82886bb298d16e74e12a9506b8e8807
-