1. 06 May, 2016 1 commit
  2. 13 Apr, 2016 1 commit
    • Vladimir Marko's avatar
      Move Assemblers to the Arena. · 93205e39
      Vladimir Marko authored
      And clean up some APIs to return std::unique_ptr<> instead
      of raw pointers that don't communicate ownership.
      
      Change-Id: I3017302307a0253d661240750298802fb0d9585e
      93205e39
  3. 12 Feb, 2016 1 commit
  4. 21 Jan, 2016 1 commit
    • Aart Bik's avatar
      Implemented BitCount as an intrinsic. With unit test. · 3f67e692
      Aart Bik authored
      Rationale:
      Recognizing this important operation as an intrinsic has
      various advantages:
      (1) having the no-side-effects/no-throw allows for
          much more GVN/LICM/BCE.
      (2) Some architectures, like x86_64, provide direct
          support for this operation.
      
      Performance improvements on X86_64:
      CheckersEvalBench (32-bit bitboard): 27,210KNS -> 36,798KNS  =  + 35%
      ReversiEvalBench  (64-bit bitboard): 52,562KNS -> 89,086KNS  =  + 69%
      
      Change-Id: I65d549b0469b7909b12c6611cdc34a8640a5751f
      3f67e692
  5. 14 Oct, 2015 1 commit
  6. 17 Sep, 2015 1 commit
    • Andreas Gampe's avatar
      ART: Refactor intrinsics slow-paths · 85b62f23
      Andreas Gampe authored
      Refactor slow paths so that there is a default implementation for
      common cases (only arm64 with vixl is special). Write a generic
      intrinsic slow-path that can be reused for the specific architectures.
      Move helper functions into CodeGenerator so that they are accessible.
      
      Change-Id: Ibd788dce432601c6a9f7e6f13eab31f28dcb8550
      85b62f23
  7. 16 Sep, 2015 1 commit
  8. 26 Aug, 2015 1 commit
    • Mark Mendell's avatar
      X86: Assembler support for near labels · 73f455ec
      Mark Mendell authored
      
      The optimizing compiler uses 32 bit relative jumps for all forward
      jumps, just in case the offset is too large to fit in one byte.  Some of
      the generated code knows that the jumps will in fact fit.
      
      Add a 'NearLabel' class to the x86 and x86_64 assemblers.  This will be
      used to generate known short forward branches.
      
      Add jecxz/jrcxz instructions, which only handle a short offset.  They
      will be used for intrinsics.
      
      Add tests for the new instructions and NearLabel.
      
      Change-Id: I11177f36394d35d63b32364b0e6289ee6d97de46
      Signed-off-by: default avatarMark Mendell <mark.p.mendell@intel.com>
      73f455ec
  9. 14 Aug, 2015 2 commits
  10. 11 Aug, 2015 1 commit
    • Mark Mendell's avatar
      [optimizing] More x86_64 code improvements · cfa410b0
      Mark Mendell authored
      
      Use the constant area some more, use 32-bit immediates in movq
      instructions when possible, and other small tweaks.
      
      Remove the commented out code for Math.Abs(float/double) as it would
      fail for baseline compiler due to the output being the same as the
      input.
      
      Change-Id: Ifa39f1865b94cec2e1c0a99af3066a645e9d3618
      Signed-off-by: default avatarMark Mendell <mark.p.mendell@intel.com>
      cfa410b0
  11. 04 Aug, 2015 1 commit
  12. 31 Jul, 2015 1 commit
  13. 30 Jul, 2015 1 commit
  14. 28 Jul, 2015 1 commit
    • Mark Mendell's avatar
      Optimizing: Use more X86 3 operand multiplies · 4a2aa4af
      Mark Mendell authored
      
      The X86_64 code generator generated 3 operand multiplies for long
      multiplication only.  Add support for 3 operand multiplication for
      int as well for both X86 and X86_64.
      
      Note that the RHS operand must be a 32 bit constant, and that it is
      possible for the constant to end up in a register (!) due to a previous
      use by another instruction.  Handle this case by checking the operand,
      otherwise the first input might not be the same as the output, due to
      the use of Any().
      
      Also allow stack operands for multiplication.
      
      Change-Id: I8f3d14cc01e9a91210f418258aa18065ee87979d
      Signed-off-by: default avatarMark Mendell <mark.p.mendell@intel.com>
      4a2aa4af
  15. 27 Jul, 2015 1 commit
  16. 01 Jul, 2015 1 commit
    • Roland Levillain's avatar
      Implement heap poisoning in ART's Optimizing compiler. · 4d02711e
      Roland Levillain authored
      - Instrument ARM, ARM64, x86 and x86-64 code generators.
      - Note: To turn heap poisoning on in Optimizing, set the
        environment variable `ART_HEAP_POISONING' to "true"
        before compiling ART.
      
      Bug: 12687968
      Change-Id: Ib3120b38cf805a8a50207a314b9ccc90c8d93740
      4d02711e
  17. 02 Jun, 2015 1 commit
    • Mathieu Chartier's avatar
      Move mirror::ArtMethod to native · 3d21bdf8
      Mathieu Chartier authored
      Optimizing + quick tests are passing, devices boot.
      
      TODO: Test and fix bugs in mips64.
      
      Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS.
      Some of the savings are from removal of virtual methods and direct
      methods object arrays.
      
      Bug: 19264997
      
      (cherry picked from commit e401d146)
      
      Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d
      
      Fix some ArtMethod related bugs
      
      Added root visiting for runtime methods, not currently required
      since the GcRoots in these methods are null.
      
      Added missing GetInterfaceMethodIfProxy in GetMethodLine, fixes
      --trace run-tests 005, 044.
      
      Fixed optimizing compiler bug where we used a normal stack location
      instead of double on ARM64, this fixes the debuggable tests.
      
      TODO: Fix JDWP tests.
      
      Bug: 19264997
      
      Change-Id: I7c55f69c61d1b45351fd0dc7185ffe5efad82bd3
      
      ART: Fix casts for 64-bit pointers on 32-bit compiler.
      
      Bug: 19264997
      Change-Id: Ief45cdd4bae5a43fc8bfdfa7cf744e2c57529457
      
      Fix JDWP tests after ArtMethod change
      
      Fixes Throwable::GetStackDepth for exception event detection after
      internal stack trace representation change.
      
      Adds missing ArtMethod::GetInterfaceMethodIfProxy call in case of
      proxy method.
      
      Bug: 19264997
      Change-Id: I363e293796848c3ec491c963813f62d868da44d2
      
      Fix accidental IMT and root marking regression
      
      Was always using the conflict trampoline. Also included fix for
      regression in GC time caused by extra roots. Most of the regression
      was IMT.
      
      Fixed bug in DumpGcPerformanceInfo where we would get SIGABRT due to
      detached thread.
      
      EvaluateAndApplyChanges:
      From ~2500 -> ~1980
      GC time: 8.2s -> 7.2s due to 1s less of MarkConcurrentRoots
      
      Bug: 19264997
      Change-Id: I4333e80a8268c2ed1284f87f25b9f113d4f2c7e0
      
      Fix bogus image test assert
      
      Previously we were comparing the size of the non moving space to
      size of the image file.
      
      Now we properly compare the size of the image space against the size
      of the image file.
      
      Bug: 19264997
      Change-Id: I7359f1f73ae3df60c5147245935a24431c04808a
      
      [MIPS64] Fix art_quick_invoke_stub argument offsets.
      
      ArtMethod reference's size got bigger, so we need to move other args
      and leave enough space for ArtMethod* and 'this' pointer.
      
      This fixes mips64 boot.
      
      Bug: 19264997
      Change-Id: I47198d5f39a4caab30b3b77479d5eedaad5006ab
      3d21bdf8
  18. 30 May, 2015 1 commit
    • Mathieu Chartier's avatar
      Move mirror::ArtMethod to native · e401d146
      Mathieu Chartier authored
      Optimizing + quick tests are passing, devices boot.
      
      TODO: Test and fix bugs in mips64.
      
      Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS.
      Some of the savings are from removal of virtual methods and direct
      methods object arrays.
      
      Bug: 19264997
      Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d
      e401d146
  19. 26 May, 2015 2 commits
    • Vladimir Marko's avatar
      ART: Clean up arm64 kNumberOfXRegisters usage. · 41b175ab
      Vladimir Marko authored
      Avoid undefined behavior for arm64 stemming from 1u << 32 in
      loops with upper bound kNumberOfXRegisters.
      
      Create iterators for enumerating bits in an integer either
      from high to low or from low to high and use them for
      <arch>Context::FillCalleeSaves() on all architectures.
      
      Refactor runtime/utils.{h,cc} by moving all bit-fiddling
      functions to runtime/base/bit_utils.{h,cc} (together with
      the new bit iterators) and all time-related functions to
      runtime/base/time_utils.{h,cc}. Improve test coverage and
      fix some corner cases for the bit-fiddling functions.
      
      Bug: 13925192
      
      (cherry picked from commit 80afd020)
      
      Change-Id: I905257a21de90b5860ebe1e39563758f721eab82
      41b175ab
    • Vladimir Marko's avatar
      ART: Clean up arm64 kNumberOfXRegisters usage. · 80afd020
      Vladimir Marko authored
      Avoid undefined behavior for arm64 stemming from 1u << 32 in
      loops with upper bound kNumberOfXRegisters.
      
      Create iterators for enumerating bits in an integer either
      from high to low or from low to high and use them for
      <arch>Context::FillCalleeSaves() on all architectures.
      
      Refactor runtime/utils.{h,cc} by moving all bit-fiddling
      functions to runtime/base/bit_utils.{h,cc} (together with
      the new bit iterators) and all time-related functions to
      runtime/base/time_utils.{h,cc}. Improve test coverage and
      fix some corner cases for the bit-fiddling functions.
      
      Bug: 13925192
      Change-Id: I704884dab15b41ecf7a1c47d397ab1c3fc7ee0f7
      80afd020
  20. 11 May, 2015 1 commit
  21. 22 Apr, 2015 1 commit
    • Mathieu Chartier's avatar
      Replace NULL with nullptr · 2cebb24b
      Mathieu Chartier authored
      Also fixed some lines that were too long, and a few other minor
      details.
      
      Change-Id: I6efba5fb6e03eb5d0a300fddb2a75bf8e2f175cb
      2cebb24b
  22. 21 Apr, 2015 1 commit
  23. 13 Apr, 2015 3 commits
    • Roland Levillain's avatar
      Exercise the x86 and x86-64 FILD and FISTP instructions. · 0a18601f
      Roland Levillain authored
      - Ensure the double- and quadword x87 (FPU) instructions for
        integer loading (resp. fildl and fildll) are properly
        generated by the x86 and x86-64 generators (resp.
        X86Assembler::filds/X86_64Assembler::filds and
        X86Assembler::fildl/X86_64Assembler::fildl).
      - Ensure the double- and quadword x87 (FPU) instructions for
        integer storing & popping (resp. filstpl and fistpll) are
        properly generated by the x86 and x86-64 generators (resp.
        X86Assembler::fistps/X86_64Assembler::fistps and
        X86Assembler::fistpl/X86_64Assembler::fistpl).
      
      These instructions can be used in the implementation of the
      long-to-float and long-to-double Dex type conversions.
      
      Change-Id: Iade52a9aee326d189d77d3dbd352a2b5dab52e46
      0a18601f
    • Nicolas Geoffray's avatar
      Revert "Optimizing: Fix long-to-fp conversion on x86." · 386ce406
      Nicolas Geoffray authored
      Test fails on arm.
      
      This reverts commit 2d45b4df.
      
      Change-Id: Id2864917b52f7ffba459680303a2d15b34f16a4e
      386ce406
    • Serguei Katkov's avatar
      Optimizing: Fix long-to-fp conversion on x86. · 2d45b4df
      Serguei Katkov authored
      
      long-to-fp conversion implemented using SSE loses the precision.
      The test is included. CL uses FPU to provide the correct result.
      
      Change-Id: I8eaf3c46819a8cb52642a7e7d7c4e3e0edbc88db
      Signed-off-by: default avatarSerguei Katkov <serguei.i.katkov@intel.com>
      2d45b4df
  24. 10 Apr, 2015 1 commit
  25. 09 Apr, 2015 2 commits
  26. 07 Apr, 2015 2 commits
  27. 01 Apr, 2015 1 commit
    • Mark Mendell's avatar
      [optimizing] Implement x86/x86_64 math intrinsics · fb8d279b
      Mark Mendell authored
      
      Implement floor/ceil/round/RoundFloat on x86 and x86_64.
      Implement RoundDouble on x86_64.
      
      Add support for roundss and roundsd on both architectures.  Support them
      in the disassembler as well.
      
      Add the instruction set features for x86, as the 'round' instruction is
      only supported if SSE4.1 is supported.
      
      Fix the tests to handle the addition of passing the instruction set
      features to x86 and x86_64.
      
      Add assembler tests for roundsd and roundss to x86_64 assembler tests.
      
      Change-Id: I9742d5930befb0bbc23f3d6c83ce0183ed9fe04f
      Signed-off-by: default avatarMark Mendell <mark.p.mendell@intel.com>
      fb8d279b
  28. 13 Mar, 2015 1 commit
    • Mark Mendell's avatar
      [optimizing] Improve x86, x86_64 code · 3f6c7f61
      Mark Mendell authored
      
      Tweak the generated code to allow more use of constants and other small
      changes
      - Use test vs. compare to 0
      - EmitMove of 0.0 should use xorps
      - VisitCompare kPrimLong can use constants
      - cmp/add/sub/mul on x86_64 can use constants if in int32_t range
      - long bit operations on x86 examine long constant high/low to optimize
      - Use 3 operand imulq if constant is in int32_t range
      
      Change-Id: I2dd4010fdffa129fe00905b0020590fe95f3f926
      Signed-off-by: default avatarMark Mendell <mark.p.mendell@intel.com>
      3f6c7f61
  29. 19 Feb, 2015 1 commit
  30. 06 Feb, 2015 1 commit
    • Nicolas Geoffray's avatar
      x64 goodness. · 748f140d
      Nicolas Geoffray authored
      - Use test instead of cmp when comparing against 0.
      - Make it possible to use lea for add.
      - Use xor instead of mov when loading 0.
      
      Change-Id: Ide95c4e2d9b773e952412892f2df6869600c324e
      748f140d
  31. 21 Jan, 2015 2 commits
  32. 16 Jan, 2015 1 commit
    • Calin Juravle's avatar
      Add implicit null checks for the optimizing compiler · cd6dffed
      Calin Juravle authored
      - for backends: arm, arm64, x86, x86_64
      - fixed parameter passing for CodeGenerator
      - 003-omnibus-opcodes test verifies that NullPointerExceptions work as
      expected
      
      Change-Id: I1b302acd353342504716c9169a80706cf3aba2c8
      cd6dffed
  33. 15 Jan, 2015 1 commit
    • Andreas Gampe's avatar
      ART: Optimizing compiler intrinsics · 71fb52fe
      Andreas Gampe authored
      Add intrinsics infrastructure to the optimizing compiler.
      
      Add almost all intrinsics supported by Quick to the x86-64 backend.
      Further intrinsics require more assembler support.
      
      Change-Id: I48de9b44c82886bb298d16e74e12a9506b8e8807
      71fb52fe