1. 29 Apr, 2015 3 commits
  2. 28 Apr, 2015 1 commit
  3. 07 Apr, 2015 1 commit
  4. 27 Mar, 2015 2 commits
  5. 18 Mar, 2015 2 commits
  6. 17 Mar, 2015 1 commit
  7. 10 Feb, 2015 4 commits
  8. 09 Feb, 2015 4 commits
    • Peter Jensen's avatar
      Support --num-threads with --multi-dex (take 2) · dd140a22
      Peter Jensen authored
      
        With fix for regression introduced in original commit.
      
      The current dx implementation supports options --multi-dex, for applications
      not fitting within the dex format limitations; and --num-threads=N, triggers
      concurrent processing of multiple input files. However, the implementation
      has the following limitations:
      
      The --num-threads option is disabled when used together with --multi-dex.
      The --num-threads option implements concurrency at the level of classpath
      entries, and does nothing when the classes to be translated are specified
      with a single classpath element (e.g. single jar output from Proguard).
      The existing --num-threads implementation may produce indeterministic output.
      The heuristic used by the --multi-dex option to determine when to rotate the
      dex output file is overly conservative.
      
      The primary objective of this change is:
      Concurrent translation of classes, independently of input specification format.
      Support --num-threads=N in both mono- and multi-dex mode.
      Deterministic class output order.
      Near optimal use of dex file format capacity.
      
      This is accomplished by reorganizing the dx workflow in a pipeline of
      concurrent phases.
      
      read-class  | parse-class | translate-class | add-to-dex | convert-dex-to-byte[];
      output-dex-files-or-jar
      
      To manage dex file rotation (i.e. --multi-dex support), the parse-class and
      add-to-dex phases are synchronized to prevent forwarding classes to the
      translate-class phase if it could potentially result in breaking the dex
      format limitations. The heuristic currently used to estimate the number of
      indices needed for a class is improved, to minimize the amount of serialization
      imposed by this feedback mechanism, and to improve the use of dex file capacity.
      
      The translate-class and convert-dex-to-byte[] phases are further parallelized
      with configurable (--num-threads=N option) thread pools. This allow translating
      classes concurrently, while also performing output conversion in parallel.
      Separate collector threads are used to collect results from the thread pools
      in deterministic order.
      
      Testing was performed on an Ubuntu system, with 6 cores and 12 hardware threads.
      The taskset command was used to experimentally establish that running with more
      than 8 hardware threads does not provide any additional benefit.
      
      Experiments shows that the argument to --num-threads should not exceed the
      lesser of the number of available hardware threads, and 5. Setting it to a
      higher value results in no additional benefit.
      
      The gain is generally larger for larger applications, and not significant for
      small applications with less than a few thousands classes. Experiments with
      generated classes shows that for large applications gains as high as 50% may
      be possible.
      
      For an existing real-life application with more than 11k classes, and requiring
      2 dex files, a speed-up of 37% was achieved (--num-threads=5, 8 hardware
      threads, 4g Java heap). A speedup of 31% was observed for another application
      with ~7 classes.
      
      For small applications, use of --num-threads=N>1 doesn’t provide significant
      benefit. Running with --num-threads=1, the modified dx is slightly faster,
      but no significant gain is observed unless the application requires multiple
      dex files.
      
      The one case where a significant regression may be observed is when using
      --num-threads=N>1, with a single hardware thread. This is an inappropriate
      configuration, even with the current implementation. However, because of
      the limitations of the current implementation, such configurations may exist.
      For instance, a configuration using both --multi-dex and --num-threads=5 will
      currently generate a warning about using the two options together. With the
      new implementation, the options can legitimately be used together, and could
      result in an ~20% regression running on a single hardware thread.
      Note: the current dx implementation, without --num-threads option, is already
      approximately 50% slower  with 1 hardware thread, compared to running with 2
      or more. With 2 hardware threads the implementations are practically at par
      (a little better, or a little worse, depending on the application).
      
      Testing:
      Tested with 6 existing applications ranging in size from 1K - 12K classes.
      Updated and tested with relevant existing unit tests (one test changed to
      account for better dex rotation heuristic).
      Added unit test to test deterministic output.
      Added unit performance test. By default run script merely validates that
      --multi-dex and --num-threads can be used together (fast). However, the test
      is configurable to perform performance test, over sets of generated classes.
      Signed-off-by: default avatarPeter Jensen <jensenp@google.com>
      
      (cherry picked from commit 845d9d0e)
      
      Change-Id: I721effa31c3b1a8b427d3a18ec554a19c5e9765b
      dd140a22
    • Benoit Lamarche's avatar
    • Benoit Lamarche's avatar
      Revert "Support --num-threads with --multi-dex" · c8b036e3
      Benoit Lamarche authored
      This reverts commit 845d9d0e.
      
      Bug: 19313927
      Change-Id: Ia6582a3914cc33762aef74da1f5a6a153c8c0ab2
      c8b036e3
    • Benoit Lamarche's avatar
      6e28e432
  9. 01 Feb, 2015 2 commits
  10. 29 Jan, 2015 1 commit
  11. 26 Jan, 2015 2 commits
  12. 23 Jan, 2015 1 commit
    • Narayan Kamath's avatar
      Don't discard directory entries in jar files. · 7736e8ff
      Narayan Kamath authored
      This is a structural change, and breaks code that looks up
      directory resource names (icu4j for example).
      
      This change also includes a minor cosmetic change to use a
      while() loop instead of for(;;).
      
      bug: 19108324
      Change-Id: Ib12c3c1d55f14a089702e5e668d7a704f298e1f4
      7736e8ff
  13. 13 Jan, 2015 3 commits
    • Elliott Hughes's avatar
      am 64d5b033: Merge "Fix printf format specifiers." · 2e5cf008
      Elliott Hughes authored
      * commit '64d5b033':
        Fix printf format specifiers.
      2e5cf008
    • Elliott Hughes's avatar
      Merge "Fix printf format specifiers." · 64d5b033
      Elliott Hughes authored
      64d5b033
    • Elliott Hughes's avatar
      Fix printf format specifiers. · ff762466
      Elliott Hughes authored
      Fixes:
      
      dalvik/tools/hprof-conv/HprofConv.c:233:61: warning: format specifies type 'int' but the argument has type 'size_t' (aka 'unsigned long') [-Wformat]
                  fprintf(stderr, "ERROR: read %d of %d bytes\n", actual, count);
      
      dalvik/tools/hprof-conv/HprofConv.c:256:58: warning: format specifies type 'int' but the argument has type 'size_t' (aka 'unsigned long') [-Wformat]
              fprintf(stderr, "ERROR: write %d of %d bytes\n", actual, pBuf->curLen);
      
      dalvik/tools/hprof-conv/HprofConv.c:537:26: warning: format specifies type 'int' but the argument has type 'long' [-Wformat]
                      subType, buf - origBuf);
      
      Change-Id: If9926900417d57971fa25a4c7f465a9a0c50405e
      ff762466
  14. 08 Jan, 2015 3 commits
  15. 22 Dec, 2014 1 commit
    • Peter Jensen's avatar
      Support --num-threads with --multi-dex · 845d9d0e
      Peter Jensen authored
      
      The current dx implementation supports options --multi-dex, for applications
      not fitting within the dex format limitations; and --num-threads=N, triggers
      concurrent processing of multiple input files. However, the implementation
      has the following limitations:
      
      The --num-threads option is disabled when used together with --multi-dex.
      The --num-threads option implements concurrency at the level of classpath
      entries, and does nothing when the classes to be translated are specified
      with a single classpath element (e.g. single jar output from Proguard).
      The existing --num-threads implementation may produce indeterministic output.
      The heuristic used by the --multi-dex option to determine when to rotate the
      dex output file is overly conservative.
      
      The primary objective of this change is:
      Concurrent translation of classes, independently of input specification format.
      Support --num-threads=N in both mono- and multi-dex mode.
      Deterministic class output order.
      Near optimal use of dex file format capacity.
      
      This is accomplished by reorganizing the dx workflow in a pipeline of
      concurrent phases.
      
      read-class  | parse-class | translate-class | add-to-dex | convert-dex-to-byte[];
      output-dex-files-or-jar
      
      To manage dex file rotation (i.e. --multi-dex support), the parse-class and
      add-to-dex phases are synchronized to prevent forwarding classes to the
      translate-class phase if it could potentially result in breaking the dex
      format limitations. The heuristic currently used to estimate the number of
      indices needed for a class is improved, to minimize the amount of serialization
      imposed by this feedback mechanism, and to improve the use of dex file capacity.
      
      The translate-class and convert-dex-to-byte[] phases are further parallelized
      with configurable (--num-threads=N option) thread pools. This allow translating
      classes concurrently, while also performing output conversion in parallel.
      Separate collector threads are used to collect results from the thread pools
      in deterministic order.
      
      Testing was performed on an Ubuntu system, with 6 cores and 12 hardware threads.
      The taskset command was used to experimentally establish that running with more
      than 8 hardware threads does not provide any additional benefit.
      
      Experiments shows that the argument to --num-threads should not exceed the
      lesser of the number of available hardware threads, and 5. Setting it to a
      higher value results in no additional benefit.
      
      The gain is generally larger for larger applications, and not significant for
      small applications with less than a few thousands classes. Experiments with
      generated classes shows that for large applications gains as high as 50% may
      be possible.
      
      For an existing real-life application with more than 11k classes, and requiring
      2 dex files, a speed-up of 37% was achieved (--num-threads=5, 8 hardware
      threads, 4g Java heap). A speedup of 31% was observed for another application
      with ~7 classes.
      
      For small applications, use of --num-threads=N>1 doesn’t provide significant
      benefit. Running with --num-threads=1, the modified dx is slightly faster,
      but no significant gain is observed unless the application requires multiple
      dex files.
      
      The one case where a significant regression may be observed is when using
      --num-threads=N>1, with a single hardware thread. This is an inappropriate
      configuration, even with the current implementation. However, because of
      the limitations of the current implementation, such configurations may exist.
      For instance, a configuration using both --multi-dex and --num-threads=5 will
      currently generate a warning about using the two options together. With the
      new implementation, the options can legitimately be used together, and could
      result in an ~20% regression running on a single hardware thread.
      Note: the current dx implementation, without --num-threads option, is already
      approximately 50% slower  with 1 hardware thread, compared to running with 2
      or more. With 2 hardware threads the implementations are practically at par
      (a little better, or a little worse, depending on the application).
      
      Testing:
      Tested with 6 existing applications ranging in size from 1K - 12K classes.
      Updated and tested with relevant existing unit tests (one test changed to
      account for better dex rotation heuristic).
      Added unit test to test deterministic output.
      Added unit performance test. By default run script merely validates that
      --multi-dex and --num-threads can be used together (fast). However, the test
      is configurable to perform performance test, over sets of generated classes.
      
      Change-Id: Ic2d11c422396e97171c2e6ceae9477113e261b8e
      Signed-off-by: default avatarPeter Jensen <jensenp@google.com>
      845d9d0e
  16. 19 Dec, 2014 2 commits
  17. 17 Dec, 2014 1 commit
  18. 02 Dec, 2014 2 commits
  19. 01 Dec, 2014 1 commit
  20. 28 Nov, 2014 2 commits
  21. 26 Nov, 2014 1 commit