summaryrefslogtreecommitdiff
path: root/docs/html/guide/practices/design/performance.jd
diff options
context:
space:
mode:
Diffstat (limited to 'docs/html/guide/practices/design/performance.jd')
-rw-r--r--docs/html/guide/practices/design/performance.jd549
1 files changed, 549 insertions, 0 deletions
diff --git a/docs/html/guide/practices/design/performance.jd b/docs/html/guide/practices/design/performance.jd
new file mode 100644
index 000000000000..1eef34234e6b
--- /dev/null
+++ b/docs/html/guide/practices/design/performance.jd
@@ -0,0 +1,549 @@
+page.title=Designing for Performance
+@jd:body
+
+<p>An Android application should be fast. Well, it's probably more accurate to
+say that it should be <em>efficient</em>. That is, it should execute as
+efficiently as possible in the mobile device environment, with its limited
+computing power and data storage, smaller screen, and constrained battery life.
+
+<p>As you develop your application, keep in mind that, while the application may
+perform well enough in your emulator, running on your dual-core development
+computer, it will not perform that well when run a mobile device &mdash; even
+the most powerful mobile device can't match the capabilities of a typical
+desktop system. For that reason, you should strive to write efficient code, to
+ensure the best possible performance on a variety of mobile devices.</p>
+
+<p>Generally speaking, writing fast or efficient code means keeping memory
+allocations to a minimum, writing tight code, and avoiding certain language and
+programming idioms that can subtly cripple performance. In object-oriented
+terms, most of this work takes place at the <em>method</em> level, on the order of
+actual lines of code, loops, and so on.</p>
+
+<p>This document covers these topics: </p>
+<ul>
+ <li><a href="#intro">Introduction</a></li>
+ <li><a href="#object_creation">Avoid Creating Objects</a></li>
+ <li><a href="#native_methods">Use Native Methods</a></li>
+ <li><a href="#prefer_virtual">Prefer Virtual Over Interface</a></li>
+ <li><a href="#prefer_static">Prefer Static Over Virtual</a></li>
+ <li><a href="#internal_get_set">Avoid Internal Getters/Setters</a></li>
+ <li><a href="#cache_fields">Cache Field Lookups</a></li>
+ <li><a href="#use_final">Declare Constants Final</a></li>
+ <li><a href="#foreach">Use Enhanced For Loop Syntax With Caution</a></li>
+ <li><a href="#avoid_enums">Avoid Enums</a></li>
+ <li><a href="#package_inner">Use Package Scope with Inner Classes</a></li>
+ <li><a href="#avoidfloat">Avoid Float</a> </li>
+ <li><a href="#samples">Some Sample Performance Numbers</a> </li>
+ <li><a href="#closing_notes">Closing Notes</a></li>
+</ul>
+
+<a name="intro" id="intro"></a>
+<h2>Introduction</h2>
+
+<p>There are two basic rules for resource-constrained systems:</p>
+
+<ul>
+ <li>Don't do work that you don't need to do.</li>
+ <li>Don't allocate memory if you can avoid it.</li>
+</ul>
+
+<p>All the tips below follow from these two basic tenets.</p>
+
+<p>Some would argue that much of the advice on this page amounts to "premature
+optimization." While it's true that micro-optimizations sometimes make it
+harder to develop efficient data structures and algorithms, on embedded
+devices like handsets you often simply have no choice. For instance, if you
+bring your assumptions about VM performance on desktop machines to Android,
+you're quite likely to write code that exhausts system memory. This will bring
+your application to a crawl &mdash; let alone what it will do to other programs
+running on the system!</p>
+
+<p>That's why these guidelines are important. Android's success depends on
+the user experience that your applications provide, and that user experience
+depends in part on whether your code is responsive and snappy, or slow and
+aggravating. Since all our applications will run on the same devices, we're
+all in this together, in a way. Think of this document as like the rules of
+the road you had to learn when you got your driver's license: things run
+smoothly when everybody follows them, but when you don't, you get your car
+smashed up.</p>
+
+<p>Before we get down to brass tacks, a brief observation: nearly all issues
+described below are valid whether or not the VM features a JIT compiler. If I
+have two methods that accomplish the same thing, and the interpreted execution
+of foo() is faster than bar(), then the compiled version of foo() will
+probably be as fast or faster than compiled bar(). It is unwise to rely on a
+compiler to "save" you and make your code fast enough.</p>
+
+<a name="object_creation"></a>
+<h2>Avoid Creating Objects</h2>
+
+<p>Object creation is never free. A generational GC with per-thread allocation
+pools for temporary objects can make allocation cheaper, but allocating memory
+is always more expensive than not allocating memory.</p>
+
+<p>If you allocate objects in a user interface loop, you will force a periodic
+garbage collection, creating little "hiccups" in the user experience.</p>
+
+<p>Thus, you should avoid creating object instances you don't need to. Some
+examples of things that can help:</p>
+
+<ul>
+ <li>When extracting strings from a set of input data, try
+ to return a substring of the original data, instead of creating a copy.
+ You will create a new String object, but it will share the char[]
+ with the data.</li>
+ <li>If you have a method returning a string, and you know that its result
+ will always be appended to a StringBuffer anyway, change your signature
+ and implementation so that the function does the append directly,
+ instead of creating a short-lived temporary object.</li>
+</ul>
+
+<p>A somewhat more radical idea is to slice up multidimensional arrays into parallel
+single one-dimension arrays:</p>
+
+<ul>
+ <li>An array of ints is a much better than an array of Integers,
+ but this also generalizes to the fact that two parallel arrays of ints
+ are also a <strong>lot</strong> more efficient than an array of (int,int)
+ objects. The same goes for any combination of primitive types.</li>
+ <li>If you need to implement a container that stores tuples of (Foo,Bar)
+ objects, try to remember that two parallel Foo[] and Bar[] arrays are
+ generally much better than a single array of custom (Foo,Bar) objects.
+ (The exception to this, of course, is when you're designing an API for
+ other code to access; in those cases, it's usually better to trade
+ correct API design for a small hit in speed. But in your own internal
+ code, you should try and be as efficient as possible.)</li>
+</ul>
+
+<p>Generally speaking, avoid creating short-term temporary objects if you
+can. Fewer objects created mean less-frequent garbage collection, which has
+a direct impact on user experience.</p>
+
+<a name="native_methods" id="native_methods"></a>
+<h2>Use Native Methods</h2>
+
+<p>When processing strings, don't hesitate to use specialty methods like
+String.indexOf(), String.lastIndexOf(), and their cousins. These are typically
+implemented in C/C++ code that easily runs 10-100x faster than doing the same
+thing in a Java loop.</p>
+
+<p>The flip side of that advice is that punching through to a native
+method is more expensive than calling an interpreted method. Don't use native
+methods for trivial computation, if you can avoid it.</p>
+
+<a name="prefer_virtual" id="prefer_virtual"></a>
+<h2>Prefer Virtual Over Interface</h2>
+
+<p>Suppose you have a HashMap object. You can declare it as a HashMap or as
+a generic Map:</p>
+
+<pre>Map myMap1 = new HashMap();
+HashMap myMap2 = new HashMap();</pre>
+
+<p>Which is better?</p>
+
+<p>Conventional wisdom says that you should prefer Map, because it
+allows you to change the underlying implementation to anything that
+implements the Map interface. Conventional wisdom is correct for
+conventional programming, but isn't so great for embedded systems. Calling
+through an interface reference can take 2x longer than a virtual
+method call through a concrete reference.</p>
+
+<p>If you have chosen a HashMap because it fits what you're doing, there
+is little value in calling it a Map. Given the availability of
+IDEs that refactor your code for you, there's not much value in calling
+it a Map even if you're not sure where the code is headed. (Again, though,
+public APIs are an exception: a good API usually trumps small performance
+concerns.)</p>
+
+<a name="prefer_static" id="prefer_static"></a>
+<h2>Prefer Static Over Virtual</h2>
+
+<p>If you don't need to access an object's fields, make your method static. It can
+be called faster, because it doesn't require a virtual method table
+indirection. It's also good practice, because you can tell from the method
+signature that calling the method can't alter the object's state.</p>
+
+<a name="internal_get_set" id="internal_get_set"></a>
+<h2>Avoid Internal Getters/Setters</h2>
+
+<p>In native languages like C++ it's common practice to use getters (e.g.
+<code>i = getCount()</code>) instead of accessing the field directly (<code>i
+= mCount</code>). This is an excellent habit for C++, because the compiler can
+usually inline the access, and if you need to restrict or debug field access
+you can add the code at any time.</p>
+
+<p>On Android, this is a bad idea. Virtual method calls are expensive,
+much more so than instance field lookups. It's reasonable to follow
+common object-oriented programming practices and have getters and setters
+in the public interface, but within a class you should always access
+fields directly.</p>
+
+<a name="cache_fields" id="cache_fields"></a>
+<h2>Cache Field Lookups</h2>
+
+<p>Accessing object fields is much slower than accessing local variables.
+Instead of writing:</p>
+<pre>for (int i = 0; i &lt; this.mCount; i++)
+ dumpItem(this.mItems[i]);</pre>
+
+<p>You should write:</p>
+<pre> int count = this.mCount;
+ Item[] items = this.mItems;
+
+ for (int i = 0; i &lt; count; i++)
+ dumpItems(items[i]);
+</pre>
+
+<p>(We're using an explicit "this" to make it clear that these are
+member variables.)</p>
+
+<p>A similar guideline is never call a method in the second clause of a "for"
+statement. For example, the following code will execute the getCount() method
+once per iteration, which is a huge waste when you could have simply cached
+the value as an int:</p>
+
+<pre>for (int i = 0; i &lt; this.getCount(); i++)
+ dumpItems(this.getItem(i));
+</pre>
+
+<p>It's also usually a good idea to create a local variable if you're going to be
+accessing an instance field more than once. For example:</p>
+
+<pre>
+ protected void drawHorizontalScrollBar(Canvas canvas, int width, int height) {
+ if (isHorizontalScrollBarEnabled()) {
+ int size = <strong>mScrollBar</strong>.getSize(<em>false</em>);
+ if (size &lt;= 0) {
+ size = mScrollBarSize;
+ }
+ <strong>mScrollBar</strong>.setBounds(0, <em>height</em> - size, width, height);
+ <strong>mScrollBar</strong>.setParams(
+ computeHorizontalScrollRange(),
+ computeHorizontalScrollOffset(),
+ computeHorizontalScrollExtent(), <em>false</em>);
+ <strong>mScrollBar</strong>.draw(canvas);
+ }
+ }</pre>
+
+<p>That's four separate lookups of the member field <code>mScrollBar</code>.
+By caching mScrollBar in a local stack variable, the four member field lookups
+become four stack variable references, which are much more efficient.</p>
+
+<p>Incidentally, method arguments have the same performance characteristics
+as local variables.</p>
+
+<a name="use_final" id="use_final"></a>
+<h2>Declare Constants Final</h2>
+
+<p>Consider the following declaration at the top of a class:</p>
+
+<pre>static int intVal = 42;
+static String strVal = "Hello, world!";</pre>
+
+<p>The compiler generates a class initializer method, called
+<code>&lt;clinit&gt;</code>, that is executed when the class is first used.
+The method stores the value 42 into <code>intVal</code>, and extracts a
+reference from the classfile string constant table for <code>strVal</code>.
+When these values are referenced later on, they are accessed with field
+lookups.</p>
+
+<p>We can improve matters with the "final" keyword:</p>
+
+<pre>static final int intVal = 42;
+static final String strVal = "Hello, world!";</pre>
+
+<p>The class no longer requires a <code>&lt;clinit&gt;</code> method,
+because the constants go into classfile static field initializers, which are
+handled directly by the VM. Code accessing <code>intVal</code> will use
+the integer value 42 directly, and accesses to <code>strVal</code> will
+use a relatively inexpensive "string constant" instruction instead of a
+field lookup.</p>
+
+<p>Declaring a method or class "final" does not confer any immediate
+performance benefits, but it does allow certain optimizations. For example, if
+the compiler knows that a "getter" method can't be overridden by a sub-class,
+it can inline the method call.</p>
+
+<p>You can also declare local variables final. However, this has no definitive
+performance benefits. For local variables, only use "final" if it makes the
+code clearer (or you have to, e.g. for use in an anonymous inner class).</p>
+
+<a name="foreach" id="foreach"></a>
+<h2>Use Enhanced For Loop Syntax With Caution</h2>
+
+<p>The enhanced for loop (also sometimes known as "for-each" loop) can be used for collections that implement the Iterable interface.
+With these objects, an iterator is allocated to make interface calls
+to hasNext() and next(). With an ArrayList, you're better off walking through
+it directly, but for other collections the enhanced for loop syntax will be equivalent
+to explicit iterator usage.</p>
+
+<p>Nevertheless, the following code shows an acceptable use of the enhanced for loop:</p>
+
+<pre>public class Foo {
+ int mSplat;
+ static Foo mArray[] = new Foo[27];
+
+ public static void zero() {
+ int sum = 0;
+ for (int i = 0; i &lt; mArray.length; i++) {
+ sum += mArray[i].mSplat;
+ }
+ }
+
+ public static void one() {
+ int sum = 0;
+ Foo[] localArray = mArray;
+ int len = localArray.length;
+
+ for (int i = 0; i &lt; len; i++) {
+ sum += localArray[i].mSplat;
+ }
+ }
+
+ public static void two() {
+ int sum = 0;
+ for (Foo a: mArray) {
+ sum += a.mSplat;
+ }
+ }
+}</pre>
+
+<p><strong>zero()</strong> retrieves the static field twice and gets the array
+length once for every iteration through the loop.</p>
+
+<p><strong>one()</strong> pulls everything out into local variables, avoiding
+the lookups.</p>
+
+<p><strong>two()</strong> uses the enhanced for loop syntax introduced in version 1.5 of
+the Java programming language. The code generated by the compiler takes care
+of copying the array reference and the array length to local variables, making
+it a good choice for walking through all elements of an array. It does
+generate an extra local load/store in the main loop (apparently preserving
+"a"), making it a teensy bit slower and 4 bytes longer than one().</p>
+
+<p>To summarize all that a bit more clearly: enhanced for loop syntax performs well
+with arrays, but be cautious when using it with Iterable objects since there is
+additional object creation.</p>
+
+<a name="avoid_enums" id="avoid_enums"></a>
+<h2>Avoid Enums</h2>
+
+<p>Enums are very convenient, but unfortunately can be painful when size
+and speed matter. For example, this:</p>
+
+<pre>public class Foo {
+ public enum Shrubbery { GROUND, CRAWLING, HANGING }
+}</pre>
+
+<p>turns into a 900 byte .class file (Foo$Shrubbery.class). On first use, the
+class initializer invokes the &lt;init&gt; method on objects representing each
+of the enumerated values. Each object gets its own static field, and the full
+set is stored in an array (a static field called "$VALUES"). That's a lot of
+code and data, just for three integers.</p>
+
+<p>This:</p>
+
+<pre>Shrubbery shrub = Shrubbery.GROUND;</pre>
+
+<p>causes a static field lookup. If "GROUND" were a static final int,
+the compiler would treat it as a known constant and inline it.</p>
+
+<p>The flip side, of course, is that with enums you get nicer APIs and
+some compile-time value checking. So, the usual trade-off applies: you should
+by all means use enums for public APIs, but try to avoid them when performance
+matters.</p>
+
+<p>In some circumstances it can be helpful to get enum integer values
+through the <code>ordinal()</code> method. For example, replace:</p>
+
+<pre>for (int n = 0; n &lt; list.size(); n++) {
+ if (list.items[n].e == MyEnum.VAL_X)
+ // do stuff 1
+ else if (list.items[n].e == MyEnum.VAL_Y)
+ // do stuff 2
+}</pre>
+
+<p>with:</p>
+
+<pre> int valX = MyEnum.VAL_X.ordinal();
+ int valY = MyEnum.VAL_Y.ordinal();
+ int count = list.size();
+ MyItem items = list.items();
+
+ for (int n = 0; n &lt; count; n++)
+ {
+ int valItem = items[n].e.ordinal();
+
+ if (valItem == valX)
+ // do stuff 1
+ else if (valItem == valY)
+ // do stuff 2
+ }</pre>
+
+<p>In some cases, this will be faster, though this is not guaranteed.</p>
+
+<a name="package_inner" id="package_inner"></a>
+<h2>Use Package Scope with Inner Classes</h2>
+
+<p>Consider the following class definition:</p>
+
+<pre>public class Foo {
+ private int mValue;
+
+ public void run() {
+ Inner in = new Inner();
+ mValue = 27;
+ in.stuff();
+ }
+
+ private void doStuff(int value) {
+ System.out.println("Value is " + value);
+ }
+
+ private class Inner {
+ void stuff() {
+ Foo.this.doStuff(Foo.this.mValue);
+ }
+ }
+}</pre>
+
+<p>The key things to note here are that we define an inner class (Foo$Inner)
+that directly accesses a private method and a private instance field
+in the outer class. This is legal, and the code prints "Value is 27" as
+expected.</p>
+
+<p>The problem is that Foo$Inner is technically (behind the scenes) a totally
+separate class, which makes direct access to Foo's private
+members illegal. To bridge that gap, the compiler generates a
+couple of synthetic methods:</p>
+
+<pre>/*package*/ static int Foo.access$100(Foo foo) {
+ return foo.mValue;
+}
+/*package*/ static void Foo.access$200(Foo foo, int value) {
+ foo.doStuff(value);
+}</pre>
+
+<p>The inner-class code calls these static methods whenever it needs to
+access the "mValue" field or invoke the "doStuff" method in the outer
+class. What this means is that the code above really boils down to a case
+where you're accessing member fields through accessor methods instead of
+directly. Earlier we talked about how accessors are slower than direct field
+accesses, so this is an example of a certain language idiom resulting in an
+"invisible" performance hit.</p>
+
+<p>We can avoid this problem by declaring fields and methods accessed
+by inner classes to have package scope, rather than private scope.
+This runs faster and removes the overhead of the generated methods.
+(Unfortunately it also means the fields could be accessed directly by other
+classes in the same package, which runs counter to the standard OO
+practice of making all fields private. Once again, if you're
+designing a public API you might want to carefully consider using this
+optimization.)</p>
+
+<a name="avoidfloat" id="avoidfloat"></a>
+<h2>Avoid Float</h2>
+
+<p>Before the release of the Pentium CPU, it was common for game authors to do
+as much as possible with integer math. With the Pentium, the floating point
+math co-processor became a built-in feature, and by interleaving integer and
+floating-point operations your game would actually go faster than it would
+with purely integer math. The common practice on desktop systems is to use
+floating point freely.</p>
+
+<p>Unfortunately, embedded processors frequently do not have hardware floating
+point support, so all operations on "float" and "double" are performed in
+software. Some basic floating point operations can take on the order of a
+millisecond to complete.</p>
+
+<p>Also, even for integers, some chips have hardware multiply but lack
+hardware divide. In such cases, integer division and modulus operations are
+performed in software &mdash; something to think about if you're designing a
+hash table or doing lots of math.</p>
+
+<a name="samples" id="samples"></a>
+<h2>Some Sample Performance Numbers</h2>
+
+<p>To illustrate some of our ideas, here is a table listing the approximate
+run times for a few basic actions. Note that these values should NOT be taken
+as absolute numbers: they are a combination of CPU and wall clock time, and
+will change as improvements are made to the system. However, it is worth
+noting how these values apply relative to each other &mdash; for example,
+adding a member variable currently takes about four times as long as adding a
+local variable.</p>
+
+<table>
+ <tr>
+ <th>Action</th>
+ <th>Time</th>
+ </tr>
+ <tr>
+ <td>Add a local variable </td>
+ <td>1</td>
+ </tr>
+ <tr class="alt">
+ <td>Add a member variable </td>
+ <td>4</td>
+ </tr>
+ <tr>
+ <td>Call String.length()</td>
+ <td>5</td>
+ </tr>
+ <tr class="alt">
+ <td>Call empty static native method</td>
+ <td>5</td>
+ </tr>
+ <tr>
+ <td>Call empty static method </td>
+ <td>12</td>
+ </tr>
+ <tr class="alt">
+ <td>Call empty virtual method </td>
+ <td>12.5</td>
+ </tr>
+ <tr>
+ <td>Call empty interface method </td>
+ <td>15</td>
+ </tr>
+ <tr class="alt">
+ <td>Call Iterator:next() on a HashMap </td>
+ <td>165</td>
+ </tr>
+ <tr>
+ <td>Call put() on a HashMap</td>
+ <td>600</td>
+ </tr>
+ <tr class="alt">
+ <td>Inflate 1 View from XML </td>
+ <td>22,000</td>
+ </tr>
+ <tr>
+ <td>Inflate 1 LinearLayout containing 1 TextView </td>
+ <td>25,000</td>
+ </tr>
+ <tr class="alt">
+ <td>Inflate 1 LinearLayout containing 6 View objects </td>
+ <td>100,000</td>
+ </tr>
+ <tr>
+ <td>Inflate 1 LinearLayout containing 6 TextView objects </td>
+ <td>135,000</td>
+ </tr>
+ <tr class="alt">
+ <td>Launch an empty activity </td>
+ <td>3,000,000</td>
+ </tr>
+</table>
+
+<a name="closing_notes" id="closing_notes"></a>
+<h2>Closing Notes</h2>
+
+<p>The best way to write good, efficient code for embedded systems is to
+understand what the code you write really does. If you really want to allocate
+an iterator, by all means use enhanced for loop syntax on a List; just make it a
+deliberate choice, not an inadvertent side effect.</p>
+
+<p>Forewarned is forearmed! Know what you're getting into! Insert your
+favorite maxim here, but always think carefully about what your code is doing,
+and be on the lookout for ways to speed it up.</p>