The patch + the commit message: http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/e8443c7be117
Sun bug:
http://bugs.sun.com/view_bug.do?bug_id=6921969
Mailing list posts:http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2010-January/002639.html
http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2010-February/002660.html
It applies well to the BigInteger.mulAdd loop:
private final static long LONG_MASK = 0xffffffffL; static int mulAdd(int[] out, int[] in, int offset, int len, int k) { long kLong = k & LONG_MASK; long carry = 0; offset = out.length-offset - 1; for (int j=len-1; j >= 0; j--) { long product = (in[j] & LONG_MASK) * kLong + (out[offset] & LONG_MASK) + carry; out[offset--] = (int)product; carry = product >>> 32; } return (int)carry; }
It improved an internal microbenchmark that uses BigInteger.mulAdd by 12%, and crypto.rsa and crypto.signverify in SPECjvm2008 by 7% and 2.3%, respectively in my measurements.
Here's low level details. On x86 (32 bit), a 64 bit long multiplication is implemented as a sequence of multiple 32 bit multiplications as in:
MOV EBP,ECX.lo IMUL EBP,EDX MOV EDX,ECX.hi IMUL EDX,EAX ADD EBP,EDX MUL EDX:EAX,ECX.lo ADD EDX,EBP
This is based on the following idea:
Result = x * y
is implemented as
lo(result) = lo(x_lo * y_lo)
hi(result) = hi(x_lo * y_lo) + lo(x_hi * y_lo) + lo(x_lo * y_hi)
where lo(x) indicates the lower 32 bits of x and hi(x) the higher 32 bits of x.
In the above mulAdd code, the higher 32 bits of both of operands are known to be zero because of the '&' operations. In such cases, some 32 bit multiplications are redundant. In the above formula, if x_hi=0 and y_hi=0,
lo(result) = lo(x_lo * y_lo)
hi(result) = hi(x_lo * y_lo)
because lo(x_hi * y_lo) = 0 and lo(x_lo * y_hi) = 0.
In this case, we need to emit only
MUL EDX:EAX,ECX.lo
which is one instruction instead of 7 and is faster to execute. There are variations of this where only one of x_hi and y_hi is zero.
I thank Chuck Rasbold, Tom Rodriguez, Christian Thalinger, and Vladimir Kozlov who reviewed this change.