summaryrefslogtreecommitdiff
path: root/inflate.c
diff options
context:
space:
mode:
authorSebastian Pop <s.pop@samsung.com>2018-11-07 15:11:27 -0600
committerHans Kristian Rosbach <hk-github@circlestorm.org>2018-12-18 13:40:06 +0100
commitdd70d6c467f26d727ea730ba577809fa0668940d (patch)
treea010cca6ae71460f80f4de306ea20d835059fa2d /inflate.c
parent4ec1ae6af0bd649860e76cef340d7fcbd5f5e552 (diff)
bug #117: speed up inflate_fast
Based on a patch by Nigel Tao: https://github.com/madler/zlib/pull/292/commits/e0ff1f330cc03ee04843f857869b4036593ab39d This patch makes unzipping of files up to 1.2x faster on x86_64. The other part (1.3x speedup) of the patch by Nigel Tao is unsafe as discussed in the review of that pull request. zlib-ng already has a different way to optimize the memcpy for that missing part. The original patch was enabled only on little-endian machines. This patch adapts the loading of 64 bits at a time to big endian machines. Benchmarking notes from Hans Kristian Rosbach: https://github.com/zlib-ng/zlib-ng/pull/224#issuecomment-444837182 Benchmark runs: 7, tested levels: 0-7, testfile 100M develop at 796ad10 with -O3: Level Comp Comptime min/avg/max Decomptime min/avg/max 0 100.02% 0.01/0.01/0.02 0.08/0.09/0.11 1 47.08% 0.49/0.50/0.51 0.37/0.39/0.40 2 36.02% 1.10/1.12/1.13 0.39/0.39/0.40 3 34.77% 1.32/1.34/1.37 0.38/0.38/0.38 4 33.41% 1.50/1.53/1.56 0.37/0.37/0.38 5 33.07% 1.85/1.87/1.90 0.36/0.37/0.38 6 32.83% 2.54/2.57/2.61 0.36/0.37/0.38 avg 45.31% 1.28 0.34 tot 62.60 16.58 PR224 with -O3: Level Comp Comptime min/avg/max Decomptime min/avg/max 0 100.02% 0.01/0.01/0.02 0.09/0.09/0.10 1 47.08% 0.49/0.50/0.51 0.37/0.37/0.38 2 36.02% 1.09/1.11/1.13 0.38/0.38/0.39 3 34.77% 1.32/1.34/1.38 0.35/0.36/0.38 4 33.41% 1.49/1.52/1.54 0.36/0.36/0.37 5 33.07% 1.85/1.88/1.93 0.35/0.36/0.37 6 32.83% 2.55/2.58/2.65 0.35/0.35/0.36 avg 45.31% 1.28 0.33 tot 62.48 16.02 So I see about a 5.4% speedup on my x86_64 machine, not quite the 1.2x speedup but a nice speedup nevertheless. This benchmark measures the total execution time of minigzip, so that might have caused some inefficiencies. At -O2, I only see a 2.7% speedup.
Diffstat (limited to 'inflate.c')
-rw-r--r--inflate.c3
1 files changed, 2 insertions, 1 deletions
diff --git a/inflate.c b/inflate.c
index d2e621e..ee0e530 100644
--- a/inflate.c
+++ b/inflate.c
@@ -1016,7 +1016,8 @@ int ZEXPORT PREFIX(inflate)(PREFIX3(stream) *strm, int flush) {
case LEN_:
state->mode = LEN;
case LEN:
- if (have >= 6 && left >= 258) {
+ if (have >= INFLATE_FAST_MIN_HAVE &&
+ left >= INFLATE_FAST_MIN_LEFT) {
RESTORE();
inflate_fast(strm, out);
LOAD();