|
Based on a patch by Nigel Tao:
https://github.com/madler/zlib/pull/292/commits/e0ff1f330cc03ee04843f857869b4036593ab39d
This patch makes unzipping of files up to 1.2x faster on x86_64. The other part
(1.3x speedup) of the patch by Nigel Tao is unsafe as discussed in the review of
that pull request. zlib-ng already has a different way to optimize the memcpy
for that missing part.
The original patch was enabled only on little-endian machines. This patch adapts
the loading of 64 bits at a time to big endian machines.
Benchmarking notes from Hans Kristian Rosbach:
https://github.com/zlib-ng/zlib-ng/pull/224#issuecomment-444837182
Benchmark runs: 7, tested levels: 0-7, testfile 100M
develop at 796ad10 with -O3:
Level Comp Comptime min/avg/max Decomptime min/avg/max
0 100.02% 0.01/0.01/0.02 0.08/0.09/0.11
1 47.08% 0.49/0.50/0.51 0.37/0.39/0.40
2 36.02% 1.10/1.12/1.13 0.39/0.39/0.40
3 34.77% 1.32/1.34/1.37 0.38/0.38/0.38
4 33.41% 1.50/1.53/1.56 0.37/0.37/0.38
5 33.07% 1.85/1.87/1.90 0.36/0.37/0.38
6 32.83% 2.54/2.57/2.61 0.36/0.37/0.38
avg 45.31% 1.28 0.34
tot 62.60 16.58
PR224 with -O3:
Level Comp Comptime min/avg/max Decomptime min/avg/max
0 100.02% 0.01/0.01/0.02 0.09/0.09/0.10
1 47.08% 0.49/0.50/0.51 0.37/0.37/0.38
2 36.02% 1.09/1.11/1.13 0.38/0.38/0.39
3 34.77% 1.32/1.34/1.38 0.35/0.36/0.38
4 33.41% 1.49/1.52/1.54 0.36/0.36/0.37
5 33.07% 1.85/1.88/1.93 0.35/0.36/0.37
6 32.83% 2.55/2.58/2.65 0.35/0.35/0.36
avg 45.31% 1.28 0.33
tot 62.48 16.02
So I see about a 5.4% speedup on my x86_64 machine, not quite the 1.2x speedup
but a nice speedup nevertheless. This benchmark measures the total execution
time of minigzip, so that might have caused some inefficiencies.
At -O2, I only see a 2.7% speedup.
|