summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authoralk3pInjection <webmaster@raspii.tech>2021-08-30 16:43:38 +0800
committeralk3pInjection <webmaster@raspii.tech>2021-08-30 16:43:38 +0800
commitcbe033a53bfe49d980774e59025e3b2af91778b7 (patch)
tree558535f91276162e0be70d07b34ed2e6577e38ad /doc
parentfdd43c66dd9e77283aa8f7e52a881be44d622441 (diff)
parentd44371841a2f1728a3f36839fd4b7e872d0927d3 (diff)
Merge tag 'v1.9.3' into lineage-18.1HEADlineage-18.1
Change-Id: Iad56c1b17a32f9f356a4c1ff9557f0e79addf481
Diffstat (limited to 'doc')
-rw-r--r--doc/images/usingCDict_1_8_2.pngbin81858 -> 0 bytes
-rw-r--r--doc/lz4_Block_format.md84
-rw-r--r--doc/lz4_Frame_format.md56
-rw-r--r--doc/lz4_manual.html501
-rw-r--r--doc/lz4frame_manual.html112
5 files changed, 479 insertions, 274 deletions
diff --git a/doc/images/usingCDict_1_8_2.png b/doc/images/usingCDict_1_8_2.png
deleted file mode 100644
index 9434198..0000000
--- a/doc/images/usingCDict_1_8_2.png
+++ /dev/null
Binary files differ
diff --git a/doc/lz4_Block_format.md b/doc/lz4_Block_format.md
index 5438730..4344e9b 100644
--- a/doc/lz4_Block_format.md
+++ b/doc/lz4_Block_format.md
@@ -1,6 +1,6 @@
LZ4 Block Format Description
============================
-Last revised: 2018-04-25.
+Last revised: 2019-03-30.
Author : Yann Collet
@@ -10,7 +10,8 @@ using any programming language.
LZ4 is an LZ77-type compressor with a fixed, byte-oriented encoding.
There is no entropy encoder back-end nor framing layer.
-The latter is assumed to be handled by other parts of the system (see [LZ4 Frame format]).
+The latter is assumed to be handled by other parts of the system
+(see [LZ4 Frame format]).
This design is assumed to favor simplicity and speed.
It helps later on for optimizations, compactness, and features.
@@ -104,45 +105,52 @@ A common case is an offset of 1,
meaning the last byte is repeated `matchlength` times.
-Parsing restrictions
+End of block restrictions
-----------------------
-There are specific parsing rules to respect in order to remain compatible
-with assumptions made by the decoder :
-
-1. The last 5 bytes are always literals. In other words, the last five bytes
- from the uncompressed input (or all bytes, if the input has less than five
- bytes) must be encoded as literals on behalf of the last sequence.
- The last sequence is incomplete, and stops right after the literals.
-2. The last match must start at least 12 bytes before end of block.
- The last match is part of the penultimate sequence,
- since the last sequence stops right after literals.
- Note that, as a consequence, blocks < 13 bytes cannot be compressed.
-
-These rules are in place to ensure that the decoder
-can speculatively execute copy instructions
-without ever reading nor writing beyond provided I/O buffers.
-
-1. To copy literals from a non-last sequence, an 8-byte copy instruction
- can always be safely issued (without reading past the input),
- because literals are followed by a 2-byte offset,
- and last sequence is at least 1+5 bytes long.
-2. Similarly, a match operation can speculatively copy up to 12 bytes
- while remaining within output buffer boundaries.
-
-Empty inputs can be represented with a zero byte,
-interpreted as a token without literals and without a match.
+There are specific rules required to terminate a block.
+
+1. The last sequence contains only literals.
+ The block ends right after them.
+2. The last 5 bytes of input are always literals.
+ Therefore, the last sequence contains at least 5 bytes.
+ - Special : if input is smaller than 5 bytes,
+ there is only one sequence, it contains the whole input as literals.
+ Empty input can be represented with a zero byte,
+ interpreted as a final token without literal and without a match.
+3. The last match must start at least 12 bytes before the end of block.
+ The last match is part of the penultimate sequence.
+ It is followed by the last sequence, which contains only literals.
+ - Note that, as a consequence,
+ an independent block < 13 bytes cannot be compressed,
+ because the match must copy "something",
+ so it needs at least one prior byte.
+ - When a block can reference data from another block,
+ it can start immediately with a match and no literal,
+ so a block of 12 bytes can be compressed.
+
+When a block does not respect these end conditions,
+a conformant decoder is allowed to reject the block as incorrect.
+
+These rules are in place to ensure that a conformant decoder
+can be designed for speed, issuing speculatively instructions,
+while never reading nor writing beyond provided I/O buffers.
Additional notes
-----------------------
-There is no assumption nor limits to the way the compressor
+If the decoder will decompress data from an external source,
+it is recommended to ensure that the decoder will not be vulnerable to
+buffer overflow manipulations.
+Always ensure that read and write operations
+remain within the limits of provided buffers.
+Test the decoder with fuzzers
+to ensure it's resilient to improbable combinations.
+
+The format makes no assumption nor limits to the way the compressor
searches and selects matches within the source data block.
-It could be a fast scan, a multi-probe, a full search using BST,
-standard hash chains or MMC, well whatever.
-
-Advanced parsing strategies can also be implemented, such as lazy match,
-or full optimal parsing.
-
-All these trade-off offer distinctive speed/memory/compression advantages.
-Whatever the method used by the compressor, its result will be decodable
-by any LZ4 decoder if it follows the format specification described above.
+Multiple techniques can be considered,
+featuring distinct time / performance trade offs.
+As long as the format is respected,
+the result will be compatible and decodable by any compliant decoder.
+An upper compression limit can be reached,
+using a technique called "full optimal parsing", at high cpu cost.
diff --git a/doc/lz4_Frame_format.md b/doc/lz4_Frame_format.md
index 0c98df1..7e08841 100644
--- a/doc/lz4_Frame_format.md
+++ b/doc/lz4_Frame_format.md
@@ -16,7 +16,7 @@ Distribution of this document is unlimited.
### Version
-1.6.1 (30/01/2018)
+1.6.2 (12/08/2020)
Introduction
@@ -75,7 +75,7 @@ __Frame Descriptor__
3 to 15 Bytes, to be detailed in its own paragraph,
as it is the most important part of the spec.
-The combined __Magic Number__ and __Frame Descriptor__ fields are sometimes
+The combined _Magic_Number_ and _Frame_Descriptor_ fields are sometimes
called ___LZ4 Frame Header___. Its size varies between 7 and 19 bytes.
__Data Blocks__
@@ -85,14 +85,13 @@ That’s where compressed data is stored.
__EndMark__
-The flow of blocks ends when the last data block has a size of “0”.
-The size is expressed as a 32-bits value.
+The flow of blocks ends when the last data block is followed by
+the 32-bit value `0x00000000`.
__Content Checksum__
-Content Checksum verify that the full content has been decoded correctly.
-The content checksum is the result
-of [xxh32() hash function](https://github.com/Cyan4973/xxHash)
+_Content_Checksum_ verify that the full content has been decoded correctly.
+The content checksum is the result of [xxHash-32 algorithm]
digesting the original (decoded) data as input, and a seed of zero.
Content checksum is only present when its associated flag
is set in the frame descriptor.
@@ -101,7 +100,7 @@ that all blocks were fully transmitted in the correct order and without error,
and also that the encoding/decoding process itself generated no distortion.
Its usage is recommended.
-The combined __EndMark__ and __Content Checksum__ fields might sometimes be
+The combined _EndMark_ and _Content_Checksum_ fields might sometimes be
referred to as ___LZ4 Frame Footer___. Its size varies between 4 and 8 bytes.
__Frame Concatenation__
@@ -213,7 +212,7 @@ __Content Size__
This is the original (uncompressed) size.
This information is optional, and only present if the associated flag is set.
-Content size is provided using unsigned 8 Bytes, for a maximum of 16 HexaBytes.
+Content size is provided using unsigned 8 Bytes, for a maximum of 16 Exabytes.
Format is Little endian.
This value is informational, typically for display or memory allocation.
It can be skipped by a decoder, or used to validate content correctness.
@@ -261,35 +260,48 @@ __Block Size__
This field uses 4-bytes, format is little-endian.
-The highest bit is “1” if data in the block is uncompressed.
+If the highest bit is set (`1`), the block is uncompressed.
-The highest bit is “0” if data in the block is compressed by LZ4.
+If the highest bit is not set (`0`), the block is LZ4-compressed,
+using the [LZ4 block format specification](https://github.com/lz4/lz4/blob/dev/doc/lz4_Block_format.md).
-All other bits give the size, in bytes, of the following data block
-(the size does not include the block checksum if present).
+All other bits give the size, in bytes, of the data section.
+The size does not include the block checksum if present.
-Block Size shall never be larger than Block Maximum Size.
-Such a thing could happen for incompressible source data.
-In such case, such a data block shall be passed in uncompressed format.
+_Block_Size_ shall never be larger than _Block_Maximum_Size_.
+Such an outcome could potentially happen for non-compressible sources.
+In such a case, such data block must be passed using uncompressed format.
+
+A value of `0x00000000` is invalid, and signifies an _EndMark_ instead.
+Note that this is different from a value of `0x80000000` (highest bit set),
+which is an uncompressed block of size 0 (empty),
+which is valid, and therefore doesn't end a frame.
+Note that, if _Block_checksum_ is enabled,
+even an empty block must be followed by a 32-bit block checksum.
__Data__
Where the actual data to decode stands.
It might be compressed or not, depending on previous field indications.
-Uncompressed size of Data can be any size, up to “block maximum size”.
-Note that data block is not necessarily full :
-an arbitrary “flush” may happen anytime. Any block can be “partially filled”.
+
+When compressed, the data must respect the [LZ4 block format specification](https://github.com/lz4/lz4/blob/dev/doc/lz4_Block_format.md).
+
+Note that a block is not necessarily full.
+Uncompressed size of data can be any size __up to__ _Block_Maximum_Size_,
+so it may contain less data than the maximum block size.
__Block checksum__
Only present if the associated flag is set.
This is a 4-bytes checksum value, in little endian format,
-calculated by using the xxHash-32 algorithm on the raw (undecoded) data block,
+calculated by using the [xxHash-32 algorithm] on the __raw__ (undecoded) data block,
and a seed of zero.
The intention is to detect data corruption (storage or transmission errors)
before decoding.
-Block checksum is cumulative with Content checksum.
+_Block_checksum_ can be cumulative with _Content_checksum_.
+
+[xxHash-32 algorithm]: https://github.com/Cyan4973/xxHash/blob/release/doc/xxhash_spec.md
Skippable Frames
@@ -386,6 +398,8 @@ and trigger an error if it does not fit within acceptable range.
Version changes
---------------
+1.6.2 : clarifies specification of _EndMark_
+
1.6.1 : introduced terms "LZ4 Frame Header" and "LZ4 Frame Footer"
1.6.0 : restored Dictionary ID field in Frame header
diff --git a/doc/lz4_manual.html b/doc/lz4_manual.html
index 6ebf8d2..47fe18d 100644
--- a/doc/lz4_manual.html
+++ b/doc/lz4_manual.html
@@ -1,10 +1,10 @@
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
-<title>1.8.3 Manual</title>
+<title>1.9.3 Manual</title>
</head>
<body>
-<h1>1.8.3 Manual</h1>
+<h1>1.9.3 Manual</h1>
<hr>
<a name="Contents"></a><h2>Contents</h2>
<ol>
@@ -15,37 +15,44 @@
<li><a href="#Chapter5">Advanced Functions</a></li>
<li><a href="#Chapter6">Streaming Compression Functions</a></li>
<li><a href="#Chapter7">Streaming Decompression Functions</a></li>
-<li><a href="#Chapter8">Unstable declarations</a></li>
-<li><a href="#Chapter9">Private definitions</a></li>
+<li><a href="#Chapter8">Experimental section</a></li>
+<li><a href="#Chapter9">Private Definitions</a></li>
<li><a href="#Chapter10">Obsolete Functions</a></li>
</ol>
<hr>
<a name="Chapter1"></a><h2>Introduction</h2><pre>
- LZ4 is lossless compression algorithm, providing compression speed at 400 MB/s per core,
+ LZ4 is lossless compression algorithm, providing compression speed >500 MB/s per core,
scalable with multi-cores CPU. It features an extremely fast decoder, with speed in
multiple GB/s per core, typically reaching RAM speed limits on multi-core systems.
The LZ4 compression library provides in-memory compression and decompression functions.
+ It gives full buffer control to user.
Compression can be done in:
- a single step (described as Simple Functions)
- a single step, reusing a context (described in Advanced Functions)
- unbounded multiple steps (described as Streaming compression)
- lz4.h provides block compression functions. It gives full buffer control to user.
- Decompressing an lz4-compressed block also requires metadata (such as compressed size).
- Each application is free to encode such metadata in whichever way it wants.
+ lz4.h generates and decodes LZ4-compressed blocks (doc/lz4_Block_format.md).
+ Decompressing such a compressed block requires additional metadata.
+ Exact metadata depends on exact decompression function.
+ For the typical case of LZ4_decompress_safe(),
+ metadata includes block's compressed size, and maximum bound of decompressed size.
+ Each application is free to encode and pass such metadata in whichever way it wants.
- An additional format, called LZ4 frame specification (doc/lz4_Frame_format.md),
- take care of encoding standard metadata alongside LZ4-compressed blocks.
- If your application requires interoperability, it's recommended to use it.
- A library is provided to take care of it, see lz4frame.h.
+ lz4.h only handle blocks, it can not generate Frames.
+
+ Blocks are different from Frames (doc/lz4_Frame_format.md).
+ Frames bundle both blocks and metadata in a specified manner.
+ Embedding metadata is required for compressed data to be self-contained and portable.
+ Frame format is delivered through a companion API, declared in lz4frame.h.
+ The `lz4` CLI can only manage frames.
<BR></pre>
<a name="Chapter2"></a><h2>Version</h2><pre></pre>
<pre><b>int LZ4_versionNumber (void); </b>/**< library version number; useful to check dll version */<b>
</b></pre><BR>
-<pre><b>const char* LZ4_versionString (void); </b>/**< library version string; unseful to check dll version */<b>
+<pre><b>const char* LZ4_versionString (void); </b>/**< library version string; useful to check dll version */<b>
</b></pre><BR>
<a name="Chapter3"></a><h2>Tuning parameter</h2><pre></pre>
@@ -53,8 +60,8 @@
# define LZ4_MEMORY_USAGE 14
#endif
</b><p> Memory usage formula : N->2^N Bytes (examples : 10 -> 1KB; 12 -> 4KB ; 16 -> 64KB; 20 -> 1MB; etc.)
- Increasing memory usage improves compression ratio
- Reduced memory usage may improve speed, thanks to cache effect
+ Increasing memory usage improves compression ratio.
+ Reduced memory usage may improve speed, thanks to better cache locality.
Default value is 14, for 16KB, which nicely fits into Intel x86 L1 cache
</p></pre><BR>
@@ -62,27 +69,35 @@
<a name="Chapter4"></a><h2>Simple Functions</h2><pre></pre>
<pre><b>int LZ4_compress_default(const char* src, char* dst, int srcSize, int dstCapacity);
-</b><p> Compresses 'srcSize' bytes from buffer 'src'
- into already allocated 'dst' buffer of size 'dstCapacity'.
- Compression is guaranteed to succeed if 'dstCapacity' >= LZ4_compressBound(srcSize).
- It also runs faster, so it's a recommended setting.
- If the function cannot compress 'src' into a more limited 'dst' budget,
- compression stops *immediately*, and the function result is zero.
- Note : as a consequence, 'dst' content is not valid.
- Note 2 : This function is protected against buffer overflow scenarios (never writes outside 'dst' buffer, nor read outside 'source' buffer).
- srcSize : max supported value is LZ4_MAX_INPUT_SIZE.
- dstCapacity : size of buffer 'dst' (which must be already allocated)
- return : the number of bytes written into buffer 'dst' (necessarily <= dstCapacity)
- or 0 if compression fails
+</b><p> Compresses 'srcSize' bytes from buffer 'src'
+ into already allocated 'dst' buffer of size 'dstCapacity'.
+ Compression is guaranteed to succeed if 'dstCapacity' >= LZ4_compressBound(srcSize).
+ It also runs faster, so it's a recommended setting.
+ If the function cannot compress 'src' into a more limited 'dst' budget,
+ compression stops *immediately*, and the function result is zero.
+ In which case, 'dst' content is undefined (invalid).
+ srcSize : max supported value is LZ4_MAX_INPUT_SIZE.
+ dstCapacity : size of buffer 'dst' (which must be already allocated)
+ @return : the number of bytes written into buffer 'dst' (necessarily <= dstCapacity)
+ or 0 if compression fails
+ Note : This function is protected against buffer overflow scenarios (never writes outside 'dst' buffer, nor read outside 'source' buffer).
+
</p></pre><BR>
<pre><b>int LZ4_decompress_safe (const char* src, char* dst, int compressedSize, int dstCapacity);
-</b><p> compressedSize : is the exact complete size of the compressed block.
- dstCapacity : is the size of destination buffer, which must be already allocated.
- return : the number of bytes decompressed into destination buffer (necessarily <= dstCapacity)
- If destination buffer is not large enough, decoding will stop and output an error code (negative value).
- If the source stream is detected malformed, the function will stop decoding and return a negative result.
- This function is protected against malicious data packets.
+</b><p> compressedSize : is the exact complete size of the compressed block.
+ dstCapacity : is the size of destination buffer (which must be already allocated), presumed an upper bound of decompressed size.
+ @return : the number of bytes decompressed into destination buffer (necessarily <= dstCapacity)
+ If destination buffer is not large enough, decoding will stop and output an error code (negative value).
+ If the source stream is detected malformed, the function will stop decoding and return a negative result.
+ Note 1 : This function is protected against malicious data packets :
+ it will never writes outside 'dst' buffer, nor read outside 'source' buffer,
+ even if the compressed block is maliciously modified to order the decoder to do these actions.
+ In such case, the decoder stops immediately, and considers the compressed block malformed.
+ Note 2 : compressedSize and dstCapacity must be provided to the function, the compressed block does not contain them.
+ The implementation is free to send / store / derive this information in whichever way is most beneficial.
+ If there is a need for a different format which bundles together both compressed data and its metadata, consider looking at lz4frame.h instead.
+
</p></pre><BR>
<a name="Chapter5"></a><h2>Advanced Functions</h2><pre></pre>
@@ -102,15 +117,17 @@
The larger the acceleration value, the faster the algorithm, but also the lesser the compression.
It's a trade-off. It can be fine tuned, with each successive value providing roughly +~3% to speed.
An acceleration value of "1" is the same as regular LZ4_compress_default()
- Values <= 0 will be replaced by ACCELERATION_DEFAULT (currently == 1, see lz4.c).
+ Values <= 0 will be replaced by LZ4_ACCELERATION_DEFAULT (currently == 1, see lz4.c).
+ Values > LZ4_ACCELERATION_MAX will be replaced by LZ4_ACCELERATION_MAX (currently == 65537, see lz4.c).
</p></pre><BR>
<pre><b>int LZ4_sizeofState(void);
int LZ4_compress_fast_extState (void* state, const char* src, char* dst, int srcSize, int dstCapacity, int acceleration);
-</b><p> Same compression function, just using an externally allocated memory space to store compression state.
- Use LZ4_sizeofState() to know how much memory must be allocated,
- and allocate it on 8-bytes boundaries (using malloc() typically).
- Then, provide this buffer as 'void* state' to compression function.
+</b><p> Same as LZ4_compress_fast(), using an externally allocated memory space for its state.
+ Use LZ4_sizeofState() to know how much memory must be allocated,
+ and allocate it on 8-bytes boundaries (using `malloc()` typically).
+ Then, provide this buffer as `void* state` to compression function.
+
</p></pre><BR>
<pre><b>int LZ4_compress_destSize (const char* src, char* dst, int* srcSizePtr, int targetDstSize);
@@ -124,26 +141,17 @@ int LZ4_compress_fast_extState (void* state, const char* src, char* dst, int src
New value is necessarily <= input value.
@return : Nb bytes written into 'dst' (necessarily <= targetDestSize)
or 0 if compression fails.
-</p></pre><BR>
-
-<pre><b>int LZ4_decompress_fast (const char* src, char* dst, int originalSize);
-</b><p> This function used to be a bit faster than LZ4_decompress_safe(),
- though situation has changed in recent versions,
- and now `LZ4_decompress_safe()` can be as fast and sometimes faster than `LZ4_decompress_fast()`.
- Moreover, LZ4_decompress_fast() is not protected vs malformed input, as it doesn't perform full validation of compressed data.
- As a consequence, this function is no longer recommended, and may be deprecated in future versions.
- It's only remaining specificity is that it can decompress data without knowing its compressed size.
- originalSize : is the uncompressed size to regenerate.
- `dst` must be already allocated, its size must be >= 'originalSize' bytes.
- @return : number of bytes read from source buffer (== compressed size).
- If the source stream is detected malformed, the function stops decoding and returns a negative result.
- note : This function requires uncompressed originalSize to be known in advance.
- The function never writes past the output buffer.
- However, since it doesn't know its 'src' size, it may read past the intended input.
- Also, because match offsets are not validated during decoding,
- reads from 'src' may underflow.
- Use this function in trusted environment **only**.
+ Note : from v1.8.2 to v1.9.1, this function had a bug (fixed un v1.9.2+):
+ the produced compressed content could, in specific circumstances,
+ require to be decompressed into a destination buffer larger
+ by at least 1 byte than the content to decompress.
+ If an application uses `LZ4_compress_destSize()`,
+ it's highly recommended to update liblz4 to v1.9.2 or better.
+ If this can't be done or ensured,
+ the receiving decompression function should provide
+ a dstCapacity which is > decompressedSize, by at least 1 byte.
+ See https://github.com/lz4/lz4/issues/859 for details
</p></pre><BR>
@@ -151,54 +159,80 @@ int LZ4_compress_fast_extState (void* state, const char* src, char* dst, int src
</b><p> Decompress an LZ4 compressed block, of size 'srcSize' at position 'src',
into destination buffer 'dst' of size 'dstCapacity'.
Up to 'targetOutputSize' bytes will be decoded.
- The function stops decoding on reaching this objective,
- which can boost performance when only the beginning of a block is required.
+ The function stops decoding on reaching this objective.
+ This can be useful to boost performance
+ whenever only the beginning of a block is required.
- @return : the number of bytes decoded in `dst` (necessarily <= dstCapacity)
+ @return : the number of bytes decoded in `dst` (necessarily <= targetOutputSize)
If source stream is detected malformed, function returns a negative result.
- Note : @return can be < targetOutputSize, if compressed block contains less data.
+ Note 1 : @return can be < targetOutputSize, if compressed block contains less data.
- Note 2 : this function features 2 parameters, targetOutputSize and dstCapacity,
- and expects targetOutputSize <= dstCapacity.
- It effectively stops decoding on reaching targetOutputSize,
+ Note 2 : targetOutputSize must be <= dstCapacity
+
+ Note 3 : this function effectively stops decoding on reaching targetOutputSize,
so dstCapacity is kind of redundant.
- This is because in a previous version of this function,
- decoding operation would not "break" a sequence in the middle.
- As a consequence, there was no guarantee that decoding would stop at exactly targetOutputSize,
+ This is because in older versions of this function,
+ decoding operation would still write complete sequences.
+ Therefore, there was no guarantee that it would stop writing at exactly targetOutputSize,
it could write more bytes, though only up to dstCapacity.
Some "margin" used to be required for this operation to work properly.
- This is no longer necessary.
- The function nonetheless keeps its signature, in an effort to not break API.
-
-</p></pre><BR>
+ Thankfully, this is no longer necessary.
+ The function nonetheless keeps the same signature, in an effort to preserve API compatibility.
-<a name="Chapter6"></a><h2>Streaming Compression Functions</h2><pre></pre>
+ Note 4 : If srcSize is the exact size of the block,
+ then targetOutputSize can be any value,
+ including larger than the block's decompressed size.
+ The function will, at most, generate block's decompressed size.
-<pre><b>LZ4_stream_t* LZ4_createStream(void);
-int LZ4_freeStream (LZ4_stream_t* streamPtr);
-</b><p> LZ4_createStream() will allocate and initialize an `LZ4_stream_t` structure.
- LZ4_freeStream() releases its memory.
+ Note 5 : If srcSize is _larger_ than block's compressed size,
+ then targetOutputSize **MUST** be <= block's decompressed size.
+ Otherwise, *silent corruption will occur*.
</p></pre><BR>
-<pre><b>void LZ4_resetStream (LZ4_stream_t* streamPtr);
-</b><p> An LZ4_stream_t structure can be allocated once and re-used multiple times.
- Use this function to start compressing a new stream.
+<a name="Chapter6"></a><h2>Streaming Compression Functions</h2><pre></pre>
+
+<pre><b>void LZ4_resetStream_fast (LZ4_stream_t* streamPtr);
+</b><p> Use this to prepare an LZ4_stream_t for a new chain of dependent blocks
+ (e.g., LZ4_compress_fast_continue()).
+
+ An LZ4_stream_t must be initialized once before usage.
+ This is automatically done when created by LZ4_createStream().
+ However, should the LZ4_stream_t be simply declared on stack (for example),
+ it's necessary to initialize it first, using LZ4_initStream().
+
+ After init, start any new stream with LZ4_resetStream_fast().
+ A same LZ4_stream_t can be re-used multiple times consecutively
+ and compress multiple streams,
+ provided that it starts each new stream with LZ4_resetStream_fast().
+
+ LZ4_resetStream_fast() is much faster than LZ4_initStream(),
+ but is not compatible with memory regions containing garbage data.
+
+ Note: it's only useful to call LZ4_resetStream_fast()
+ in the context of streaming compression.
+ The *extState* functions perform their own resets.
+ Invoking LZ4_resetStream_fast() before is redundant, and even counterproductive.
</p></pre><BR>
<pre><b>int LZ4_loadDict (LZ4_stream_t* streamPtr, const char* dictionary, int dictSize);
-</b><p> Use this function to load a static dictionary into LZ4_stream_t.
- Any previous data will be forgotten, only 'dictionary' will remain in memory.
+</b><p> Use this function to reference a static dictionary into LZ4_stream_t.
+ The dictionary must remain available during compression.
+ LZ4_loadDict() triggers a reset, so any previous data will be forgotten.
+ The same dictionary will have to be loaded on decompression side for successful decoding.
+ Dictionary are useful for better compression of small data (KB range).
+ While LZ4 accept any input as dictionary,
+ results are generally better when using Zstandard's Dictionary Builder.
Loading a size of 0 is allowed, and is the same as reset.
- @return : dictionary size, in bytes (necessarily <= 64 KB)
+ @return : loaded dictionary size, in bytes (necessarily <= 64 KB)
</p></pre><BR>
<pre><b>int LZ4_compress_fast_continue (LZ4_stream_t* streamPtr, const char* src, char* dst, int srcSize, int dstCapacity, int acceleration);
</b><p> Compress 'src' content using data from previously compressed blocks, for better compression ratio.
- 'dst' buffer must be already allocated.
+ 'dst' buffer must be already allocated.
If dstCapacity >= LZ4_compressBound(srcSize), compression is guaranteed to succeed, and runs faster.
@return : size of compressed block
@@ -206,10 +240,10 @@ int LZ4_freeStream (LZ4_stream_t* streamPtr);
Note 1 : Each invocation to LZ4_compress_fast_continue() generates a new block.
Each block has precise boundaries.
+ Each block must be decompressed separately, calling LZ4_decompress_*() with relevant metadata.
It's not possible to append blocks together and expect a single invocation of LZ4_decompress_*() to decompress them together.
- Each block must be decompressed separately, calling LZ4_decompress_*() with associated metadata.
- Note 2 : The previous 64KB of source data is __assumed__ to remain present, unmodified, at same address in memory!
+ Note 2 : The previous 64KB of source data is __assumed__ to remain present, unmodified, at same address in memory !
Note 3 : When input is structured as a double-buffer, each buffer can have any size, including < 64 KB.
Make sure that buffers are separated, by at least one byte.
@@ -217,7 +251,7 @@ int LZ4_freeStream (LZ4_stream_t* streamPtr);
Note 4 : If input buffer is a ring-buffer, it can have any size, including < 64 KB.
- Note 5 : After an error, the stream status is invalid, it can only be reset or freed.
+ Note 5 : After an error, the stream status is undefined (invalid), it can only be reset or freed.
</p></pre><BR>
@@ -250,7 +284,7 @@ int LZ4_freeStreamDecode (LZ4_streamDecode_t* LZ4_stream);
</p></pre><BR>
<pre><b>int LZ4_decoderRingBufferSize(int maxBlockSize);
-#define LZ4_DECODER_RING_BUFFER_SIZE(mbs) (65536 + 14 + (mbs)) </b>/* for static allocation; mbs presumed valid */<b>
+#define LZ4_DECODER_RING_BUFFER_SIZE(maxBlockSize) (65536 + 14 + (maxBlockSize)) </b>/* for static allocation; maxBlockSize presumed valid */<b>
</b><p> Note : in a ring buffer scenario (optional),
blocks are presumed decompressed next to each other
up to the moment there is not enough remaining space for next block (remainingSize < maxBlockSize),
@@ -264,7 +298,6 @@ int LZ4_freeStreamDecode (LZ4_streamDecode_t* LZ4_stream);
</p></pre><BR>
<pre><b>int LZ4_decompress_safe_continue (LZ4_streamDecode_t* LZ4_streamDecode, const char* src, char* dst, int srcSize, int dstCapacity);
-int LZ4_decompress_fast_continue (LZ4_streamDecode_t* LZ4_streamDecode, const char* src, char* dst, int originalSize);
</b><p> These decoding functions allow decompression of consecutive blocks in "streaming" mode.
A block is an unsplittable entity, it must be presented entirely to a decompression function.
Decompression functions only accepts one block at a time.
@@ -291,70 +324,48 @@ int LZ4_decompress_fast_continue (LZ4_streamDecode_t* LZ4_streamDecode, const ch
</p></pre><BR>
<pre><b>int LZ4_decompress_safe_usingDict (const char* src, char* dst, int srcSize, int dstCapcity, const char* dictStart, int dictSize);
-int LZ4_decompress_fast_usingDict (const char* src, char* dst, int originalSize, const char* dictStart, int dictSize);
</b><p> These decoding functions work the same as
a combination of LZ4_setStreamDecode() followed by LZ4_decompress_*_continue()
They are stand-alone, and don't need an LZ4_streamDecode_t structure.
- Dictionary is presumed stable : it must remain accessible and unmodified during next decompression.
+ Dictionary is presumed stable : it must remain accessible and unmodified during decompression.
+ Performance tip : Decompression speed can be substantially increased
+ when dst == dictStart + dictSize.
</p></pre><BR>
-<a name="Chapter8"></a><h2>Unstable declarations</h2><pre>
- Declarations in this section should be considered unstable.
- Use at your own peril, etc., etc.
- They may be removed in the future.
- Their signatures may change.
-<BR></pre>
+<a name="Chapter8"></a><h2>Experimental section</h2><pre>
+ Symbols declared in this section must be considered unstable. Their
+ signatures or semantics may change, or they may be removed altogether in the
+ future. They are therefore only safe to depend on when the caller is
+ statically linked against the library.
-<pre><b>void LZ4_resetStream_fast (LZ4_stream_t* streamPtr);
-</b><p> Use this, like LZ4_resetStream(), to prepare a context for a new chain of
- calls to a streaming API (e.g., LZ4_compress_fast_continue()).
-
- Note:
- Using this in advance of a non- streaming-compression function is redundant,
- and potentially bad for performance, since they all perform their own custom
- reset internally.
-
- Differences from LZ4_resetStream():
- When an LZ4_stream_t is known to be in a internally coherent state,
- it can often be prepared for a new compression with almost no work, only
- sometimes falling back to the full, expensive reset that is always required
- when the stream is in an indeterminate state (i.e., the reset performed by
- LZ4_resetStream()).
-
- LZ4_streams are guaranteed to be in a valid state when:
- - returned from LZ4_createStream()
- - reset by LZ4_resetStream()
- - memset(stream, 0, sizeof(LZ4_stream_t)), though this is discouraged
- - the stream was in a valid state and was reset by LZ4_resetStream_fast()
- - the stream was in a valid state and was then used in any compression call
- that returned success
- - the stream was in an indeterminate state and was used in a compression
- call that fully reset the state (e.g., LZ4_compress_fast_extState()) and
- that returned success
-
- When a stream isn't known to be in a valid state, it is not safe to pass to
- any fastReset or streaming function. It must first be cleansed by the full
- LZ4_resetStream().
-
-</p></pre><BR>
+ To protect against unsafe usage, not only are the declarations guarded,
+ the definitions are hidden by default
+ when building LZ4 as a shared/dynamic library.
+
+ In order to access these declarations,
+ define LZ4_STATIC_LINKING_ONLY in your application
+ before including LZ4's headers.
+
+ In order to make their implementations accessible dynamically, you must
+ define LZ4_PUBLISH_STATIC_FUNCTIONS when building the LZ4 library.
+<BR></pre>
-<pre><b>int LZ4_compress_fast_extState_fastReset (void* state, const char* src, char* dst, int srcSize, int dstCapacity, int acceleration);
+<pre><b>LZ4LIB_STATIC_API int LZ4_compress_fast_extState_fastReset (void* state, const char* src, char* dst, int srcSize, int dstCapacity, int acceleration);
</b><p> A variant of LZ4_compress_fast_extState().
- Using this variant avoids an expensive initialization step. It is only safe
- to call if the state buffer is known to be correctly initialized already
- (see above comment on LZ4_resetStream_fast() for a definition of "correctly
- initialized"). From a high level, the difference is that this function
- initializes the provided state with a call to something like
- LZ4_resetStream_fast() while LZ4_compress_fast_extState() starts with a
- call to LZ4_resetStream().
+ Using this variant avoids an expensive initialization step.
+ It is only safe to call if the state buffer is known to be correctly initialized already
+ (see above comment on LZ4_resetStream_fast() for a definition of "correctly initialized").
+ From a high level, the difference is that
+ this function initializes the provided state with a call to something like LZ4_resetStream_fast()
+ while LZ4_compress_fast_extState() starts with a call to LZ4_resetStream().
</p></pre><BR>
-<pre><b>void LZ4_attach_dictionary(LZ4_stream_t *working_stream, const LZ4_stream_t *dictionary_stream);
-</b><p> This is an experimental API that allows for the efficient use of a
- static dictionary many times.
+<pre><b>LZ4LIB_STATIC_API void LZ4_attach_dictionary(LZ4_stream_t* workingStream, const LZ4_stream_t* dictionaryStream);
+</b><p> This is an experimental API that allows
+ efficient use of a static dictionary many times.
Rather than re-loading the dictionary buffer into a working context before
each compression, or copying a pre-loaded dictionary's LZ4_stream_t into a
@@ -365,8 +376,8 @@ int LZ4_decompress_fast_usingDict (const char* src, char* dst, int originalSize,
Currently, only streams which have been prepared by LZ4_loadDict() should
be expected to work.
- Alternatively, the provided dictionary stream pointer may be NULL, in which
- case any existing dictionary stream is unset.
+ Alternatively, the provided dictionaryStream may be NULL,
+ in which case any existing dictionary stream is unset.
If a dictionary is provided, it replaces any pre-existing stream history.
The dictionary contents are the only history that can be referenced and
@@ -380,51 +391,118 @@ int LZ4_decompress_fast_usingDict (const char* src, char* dst, int originalSize,
</p></pre><BR>
-<a name="Chapter9"></a><h2>Private definitions</h2><pre>
- Do not use these definitions.
- They are exposed to allow static allocation of `LZ4_stream_t` and `LZ4_streamDecode_t`.
- Using these definitions will expose code to API and/or ABI break in future versions of the library.
-<BR></pre>
+<pre><b></b><p>
+ It's possible to have input and output sharing the same buffer,
+ for highly contrained memory environments.
+ In both cases, it requires input to lay at the end of the buffer,
+ and decompression to start at beginning of the buffer.
+ Buffer size must feature some margin, hence be larger than final size.
+
+ |<------------------------buffer--------------------------------->|
+ |<-----------compressed data--------->|
+ |<-----------decompressed size------------------>|
+ |<----margin---->|
+
+ This technique is more useful for decompression,
+ since decompressed size is typically larger,
+ and margin is short.
+
+ In-place decompression will work inside any buffer
+ which size is >= LZ4_DECOMPRESS_INPLACE_BUFFER_SIZE(decompressedSize).
+ This presumes that decompressedSize > compressedSize.
+ Otherwise, it means compression actually expanded data,
+ and it would be more efficient to store such data with a flag indicating it's not compressed.
+ This can happen when data is not compressible (already compressed, or encrypted).
+
+ For in-place compression, margin is larger, as it must be able to cope with both
+ history preservation, requiring input data to remain unmodified up to LZ4_DISTANCE_MAX,
+ and data expansion, which can happen when input is not compressible.
+ As a consequence, buffer size requirements are much higher,
+ and memory savings offered by in-place compression are more limited.
+
+ There are ways to limit this cost for compression :
+ - Reduce history size, by modifying LZ4_DISTANCE_MAX.
+ Note that it is a compile-time constant, so all compressions will apply this limit.
+ Lower values will reduce compression ratio, except when input_size < LZ4_DISTANCE_MAX,
+ so it's a reasonable trick when inputs are known to be small.
+ - Require the compressor to deliver a "maximum compressed size".
+ This is the `dstCapacity` parameter in `LZ4_compress*()`.
+ When this size is < LZ4_COMPRESSBOUND(inputSize), then compression can fail,
+ in which case, the return code will be 0 (zero).
+ The caller must be ready for these cases to happen,
+ and typically design a backup scheme to send data uncompressed.
+ The combination of both techniques can significantly reduce
+ the amount of margin required for in-place compression.
+
+ In-place compression can work in any buffer
+ which size is >= (maxCompressedSize)
+ with maxCompressedSize == LZ4_COMPRESSBOUND(srcSize) for guaranteed compression success.
+ LZ4_COMPRESS_INPLACE_BUFFER_SIZE() depends on both maxCompressedSize and LZ4_DISTANCE_MAX,
+ so it's possible to reduce memory requirements by playing with them.
+
+</p></pre><BR>
-<pre><b>typedef struct {
- const uint8_t* externalDict;
- size_t extDictSize;
- const uint8_t* prefixEnd;
- size_t prefixSize;
-} LZ4_streamDecode_t_internal;
+<pre><b>#define LZ4_DECOMPRESS_INPLACE_BUFFER_SIZE(decompressedSize) ((decompressedSize) + LZ4_DECOMPRESS_INPLACE_MARGIN(decompressedSize)) </b>/**< note: presumes that compressedSize < decompressedSize. note2: margin is overestimated a bit, since it could use compressedSize instead */<b>
</b></pre><BR>
+<pre><b>#define LZ4_COMPRESS_INPLACE_BUFFER_SIZE(maxCompressedSize) ((maxCompressedSize) + LZ4_COMPRESS_INPLACE_MARGIN) </b>/**< maxCompressedSize is generally LZ4_COMPRESSBOUND(inputSize), but can be set to any lower value, with the risk that compression can fail (return code 0(zero)) */<b>
+</b></pre><BR>
+<a name="Chapter9"></a><h2>Private Definitions</h2><pre>
+ Do not use these definitions directly.
+ They are only exposed to allow static allocation of `LZ4_stream_t` and `LZ4_streamDecode_t`.
+ Accessing members will expose user code to API and/or ABI break in future versions of the library.
+<BR></pre>
+
<pre><b>typedef struct {
- const unsigned char* externalDict;
+ const LZ4_byte* externalDict;
size_t extDictSize;
- const unsigned char* prefixEnd;
+ const LZ4_byte* prefixEnd;
size_t prefixSize;
} LZ4_streamDecode_t_internal;
</b></pre><BR>
-<pre><b>#define LZ4_STREAMSIZE_U64 ((1 << (LZ4_MEMORY_USAGE-3)) + 4)
-#define LZ4_STREAMSIZE (LZ4_STREAMSIZE_U64 * sizeof(unsigned long long))
+<pre><b>#define LZ4_STREAMSIZE 16416 </b>/* static size, for inter-version compatibility */<b>
+#define LZ4_STREAMSIZE_VOIDP (LZ4_STREAMSIZE / sizeof(void*))
union LZ4_stream_u {
- unsigned long long table[LZ4_STREAMSIZE_U64];
+ void* table[LZ4_STREAMSIZE_VOIDP];
LZ4_stream_t_internal internal_donotuse;
-} ; </b>/* previously typedef'd to LZ4_stream_t */<b>
-</b><p> information structure to track an LZ4 stream.
- init this structure before first use.
- note : only use in association with static linking !
- this definition is not API/ABI safe,
- it may change in a future version !
+}; </b>/* previously typedef'd to LZ4_stream_t */<b>
+</b><p> Do not use below internal definitions directly !
+ Declare or allocate an LZ4_stream_t instead.
+ LZ4_stream_t can also be created using LZ4_createStream(), which is recommended.
+ The structure definition can be convenient for static allocation
+ (on stack, or as part of larger structure).
+ Init this structure with LZ4_initStream() before first use.
+ note : only use this definition in association with static linking !
+ this definition is not API/ABI safe, and may change in future versions.
</p></pre><BR>
-<pre><b>#define LZ4_STREAMDECODESIZE_U64 4
+<pre><b>LZ4_stream_t* LZ4_initStream (void* buffer, size_t size);
+</b><p> An LZ4_stream_t structure must be initialized at least once.
+ This is automatically done when invoking LZ4_createStream(),
+ but it's not when the structure is simply declared on stack (for example).
+
+ Use LZ4_initStream() to properly initialize a newly declared LZ4_stream_t.
+ It can also initialize any arbitrary buffer of sufficient size,
+ and will @return a pointer of proper type upon initialization.
+
+ Note : initialization fails if size and alignment conditions are not respected.
+ In which case, the function will @return NULL.
+ Note2: An LZ4_stream_t structure guarantees correct alignment and size.
+ Note3: Before v1.9.0, use LZ4_resetStream() instead
+
+</p></pre><BR>
+
+<pre><b>#define LZ4_STREAMDECODESIZE_U64 (4 + ((sizeof(void*)==16) ? 2 : 0) </b>/*AS-400*/ )<b>
#define LZ4_STREAMDECODESIZE (LZ4_STREAMDECODESIZE_U64 * sizeof(unsigned long long))
union LZ4_streamDecode_u {
unsigned long long table[LZ4_STREAMDECODESIZE_U64];
LZ4_streamDecode_t_internal internal_donotuse;
} ; </b>/* previously typedef'd to LZ4_streamDecode_t */<b>
-</b><p> information structure to track an LZ4 stream during decompression.
- init this structure using LZ4_setStreamDecode (or memset()) before first use
- note : only use in association with static linking !
- this definition is not API/ABI safe,
- and may change in a future version !
+</b><p> information structure to track an LZ4 stream during decompression.
+ init this structure using LZ4_setStreamDecode() before first use.
+ note : only use in association with static linking !
+ this definition is not API/ABI safe,
+ and may change in a future version !
</p></pre><BR>
@@ -433,25 +511,86 @@ union LZ4_streamDecode_u {
<pre><b>#ifdef LZ4_DISABLE_DEPRECATE_WARNINGS
# define LZ4_DEPRECATED(message) </b>/* disable deprecation warnings */<b>
#else
-# define LZ4_GCC_VERSION (__GNUC__ * 100 + __GNUC_MINOR__)
# if defined (__cplusplus) && (__cplusplus >= 201402) </b>/* C++14 or greater */<b>
# define LZ4_DEPRECATED(message) [[deprecated(message)]]
-# elif (LZ4_GCC_VERSION >= 405) || defined(__clang__)
-# define LZ4_DEPRECATED(message) __attribute__((deprecated(message)))
-# elif (LZ4_GCC_VERSION >= 301)
-# define LZ4_DEPRECATED(message) __attribute__((deprecated))
# elif defined(_MSC_VER)
# define LZ4_DEPRECATED(message) __declspec(deprecated(message))
+# elif defined(__clang__) || (defined(__GNUC__) && (__GNUC__ * 10 + __GNUC_MINOR__ >= 45))
+# define LZ4_DEPRECATED(message) __attribute__((deprecated(message)))
+# elif defined(__GNUC__) && (__GNUC__ * 10 + __GNUC_MINOR__ >= 31)
+# define LZ4_DEPRECATED(message) __attribute__((deprecated))
# else
-# pragma message("WARNING: You need to implement LZ4_DEPRECATED for this compiler")
-# define LZ4_DEPRECATED(message)
+# pragma message("WARNING: LZ4_DEPRECATED needs custom implementation for this compiler")
+# define LZ4_DEPRECATED(message) </b>/* disabled */<b>
# endif
#endif </b>/* LZ4_DISABLE_DEPRECATE_WARNINGS */<b>
-</b><p> Should deprecation warnings be a problem,
- it is generally possible to disable them,
- typically with -Wno-deprecated-declarations for gcc
- or _CRT_SECURE_NO_WARNINGS in Visual.
- Otherwise, it's also possible to define LZ4_DISABLE_DEPRECATE_WARNINGS
+</b><p>
+ Deprecated functions make the compiler generate a warning when invoked.
+ This is meant to invite users to update their source code.
+ Should deprecation warnings be a problem, it is generally possible to disable them,
+ typically with -Wno-deprecated-declarations for gcc
+ or _CRT_SECURE_NO_WARNINGS in Visual.
+
+ Another method is to define LZ4_DISABLE_DEPRECATE_WARNINGS
+ before including the header file.
+
+</p></pre><BR>
+
+<pre><b>LZ4_DEPRECATED("use LZ4_compress_default() instead") LZ4LIB_API int LZ4_compress (const char* src, char* dest, int srcSize);
+LZ4_DEPRECATED("use LZ4_compress_default() instead") LZ4LIB_API int LZ4_compress_limitedOutput (const char* src, char* dest, int srcSize, int maxOutputSize);
+LZ4_DEPRECATED("use LZ4_compress_fast_extState() instead") LZ4LIB_API int LZ4_compress_withState (void* state, const char* source, char* dest, int inputSize);
+LZ4_DEPRECATED("use LZ4_compress_fast_extState() instead") LZ4LIB_API int LZ4_compress_limitedOutput_withState (void* state, const char* source, char* dest, int inputSize, int maxOutputSize);
+LZ4_DEPRECATED("use LZ4_compress_fast_continue() instead") LZ4LIB_API int LZ4_compress_continue (LZ4_stream_t* LZ4_streamPtr, const char* source, char* dest, int inputSize);
+LZ4_DEPRECATED("use LZ4_compress_fast_continue() instead") LZ4LIB_API int LZ4_compress_limitedOutput_continue (LZ4_stream_t* LZ4_streamPtr, const char* source, char* dest, int inputSize, int maxOutputSize);
+</b><p></p></pre><BR>
+
+<pre><b>LZ4_DEPRECATED("use LZ4_decompress_fast() instead") LZ4LIB_API int LZ4_uncompress (const char* source, char* dest, int outputSize);
+LZ4_DEPRECATED("use LZ4_decompress_safe() instead") LZ4LIB_API int LZ4_uncompress_unknownOutputSize (const char* source, char* dest, int isize, int maxOutputSize);
+</b><p></p></pre><BR>
+
+<pre><b>LZ4_DEPRECATED("use LZ4_decompress_safe_usingDict() instead") LZ4LIB_API int LZ4_decompress_safe_withPrefix64k (const char* src, char* dst, int compressedSize, int maxDstSize);
+LZ4_DEPRECATED("use LZ4_decompress_fast_usingDict() instead") LZ4LIB_API int LZ4_decompress_fast_withPrefix64k (const char* src, char* dst, int originalSize);
+</b><p></p></pre><BR>
+
+<pre><b>LZ4_DEPRECATED("This function is deprecated and unsafe. Consider using LZ4_decompress_safe() instead")
+int LZ4_decompress_fast (const char* src, char* dst, int originalSize);
+LZ4_DEPRECATED("This function is deprecated and unsafe. Consider using LZ4_decompress_safe_continue() instead")
+int LZ4_decompress_fast_continue (LZ4_streamDecode_t* LZ4_streamDecode, const char* src, char* dst, int originalSize);
+LZ4_DEPRECATED("This function is deprecated and unsafe. Consider using LZ4_decompress_safe_usingDict() instead")
+int LZ4_decompress_fast_usingDict (const char* src, char* dst, int originalSize, const char* dictStart, int dictSize);
+</b><p> These functions used to be faster than LZ4_decompress_safe(),
+ but this is no longer the case. They are now slower.
+ This is because LZ4_decompress_fast() doesn't know the input size,
+ and therefore must progress more cautiously into the input buffer to not read beyond the end of block.
+ On top of that `LZ4_decompress_fast()` is not protected vs malformed or malicious inputs, making it a security liability.
+ As a consequence, LZ4_decompress_fast() is strongly discouraged, and deprecated.
+
+ The last remaining LZ4_decompress_fast() specificity is that
+ it can decompress a block without knowing its compressed size.
+ Such functionality can be achieved in a more secure manner
+ by employing LZ4_decompress_safe_partial().
+
+ Parameters:
+ originalSize : is the uncompressed size to regenerate.
+ `dst` must be already allocated, its size must be >= 'originalSize' bytes.
+ @return : number of bytes read from source buffer (== compressed size).
+ The function expects to finish at block's end exactly.
+ If the source stream is detected malformed, the function stops decoding and returns a negative result.
+ note : LZ4_decompress_fast*() requires originalSize. Thanks to this information, it never writes past the output buffer.
+ However, since it doesn't know its 'src' size, it may read an unknown amount of input, past input buffer bounds.
+ Also, since match offsets are not validated, match reads from 'src' may underflow too.
+ These issues never happen if input (compressed) data is correct.
+ But they may happen if input data is invalid (error or intentional tampering).
+ As a consequence, use these functions in trusted environments with trusted data **only**.
+
+</p></pre><BR>
+
+<pre><b>void LZ4_resetStream (LZ4_stream_t* streamPtr);
+</b><p> An LZ4_stream_t structure must be initialized at least once.
+ This is done with LZ4_initStream(), or LZ4_resetStream().
+ Consider switching to LZ4_initStream(),
+ invoking LZ4_resetStream() will trigger deprecation warnings in the future.
+
</p></pre><BR>
</html>
diff --git a/doc/lz4frame_manual.html b/doc/lz4frame_manual.html
index fb8e0ce..2758306 100644
--- a/doc/lz4frame_manual.html
+++ b/doc/lz4frame_manual.html
@@ -1,10 +1,10 @@
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
-<title>1.8.3 Manual</title>
+<title>1.9.3 Manual</title>
</head>
<body>
-<h1>1.8.3 Manual</h1>
+<h1>1.9.3 Manual</h1>
<hr>
<a name="Contents"></a><h2>Contents</h2>
<ol>
@@ -84,19 +84,21 @@
LZ4F_blockChecksum_t blockChecksumFlag; </b>/* 1: each block followed by a checksum of block's compressed data; 0: disabled (default) */<b>
} LZ4F_frameInfo_t;
</b><p> makes it possible to set or read frame parameters.
- It's not required to set all fields, as long as the structure was initially memset() to zero.
- For all fields, 0 sets it to default value
+ Structure must be first init to 0, using memset() or LZ4F_INIT_FRAMEINFO,
+ setting all parameters to default.
+ It's then possible to update selectively some parameters
</p></pre><BR>
<pre><b>typedef struct {
LZ4F_frameInfo_t frameInfo;
int compressionLevel; </b>/* 0: default (fast mode); values > LZ4HC_CLEVEL_MAX count as LZ4HC_CLEVEL_MAX; values < 0 trigger "fast acceleration" */<b>
- unsigned autoFlush; </b>/* 1: always flush, to reduce usage of internal buffers */<b>
- unsigned favorDecSpeed; </b>/* 1: parser favors decompression speed vs compression ratio. Only works for high compression modes (>= LZ4LZ4HC_CLEVEL_OPT_MIN) */ /* >= v1.8.2 */<b>
+ unsigned autoFlush; </b>/* 1: always flush; reduces usage of internal buffers */<b>
+ unsigned favorDecSpeed; </b>/* 1: parser favors decompression speed vs compression ratio. Only works for high compression modes (>= LZ4HC_CLEVEL_OPT_MIN) */ /* v1.8.2+ */<b>
unsigned reserved[3]; </b>/* must be zero for forward compatibility */<b>
} LZ4F_preferences_t;
-</b><p> makes it possible to supply detailed compression parameters to the stream interface.
- Structure is presumed initially memset() to zero, representing default settings.
+</b><p> makes it possible to supply advanced compression instructions to streaming interface.
+ Structure must be first init to 0, using memset() or LZ4F_INIT_PREFERENCES,
+ setting all parameters to default.
All reserved fields must be set to zero.
</p></pre><BR>
@@ -155,15 +157,19 @@ LZ4F_errorCode_t LZ4F_freeCompressionContext(LZ4F_cctx* cctx);
</p></pre><BR>
<pre><b>size_t LZ4F_compressBound(size_t srcSize, const LZ4F_preferences_t* prefsPtr);
-</b><p> Provides minimum dstCapacity required to guarantee compression success
- given a srcSize and preferences, covering worst case scenario.
+</b><p> Provides minimum dstCapacity required to guarantee success of
+ LZ4F_compressUpdate(), given a srcSize and preferences, for a worst case scenario.
+ When srcSize==0, LZ4F_compressBound() provides an upper bound for LZ4F_flush() and LZ4F_compressEnd() instead.
+ Note that the result is only valid for a single invocation of LZ4F_compressUpdate().
+ When invoking LZ4F_compressUpdate() multiple times,
+ if the output buffer is gradually filled up instead of emptied and re-used from its start,
+ one must check if there is enough remaining capacity before each invocation, using LZ4F_compressBound().
+ @return is always the same for a srcSize and prefsPtr.
prefsPtr is optional : when NULL is provided, preferences will be set to cover worst case scenario.
- Estimation is valid for either LZ4F_compressUpdate(), LZ4F_flush() or LZ4F_compressEnd(),
- Estimation includes the possibility that internal buffer might already be filled by up to (blockSize-1) bytes.
- It also includes frame footer (ending + checksum), which would have to be generated by LZ4F_compressEnd().
- Estimation doesn't include frame header, as it was already generated by LZ4F_compressBegin().
- Result is always the same for a srcSize and prefsPtr, so it can be trusted to size reusable buffers.
- When srcSize==0, LZ4F_compressBound() provides an upper bound for LZ4F_flush() and LZ4F_compressEnd() operations.
+ tech details :
+ @return if automatic flushing is not enabled, includes the possibility that internal buffer might already be filled by up to (blockSize-1) bytes.
+ It also includes frame footer (ending + checksum), since it might be generated by LZ4F_compressEnd().
+ @return doesn't include frame header, as it was already generated by LZ4F_compressBegin().
</p></pre><BR>
@@ -192,6 +198,7 @@ LZ4F_errorCode_t LZ4F_freeCompressionContext(LZ4F_cctx* cctx);
`cOptPtr` is optional : it's possible to provide NULL, all options will be set to default.
@return : nb of bytes written into dstBuffer (can be zero, when there is no data stored within cctx)
or an error code if it fails (which can be tested using LZ4F_isError())
+ Note : LZ4F_flush() is guaranteed to be successful when dstCapacity >= LZ4F_compressBound(0, prefsPtr).
</p></pre><BR>
@@ -204,6 +211,7 @@ LZ4F_errorCode_t LZ4F_freeCompressionContext(LZ4F_cctx* cctx);
`cOptPtr` is optional : NULL can be provided, in which case all options will be set to default.
@return : nb of bytes written into dstBuffer, necessarily >= 4 (endMark),
or an error code if it fails (which can be tested using LZ4F_isError())
+ Note : LZ4F_compressEnd() is guaranteed to be successful when dstCapacity >= LZ4F_compressBound(0, prefsPtr).
A successful call to LZ4F_compressEnd() makes `cctx` available again for another compression task.
</p></pre><BR>
@@ -229,25 +237,58 @@ LZ4F_errorCode_t LZ4F_freeDecompressionContext(LZ4F_dctx* dctx);
<a name="Chapter10"></a><h2>Streaming decompression functions</h2><pre></pre>
+<pre><b>size_t LZ4F_headerSize(const void* src, size_t srcSize);
+</b><p> Provide the header size of a frame starting at `src`.
+ `srcSize` must be >= LZ4F_MIN_SIZE_TO_KNOW_HEADER_LENGTH,
+ which is enough to decode the header length.
+ @return : size of frame header
+ or an error code, which can be tested using LZ4F_isError()
+ note : Frame header size is variable, but is guaranteed to be
+ >= LZ4F_HEADER_SIZE_MIN bytes, and <= LZ4F_HEADER_SIZE_MAX bytes.
+
+</p></pre><BR>
+
<pre><b>size_t LZ4F_getFrameInfo(LZ4F_dctx* dctx,
LZ4F_frameInfo_t* frameInfoPtr,
const void* srcBuffer, size_t* srcSizePtr);
</b><p> This function extracts frame parameters (max blockSize, dictID, etc.).
- Its usage is optional.
- Extracted information is typically useful for allocation and dictionary.
- This function works in 2 situations :
- - At the beginning of a new frame, in which case
- it will decode information from `srcBuffer`, starting the decoding process.
- Input size must be large enough to successfully decode the entire frame header.
- Frame header size is variable, but is guaranteed to be <= LZ4F_HEADER_SIZE_MAX bytes.
- It's allowed to provide more input data than this minimum.
- - After decoding has been started.
- In which case, no input is read, frame parameters are extracted from dctx.
- - If decoding has barely started, but not yet extracted information from header,
+ Its usage is optional: user can call LZ4F_decompress() directly.
+
+ Extracted information will fill an existing LZ4F_frameInfo_t structure.
+ This can be useful for allocation and dictionary identification purposes.
+
+ LZ4F_getFrameInfo() can work in the following situations :
+
+ 1) At the beginning of a new frame, before any invocation of LZ4F_decompress().
+ It will decode header from `srcBuffer`,
+ consuming the header and starting the decoding process.
+
+ Input size must be large enough to contain the full frame header.
+ Frame header size can be known beforehand by LZ4F_headerSize().
+ Frame header size is variable, but is guaranteed to be >= LZ4F_HEADER_SIZE_MIN bytes,
+ and not more than <= LZ4F_HEADER_SIZE_MAX bytes.
+ Hence, blindly providing LZ4F_HEADER_SIZE_MAX bytes or more will always work.
+ It's allowed to provide more input data than the header size,
+ LZ4F_getFrameInfo() will only consume the header.
+
+ If input size is not large enough,
+ aka if it's smaller than header size,
+ function will fail and return an error code.
+
+ 2) After decoding has been started,
+ it's possible to invoke LZ4F_getFrameInfo() anytime
+ to extract already decoded frame parameters stored within dctx.
+
+ Note that, if decoding has barely started,
+ and not yet read enough information to decode the header,
LZ4F_getFrameInfo() will fail.
- The number of bytes consumed from srcBuffer will be updated within *srcSizePtr (necessarily <= original value).
- Decompression must resume from (srcBuffer + *srcSizePtr).
- @return : an hint about how many srcSize bytes LZ4F_decompress() expects for next call,
+
+ The number of bytes consumed from srcBuffer will be updated in *srcSizePtr (necessarily <= original value).
+ LZ4F_getFrameInfo() only consumes bytes when decoding has not yet started,
+ and when decoding the header has been successful.
+ Decompression must then resume from (srcBuffer + *srcSizePtr).
+
+ @return : a hint about how many srcSize bytes LZ4F_decompress() expects for next call,
or an error code which can be tested using LZ4F_isError().
note 1 : in case of error, dctx is not modified. Decoding operation can resume from beginning safely.
note 2 : frame parameters are *copied into* an already allocated LZ4F_frameInfo_t structure.
@@ -258,8 +299,10 @@ LZ4F_errorCode_t LZ4F_freeDecompressionContext(LZ4F_dctx* dctx);
void* dstBuffer, size_t* dstSizePtr,
const void* srcBuffer, size_t* srcSizePtr,
const LZ4F_decompressOptions_t* dOptPtr);
-</b><p> Call this function repetitively to regenerate compressed data from `srcBuffer`.
- The function will read up to *srcSizePtr bytes from srcBuffer,
+</b><p> Call this function repetitively to regenerate data compressed in `srcBuffer`.
+
+ The function requires a valid dctx state.
+ It will read up to *srcSizePtr bytes from srcBuffer,
and decompress data into dstBuffer, of capacity *dstSizePtr.
The nb of bytes consumed from srcBuffer will be written into *srcSizePtr (necessarily <= original value).
@@ -295,13 +338,14 @@ LZ4F_errorCode_t LZ4F_freeDecompressionContext(LZ4F_dctx* dctx);
and start a new one using same context resources.
</p></pre><BR>
-<pre><b>typedef enum { LZ4F_LIST_ERRORS(LZ4F_GENERATE_ENUM) } LZ4F_errorCodes;
+<pre><b>typedef enum { LZ4F_LIST_ERRORS(LZ4F_GENERATE_ENUM)
+ _LZ4F_dummy_error_enum_for_c89_never_used } LZ4F_errorCodes;
</b></pre><BR>
<a name="Chapter11"></a><h2>Bulk processing dictionary API</h2><pre></pre>
<pre><b>LZ4FLIB_STATIC_API LZ4F_CDict* LZ4F_createCDict(const void* dictBuffer, size_t dictSize);
LZ4FLIB_STATIC_API void LZ4F_freeCDict(LZ4F_CDict* CDict);
-</b><p> When compressing multiple messages / blocks with the same dictionary, it's recommended to load it just once.
+</b><p> When compressing multiple messages / blocks using the same dictionary, it's recommended to load it just once.
LZ4_createCDict() will create a digested dictionary, ready to start future compression operations without startup delay.
LZ4_CDict can be created once and shared by multiple threads concurrently, since its usage is read-only.
`dictBuffer` can be released after LZ4_CDict creation, since its content is copied within CDict