2019-06-25 11:12:58 +00:00
|
|
|
|
.. SPDX-License-Identifier: BSD-3-Clause
|
|
|
|
|
Copyright(c) 2017-2018 Cavium Networks.
|
|
|
|
|
|
|
|
|
|
Compression Device Library
|
|
|
|
|
===========================
|
|
|
|
|
|
|
|
|
|
The compression framework provides a generic set of APIs to perform compression services
|
|
|
|
|
as well as to query and configure compression devices both physical(hardware) and virtual(software)
|
|
|
|
|
to perform those services. The framework currently only supports lossless compression schemes:
|
|
|
|
|
Deflate and LZS.
|
|
|
|
|
|
|
|
|
|
Device Management
|
|
|
|
|
-----------------
|
|
|
|
|
|
|
|
|
|
Device Creation
|
|
|
|
|
~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
Physical compression devices are discovered during the bus probe of the EAL function
|
|
|
|
|
which is executed at DPDK initialization, based on their unique device identifier.
|
2019-06-26 10:17:41 +00:00
|
|
|
|
For e.g. PCI devices can be identified using PCI BDF (bus/bridge, device, function).
|
2019-06-25 11:12:58 +00:00
|
|
|
|
Specific physical compression devices, like other physical devices in DPDK can be
|
|
|
|
|
white-listed or black-listed using the EAL command line options.
|
|
|
|
|
|
|
|
|
|
Virtual devices can be created by two mechanisms, either using the EAL command
|
|
|
|
|
line options or from within the application using an EAL API directly.
|
|
|
|
|
|
|
|
|
|
From the command line using the --vdev EAL option
|
|
|
|
|
|
|
|
|
|
.. code-block:: console
|
|
|
|
|
|
|
|
|
|
--vdev '<pmd name>,socket_id=0'
|
|
|
|
|
|
|
|
|
|
.. Note::
|
|
|
|
|
|
|
|
|
|
* If DPDK application requires multiple software compression PMD devices then required
|
|
|
|
|
number of ``--vdev`` with appropriate libraries are to be added.
|
|
|
|
|
|
|
|
|
|
* An Application with multiple compression device instances exposed by the same PMD must
|
|
|
|
|
specify a unique name for each device.
|
|
|
|
|
|
|
|
|
|
Example: ``--vdev 'pmd0' --vdev 'pmd1'``
|
|
|
|
|
|
|
|
|
|
Or, by using the rte_vdev_init API within the application code.
|
|
|
|
|
|
|
|
|
|
.. code-block:: c
|
|
|
|
|
|
|
|
|
|
rte_vdev_init("<pmd_name>","socket_id=0")
|
|
|
|
|
|
|
|
|
|
All virtual compression devices support the following initialization parameters:
|
|
|
|
|
|
|
|
|
|
* ``socket_id`` - socket on which to allocate the device resources on.
|
|
|
|
|
|
|
|
|
|
Device Identification
|
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
Each device, whether virtual or physical is uniquely designated by two
|
|
|
|
|
identifiers:
|
|
|
|
|
|
|
|
|
|
- A unique device index used to designate the compression device in all functions
|
|
|
|
|
exported by the compressdev API.
|
|
|
|
|
|
|
|
|
|
- A device name used to designate the compression device in console messages, for
|
|
|
|
|
administration or debugging purposes.
|
|
|
|
|
|
|
|
|
|
Device Configuration
|
|
|
|
|
~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
The configuration of each compression device includes the following operations:
|
|
|
|
|
|
|
|
|
|
- Allocation of resources, including hardware resources if a physical device.
|
|
|
|
|
- Resetting the device into a well-known default state.
|
|
|
|
|
- Initialization of statistics counters.
|
|
|
|
|
|
|
|
|
|
The ``rte_compressdev_configure`` API is used to configure a compression device.
|
|
|
|
|
|
|
|
|
|
The ``rte_compressdev_config`` structure is used to pass the configuration
|
|
|
|
|
parameters.
|
|
|
|
|
|
|
|
|
|
See *DPDK API Reference* for details.
|
|
|
|
|
|
|
|
|
|
Configuration of Queue Pairs
|
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
Each compression device queue pair is individually configured through the
|
|
|
|
|
``rte_compressdev_queue_pair_setup`` API.
|
|
|
|
|
|
|
|
|
|
The ``max_inflight_ops`` is used to pass maximum number of
|
|
|
|
|
rte_comp_op that could be present in a queue at-a-time.
|
|
|
|
|
PMD then can allocate resources accordingly on a specified socket.
|
|
|
|
|
|
|
|
|
|
See *DPDK API Reference* for details.
|
|
|
|
|
|
|
|
|
|
Logical Cores, Memory and Queues Pair Relationships
|
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
Library supports NUMA similarly as described in Cryptodev library section.
|
|
|
|
|
|
|
|
|
|
A queue pair cannot be shared and should be exclusively used by a single processing
|
|
|
|
|
context for enqueuing operations or dequeuing operations on the same compression device
|
|
|
|
|
since sharing would require global locks and hinder performance. It is however possible
|
|
|
|
|
to use a different logical core to dequeue an operation on a queue pair from the logical
|
|
|
|
|
core on which it was enqueued. This means that a compression burst enqueue/dequeue
|
|
|
|
|
APIs are a logical place to transition from one logical core to another in a
|
|
|
|
|
data processing pipeline.
|
|
|
|
|
|
|
|
|
|
Device Features and Capabilities
|
|
|
|
|
---------------------------------
|
|
|
|
|
|
|
|
|
|
Compression devices define their functionality through two mechanisms, global device
|
|
|
|
|
features and algorithm features. Global devices features identify device
|
|
|
|
|
wide level features which are applicable to the whole device such as supported hardware
|
|
|
|
|
acceleration and CPU features. List of compression device features can be seen in the
|
|
|
|
|
RTE_COMPDEV_FF_XXX macros.
|
|
|
|
|
|
|
|
|
|
The algorithm features lists individual algo feature which device supports per-algorithm,
|
|
|
|
|
such as a stateful compression/decompression, checksums operation etc. List of algorithm
|
|
|
|
|
features can be seen in the RTE_COMP_FF_XXX macros.
|
|
|
|
|
|
|
|
|
|
Capabilities
|
|
|
|
|
~~~~~~~~~~~~
|
|
|
|
|
Each PMD has a list of capabilities, including algorithms listed in
|
|
|
|
|
enum ``rte_comp_algorithm`` and its associated feature flag and
|
|
|
|
|
sliding window range in log base 2 value. Sliding window tells
|
|
|
|
|
the minimum and maximum size of lookup window that algorithm uses
|
|
|
|
|
to find duplicates.
|
|
|
|
|
|
|
|
|
|
See *DPDK API Reference* for details.
|
|
|
|
|
|
|
|
|
|
Each Compression poll mode driver defines its array of capabilities
|
|
|
|
|
for each algorithm it supports. See PMD implementation for capability
|
|
|
|
|
initialization.
|
|
|
|
|
|
|
|
|
|
Capabilities Discovery
|
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
PMD capability and features are discovered via ``rte_compressdev_info_get`` function.
|
|
|
|
|
|
|
|
|
|
The ``rte_compressdev_info`` structure contains all the relevant information for the device.
|
|
|
|
|
|
|
|
|
|
See *DPDK API Reference* for details.
|
|
|
|
|
|
|
|
|
|
Compression Operation
|
|
|
|
|
----------------------
|
|
|
|
|
|
|
|
|
|
DPDK compression supports two types of compression methodologies:
|
|
|
|
|
|
|
|
|
|
- Stateless, data associated to a compression operation is compressed without any reference
|
|
|
|
|
to another compression operation.
|
|
|
|
|
|
|
|
|
|
- Stateful, data in each compression operation is compressed with reference to previous compression
|
|
|
|
|
operations in the same data stream i.e. history of data is maintained between the operations.
|
|
|
|
|
|
|
|
|
|
For more explanation, please refer RFC https://www.ietf.org/rfc/rfc1951.txt
|
|
|
|
|
|
|
|
|
|
Operation Representation
|
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
Compression operation is described via ``struct rte_comp_op``, which contains both input and
|
|
|
|
|
output data. The operation structure includes the operation type (stateless or stateful),
|
|
|
|
|
the operation status and the priv_xform/stream handle, source, destination and checksum buffer
|
|
|
|
|
pointers. It also contains the source mempool from which the operation is allocated.
|
|
|
|
|
PMD updates consumed field with amount of data read from source buffer and produced
|
|
|
|
|
field with amount of data of written into destination buffer along with status of
|
|
|
|
|
operation. See section *Produced, Consumed And Operation Status* for more details.
|
|
|
|
|
|
|
|
|
|
Compression operations mempool also has an ability to allocate private memory with the
|
|
|
|
|
operation for application's purposes. Application software is responsible for specifying
|
|
|
|
|
all the operation specific fields in the ``rte_comp_op`` structure which are then used
|
|
|
|
|
by the compression PMD to process the requested operation.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Operation Management and Allocation
|
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
The compressdev library provides an API set for managing compression operations which
|
|
|
|
|
utilize the Mempool Library to allocate operation buffers. Therefore, it ensures
|
|
|
|
|
that the compression operation is interleaved optimally across the channels and
|
|
|
|
|
ranks for optimal processing.
|
|
|
|
|
|
|
|
|
|
A ``rte_comp_op`` contains a field indicating the pool it originated from.
|
|
|
|
|
|
|
|
|
|
``rte_comp_op_alloc()`` and ``rte_comp_op_bulk_alloc()`` are used to allocate
|
|
|
|
|
compression operations from a given compression operation mempool.
|
|
|
|
|
The operation gets reset before being returned to a user so that operation
|
|
|
|
|
is always in a good known state before use by the application.
|
|
|
|
|
|
|
|
|
|
``rte_comp_op_free()`` is called by the application to return an operation to
|
|
|
|
|
its allocating pool.
|
|
|
|
|
|
|
|
|
|
See *DPDK API Reference* for details.
|
|
|
|
|
|
|
|
|
|
Passing source data as mbuf-chain
|
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
If input data is scattered across several different buffers, then
|
|
|
|
|
Application can either parse through all such buffers and make one
|
|
|
|
|
mbuf-chain and enqueue it for processing or, alternatively, it can
|
|
|
|
|
make multiple sequential enqueue_burst() calls for each of them
|
|
|
|
|
processing them statefully. See *Compression API Stateful Operation*
|
|
|
|
|
for stateful processing of ops.
|
|
|
|
|
|
|
|
|
|
Operation Status
|
|
|
|
|
~~~~~~~~~~~~~~~~
|
|
|
|
|
Each operation carries a status information updated by PMD after it is processed.
|
|
|
|
|
following are currently supported status:
|
|
|
|
|
|
|
|
|
|
- RTE_COMP_OP_STATUS_SUCCESS,
|
|
|
|
|
Operation is successfully completed
|
|
|
|
|
|
|
|
|
|
- RTE_COMP_OP_STATUS_NOT_PROCESSED,
|
|
|
|
|
Operation has not yet been processed by the device
|
|
|
|
|
|
|
|
|
|
- RTE_COMP_OP_STATUS_INVALID_ARGS,
|
|
|
|
|
Operation failed due to invalid arguments in request
|
|
|
|
|
|
|
|
|
|
- RTE_COMP_OP_STATUS_ERROR,
|
|
|
|
|
Operation failed because of internal error
|
|
|
|
|
|
|
|
|
|
- RTE_COMP_OP_STATUS_INVALID_STATE,
|
|
|
|
|
Operation is invoked in invalid state
|
|
|
|
|
|
|
|
|
|
- RTE_COMP_OP_STATUS_OUT_OF_SPACE_TERMINATED,
|
|
|
|
|
Output buffer ran out of space during processing. Error case,
|
|
|
|
|
PMD cannot continue from here.
|
|
|
|
|
|
|
|
|
|
- RTE_COMP_OP_STATUS_OUT_OF_SPACE_RECOVERABLE,
|
|
|
|
|
Output buffer ran out of space before operation completed, but this
|
|
|
|
|
is not an error case. Output data up to op.produced can be used and
|
|
|
|
|
next op in the stream should continue on from op.consumed+1.
|
|
|
|
|
|
|
|
|
|
Produced, Consumed And Operation Status
|
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
- If status is RTE_COMP_OP_STATUS_SUCCESS,
|
|
|
|
|
consumed = amount of data read from input buffer, and
|
|
|
|
|
produced = amount of data written in destination buffer
|
|
|
|
|
- If status is RTE_COMP_OP_STATUS_FAILURE,
|
|
|
|
|
consumed = produced = 0 or undefined
|
|
|
|
|
- If status is RTE_COMP_OP_STATUS_OUT_OF_SPACE_TERMINATED,
|
|
|
|
|
consumed = 0 and
|
|
|
|
|
produced = usually 0, but in decompression cases a PMD may return > 0
|
|
|
|
|
i.e. amount of data successfully produced until out of space condition
|
|
|
|
|
hit. Application can consume output data in this case, if required.
|
|
|
|
|
- If status is RTE_COMP_OP_STATUS_OUT_OF_SPACE_RECOVERABLE,
|
|
|
|
|
consumed = amount of data read, and
|
|
|
|
|
produced = amount of data successfully produced until
|
|
|
|
|
out of space condition hit. PMD has ability to recover
|
|
|
|
|
from here, so application can submit next op from
|
|
|
|
|
consumed+1 and a destination buffer with available space.
|
|
|
|
|
|
|
|
|
|
Transforms
|
|
|
|
|
----------
|
|
|
|
|
|
|
|
|
|
Compression transforms (``rte_comp_xform``) are the mechanism
|
|
|
|
|
to specify the details of the compression operation such as algorithm,
|
|
|
|
|
window size and checksum.
|
|
|
|
|
|
|
|
|
|
Compression API Hash support
|
|
|
|
|
----------------------------
|
|
|
|
|
|
|
|
|
|
Compression API allows application to enable digest calculation
|
|
|
|
|
alongside compression and decompression of data. A PMD reflects its
|
|
|
|
|
support for hash algorithms via capability algo feature flags.
|
|
|
|
|
If supported, PMD calculates digest always on plaintext i.e.
|
|
|
|
|
before compression and after decompression.
|
|
|
|
|
|
|
|
|
|
Currently supported list of hash algos are SHA-1 and SHA2 family
|
|
|
|
|
SHA256.
|
|
|
|
|
|
|
|
|
|
See *DPDK API Reference* for details.
|
|
|
|
|
|
|
|
|
|
If required, application should set valid hash algo in compress
|
|
|
|
|
or decompress xforms during ``rte_compressdev_stream_create()``
|
|
|
|
|
or ``rte_compressdev_private_xform_create()`` and pass a valid
|
|
|
|
|
output buffer in ``rte_comp_op`` hash field struct to store the
|
|
|
|
|
resulting digest. Buffer passed should be contiguous and large
|
|
|
|
|
enough to store digest which is 20 bytes for SHA-1 and
|
|
|
|
|
32 bytes for SHA2-256.
|
|
|
|
|
|
|
|
|
|
Compression API Stateless operation
|
|
|
|
|
------------------------------------
|
|
|
|
|
|
|
|
|
|
An op is processed stateless if it has
|
|
|
|
|
- op_type set to RTE_COMP_OP_STATELESS
|
|
|
|
|
- flush value set to RTE_FLUSH_FULL or RTE_FLUSH_FINAL
|
|
|
|
|
(required only on compression side),
|
|
|
|
|
- All required input in source buffer
|
|
|
|
|
|
|
|
|
|
When all of the above conditions are met, PMD initiates stateless processing
|
|
|
|
|
and releases acquired resources after processing of current operation is
|
|
|
|
|
complete. Application can enqueue multiple stateless ops in a single burst
|
|
|
|
|
and must attach priv_xform handle to such ops.
|
|
|
|
|
|
|
|
|
|
priv_xform in Stateless operation
|
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
priv_xform is PMD internally managed private data that it maintains to do stateless processing.
|
|
|
|
|
priv_xforms are initialized provided a generic xform structure by an application via making call
|
|
|
|
|
to ``rte_comp_private_xform_create``, at an output PMD returns an opaque priv_xform reference.
|
|
|
|
|
If PMD support SHAREABLE priv_xform indicated via algorithm feature flag, then application can
|
|
|
|
|
attach same priv_xform with many stateless ops at-a-time. If not, then application needs to
|
|
|
|
|
create as many priv_xforms as it expects to have stateless operations in-flight.
|
|
|
|
|
|
|
|
|
|
.. figure:: img/stateless-op.*
|
|
|
|
|
|
|
|
|
|
Stateless Ops using Non-Shareable priv_xform
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
.. figure:: img/stateless-op-shared.*
|
|
|
|
|
|
|
|
|
|
Stateless Ops using Shareable priv_xform
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Application should call ``rte_compressdev_private_xform_create()`` and attach to stateless op before
|
|
|
|
|
enqueuing them for processing and free via ``rte_compressdev_private_xform_free()`` during termination.
|
|
|
|
|
|
|
|
|
|
An example pseudocode to setup and process NUM_OPS stateless ops with each of length OP_LEN
|
|
|
|
|
using priv_xform would look like:
|
|
|
|
|
|
|
|
|
|
.. code-block:: c
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* pseudocode for stateless compression
|
|
|
|
|
*/
|
|
|
|
|
|
|
|
|
|
uint8_t cdev_id = rte_compdev_get_dev_id(<pmd name>);
|
|
|
|
|
|
|
|
|
|
/* configure the device. */
|
|
|
|
|
if (rte_compressdev_configure(cdev_id, &conf) < 0)
|
|
|
|
|
rte_exit(EXIT_FAILURE, "Failed to configure compressdev %u", cdev_id);
|
|
|
|
|
|
|
|
|
|
if (rte_compressdev_queue_pair_setup(cdev_id, 0, NUM_MAX_INFLIGHT_OPS,
|
|
|
|
|
socket_id()) < 0)
|
|
|
|
|
rte_exit(EXIT_FAILURE, "Failed to setup queue pair\n");
|
|
|
|
|
|
|
|
|
|
if (rte_compressdev_start(cdev_id) < 0)
|
|
|
|
|
rte_exit(EXIT_FAILURE, "Failed to start device\n");
|
|
|
|
|
|
|
|
|
|
/* setup compress transform */
|
|
|
|
|
struct rte_compress_compress_xform compress_xform = {
|
|
|
|
|
.type = RTE_COMP_COMPRESS,
|
|
|
|
|
.compress = {
|
|
|
|
|
.algo = RTE_COMP_ALGO_DEFLATE,
|
|
|
|
|
.deflate = {
|
|
|
|
|
.huffman = RTE_COMP_HUFFMAN_DEFAULT
|
|
|
|
|
},
|
|
|
|
|
.level = RTE_COMP_LEVEL_PMD_DEFAULT,
|
|
|
|
|
.chksum = RTE_COMP_CHECKSUM_NONE,
|
|
|
|
|
.window_size = DEFAULT_WINDOW_SIZE,
|
|
|
|
|
.hash_algo = RTE_COMP_HASH_ALGO_NONE
|
|
|
|
|
}
|
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
/* create priv_xform and initialize it for the compression device. */
|
|
|
|
|
void *priv_xform = NULL;
|
|
|
|
|
rte_compressdev_info_get(cdev_id, &dev_info);
|
|
|
|
|
if(dev_info.capability->comps_feature_flag & RTE_COMP_FF_SHAREABLE_PRIV_XFORM) {
|
|
|
|
|
rte_comp_priv_xform_create(cdev_id, &compress_xform, &priv_xform);
|
|
|
|
|
} else {
|
|
|
|
|
shareable = 0;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/* create operation pool via call to rte_comp_op_pool_create and alloc ops */
|
|
|
|
|
rte_comp_op_bulk_alloc(op_pool, comp_ops, NUM_OPS);
|
|
|
|
|
|
|
|
|
|
/* prepare ops for compression operations */
|
|
|
|
|
for (i = 0; i < NUM_OPS; i++) {
|
|
|
|
|
struct rte_comp_op *op = comp_ops[i];
|
|
|
|
|
if (!shareable)
|
|
|
|
|
rte_priv_xform_create(cdev_id, &compress_xform, &op->priv_xform)
|
|
|
|
|
else
|
|
|
|
|
op->priv_xform = priv_xform;
|
|
|
|
|
op->type = RTE_COMP_OP_STATELESS;
|
|
|
|
|
op->flush = RTE_COMP_FLUSH_FINAL;
|
|
|
|
|
|
|
|
|
|
op->src.offset = 0;
|
|
|
|
|
op->dst.offset = 0;
|
|
|
|
|
op->src.length = OP_LEN;
|
|
|
|
|
op->input_chksum = 0;
|
|
|
|
|
setup op->m_src and op->m_dst;
|
|
|
|
|
}
|
|
|
|
|
num_enqd = rte_compressdev_enqueue_burst(cdev_id, 0, comp_ops, NUM_OPS);
|
2019-06-26 10:17:41 +00:00
|
|
|
|
/* wait for this to complete before enqueuing next*/
|
2019-06-25 11:12:58 +00:00
|
|
|
|
do {
|
|
|
|
|
num_deque = rte_compressdev_dequeue_burst(cdev_id, 0 , &processed_ops, NUM_OPS);
|
|
|
|
|
} while (num_dqud < num_enqd);
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Stateless and OUT_OF_SPACE
|
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
OUT_OF_SPACE is a condition when output buffer runs out of space and where PMD
|
|
|
|
|
still has more data to produce. If PMD runs into such condition, then PMD returns
|
|
|
|
|
RTE_COMP_OP_OUT_OF_SPACE_TERMINATED error. In such case, PMD resets itself and can set
|
|
|
|
|
consumed=0 and produced=amount of output it could produce before hitting out_of_space.
|
|
|
|
|
Application would need to resubmit the whole input with a larger output buffer, if it
|
|
|
|
|
wants the operation to be completed.
|
|
|
|
|
|
|
|
|
|
Hash in Stateless
|
|
|
|
|
~~~~~~~~~~~~~~~~~
|
|
|
|
|
If hash is enabled, digest buffer will contain valid data after op is successfully
|
|
|
|
|
processed i.e. dequeued with status = RTE_COMP_OP_STATUS_SUCCESS.
|
|
|
|
|
|
|
|
|
|
Checksum in Stateless
|
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
If checksum is enabled, checksum will only be available after op is successfully
|
|
|
|
|
processed i.e. dequeued with status = RTE_COMP_OP_STATUS_SUCCESS.
|
|
|
|
|
|
|
|
|
|
Compression API Stateful operation
|
|
|
|
|
-----------------------------------
|
|
|
|
|
|
|
|
|
|
Compression API provide RTE_COMP_FF_STATEFUL_COMPRESSION and
|
|
|
|
|
RTE_COMP_FF_STATEFUL_DECOMPRESSION feature flag for PMD to reflect
|
|
|
|
|
its support for Stateful operations.
|
|
|
|
|
|
|
|
|
|
A Stateful operation in DPDK compression means application invokes enqueue
|
|
|
|
|
burst() multiple times to process related chunk of data because
|
|
|
|
|
application broke data into several ops.
|
|
|
|
|
|
|
|
|
|
In such case
|
|
|
|
|
- ops are setup with op_type RTE_COMP_OP_STATEFUL,
|
|
|
|
|
- all ops except last set to flush value = RTE_COMP_NO/SYNC_FLUSH
|
|
|
|
|
and last set to flush value RTE_COMP_FULL/FINAL_FLUSH.
|
|
|
|
|
|
|
|
|
|
In case of either one or all of the above conditions, PMD initiates
|
|
|
|
|
stateful processing and releases acquired resources after processing
|
|
|
|
|
operation with flush value = RTE_COMP_FLUSH_FULL/FINAL is complete.
|
|
|
|
|
Unlike stateless, application can enqueue only one stateful op from
|
|
|
|
|
a particular stream at a time and must attach stream handle
|
|
|
|
|
to each op.
|
|
|
|
|
|
|
|
|
|
Stream in Stateful operation
|
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
`stream` in DPDK compression is a logical entity which identifies related set of ops, say, a one large
|
|
|
|
|
file broken into multiple chunks then file is represented by a stream and each chunk of that file is
|
|
|
|
|
represented by compression op `rte_comp_op`. Whenever application wants a stateful processing of such
|
|
|
|
|
data, then it must get a stream handle via making call to ``rte_comp_stream_create()``
|
|
|
|
|
with xform, at an output the target PMD will return an opaque stream handle to application which
|
|
|
|
|
it must attach to all of the ops carrying data of that stream. In stateful processing, every op
|
|
|
|
|
requires previous op data for compression/decompression. A PMD allocates and set up resources such
|
|
|
|
|
as history, states, etc. within a stream, which are maintained during the processing of the related ops.
|
|
|
|
|
|
|
|
|
|
Unlike priv_xforms, stream is always a NON_SHAREABLE entity. One stream handle must be attached to only
|
|
|
|
|
one set of related ops and cannot be reused until all of them are processed with status Success or failure.
|
|
|
|
|
|
|
|
|
|
.. figure:: img/stateful-op.*
|
|
|
|
|
|
|
|
|
|
Stateful Ops
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Application should call ``rte_comp_stream_create()`` and attach to op before
|
|
|
|
|
enqueuing them for processing and free via ``rte_comp_stream_free()`` during
|
|
|
|
|
termination. All ops that are to be processed statefully should carry *same* stream.
|
|
|
|
|
|
|
|
|
|
See *DPDK API Reference* document for details.
|
|
|
|
|
|
|
|
|
|
An example pseudocode to set up and process a stream having NUM_CHUNKS with each chunk size of CHUNK_LEN would look like:
|
|
|
|
|
|
|
|
|
|
.. code-block:: c
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* pseudocode for stateful compression
|
|
|
|
|
*/
|
|
|
|
|
|
|
|
|
|
uint8_t cdev_id = rte_compdev_get_dev_id(<pmd name>);
|
|
|
|
|
|
|
|
|
|
/* configure the device. */
|
|
|
|
|
if (rte_compressdev_configure(cdev_id, &conf) < 0)
|
|
|
|
|
rte_exit(EXIT_FAILURE, "Failed to configure compressdev %u", cdev_id);
|
|
|
|
|
|
|
|
|
|
if (rte_compressdev_queue_pair_setup(cdev_id, 0, NUM_MAX_INFLIGHT_OPS,
|
|
|
|
|
socket_id()) < 0)
|
|
|
|
|
rte_exit(EXIT_FAILURE, "Failed to setup queue pair\n");
|
|
|
|
|
|
|
|
|
|
if (rte_compressdev_start(cdev_id) < 0)
|
|
|
|
|
rte_exit(EXIT_FAILURE, "Failed to start device\n");
|
|
|
|
|
|
|
|
|
|
/* setup compress transform. */
|
|
|
|
|
struct rte_compress_compress_xform compress_xform = {
|
|
|
|
|
.type = RTE_COMP_COMPRESS,
|
|
|
|
|
.compress = {
|
|
|
|
|
.algo = RTE_COMP_ALGO_DEFLATE,
|
|
|
|
|
.deflate = {
|
|
|
|
|
.huffman = RTE_COMP_HUFFMAN_DEFAULT
|
|
|
|
|
},
|
|
|
|
|
.level = RTE_COMP_LEVEL_PMD_DEFAULT,
|
|
|
|
|
.chksum = RTE_COMP_CHECKSUM_NONE,
|
|
|
|
|
.window_size = DEFAULT_WINDOW_SIZE,
|
|
|
|
|
.hash_algo = RTE_COMP_HASH_ALGO_NONE
|
|
|
|
|
}
|
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
/* create stream */
|
|
|
|
|
rte_comp_stream_create(cdev_id, &compress_xform, &stream);
|
|
|
|
|
|
|
|
|
|
/* create an op pool and allocate ops */
|
|
|
|
|
rte_comp_op_bulk_alloc(op_pool, comp_ops, NUM_CHUNKS);
|
|
|
|
|
|
|
|
|
|
/* Prepare source and destination mbufs for compression operations */
|
|
|
|
|
unsigned int i;
|
|
|
|
|
for (i = 0; i < NUM_CHUNKS; i++) {
|
|
|
|
|
if (rte_pktmbuf_append(mbufs[i], CHUNK_LEN) == NULL)
|
|
|
|
|
rte_exit(EXIT_FAILURE, "Not enough room in the mbuf\n");
|
|
|
|
|
comp_ops[i]->m_src = mbufs[i];
|
|
|
|
|
if (rte_pktmbuf_append(dst_mbufs[i], CHUNK_LEN) == NULL)
|
|
|
|
|
rte_exit(EXIT_FAILURE, "Not enough room in the mbuf\n");
|
|
|
|
|
comp_ops[i]->m_dst = dst_mbufs[i];
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/* Set up the compress operations. */
|
|
|
|
|
for (i = 0; i < NUM_CHUNKS; i++) {
|
|
|
|
|
struct rte_comp_op *op = comp_ops[i];
|
|
|
|
|
op->stream = stream;
|
|
|
|
|
op->m_src = src_buf[i];
|
|
|
|
|
op->m_dst = dst_buf[i];
|
|
|
|
|
op->type = RTE_COMP_OP_STATEFUL;
|
|
|
|
|
if(i == NUM_CHUNKS-1) {
|
|
|
|
|
/* set to final, if last chunk*/
|
|
|
|
|
op->flush = RTE_COMP_FLUSH_FINAL;
|
|
|
|
|
} else {
|
|
|
|
|
/* set to NONE, for all intermediary ops */
|
|
|
|
|
op->flush = RTE_COMP_FLUSH_NONE;
|
|
|
|
|
}
|
|
|
|
|
op->src.offset = 0;
|
|
|
|
|
op->dst.offset = 0;
|
|
|
|
|
op->src.length = CHUNK_LEN;
|
|
|
|
|
op->input_chksum = 0;
|
|
|
|
|
num_enqd = rte_compressdev_enqueue_burst(cdev_id, 0, &op[i], 1);
|
2019-06-26 10:17:41 +00:00
|
|
|
|
/* wait for this to complete before enqueuing next*/
|
2019-06-25 11:12:58 +00:00
|
|
|
|
do {
|
|
|
|
|
num_deqd = rte_compressdev_dequeue_burst(cdev_id, 0 , &processed_ops, 1);
|
|
|
|
|
} while (num_deqd < num_enqd);
|
|
|
|
|
/* push next op*/
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Stateful and OUT_OF_SPACE
|
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
If PMD supports stateful operation, then OUT_OF_SPACE status is not an actual
|
|
|
|
|
error for the PMD. In such case, PMD returns with status
|
|
|
|
|
RTE_COMP_OP_STATUS_OUT_OF_SPACE_RECOVERABLE with consumed = number of input bytes
|
|
|
|
|
read and produced = length of complete output buffer.
|
|
|
|
|
Application should enqueue next op with source starting at consumed+1 and an
|
|
|
|
|
output buffer with available space.
|
|
|
|
|
|
|
|
|
|
Hash in Stateful
|
|
|
|
|
~~~~~~~~~~~~~~~~
|
|
|
|
|
If enabled, digest buffer will contain valid digest after last op in stream
|
|
|
|
|
(having flush = RTE_COMP_OP_FLUSH_FINAL) is successfully processed i.e. dequeued
|
|
|
|
|
with status = RTE_COMP_OP_STATUS_SUCCESS.
|
|
|
|
|
|
|
|
|
|
Checksum in Stateful
|
|
|
|
|
~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
If enabled, checksum will only be available after last op in stream
|
|
|
|
|
(having flush = RTE_COMP_OP_FLUSH_FINAL) is successfully processed i.e. dequeued
|
|
|
|
|
with status = RTE_COMP_OP_STATUS_SUCCESS.
|
|
|
|
|
|
|
|
|
|
Burst in compression API
|
|
|
|
|
-------------------------
|
|
|
|
|
|
|
|
|
|
Scheduling of compression operations on DPDK's application data path is
|
|
|
|
|
performed using a burst oriented asynchronous API set. A queue pair on a compression
|
|
|
|
|
device accepts a burst of compression operations using enqueue burst API. On physical
|
|
|
|
|
devices the enqueue burst API will place the operations to be processed
|
|
|
|
|
on the device's hardware input queue, for virtual devices the processing of the
|
|
|
|
|
operations is usually completed during the enqueue call to the compression
|
|
|
|
|
device. The dequeue burst API will retrieve any processed operations available
|
|
|
|
|
from the queue pair on the compression device, from physical devices this is usually
|
|
|
|
|
directly from the devices processed queue, and for virtual device's from a
|
|
|
|
|
``rte_ring`` where processed operations are place after being processed on the
|
|
|
|
|
enqueue call.
|
|
|
|
|
|
|
|
|
|
A burst in DPDK compression can be a combination of stateless and stateful operations with a condition
|
|
|
|
|
that for stateful ops only one op at-a-time should be enqueued from a particular stream i.e. no-two ops
|
|
|
|
|
should belong to same stream in a single burst. However a burst may contain multiple stateful ops as long
|
|
|
|
|
as each op is attached to a different stream i.e. a burst can look like:
|
|
|
|
|
|
|
|
|
|
+---------------+--------------+--------------+-----------------+--------------+--------------+
|
|
|
|
|
| enqueue_burst | op1.no_flush | op2.no_flush | op3.flush_final | op4.no_flush | op5.no_flush |
|
|
|
|
|
+---------------+--------------+--------------+-----------------+--------------+--------------+
|
|
|
|
|
|
|
|
|
|
Where, op1 .. op5 all belong to different independent data units. op1, op2, op4, op5 must be stateful
|
|
|
|
|
as stateless ops can only use flush full or final and op3 can be of type stateless or stateful.
|
|
|
|
|
Every op with type set to RTE_COMP_OP_TYPE_STATELESS must be attached to priv_xform and
|
|
|
|
|
Every op with type set to RTE_COMP_OP_TYPE_STATEFUL *must* be attached to stream.
|
|
|
|
|
|
|
|
|
|
Since each operation in a burst is independent and thus can be completed
|
|
|
|
|
out-of-order, applications which need ordering, should setup per-op user data
|
|
|
|
|
area with reordering information so that it can determine enqueue order at
|
|
|
|
|
dequeue.
|
|
|
|
|
|
|
|
|
|
Also if multiple threads calls enqueue_burst() on same queue pair then it’s
|
|
|
|
|
application onus to use proper locking mechanism to ensure exclusive enqueuing
|
|
|
|
|
of operations.
|
|
|
|
|
|
|
|
|
|
Enqueue / Dequeue Burst APIs
|
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
The burst enqueue API uses a compression device identifier and a queue pair
|
|
|
|
|
identifier to specify the compression device queue pair to schedule the processing on.
|
|
|
|
|
The ``nb_ops`` parameter is the number of operations to process which are
|
|
|
|
|
supplied in the ``ops`` array of ``rte_comp_op`` structures.
|
|
|
|
|
The enqueue function returns the number of operations it actually enqueued for
|
|
|
|
|
processing, a return value equal to ``nb_ops`` means that all packets have been
|
|
|
|
|
enqueued.
|
|
|
|
|
|
|
|
|
|
The dequeue API uses the same format as the enqueue API but
|
|
|
|
|
the ``nb_ops`` and ``ops`` parameters are now used to specify the max processed
|
|
|
|
|
operations the user wishes to retrieve and the location in which to store them.
|
|
|
|
|
The API call returns the actual number of processed operations returned, this
|
|
|
|
|
can never be larger than ``nb_ops``.
|
|
|
|
|
|
|
|
|
|
Sample code
|
|
|
|
|
-----------
|
|
|
|
|
|
|
|
|
|
There are unit test applications that show how to use the compressdev library inside
|
|
|
|
|
test/test/test_compressdev.c
|
|
|
|
|
|
|
|
|
|
Compression Device API
|
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
The compressdev Library API is described in the *DPDK API Reference* document.
|