ip: reassembly: add documentation
Type: docs Signed-off-by: Klement Sekera <ksekera@cisco.com> Change-Id: I23008cde47d8b7a531346eab02902e2ced18742a
This commit is contained in:

committed by
Ole Tr�an

parent
e63a2d44d1
commit
bb912f2e25
@ -12,6 +12,7 @@ Core Features
|
|||||||
punt
|
punt
|
||||||
ipsec
|
ipsec
|
||||||
bfd_doc
|
bfd_doc
|
||||||
|
reassembly
|
||||||
ipfix_doc
|
ipfix_doc
|
||||||
span_doc
|
span_doc
|
||||||
mtu
|
mtu
|
||||||
|
1
docs/developer/corefeatures/reassembly.rst
Symbolic link
1
docs/developer/corefeatures/reassembly.rst
Symbolic link
@ -0,0 +1 @@
|
|||||||
|
../../../src/vnet/ip/reass/reassembly.rst
|
221
src/vnet/ip/reass/reassembly.rst
Normal file
221
src/vnet/ip/reass/reassembly.rst
Normal file
@ -0,0 +1,221 @@
|
|||||||
|
.. _reassembly:
|
||||||
|
|
||||||
|
IP Reassembly
|
||||||
|
=============
|
||||||
|
|
||||||
|
Some VPP functions need access to whole packet and/or stream
|
||||||
|
classification based on L4 headers. Reassembly functionality allows
|
||||||
|
both former and latter.
|
||||||
|
|
||||||
|
Full reassembly vs shallow (virtual) reassembly
|
||||||
|
-----------------------------------------------
|
||||||
|
|
||||||
|
There are two kinds of reassembly available in VPP:
|
||||||
|
|
||||||
|
1. Full reassembly changes a stream of packet fragments into one
|
||||||
|
packet containing all data reassembled with fragment bits cleared
|
||||||
|
and fragment header stripped (in case of ip6). Note that resulting
|
||||||
|
packet may come out of reassembly as a buffer chain. Because it's
|
||||||
|
impractical to parse headers which are split over multiple vnet
|
||||||
|
buffers, vnet_buffer_chain_linearize() is called after reassembly so
|
||||||
|
that L2/L3/L4 headers can be found in first buffer. Full reassembly
|
||||||
|
is costly and shouldn't be used unless necessary. Full reassembly is by
|
||||||
|
default enabled for both ipv4 and ipv6 traffic for "forus" traffic
|
||||||
|
- that is packets aimed at VPP addresses. This can be disabled via API
|
||||||
|
if desired, in which case "forus" fragments are dropped.
|
||||||
|
|
||||||
|
2. Shallow (virtual) reassembly allows various classifying and/or
|
||||||
|
translating features to work with fragments without having to
|
||||||
|
understand fragmentation. It works by extracting L4 data and adding
|
||||||
|
them to vnet_buffer for each packet/fragment passing throught SVR
|
||||||
|
nodes. This operation is performed for both fragments and regular
|
||||||
|
packets, allowing consuming code to treat all packets in same way. SVR
|
||||||
|
caches incoming packet fragments (buffers) until first fragment is
|
||||||
|
seen. Then it extracts L4 data from that first fragment, fills it for
|
||||||
|
any cached fragments and transmits them in the same order as they were
|
||||||
|
received. From that point on, any other passing fragments get L4 data
|
||||||
|
populated in vnet_buffer based on reassembly context.
|
||||||
|
|
||||||
|
Multi-worker behaviour
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
Both reassembly types deal with fragments arriving on different workers
|
||||||
|
via handoff mechanism. All reassembly contexts are stored in pools.
|
||||||
|
Bihash mapping 5-tuple key to a value containing pool index and thread
|
||||||
|
index is used for lookups. When a lookup finds an existing reasembly on
|
||||||
|
a different thread, it hands off the fragment to that thread. If lookup
|
||||||
|
fails, a new reassembly context is created and current worker becomes
|
||||||
|
owner of that context. Further fragments received on other worker
|
||||||
|
threads are then handed off owner worker thread.
|
||||||
|
|
||||||
|
Full reassembly also remembers thread index where first fragment (as in
|
||||||
|
fragment with fragment offset 0) was seen and uses handoff mechanism to
|
||||||
|
send the reassembled packet out on that thread even if pool owner is
|
||||||
|
a different thread. This then requires an additional handoff to free
|
||||||
|
reassembly context as only pool owner can do that in a thread-safe way.
|
||||||
|
|
||||||
|
Limits
|
||||||
|
^^^^^^
|
||||||
|
|
||||||
|
Because reassembly could be an attack vector, there is a configurable
|
||||||
|
limit on the number of concurrent reassemblies and also maximum
|
||||||
|
fragments per packet.
|
||||||
|
|
||||||
|
Custom applications
|
||||||
|
^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
Both reassembly features allow to be used by custom applicatind which
|
||||||
|
are not part of VPP source tree. Be it patches or 3rd party plugins,
|
||||||
|
they can build their own graph paths by using "-custom*" versions of
|
||||||
|
nodes. Reassembly then reads next_index and error_next_index for each
|
||||||
|
buffer from vnet_buffer, allowing custom application to steer
|
||||||
|
both reassembled packets and any packets which are considered an error
|
||||||
|
in a way the custom application requires.
|
||||||
|
|
||||||
|
Full reassembly
|
||||||
|
---------------
|
||||||
|
|
||||||
|
Configuration
|
||||||
|
^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
Configuration is via API (``ip_reassembly_enable_disable``) or CLI:
|
||||||
|
|
||||||
|
``set interface reassembly <interface-name> [on|off|ip4|ip6]``
|
||||||
|
|
||||||
|
here ``on`` means both ip4 and ip6.
|
||||||
|
|
||||||
|
A show command is provided to see reassembly contexts:
|
||||||
|
|
||||||
|
For ip4:
|
||||||
|
|
||||||
|
``show ip4-full-reassembly [details]``
|
||||||
|
|
||||||
|
For ip6:
|
||||||
|
|
||||||
|
``show ip6-full-reassembly [details]``
|
||||||
|
|
||||||
|
Global full reassembly parameters can be modified using API
|
||||||
|
``ip_reassembly_set`` and retrieved using ``ip_reassembly_get``.
|
||||||
|
|
||||||
|
Defaults
|
||||||
|
""""""""
|
||||||
|
|
||||||
|
For defaults values, see #defines in
|
||||||
|
|
||||||
|
`ip4_full_reass.c <__REPOSITORY_URL__/src/vnet/ip/reass/ip4_full_reass.c>`_
|
||||||
|
|
||||||
|
========================================= ==========================================
|
||||||
|
#define description
|
||||||
|
----------------------------------------- ------------------------------------------
|
||||||
|
IP4_REASS_TIMEOUT_DEFAULT_MS timeout in milliseconds
|
||||||
|
IP4_REASS_EXPIRE_WALK_INTERVAL_DEFAULT_MS interval between reaping expired sessions
|
||||||
|
IP4_REASS_MAX_REASSEMBLIES_DEFAULT maximum number of concurrent reassemblies
|
||||||
|
IP4_REASS_MAX_REASSEMBLY_LENGTH_DEFAULT maximum number of fragments per reassembly
|
||||||
|
========================================= ==========================================
|
||||||
|
|
||||||
|
and
|
||||||
|
|
||||||
|
`ip6_full_reass.c <__REPOSITORY_URL__/src/vnet/ip/reass/ip6_full_reass.c>`_
|
||||||
|
|
||||||
|
========================================= ==========================================
|
||||||
|
#define description
|
||||||
|
----------------------------------------- ------------------------------------------
|
||||||
|
IP6_REASS_TIMEOUT_DEFAULT_MS timeout in milliseconds
|
||||||
|
IP6_REASS_EXPIRE_WALK_INTERVAL_DEFAULT_MS interval between reaping expired sessions
|
||||||
|
IP6_REASS_MAX_REASSEMBLIES_DEFAULT maximum number of concurrent reassemblies
|
||||||
|
IP6_REASS_MAX_REASSEMBLY_LENGTH_DEFAULT maximum number of fragments per reassembly
|
||||||
|
========================================= ==========================================
|
||||||
|
|
||||||
|
Finished/expired contexts
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
Reassembly contexts are freed either when reassembly is finished - when
|
||||||
|
all data has been received or in case of timeout. There is a process
|
||||||
|
walking all reassemblies, freeing any expired ones.
|
||||||
|
|
||||||
|
Shallow (virtual) reassembly
|
||||||
|
----------------------------
|
||||||
|
|
||||||
|
Configuration
|
||||||
|
^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
Configuration is via API (``ip_reassembly_enable_disable``) only as
|
||||||
|
there is no value in turning SVR on by hand without a feature consuming
|
||||||
|
buffer metadata. SVR is designed to be turned on by a feature requiring
|
||||||
|
it in a programmatic way.
|
||||||
|
|
||||||
|
A show command is provided to see reassembly contexts:
|
||||||
|
|
||||||
|
For ip4:
|
||||||
|
|
||||||
|
``show ip4-sv-reassembly [details]``
|
||||||
|
|
||||||
|
For ip6:
|
||||||
|
|
||||||
|
``show ip6-sv-reassembly [details]``
|
||||||
|
|
||||||
|
Global shallow reassembly parameters can be modified using API
|
||||||
|
``ip_reassembly_set`` and retrieved using ``ip_reassembly_get``.
|
||||||
|
|
||||||
|
Defaults
|
||||||
|
""""""""
|
||||||
|
|
||||||
|
For defaults values, see #defines in
|
||||||
|
|
||||||
|
`ip4_sv_reass.c <__REPOSITORY_URL__/src/vnet/ip/reass/ip4_sv_reass.c>`_
|
||||||
|
|
||||||
|
============================================ ==========================================
|
||||||
|
#define description
|
||||||
|
-------------------------------------------- ------------------------------------------
|
||||||
|
IP4_SV_REASS_TIMEOUT_DEFAULT_MS timeout in milliseconds
|
||||||
|
IP4_SV_REASS_EXPIRE_WALK_INTERVAL_DEFAULT_MS interval between reaping expired sessions
|
||||||
|
IP4_SV_REASS_MAX_REASSEMBLIES_DEFAULT maximum number of concurrent reassemblies
|
||||||
|
IP4_SV_REASS_MAX_REASSEMBLY_LENGTH_DEFAULT maximum number of fragments per reassembly
|
||||||
|
============================================ ==========================================
|
||||||
|
|
||||||
|
and
|
||||||
|
|
||||||
|
`ip6_sv_reass.c <__REPOSITORY_URL__/src/vnet/ip/reass/ip6_sv_reass.c>`_
|
||||||
|
|
||||||
|
============================================ ==========================================
|
||||||
|
#define description
|
||||||
|
-------------------------------------------- ------------------------------------------
|
||||||
|
IP6_SV_REASS_TIMEOUT_DEFAULT_MS timeout in milliseconds
|
||||||
|
IP6_SV_REASS_EXPIRE_WALK_INTERVAL_DEFAULT_MS interval between reaping expired sessions
|
||||||
|
IP6_SV_REASS_MAX_REASSEMBLIES_DEFAULT maximum number of concurrent reassemblies
|
||||||
|
IP6_SV_REASS_MAX_REASSEMBLY_LENGTH_DEFAULT maximum number of fragments per reassembly
|
||||||
|
============================================ ==========================================
|
||||||
|
|
||||||
|
Expiring contexts
|
||||||
|
^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
There is no way of knowing when a reassembly is finished without
|
||||||
|
performing (an almost) full reassembly, so contexts in SVR cannot be
|
||||||
|
freed in the same way as in full reassembly. Instead a different
|
||||||
|
approach is taken. Least recently used (LRU) list is maintained where
|
||||||
|
reassembly contexts are ordered based on last update. The oldest
|
||||||
|
context is then freed whenever SVR hits limit on number of concurrent
|
||||||
|
reassembly contexts. There is also a process reaping expired sessions
|
||||||
|
similar as in full reassembly.
|
||||||
|
|
||||||
|
Truncated packets
|
||||||
|
^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
When SVR detects that a packet has been truncated in a way where L4
|
||||||
|
headers are not available, it will mark it as such in vnet_buffer,
|
||||||
|
allowing downstream features to handle such packets as they deem fit.
|
||||||
|
|
||||||
|
Fast path/slow path
|
||||||
|
^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
SVR runs is implemented fast path/slow path way. By default, it assumes
|
||||||
|
that any passing traffic doesn't contain fragments, processing buffers
|
||||||
|
in a dual-loop. If it sees a fragment, it then jumps to single-loop
|
||||||
|
processing.
|
||||||
|
|
||||||
|
Feature enabled by other features/reference counting
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
SVR feature is enabled by some other features, like NAT, when those
|
||||||
|
features are enabled. For this to work, it implements a reference
|
||||||
|
counted API for enabling/disabling SVR.
|
Reference in New Issue
Block a user