aboutsummaryrefslogtreecommitdiff
path: root/doc/users-guide/users-guide.adoc
blob: 078dd7ccc529df3e483c861c5feb6e62e9a6abf3 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
:doctitle: OpenDataPlane (ODP) Users-Guide
:description: This document is intended to guide a new OpenDataPlane +
application developer.
:imagesdir: ../images
:toc:

:numbered!:
[abstract]
Abstract
--------
This document is intended to guide a new ODP application developer.
Further details about ODP may be found at the http://opendataplane.org[ODP]
home page.

.Overview of a system running ODP applications
image::overview.svg[align="center"]

ODP is an API specification that allows many implementations to provide
platform independence, automatic hardware acceleration and CPU scaling to
high performance networking  applications. This document describes how to
write an application that can successfully take advantage of the API.

:numbered:
== Introduction
.OpenDataPlane Components
image::odp_components.svg[align="center"]

.The ODP API Specification
ODP consists of three separate but related component parts. First, ODP is an
abstract API specification that describes a functional model for
data plane applications. This specification covers many common data plane
application programming needs, such as the ability to receive, manipulate, and
transmit packet data, without specifying how these functions are performed. This
is quite intentional. It is precisely because ODP APIs do not have a preferred
embodiment that they permit innovation in how these functions can
be realized on various platforms that offer implementations of ODP. To achieve
this goal, ODP APIs are described using abstract data types whose definition
is left up to the ODP implementer.  For example, in ODP packets are referenced
by abstract handles of type `odp_packet_t`, and packet-related APIs take
arguments of this type. What an `odp_packet_t` actually is is not part of the
ODP API specification--that is the responsibility of each ODP implementation.

.Summary: ODP API attributes:
* Open Source, open contribution, BSD-3 licensed.
* Vendor and platform neutral.
* Application-centric.  Covers functional needs of data plane applications.
* Ensures portability by specifying the functional behavior of ODP.
* Defined jointly and openly by application writers and platform implementers.
* Architected to be implementable on a wide range of platforms efficiently
* Sponsored, governed, and maintained by the Linaro Networking Group (LNG)

.ODP Implementations
Second, ODP consists of multiple implementations of this API specification,
each tailored to a specific target platform. ODP implementations determine
how each ODP abstract type is represented on that platform and how each ODP
API is realized. On some platforms, ODP APIs will
be realized using specialized instructions that accelerate the functional
behavior specified by the API. On others, hardware co-processing engines may
completely offload the API so that again it can be performed with little or no
involvement by a CPU. In all cases, the application sees the same
functional behavior independent of how a given platform has chosen to realize
it. By allowing each platform the freedom to determine how best to realize each
API's specified functional behavior in an optimal manner, ODP permits
applications written to its APIs to take full advantage of the unique
capabilities of each platform without the application programmer needing to
have specialist knowledge of that platform or to be concerned with how best
to tune the application to a particular platform. This latter consideration is
particularly important in Network Function Virtualization (NFV) environments
where the application will run on a target platform chosen by someone else.

.Summary: ODP Implementation Characteristics
* One size does not fit all--supporting multiple implementations allows ODP
to adapt to widely differing internals among platforms.
* Anyone can create an ODP implementation tailored to their platform
* Distribution and maintenance of each implementation is as owner wishes
  - Open source or closed source as business needs determine
  - Have independent release cycles and service streams
* Allows HW and SW innovation in how ODP APIs are implemented on each platform.

.Reference Implementations
To make it easy to get started with implementing ODP on a new platform, ODP
supplies a number of _reference implementations_ that can serve as a
starting point.  The two primary references implementations supplied by ODP are
*odp-linux* and *odp-dpdk*

.odp-linux
The *odp-linux* reference implementation is a pure SW implementation of the
ODP API that relies only on the Linux programming API. As a functional model
for ODP, it enables ODP to be bootstrapped easily to any platform that
supports a Linux kernel.

.odp-dpdk
The *odp-dpdk* reference implementation is a pure SW implementation of the
ODP API that uses http://dpdk.org[DPDK] as a SW accelerator. In particular,
*odp-dpdk* offers superior I/O performance for systems that use NICs, allowing
ODP applications to take immediate full advantage of the various NIC device
drivers supported by DPDK.

.Summary: ODP Reference Implementations
* Open source, open contribution, BSD-3 licensed.
* Provide easy bootstrapping of ODP onto new platforms
* Implementers free to borrow or tailor code as needed for their platform
* Implementers retain full control over their implementations whether or not
they are derived from a reference implementation.

.ODP Validation Test Suite
Third, to ensure consistency between different ODP implementations, ODP
consists of a validation suite that verifies that any given implementation of
ODP faithfully provides the specified functional behavior of each ODP API.
As a separate open source component, the validation suite may be used by
application writers, system integrators, and platform providers alike to
confirm that any purported implementation of ODP does indeed conform to the
ODP API specification.

.Summary: ODP Validation Test Suite
* Synchronized with ODP API specification
* Maintained and distributed by LNG
* Open source, open contribution, BSD-3 licensed.
* Key to ensuring application portability across all ODP implementations
* Tests that ODP implementations conform to the specified functional behavior
of ODP APIs.
* Can be run at any time by users and vendors to validate implementations
of ODP.

=== ODP API Specification Versioning
As an evolving standard, the ODP API specification is released under an
incrementing version number, and corresponding implementations of ODP, as well
as the validation suite that verifies API conformance, are linked to this
version number. ODP versions are specified using a standard three-level
number (major.minor.fixlevel) that are incremented according to the degree of
change the level represents. Increments to the fix level represent clarification
of the specification or other minor changes that do not affect either the
syntax or semantics of the specification. Such changes in the API specification
are expected to be rare. Increments to the minor level
represent the introduction of new APIs or functional capabilities, or changes
to he specified syntax or functional behavior of APIs and thus may require
application source code changes. Such changes are well documented in the
release notes for each revision of the specification. Finally, increments to
the major level represent significant structural changes that most likely
require some level of application source code change, again as documented in
the release notes for that version.

=== ODP Implementation Versioning
ODP implementations are free to use whatever release naming/numbering
conventions they wish, as long as it is clear what level of the ODP API a given
release implements. A recommended convention is to use the same three level
numbering scheme where the major and minor numbers correspond to the ODP API
level and the fix level represents an implementation-defined service level
associated with that API level implementation. The LNG-supplied ODP reference
implementations follow this convention.

=== ODP Validation Test Suite Versioning
The ODP validation test suite follows these same naming conventions. The major
and minor release numbers correspond to the ODP API level that the suite
validates and the fix level represents the service level of the validation
suite itself for that API level.

=== ODP Design Goals
ODP has three primary goals that follow from its component structure. The first
is application portability across a wide range of platforms. These platforms
differ in terms of processor instruction set architecture, number and types of
application processing cores, memory organization, as well as the number and
type of platform specific hardware acceleration and offload features that
are available. ODP applications can move from one conforming implementation
to another with at most a recompile.

Second, ODP is designed to permit data plane applications to avail themselves
of platform-specific features, including specialized hardware accelerators,
without specialized programming. This is achieved by separating the API
specification from their implementation on individual platforms. Since each
platform implements each ODP API in a manner optimal to that platform,
applications automatically gain the benefit of such optimizations without the
need for explicit programming.

Third, ODP is designed to allow applications to scale out automatically to
support many core architectures. This is done using an event based programming
model that permits applications to be written to be independent of the number
of processing cores that are available to realize application function. The
result is that an application written to this model does not require redesign
as it scales from 4, to 40, to 400 cores.

== Organization of this Document
This document is organized into several sections. The first presents a high
level overview of ODP applications, the ODP API component areas,
and their associated abstract
data types. This section introduces ODP APIs at a conceptual level.
The second provides a tutorial on the programming model(s)
supported by ODP, paying particular attention to the event model as this
represents the preferred structure for most ODP applications. This section
builds on the concepts introduced in the first section and shows how ODP
applications are structured to best realize the three ODP design goals
mentioned earlier. The third section provides a more detailed overview of
the major ODP API components and is designed to serve as a companion to the
full reference specification for each API. The latter is intended to be used
by ODP application programmers, as well as implementers, to understand the
precise syntax and semantics of each API.

== ODP Applications and Packet Flow
Data plane applications are fundamentally concerned with receiving, examining,
manipulating, and transmitting packets. The distinguishing feature of the
data plane is that these applications are mostly concerned with the lowest
layers of the ISO stack (Layers 2 and 3) and they have very high to extreme
performance requirements. ODP is designed to provide a portable framework for
such applications.

At the highest level, an *ODP Application* is a program that uses one or more
ODP APIs. Because ODP is a framework rather than a programming environment,
applications are free to also use other APIs that may or may not provide the
same portability characteristics as ODP APIs.

ODP applications vary in terms of what they do and how they operate, but in
general all share the following characteristics:

. They are organized into one or more _threads_ that execute in parallel.
. These threads communicate and coordinate their activities using various
_synchronization_ mechanisms.
. They receive packets from one or more _packet I/O interfaces_.
. They examine, transform, or otherwise process packets.
. They transmit packets to one or more _packet I/O interfaces_.

At the highest level, an ODP application looks as follows:

.ODP Application Packet Flow Overview
image::packet_flow.svg[align="center"]

Packets arrive and are received (RX) from a network interface represented by
a _PktIO_ abstraction. From here they go either directly to _Queues_ that are
polled by ODP _Threads_, or can pass through the _Classifier_ and sorted into
Queues that represent individual flows. These queues can then be dispatched
to application threads via the _Scheduler_.

Threads, in term can invoke various ODP APIs to manipulate packet contents
prior to disposing of them. For output processing, packets make by directly
queued to a PktIO output queue or else they may be handed to the _Traffic
Manager_ for programmatic _Quality of Service (QoS)_ processing before winding
up being transmitted (TX). Note that output interfaces may operate in
_loopback_ mode, in which case packets sent to them are re-routed back to the
input lines for "second pass" processing. For example, an incoming IPSec packet
cannot be properly classified (beyond being IPSec traffic) until it is
decrypted. Once decrypted and its actual contents made visible, it can then
be classified into its real flow.

What is important to note is that the only part of the above diagram that need
be written are the boxes in yellow that contain the application
logic. Everything else shown here is provided by the ODP framework and
available for use by any ODP application. This represents the "machinery" of a
data plane application and is structured to allow applications written to the
ODP APIs to be both portable and optimized for each platform that offers an
ODP implementation without additional programming effort.

== ODP API Concepts
ODP programs are built around several conceptual structures that every
application programmer needs to be familiar with to use ODP effectively. The
main ODP concepts are:
Thread, Event, Queue, Pool, Shared Memory, Buffer, Packet, PktIO, Time, Timer,
and Synchronizer.

=== Thread
The thread is the fundamental programming unit in ODP.  ODP applications are
organized into a collection of threads that perform the work that the
application is designed to do. ODP threads may or may not share memory with
other threads--that is up to the implementation. Threads come in two "flavors":
control and worker, that are represented by the abstract type
`odp_thread_type_t`.

A control thread is a supervisory thread that organizes
the operation of worker threads. Worker threads, by contrast, exist to
perform the main processing logic of the application and employ a run to
completion model. Worker threads, in particular, are intended to operate on
dedicated processing cores, especially in many core processing environments,
however a given implementation may multitask multiple threads on a single
core if desired (typically on smaller and lower performance target
environments).

In addition to thread types, threads have associated _attributes_ such as
_thread mask_ and _scheduler group_ that determine where they can run and
the type of work that they can handle. These will be discussed in greater
detail later.

=== Event
Events are what threads process to perform their work. Events can represent
new work, such as the arrival of a packet that needs to be processed, or they
can represent the completion of requests that have executed asynchronously.
Events can also represent notifications of the passage of time, or of status
changes in various components of interest to the application. Events have an
event type that describes what it represents. Threads can create new events
or consume events processed by them, or they can perform some processing on
an event and then pass it along to another component for further processing.
References to events are via handles of abstract type `odp_event_t`. Cast
functions are provided to convert these into specific handles of the
appropriate type represented by the event.

=== Queue
A queue is a message passing channel that holds events.  Events can be
added to a queue via enqueue operations or removed from a queue via dequeue
operations. The endpoints of a queue will vary depending on how it is used.
Queues come in two major types: polled and scheduled, which will be
discussed in more detail when the event model is introduced. Queues may also
have an associated context, which represents a persistent state for all
events that make use of it. These states are what permit threads to perform
stateful processing on events as well as stateless processing.

Queues are represented by handles of abstract type `odp_queue_t`.

=== Pool
A pool is a shared memory area from which elements may be drawn. Pools
represent the backing store for events, among other things. Pools are
typically created and destroyed by the application during initialization and
termination, respectively, and then used during processing. Pools may be
used by ODP components exclusively, by applications exclusively, or their
use may be shared between the two. Pools have an associated type that
characterizes the elements that they contain. The two most important pool types
are Buffer and Packet.

Pools are represented by handles of abstract type `odp_pool_t`.

=== Shared Memory
Shared memory represents raw blocks of storage that are sharable between
threads. They are the building blocks of pools but can be used directly by
ODP applications if desired.

Shared memory is represented by handles of abstract type `odp_shm_t`.

=== Buffer
A buffer is a fixed sized block of shared storage that is used by ODP
components and/or applications to realize their function. Buffers contain
zero or more bytes of application data as well as system maintained
metadata that provide information about the buffer, such as its size or the
pool it was allocated from. Metadata is an important ODP concept because it
allows for arbitrary amounts of side information to be associated with an
ODP object. Most ODP objects have associated metadata and this metadata is
manipulated via accessor functions that act as getters and setters for
this information. Getter access functions permit an application to read
a metadata item, while setter access functions permit an application to write
a metadata item. Note that some metadata is inherently read only and thus
no setter is provided to manipulate it.  When object have multiple metadata
items, each has its own associated getter and/or setter access function to
inspect or manipulate it.

Buffers are represented by handles of abstract type `odp_buffer_t`.

=== Packet
Packets are received and transmitted via I/O interfaces and represent
the basic data that data plane applications manipulate.
Packets are drawn from pools of type `ODP_POOL_PACKET`.
Unlike  buffers, which are simple objects,
ODP packets have a rich set of semantics that permit their inspection
and manipulation in complex ways to be described later. Packets also support
a rich set of metadata as well as user metadata. User metadata permits
applications to associate an application-determined amount of side information
with each packet for its own use.

Packets are represented by handles of abstract type `odp_packet_t`.

=== Packet I/O (PktIO)
PktIO is how ODP represents I/O interfaces. A pktio object is a logical port
capable of receiving (RX) and/or transmitting (TX) packets. This may be
directly supported by the underlying platform as an integrated feature, or may
represent a device attached via a PCIE or other bus.

PktIOs are represented by handles of abstract type `odp_pktio_t`.

=== Time
The time API is used to measure time intervals and track time flow of an
application and presents a convenient way to get access to a time source.
The time API consists of two main parts: local time API and global time API.

==== Local time
The local time API is designed to be used within one thread and can be faster
than the global time API. The local time API cannot be used between threads as
time consistency is not guaranteed, and in some cases that's enough.
So, local time stamps are local to the calling thread and must not be shared
with other threads. Current local time can be read with `odp_time_local()`.

==== Global time
The global time API is designed to be used for tracking time between threads.
So, global time stamps can be shared between threads. Current global time can
be read with `odp_time_global()`.

Both, local and global time is not wrapped during the application life cycle.
The time API includes functions to operate with time, such as `odp_time_diff()`,
`odp_time_sum()`, `odp_time_cmp()`, conversion functions like
`odp_time_to_ns()`, `odp_time_local_from_ns()`, `odp_time_global_from_ns()`.
To get rate of time source `odp_time_local_res()`, `odp_time_global_res()`
are used. To wait, `odp_time_wait_ns()` and `odp_time_wait_until()` are used,
during witch a thread potentially busy loop the entire wait time.

The `odp_time_t` opaque type represents local or global timestamps.

=== Timer
Timers are how ODP applications measure and respond to the passage of time.
Timers are drawn from specialized pools called timer pools that have their
own abstract type (`odp_timer_pool_t`). Applications may have many timers
active at the same time and can set them to use either relative or absolute
time. When timers expire they create events of type `odp_timeout_t`, which
serve as notifications of timer expiration.

=== Synchronizer
Multiple threads operating in parallel typically require various
synchronization services to permit them to operate in a reliable and
coordinated manner. ODP provides a rich set of locks, barriers, and similar
synchronization primitives, as well as abstract types for representing various
types of atomic variables. The ODP event model also makes use of queues to
avoid the need for explicit locking in many cases. This will be discussed
in the next section.

== ODP Components ==
Building on ODP concepts, ODP offers several components that relate to the
flow of work through an ODP application. These include the Classifier,
Scheduler, and Traffic Manager.  These components relate to the three
main stages of packet processing: Receive, Process, and Transmit.

=== Classifier
The *Classifier* provides a suite of APIs that control packet receive (RX)
processing.

.ODP Receive Processing with Classifier
image::odp_rx_processing.svg[align="center"]

The classifier provides two logically related services:
[horizontal]
Packet parsing:: Verifying and extracting structural information from a
received packet.

Packet classification:: Applying *Pattern Matching Rules (PMRs)* to the
parsed results to assign an incoming packet to a *Class of Service (CoS)*.

Combined, these permit incoming packets to be sorted into *flows*, which are
logically related sequences of packets that share common processing
requirements. While many data plane applications perform stateless packet
processing (_e.g.,_ for simple forwarding) others perform stateful packet
processing.  Flows anchor state information relating to these groups of
packets.

A CoS determines two variables for packets belonging to a flow:
[list]
* The pool that they will be stored in on receipt
* The queue that they will be added to for processing

The PMRs supported by ODP permit flow determination based on combinations of
packet field values (tuples). The main advantage of classification is that on
many platforms these functions are performed in hardware, meaning that
classification occurs at line rate as packets are being received without
any explicit processing by the ODP application.

Note that the use of the classifier is optional.  Applications may directly
receive packets from a corresponding PktIO input queue via direct polling
if they choose.

=== Scheduler
The *Scheduler* provides a suite of APIs that control scalable event
processing.

.ODP Scheduler and Event Processing
image::odp_scheduling.svg[align="center"]

The Scheduler is responsible for selecting and dispatching one or more events
to a requesting thread. Event selection is based on several factors involving
both the queues containing schedulable events and the thread making an
`odp_schedule()` or `odp_schedule_multi()` call.

ODP queues have a _scheduling priority_ that determines how urgently events
on them should be processed relative to events contained in other queues.
Queues also have a _scheduler group id_ associated with them that must match
the associated scheduler group _thread mask_ of the thread calling the
scheduler. This permits events to be grouped for processing into classes and
have threads that are dedicated to processing events from specified classes.
Threads can join and leave scheduler groups dynamically, permitting easy
application response to increases in demand.

When a thread receives an event from the scheduler, it in turn can invoke
other processing engines via ODP APIs (_e.g.,_ crypto processing) that
can operate asynchronously. When such processing is complete, the result is
that a *completion event* is added to a schedulable queue where it can be
scheduled back to a thread to continue processing with the results of the
requested asynchronous operation.

Threads themselves can enqueue events to queues for downstream processing
by other threads, permitting flexibility in how applications structure
themselves to maximize concurrency.

=== Traffic Manager
The *Traffic Manager* provides a suite of APIs that control traffic shaping and
Quality of Service (QoS) processing for packet output.

.ODP Transmit processing with Traffic Manager
image::odp_traffic_manager.svg[align="center"]

The final stage of packet processing is to transmit it. Here, applications have
several choices.  As with RX processing, applications may send packets
directly to PktIO TX queues for direct transmission.  Often, however,
applications need to perform traffic shaping and related
*Quality of Service (QoS)* processing on the packets comprising a flow as part
of transmit processing. To handle this need, ODP provides a suite of
*Traffic Manager* APIs that permit programmatic establishment of arbiters,
shapers, etc. that control output packet processing to achieve desired QoS
goals. Again, the advantage here is that on many platforms traffic management
functions are implemented in hardware, permitting transparent offload of
this work.

== ODP Application Programming Structure

=== The include structure
Applications only include the 'include/odp_api.h' file, which includes the
'platform/<implementation name>/include/odp/api' files to provide a complete
definition of the API on that platform. The doxygen documentation defining
the behavior of the ODP API is all contained in the public API files, and the
actual definitions for an implementation will be found in the per platform
directories. Per-platform data that might normally be a `#define` can be
recovered via the appropriate access function if the #define is not directly
visible to the application.

.Users include structure
----
./
├── include/
│   ├── odp/
│   │   └── api/
│   │       └── spec/
│   │           └── The Public API and the documentation.
│   │
│   │
│   ├── odp_api.h   This file should be the only file included by the
│   │               application.
----

=== Initialization
IMPORTANT: ODP depends on the application to perform a graceful shutdown,
calling the terminate functions should only be done when the application is
sure it has closed the ingress and subsequently drained all queues, etc.

=== Startup
The first API that must be called by an ODP application is 'odp_init_global()'.
This takes two pointers. The first, `odp_init_t`, contains ODP initialization
data that is platform independent and portable, while the second,
`odp_platform_init_t`, is passed unparsed to the implementation
to be used for platform specific data that is not yet, or may never be
suitable for the ODP API.

Calling odp_init_global() establishes the ODP API framework and MUST be
called before any other ODP API may be called. Note that it is only called
once per application. Following global initialization, each thread in turn
calls 'odp_init_local()'. This establishes the local ODP thread
context for that thread and MUST be called before other ODP APIs may be
called by that thread. The sole argument to this call is the _thread type_,
which is either `ODP_THREAD_WORKER` or `ODP_THREAD_CONTROL`.

=== Shutdown
Shutdown is the logical reverse of the initialization procedure, with
`odp_term_local()` called for each thread before `odp_term_global()` is
called to terminate ODP.

=== Application Initialization/Termination Structure
ODP Applications follow the general structure flow shown below:

.ODP Application Structure Flow Diagram
image::resource_management.svg[align="center"]

== Common Conventions
Many ODP APIs share common conventions regarding their arguments and return
types. This section highlights some of the more common and frequently used
conventions.

=== Handles and Special Designators
ODP resources are represented via _handles_ that have abstract type
_odp_resource_t_.  So pools are represented by handles of type `odp_pool_t`,
queues by handles of type `odp_queue_t`, etc. Each such type
has a distinguished type _ODP_RESOURCE_INVALID_ that is used to indicate a
handle that does not refer to a valid resource of that type. Resources are
typically created via an API named _odp_resource_create()_ that returns a
handle of type _odp_resource_t_ that represents the created object. This
returned handle is set to _ODP_RESOURCE_INVALID_ if, for example, the
resource could not be created due to resource exhaustion. Invalid resources
do not necessarily represent error conditions. For example, `ODP_EVENT_INVALID`
in response to an `odp_queue_deq()` call to get an event from a queue simply
indicates that the queue is empty.

=== Addressing Scope
Unless specifically noted in the API, all ODP resources are global to the ODP
application, whether it runs as a single process or multiple processes. ODP
handles therefore have common meaning within an ODP application but have no
meaning outside the scope of the application.

=== Resources and Names
Many ODP resource objects, such as pools and queues, support an
application-specified character string _name_ that is associated with an ODP
object at create time.  This name serves two purposes: documentation, and
lookup. The lookup function is particularly useful to allow an ODP application
that is divided into multiple processes to obtain the handle for the common
resource.

== Shared memory
=== Allocating shared memory
Blocks of shared memory can be created using the `odp_shm_reserve()` API
call. The call expects a shared memory block name, a block size, an alignment
requirement, and optional flags as parameters. It returns a `odp_shm_t`
handle. The size and alignment requirement are given in bytes.

.creating a block of shared memory
[source,c]
----
#define ALIGNMENT 128
#define BLKNAME "shared_items"

odp_shm_t shm;
uint32_t shm_flags = 0;

typedef struct {
...
} shared_data_t;

shm = odp_shm_reserve(BLKNAME, sizeof(shared_data_t), ALIGNMENT, shm_flags);
----

=== Getting the shared memory block address
The returned odp_shm_t handle can then be used to retrieve the actual
address (in the caller's ODP thread virtual address space) of the created
shared memory block.

.getting the address of a shared memory block
[source,c]
----
shared_data_t *shared_data;
shared_data = odp_shm_addr(shm);
----

The address returned by `odp_shm_addr()` is valid only in the calling ODP
thread space: odp_shm_t handles can be shared between ODP threads and remain
valid within any threads, whereas the address returned by `odp_shm_addr(shm)`
may differ from ODP threads to ODP threads (for the same 'shm' block), and
should therefore not be shared between ODP threads.
For instance, it would be correct to send a shm handle using IPC between two
ODP threads and let each of these thread do their own `odp_shm_addr()` to
get the block address. Directly sending the address returned by
`odp_shm_addr()` from one ODP thread to another would however possibly fail
(the address may have no sense in the receiver address space).

The address returned by `odp_shm_addr()` is nevertheless guaranteed to be
aligned according to the alignment requirements provided at block creation
time, even if the call to `odp_shm_addr()` is performed by a different ODP
thread than the one which originally called `odp_shm_reserve()`.

All shared memory blocks are contiguous in any ODP thread addressing space:
'address' to 'address'\+'size' (where 'size' is the shared memory block size,
as provided in the `odp_shm_reserve()` call) is read and writeable and
mapping the shared memory block. There is no fragmentation.

=== Memory behaviour
By default ODP threads are assumed to behave as cache coherent systems:
Any change performed on a shared memory block is guaranteed to eventually
become visible to other ODP threads sharing this memory block.
Nevertheless, there is no implicit memory barrier associated with any action
on shared memories: *When* a change performed by an ODP thread becomes visible
to another ODP thread is not known: An application using shared memory
blocks has to use some memory barrier provided by ODP to guarantee shared data
validity between ODP threads.

The virtual address at which a given memory block is mapped in different ODP
threads may differ from ODP thread to ODP thread, if ODP threads have separate
virtual spaces (for instance if ODP threads are implemented as processes).
However, the ODP_SHM_SINGLE_VA flag can be used at `odp_shm_reserve()` time
to guarantee address uniqueness in all ODP threads, regardless of their
implementation or creation time.

=== Lookup by name
As mentioned, shared memory handles can be sent from ODP threads to ODP
threads using any IPC mechanism, and then the block address retrieved.
A simpler approach to get the shared memory block handle of an already created
block is to use the `odp_shm_lookup()` API function call.
This nevertheless requires the calling ODP thread to provide the name of the
shared memory block:
`odp_shm_lookup()` will return `ODP_SHM_INVALID` if no shared memory block
with the provided name is known by ODP.

.retrieving a block handle and address from another ODP task
[source,c]
----
#define BLKNAME "shared_items"

odp_shm_t shm;
shared_data_t *shared_data;

shm = odp_shm_lookup(BLKNAME);
if (shm != ODP_SHM_INVALID) {
	shared_data = odp_shm_addr(shm);
	...
}
----

=== Freeing memory
Freeing shared memory is performed using the `odp_shm_free()` API call.
`odp_shm_free()` takes one single argument, the shared memory block handle.
Any ODP thread is allowed to perform a `odp_shm_free()` on a shared memory
block (i.e. the thread performing the `odp_shm_free()` may be different
from the thread which did the `odp_shm_reserve()`). Shared memory blocks should
be freed only once, and once freed, a shared memory block should no longer
be referenced by any ODP threads.

.freeing a shared memory block
[source,c]
----
if (odp_shm_free(shm) != 0) {
	...//handle error
}
----

=== sharing memory with the external world
ODP provides ways of sharing memory with entities located outside
ODP instances:

Sharing a block of memory with an external (non ODP) thread is achieved
by setting the ODP_SHM_PROC flag at `odp_shm_reserve()` time.
How the memory block is retrieved on the Operating System side is
implementation and Operating System dependent.

Sharing a block of memory with an external ODP instance (running
on the same Operating System) is achieved
by setting the ODP_SHM_EXPORT flag at `odp_shm_reserve()` time.
A block of memory created with this flag in an ODP instance A, can be "mapped"
into a remote ODP instance B (on the same OS) by using the
`odp_shm_import()`, on ODP instance B:

.sharing memory between ODP instances: instance A
[source,c]
----
odp_shm_t shmA;
shmA = odp_shm_reserve("memoryA", size, 0, ODP_SHM_EXPORT);
----

.sharing memory between ODP instances: instance B
[source,c]
----
odp_shm_t shmB;
odp_instance_t odpA;

/* get ODP A instance handle by some OS method */
odpA = ...

/* get the shared memory exported by A:
shmB = odp_shm_import("memoryA", odpA, "memoryB", 0, 0);
----

Note that the handles shmA and shmB are scoped by each ODP instance
(you can not use them outside the ODP instance they belong to).
Also note that both ODP instances have to call `odp_shm_free()` when done.

=== Memory creation flags
The last argument to odp_shm_reserve() is a set of ORed flags.
The following flags are supported:

==== ODP_SHM_PROC
When this flag is given, the allocated shared memory will become visible
outside ODP. Non ODP threads (e.g. usual linux process or linux threads)
will be able to access the memory using native (non ODP) OS calls such as
'shm_open()' and 'mmap' (for linux).
Each ODP implementation should provide a description on exactly how
this mapping should be done on that specific platform.

==== ODP_SHM_EXPORT
When this flag is given, the allocated shared memory will become visible
to other ODP instances running on the same OS.
Other ODP instances willing to see this exported memory should use the
`odp_shm_import()` ODP function.

==== ODP_SHM_SW_ONLY
This flag tells ODP that the shared memory will be used by the ODP application
software only: no HW (such as DMA, or other accelerator) will ever
try to access the memory. No other ODP call will be involved on this memory
(as ODP calls could implicitly involve HW, depending on the ODP
implementation), except for `odp_shm_lookup()` and `odp_shm_free()`.
ODP implementations may use this flag as a hint for performance optimization,
or may as well ignore this flag.

==== ODP_SHM_SINGLE_VA
This flag is used to guarantee the uniqueness of the address at which
the shared memory is mapped: without this flag, a given memory block may be
mapped at different virtual addresses (assuming the target have virtual
addresses) by different ODP threads. This means that the value returned by
`odp_shm_addr()` would be different in different threads, in this case.
Setting this flag guarantees that all ODP threads sharing this memory
block will see it at the same address (`odp_shm_addr()` would return the
same value on all ODP threads, for a given memory block, in this case)
Note that ODP implementations may have restrictions of the amount of memory
which can be allocated with this flag.

== Queues
Queues are the fundamental event sequencing mechanism provided by ODP and all
ODP applications make use of them either explicitly or implicitly. Queues are
created via the 'odp_queue_create()' API that returns a handle of type
`odp_queue_t` that is used to refer to this queue in all subsequent APIs that
reference it. Queues have one of two ODP-defined _types_, POLL, and SCHED that
determine how they are used. POLL queues directly managed by the ODP
application while SCHED queues make use of the *ODP scheduler* to provide
automatic scalable dispatching and synchronization services.

.Operations on POLL queues
[source,c]
----
odp_queue_t poll_q1 = odp_queue_create("poll queue 1", ODP_QUEUE_TYPE_POLL, NULL);
odp_queue_t poll_q2 = odp_queue_create("poll queue 2", ODP_QUEUE_TYPE_POLL, NULL);
...
odp_event_t ev = odp_queue_deq(poll_q1);
...do something
int rc = odp_queue_enq(poll_q2, ev);
----

The key distinction is that dequeueing events from POLL queues is an
application responsibility while dequeueing events from SCHED queues is the
responsibility of the ODP scheduler.

.Operations on SCHED queues
[source,c]
----
odp_queue_param_t qp;
odp_queue_param_init(&qp);
odp_schedule_prio_t prio = ...;
odp_schedule_group_t sched_group = ...;
qp.sched.prio = prio;
qp.sched.sync = ODP_SCHED_SYNC_[NONE|ATOMIC|ORDERED];
qp.sched.group = sched_group;
qp.lock_count = n; /* Only relevant for ordered queues */
odp_queue_t sched_q1 = odp_queue_create("sched queue 1", ODP_QUEUE_TYPE_SCHED, &qp);

...thread init processing

while (1) {
        odp_event_t ev;
        odp_queue_t which_q;
        ev = odp_schedule(&which_q, <wait option>);
        ...process the event
}
----

With scheduled queues, events are sent to a queue, and the sender chooses
a queue based on the service it needs. The sender does not need to know
which ODP thread (on which core) or hardware accelerator will process
the event, but all the events on a queue are eventually scheduled and processed.

As can be seen, SCHED queues have additional attributes that are specified at
queue create that control how the scheduler is to process events contained
on them. These include group, priority, and synchronization class.

=== Scheduler Groups
The scheduler's dispatching job is to return the next event from the highest
priority SCHED queue that the caller is eligible to receive events from.
This latter consideration is determined by the queues _scheduler group_, which
is set at queue create time, and by the caller's _scheduler group mask_ that
indicates which scheduler group(s) it belongs to. Scheduler groups are
represented by handles of type `odp_scheduler_group_t` and are created by
the *odp_scheduler_group_create()* API. A number of scheduler groups are
_predefined_ by ODP.  These include `ODP_SCHED_GROUP_ALL` (all threads),
`ODP_SCHED_GROUP_WORKER` (all worker threads), and `ODP_SCHED_GROUP_CONTROL`
(all control threads). The application is free to create additional scheduler
groups for its own purpose and threads can join or leave scheduler groups
using the *odp_scheduler_group_join()* and *odp_scheduler_group_leave()* APIs

=== Scheduler Priority
The `prio` field of the `odp_queue_param_t` specifies the queue's scheduling
priority, which is how queues within eligible scheduler groups are selected
for dispatch. Queues have a default scheduling priority of NORMAL but can be
set to HIGHEST or LOWEST according to application needs.

=== Scheduler Synchronization
In addition to its dispatching function, which provide automatic scalability to
ODP applications in many core environments, the other main function of the
scheduler is to provide event synchronization services that greatly simplify
application programming in a parallel processing environment. A queue's
SYNC mode determines how the scheduler handles the synchronization processing
of multiple events originating from the same queue.

Three types of queue scheduler synchronization area supported: Parallel,
Atomic, and Ordered.

==== Parallel Queues
SCHED queues that specify a sync mode of ODP_SCHED_SYNC_NONE are unrestricted
in how events are processed.

.Parallel Queue Scheduling
image::parallel_queue.svg[align="center"]

All events held on parallel queues are eligible to be scheduled simultaneously
and any required synchronization between them is the responsibility of the
application. Events originating from parallel queues thus have the highest
throughput rate, however they also potentially involve the most work on the
part of the application. In the Figure above, four threads are calling
*odp_schedule()* to obtain events to process. The scheduler has assigned
three events from the first queue to three threads in parallel. The fourth
thread is processing a single event from the third queue. The second queue
might either be empty, of lower priority, or not in a scheduler group matching
any of the threads being serviced by the scheduler.

=== Atomic Queues
Atomic queues simplify event synchronization because only a single thread may
process event(s) from  a given atomic queue at a time. Events scheduled from
atomic queues thus can be processed lock free because the locking is being
done implicitly by the scheduler. Note that the caller may receive one or
more events from the same atomic queue if *odp_schedule_multi()* is used. In
this case these multiple events all share the same atomic scheduling context.

.Atomic Queue Scheduling
image::atomic_queue.svg[align="center"]

In this example, no matter how many events may be held in an atomic queue,
only one calling thread can receive scheduled events from it at a time. Here
two threads process events from two different atomic queues. Note that there
is no synchronization between different atomic queues, only between events
originating from the same atomic queue. The queue context associated with the
atomic queue is held until the next call to the scheduler or until the
application explicitly releases it via a call to
*odp_schedule_release_atomic()*.

Note that while atomic queues simplify programming, the serial nature of
atomic queues may impair scaling.

=== Ordered Queues
Ordered queues provide the best of both worlds by providing the inherent
scalability of parallel queues, with the easy synchronization of atomic
queues.

.Ordered Queue Scheduling
image::ordered_queue.svg[align="center"]

When scheduling events from an ordered queue, the scheduler dispatches multiple
events from the queue in parallel to different threads, however the scheduler
also ensures that the relative sequence of these events on output queues
is identical to their sequence from their originating ordered queue.

As with atomic queues, the ordering guarantees associated with ordered queues
refer to events originating from the same queue, not for those originating on
different queues. Thus in this figure three thread are processing events 5, 3,
and 4, respectively from the first ordered queue. Regardless of how these
threads complete processing, these events will appear in their original
relative order on their output queue.

==== Order Preservation
Relative order is preserved independent of whether events are being sent to
different output queues.  For example, if some events are sent to output queue
A while others are sent to output queue B then the events on these output
queues will still be in the same relative order as they were on their
originating queue.  Similarly, if the processing consumes events so that no
output is issued for some of them (_e.g.,_ as part of IP fragment reassembly
processing) then other events will still be correctly ordered with respect to
these sequence gaps. Finally, if multiple events are enqueued for a given
order (_e.g.,_ as part of packet segmentation processing for MTU
considerations), then each of these events will occupy the originator's
sequence in the target output queue(s). In this case the relative order of these
events will be in the order that the thread issued *odp_queue_enq()* calls for
them.

The ordered context associated with the dispatch of an event from an ordered
queue lasts until the next scheduler call or until explicitly released by
the thread calling *odp_schedule_release_ordered()*. This call may be used
as a performance advisory that the thread no longer requires ordering
guarantees for the current context. As a result, any subsequent enqueues
within the current scheduler context will be treated as if the thread was
operating in a parallel queue context.

==== Ordered Locking
Another powerful feature of the scheduler's handling of ordered queues is
*ordered locks*. Each ordered queue has associated with it a number of ordered
locks as specified by the _lock_count_ parameter at queue create time.

Ordered locks provide an efficient means to perform in-order sequential
processing within an ordered context. For example, supposed events with relative
order 5, 6, and 7 are executing in parallel by three different threads. An
ordered lock will enable these threads to synchronize such that they can
perform some critical section in their originating queue order. The number of
ordered locks supported for each ordered queue is implementation dependent (and
queryable via the *odp_config_max_ordered_locks_per_queue()* API). If the
implementation supports multiple ordered locks then these may be used to
protect different ordered critical sections within a given ordered context.

==== Summary: Ordered Queues
To see how these considerations fit together, consider the following code:

.Processing with Ordered Queues
[source,c]
----
void worker_thread()
        odp_init_local();
        ...other initialization processing

        while (1) {
                ev = odp_schedule(&which_q, ODP_SCHED_WAIT);
                ...process events in parallel
                odp_schedule_order_lock(0);
                ...critical section processed in order
                odp_schedule_order_unlock(0);
                ...continue processing in parallel
                odp_queue_enq(dest_q, ev);
        }
}
----

This represents a simplified structure for a typical worker thread operating
on ordered queues. Multiple events are processed in parallel and the use of
ordered queues ensures that they will be placed on `dest_q` in the same order
as they originated.  While processing in parallel, the use of ordered locks
enables critical sections to be processed in order within the overall parallel
flow. When a thread arrives at the *odp_schedule_order_lock()* call, it waits
until the locking order for this lock for all prior events has been resolved
and then enters the critical section. The *odp_schedule_order_unlock()* call
releases the critical section and allows the next order to enter it.

=== Queue Scheduling Summary

NOTE: Both ordered and parallel queues improve throughput over atomic queues
due to parallel event processing, but require that the application take
steps to ensure context data synchronization if needed.

include::users-guide-packet.adoc[]

include::users-guide-pktio.adoc[]

include::users-guide-timer.adoc[]

== Cryptographic services

ODP provides APIs to perform cryptographic operations required by various
communication protocols (e.g. IPSec). ODP cryptographic APIs are session based.

ODP provides APIs for following cryptographic services:

* Ciphering
* Authentication/data integrity via Keyed-Hashing (HMAC)
* Random number generation
* Crypto capability inquiries

=== Crypto Sessions

To apply a cryptographic operation to a packet a session must be created. All
packets processed by a session share the parameters that define the session.

ODP supports synchronous and asynchronous crypto sessions. For asynchronous
sessions, the output of crypto operation is posted in a queue defined as
the completion queue in its session parameters.

ODP crypto APIs support chained operation sessions in which hashing and ciphering
both can be achieved using a single session and operation call. The order of
cipher and hashing can be controlled by the `auth_cipher_text` session parameter.

Other Session parameters include algorithms, keys, initialization vector
(optional), encode or decode, output queue for async mode and output packet pool
for allocation of an output packet if required.

=== Crypto operations

After session creation, a cryptographic operation can be applied to a packet
using the `odp_crypto_operation()` API. Applications may indicate a preference
for synchronous or asynchronous processing in the session's `pref_mode` parameter.
However crypto operations may complete synchronously even if an asynchronous
preference is indicated, and applications must examine the `posted` output
parameter from `odp_crypto_operation()` to determine whether the operation has
completed or if an `ODP_EVENT_CRYPTO_COMPL` notification is expected. In the case
of an async operation, the `posted` output parameter will be set to true.


The operation arguments specify for each packet the areas that are to be
encrypted or decrypted and authenticated. Also, there is an option of overriding
the initialization vector specified in session parameters.

An operation can be executed in in-place, out-of-place or new buffer mode.
In in-place mode output packet is same as the input packet.
In case of out-of-place mode output packet is different from input packet as
specified by the application, while in new buffer mode implementation allocates
a new output buffer from the session’s output pool.

The application can also specify a context associated with a given operation that
will be retained during async operation and can be retrieved via the completion
event.

Results of an asynchronous session will be posted as completion events to the
session’s completion queue, which can be accessed directly or via the ODP
scheduler. The completion event contains the status of the operation and the
result. The application has the responsibility to free the completion event.

=== Random number Generation

ODP provides an API `odp_random_data()` to generate random data bytes. It has
an argument to specify whether to use system entropy source for random number
generation or not.

=== Capability inquiries

ODP provides an API interface `odp_crypto_capability()` to inquire implementation’s
crypto capabilities. This interface returns a bitmask for supported algorithms
and hardware backed algorithms.

include::users-guide-tm.adoc[]

include::users-guide-cls.adoc[]

include::../glossary.adoc[]