Commit Graph

5 Commits

Author SHA1 Message Date
Dave Barach
6931f59e54 Add per-thread, per-node runtime stats serialization
Change-Id: Ic4009cdbac67b7cd53c88079439496b9d9dfaa35
Signed-off-by: Dave Barach <dave@barachs.net>
2016-05-21 10:04:06 -04:00
Damjan Marion
1c80e831b7 Add support for multiple microarchitectures in single binary
* compiler -march= parameter is changed from native to corei7
   so code is always genereted with instructions which are available
   on the Nehalem microarchitecture (up to SSE4.2)

 * compiler -mtune= parameter is added so code is optimized for
   corei7-avx which equals to Sandy Bridge microarchitecture

 * set of macros is added which allows run-time detection of available
   cpu instructions (e.g. clib_cpu_supports_avx())

 * set of macros is added which allows us to clone graph node funcitons
   where cloned function is optmized for different microarchitecture
   Those macros are using following attributes:
     __attribute__((flatten))
     __attribute__((target("arch=core-avx2)))

   I.e. If applied to foo_node_fn() macro will generate cloned
   functions foo_node_fn_avx2() and foo_node_fn_avx512() (future)
   It will also generate function void * foo_node_fn_multiarch_select()
   which detects available instruction set and returns pointer to the
   best matching function clone.

Change-Id: I2dce0ac92a5ede95fcb56f47f3d1f3c4c040bac0
Signed-off-by: Damjan Marion <damarion@cisco.com>
2016-05-19 18:14:38 +02:00
Andrew Yourtchenko
716d959349 Avoid clobbering output_function by concurrent CLI sessions doing vlib_process_wait_for_event*.
A problem is easily reproducible by taking the test harness code from the commit,
and launching it in two terminals with some time overlap - the outputs will
be sent to the wrong session. This commit moves the output_function and argument
from a global structure into the process structure, thus the output_function
is not clobbered anymore and each session gets only its own output.

To ensure the callers can redirect the outputs to different destinations
(e.g. the API calls via shared memory, etc.) the existing logic
for vlib_cli_input() was retained.

To avoid the magic numbers usage in the logic that does the page-alignment
of the process stack, there are changes around the stack[] member
of vlib_process_t. Also added a compile-time assert to ensure that
the stack does indeed start on the page size multiple boundary.

Change-Id: I128680ac480735e5f214f81a884e414268e5d652
Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
2016-05-10 17:13:00 +00:00
Damjan Marion
3f46baf1bd Increase VLIB_MAX_CPUS to 256
Change-Id: Iac68b38dda1a0f9e2242f9eab5b03e44bbcac269
Signed-off-by: Damjan Marion <damarion@cisco.com>
2016-02-14 00:02:31 +00:00
Ed Warnicke
cb9cadad57 Initial commit of vpp code.
Change-Id: Ib246f1fbfce93274020ee93ce461e3d8bd8b9f17
Signed-off-by: Ed Warnicke <eaw@cisco.com>
2015-12-08 15:47:27 -07:00