This page contains information about NETCONF (Scale, Performance etc.) testing.

Scale tests

Scale tests for NETCONF in ODL.

NETCONF southbound scale test

Goal of this test is to measure how many NETCONF devices can be mounted by ODL with a set amount of RAM.

Scenario

  1. Start netconf-testtool that starts the desired amount of netconf servers
  2. Testtool generates initial configuration for odl
  3. ODL tries to connect to all of the simulated devices.
  4. Measure the amount of devices connected(if all weren't connected)
  5. Measure the time until odl connected to all the devices with a certain amount of RAM(2,4,8,16 GB)

How to

  1. Make sure the open file limit is set reasonably high to the amount of devices started: https://wiki.opendaylight.org/view/OpenDaylight_Controller:Netconf:Testtool#Too_many_files_open
  2. Unpackage a clean odl distribution, our scale utility will take care of feature installation and config generation
  3. Download netconf scale-util : https://nexus.opendaylight.org/content/repositories/opendaylight.snapshot/org/opendaylight/netconf/netconf-testtool/1.1.0-SNAPSHOT/netconf-testtool-1.1.0-20160308.161039-64-scale-util.jar
  4. Run the scale tool :
 java -Xmx8G -jar scale-util-1.1.0-SNAPSHOT-scale-util.jar --distribution-folder ./distribution-karaf-0.4.0-Beryllium --device-count 8000 --ssh false --exi false --generate-configs-batch-size 1000

The scale util needs to be pointed to an unpacked distribution (--distribution-folder argument) and handles the karaf start and feature installation. While the test is running the utility is also checking Restconf periodically to see the current status of all netconf devices. After the test completes successfully karaf is stopped, features are cleaned and the whole test is restarted with more devices(currently hardcoded 1000, parameter for this needs to be added). If you are starting with more ram than 2GB you should start with more devices than 8k.right away. The test results are being logged periodically into the scale-results.log that's present in the same location as the util jar. If you are running with even more devices the ram for testtools should be increased aswell.

To run the test with tcp add --ssh false argument when starting the scale-util.

To rerun the test with more ram available to odl you need to edit the /${distribution-location}/bin/setenv script line:

 export JAVA_MAX_MEM="4G"

NOTE: the fastest way to find out how many devices odl can handle at a given ram is to start the test with a larger amount of devices than it can handle, set the config batch size to 1k and start the test. You can then analyze the result log to see where there seems to be a drop off in the connected devices/or the test timed out.

Results

Beryllium

Environment:


Other configuration:


In this test the simulated devices testtool started only had the base capabilites that testtool has included by default(ietf-netconf-monitoring, ietf-inet-types, ietf-yang-types).


Netconf scale test
Ram for ODL - 2GB
transportdevicestime needed
ssh80003m 40s4k batches, starts having low memory(most of the time in GC) issues around 8k devices.
ssh900012m 16s1k batches, times out after 20minutes
tcp200006m 03s4k batches, reached 20min timeout, maybe can handle a bit more
tcp2100018m 54s1k batches, reached 20min timeout, maybe can handle a bit more
Ram for ODL - 4GB
ssh140009m 28s
ssh1500017m 20stimeout after 20minutes, hits the ram limit
tcp2400018m 31s1k batches
tcp2800017m 27s2k batches, timeout after 20min, should be able to get higher but needs more time


With tcp we also noticed that after 15k devices theres a pretty big slowdown with pushing/handling the configs as there starts to be increasing gaps between the individual batches.

Beryllium SR3

Environment:

Other configurations

Howto

Git clone https://github.com/opendaylight/netconf.git
cd netconf
mvn clean install
cp ~/netconf/netconf/tools/netconf-testtool/target/scale-util-1.2.0-SNAPSHOT.jar ~
 java -Xmx8G -jar scale-util-1.2.0-SNAPSHOT-scale-util.jar --distribution-folder ./distribution-karaf-0.4.0-Beryllium --device-count 8000 --ssh false --exi false --generate-configs-batch-size 1000

Results:
The Netconf Southbound test would repeatedly fail with a "java.lang.OutOfMemoryError: unable to create a new native thread" error message. The tester used the command set defined above, although the flag -Xmx8G was decreased to 4G, increased to 16G, and 32G -- all with the same result. At the same time this tester increased the value of JAVA_MAX_MEM to 16G and 32G, and still encountered the java.lang.OutOfMemoryError message. Log files of the failure and hs_err_pid log files captured and saved for examination. These are available on request.

Performance tests

Performance tests for NETCONF in ODL.

NETCONF northbound performance test

Goal of this test is to measure is the performance of an external NETCONF client uploading information into ODL (Global Datastore) using just NETCONF northbound server.

Scenario

  1. ODL controller starts with simple l2fib models and NETCONF northbound for MD-SAL enabled
  2. External fast netconf client writes lots of l2fib entries into MD-SAL's global DataStore using NETCONF northbound interface
  3. The client measures time since sending out the 1st request until last response is received
  4. After all the l2fibs are in ODL, performance is calculated in terms of requests and l2fibs written per scond

How to

This how to will be split into multiple ordered sections:

ODL and tooling setup
 git clone https://git.opendaylight.org/gerrit/coretutorials
 cd coretutorials
 git checkout stable/beryllium
 git fetch https://git.opendaylight.org/gerrit/coretutorials refs/changes/16/35916/1 && git checkout FETCH_HEAD
 cd ncmount
 mvn clean install -DskipTests -Dcheckstyle.skip
 cd karaf/target/assembly/
 ./bin/karaf
  NETCONF Node: controller-config is fully connected
 curl -u "admin:admin" -H "Accept: application/xml" -H "Content-type: application/xml" --request POST 'http://localhost:8181/restconf/config/network-topology:network-topology/topology/topology-netconf/node/controller-config/yang-ext:mount/config:modules' --data '<module xmlns="urn:opendaylight:params:xml:ns:yang:controller:config"> \
             <type xmlns:prefix="urn:opendaylight:params:xml:ns:yang:controller:netconf:northbound:tcp">prefix:netconf-northbound-tcp</type> \
             <name>netconf-mdsal-tcp-server</name> \
             <dispatcher xmlns="urn:opendaylight:params:xml:ns:yang:controller:netconf:northbound:tcp"> \
                 <type xmlns:prefix="urn:opendaylight:params:xml:ns:yang:controller:config:netconf:northbound">prefix:netconf-server-dispatcher</type> \
                 <name>netconf-mdsal-server-dispatcher</name> \
             </dispatcher> \
         </module>'
 Netconf TCP endpoint started successfully at /0.0.0.0:2831
Testing over TCP[edit]
 edit-l2fib-1000.txt  edit-l2fib-1.txt           netconf-north-perf-test-files.zip                      stress-client-1.0.0-Beryllium-package
 edit-l2fib-100.txt   edit-l2fib-delete-all.txt  netconf-testtool-1.0.0-Beryllium-stress-client.tar.gz
 java -Xmx2G -XX:MaxPermSize=256M -jar stress-client-1.0.0-Beryllium-package/stress-client-1.0.0-Beryllium-stress-client.jar --ip 127.0.0.1 --port 2831 --edits 10000 --exi false --ssh false --username admin --password admin --thread-amount 1 --async false --edit-batch-size 10000 --edit-content edit-l2fib-1.txt
 FINISHED. Execution time: 5.585 s
 Requests per second: 1790.8309455587394
 curl -u "admin:admin" http://localhost:8181/restconf/config/ncmount-l2fib:bridge-domains | grep -o forward | wc -l
 java -jar stress-client-1.0.0-Beryllium-package/stress-client-1.0.0-Beryllium-stress-client.jar --ip 127.0.0.1 --port 2831 --edits 1 --exi false --ssh false --username admin --password admin --thread-amount 1 --async false --edit-content edit-l2fib-delete-all.txt


Note: The client has many configuration options. Use -h to see all of them. Note: There are 3 edit-content files in the resources. Each contains a different number of l2-fibs per edit-config: 1, 100 and 1000. Files with different amounts can be produced and used.

Testing over SSH
 java -Xmx2G -XX:MaxPermSize=256M -jar stress-client-1.0.0-Beryllium-package/stress-client-1.0.0-Beryllium-stress-client.jar --ip 127.0.0.1 --port 2830 --edits 10000 --exi false --ssh true --username admin --password admin --thread-amount 1 --async false --edit-batch-size 10000 --edit-content edit-l2fib-1.txt
 FINISHED. Execution time: 11.64 s
 Requests per second: 859.106529209622

Results

Beryllium

Environment:


Base configuration:


Measured numbers with a single client:

Netconf northbound single client performance
Client typel2fib per requestTCP performanceSSH performanceTotal l2fibs
Sync11 730 edits/s
1 730 l2fibs/s
1 474 edits/s
1 474 l2fibs/s
100k
Async17 063 edits/s
7 063 l2fibs/s
6 600 edits/s
6 600 l2fibs/s
100k
Sync100233 edits/s
23 372 l2fibs/s
148 edits/s
14 850 l2fibs/s
500k
Async100421 edits/s
42 179 l2fibs/s
386 edits/s
38 600 l2fibs/s
500k
Sync50061 edits/s
30 935 l2fibs/s
13 edits/s
6 590 l2fibs/s
1M
Async50081 edits/s
40 894 l2fibs/s
69 edits/s
34 500 l2fibs/s
1M
Sync100035 edits/s
35 365 l2fibs/s
13 edits/s
13 248 l2fibs/s
1M
Async100038 edits/s
38 099 l2fibs/s
19 edits/s
19 898 l2fibs/s
1M


Multiple clients:

Netconf northbound mutliplce client performance
ClientsClient typel2fib per requestTCP performanceSSH performanceTotal l2fibs
8Sync123 010 edits/s
23 010 l2fibs/s
13 847 edits/s
13 847 l2fibs/s
400k
8Async141 114 edits/s
41 114 l2fibs/s
12 527 edits/s
12 527 l2fibs/s
400k
16Sync131 743 edits/s
31 743 l2fibs/s
15 879 edits/s
15 879 l2fibs/s
400k
16Async143 252 edits/s
43 252 l2fibs/s
12 496 edits/s
12 496 l2fibs/s
400k
8Sync100852 edits/s
85 215 l2fibs/s
769 edits/s
76 989 l2fibs/s
1,6M
8Async100984 edits/s
98 419 l2fibs/s
869 edits/s
86 923 l2fibs/s
1,6M
16Sync100808 edits/s
80 885 l2fibs/s
723 edits/s
72 345 l2fibs/s
1,6M
16Async100852 edits/s
85 224 l2fibs/s
749 edits/s
74 962 l2fibs/s
1,6M
8Sync500


8Async500


16Sync500


16Async500


8Sync1000


8Async1000


16Sync1000


16Async1000


Beryllium SR3

Environment

Other configurations

N/A

Steps to recreate

git clone https://git.opendaylight.org/gerrit/coretutorials
cd coretutorials
git checkout master
cd ncmount
mvn clean install –DskipTests –Dcheckstyle.skip
cd karaf/target/assembly

(When the other set of instructions were followed, when the tester executed "mvn clean install –DskipTests –Dcheckstyle.skip", maven failed due to "Non-resolvable parent POM for org.opendaylight.coretutorials:ncmount-aggregator:1.1.0-SNAPSHOT". Attempting to hand-edit all of the relevant pom.xml files proved impractical -- too many files to investigate & fix -- so the tester used this work around. At least maven completed the command.)

mvn clean install

(Trying to download the Beryllium version of the NETCONF stress tool from the URL in the Wiki failed. When the tester used the link, he encountered the error “404 Not found: repository with ID: “autorelease-1074” not found”.)

./bin/karaf

At this point, the expected message "NETCONF Node: controller-config is fully connected" was not found in the logs. The tester was not confident he understood these instructions, & stopped here, waiting on further instructions.

NETCONF southbound performance test

Goal of this test is to measure is the performance of a NETCONF device uploading information into ODL using just NETCONF southbound notifications.

Scenario

  1. ODL controller mounts a simulated NETCONF device (simulates Cisco IOS XR thanks to some of its routing models)
  2. Small ODL application triggers a NETCONF notification stream for new mountpoint
  3. Simulated device immediately starts sending routes with a certain number of prefixes into ODL
  4. Application waits until all notifications have been processed and measures the execution/receiving time
  5. Application outputs the performance numbers: notifications/second and prefixes/second to the log


Notes:

How to

 java -jar netconf-testtool-1.0.0-Beryllium-executable.jar --schemas-dir ./xrSchemas/ --ssh false --exi false --notification-file i2rs-notifs-perf100k.xml
 ./ncmount-karaf-1.1.0-SNAPSHOT/bin/karaf

and wait until you there's following message in the log:

 NETCONF Node: controller-config is fully connected
 curl -u "admin:admin" -H "Accept: application/xml" -H "Content-type: application/xml" --request POST 'http://localhost:8181/restconf/config/network-topology:network-topology/topology/topology-netconf/node/controller-config/yang-ext:mount/config:modules' --data '<module xmlns="urn:opendaylight:params:xml:ns:yang:controller:config"> \
     <type xmlns:prefix="urn:opendaylight:params:xml:ns:yang:controller:md:sal:connector:netconf">prefix:sal-netconf-connector</type> \
     <name>controller-notif-100000</name> \
     <address xmlns="urn:opendaylight:params:xml:ns:yang:controller:md:sal:connector:netconf">127.0.0.1</address> \
     <port xmlns="urn:opendaylight:params:xml:ns:yang:controller:md:sal:connector:netconf">17830</port> \
     <username xmlns="urn:opendaylight:params:xml:ns:yang:controller:md:sal:connector:netconf">admin</username> \
     <password xmlns="urn:opendaylight:params:xml:ns:yang:controller:md:sal:connector:netconf">admin</password> \
     <tcp-only xmlns="urn:opendaylight:params:xml:ns:yang:controller:md:sal:connector:netconf">true</tcp-only> \
     <keepalive-delay xmlns="urn:opendaylight:params:xml:ns:yang:controller:md:sal:connector:netconf">0</keepalive-delay> \
     <event-executor xmlns="urn:opendaylight:params:xml:ns:yang:controller:md:sal:connector:netconf"> \
        <type xmlns:prefix="urn:opendaylight:params:xml:ns:yang:controller:netty">prefix:netty-event-executor</type> \
       <name>global-event-executor</name> \
     </event-executor> \
     <binding-registry xmlns="urn:opendaylight:params:xml:ns:yang:controller:md:sal:connector:netconf"> \
       <type xmlns:prefix="urn:opendaylight:params:xml:ns:yang:controller:md:sal:binding">prefix:binding-broker-osgi-registry</type> \
       <name>binding-osgi-broker</name> \
     </binding-registry> \
     <dom-registry xmlns="urn:opendaylight:params:xml:ns:yang:controller:md:sal:connector:netconf"> \
       <type xmlns:prefix="urn:opendaylight:params:xml:ns:yang:controller:md:sal:dom">prefix:dom-broker-osgi-registry</type> \
       <name>dom-broker</name> \
     </dom-registry> \
     <client-dispatcher xmlns="urn:opendaylight:params:xml:ns:yang:controller:md:sal:connector:netconf"> \
       <type xmlns:prefix="urn:opendaylight:params:xml:ns:yang:controller:config:netconf">prefix:netconf-client-dispatcher</type> \
       <name>global-netconf-dispatcher</name> \
     </client-dispatcher> \
     <processing-executor xmlns="urn:opendaylight:params:xml:ns:yang:controller:md:sal:connector:netconf"> \
       <type xmlns:prefix="urn:opendaylight:params:xml:ns:yang:controller:threadpool">prefix:threadpool</type> \
       <name>global-netconf-processing-executor</name> \
     </processing-executor> \
   </module>'
 NETCONF Node: controller-notif-100000 is fully connected
 Elapsed ms for 100000 notifications: 9092
 Performance (notifications/second): 10998.680158380994
 Performance (prefixes/second): 10998.680158380994
   curl -u "admin:admin" -H "Accept: application/xml" -H "Content-type: application/xml" --request DELETE  http://localhost:8181/restconf/config/network-topology:network-topology/topology/topology-netconf/node/controller-config/yang-ext:mount/config:modules/module/odl-sal-netconf-connector-cfg:sal-netconf-connector/controller-notif-100000


Notes:

Results

Beryllium

Environment:


Configuration:


Measured performance send/receive of 1M Notifications:

Netconf southbound notification performance
Total notificationsPrefixes per notificationTCP performanceSSH performanceTCP+EXI performance
100k110716 notifications/s
10716 prefixes/s
9828 notifications/s
9828 prefixes/s

100k27112 notifications/s
14224 prefixes/s
5496 notifications/s
10992 prefixes/s

100k101996 notifications/s
19965 prefixes/s
1635 notifications/s
16356 prefixes/s

* SSH test performance is worse over time (with test reruns on the same ODL) and more memory is used but not freed. Looks like a memory leak. https://bugs.opendaylight.org/show_bug.cgi?id=5488

Beryllium-SR-3

Test failed to run.

Environment

Steps performed

NETCONF end-to-end performance test

Goal of this test is to measure is the performance the end-to-end (external REST client -> REST north -> MD-SAL -> NETCONF south -> NETCONF device) performance of ODL using both NETCONF and RESTCONF.

Scenario

  1. ODL controller mounts a simulated NETCONF device (simulates Cisco IOS XR thanks to some of its routing models)
  2. External REST client starts sending prefixes via RESTCONF
  3. ODL application handles the calls, transforms the request into device specific models and writes to the device
  4. The client waits until all of its requests were handled in RESTCONF and calculates the rate

How to

 git clone https://git.opendaylight.org/gerrit/coretutorials
 cd coretutorials
 git fetch https://git.opendaylight.org/gerrit/coretutorials refs/changes/54/36054/1 && git checkout FETCH_HEAD
 cd ncmount
 mvn clean install -DskipTests -Dcheckstyle.skip
 cd karaf/target/assembly/
 java -jar netconf-testtool-1.0.0-Beryllium-executable.jar --schemas-dir ./xrSchemas/ --ssh false --exi false --distribution-folder /ncmount-karaf-1.1.0-SNAPSHOT
 ./ncmount-karaf-1.1.0-SNAPSHOT/bin/karaf
 java -jar rest-stress-client.jar --ip localhost --port 8181 --destination /restconf/operations/ncmount:write-routes --edits 100 --edit-content json_routes_10.json --async-requests true --throttle 1000 --auth admin admin

Multiple devices

To run this test with multiple devices/clients these changes apply:

 --device-count 16
 --same-device false  --thread-amount 16

--thread-amount is the number of clients. Each client will be mapped to a single device.