Setting up container based Openstack with OVN networking

OVN is a relatively new networking technology which provides a powerful and flexible software implementation of standard networking functionalities such as switches, routers, firewalls, etc. Importantly, OVN is distributed in the sense that the aforementioned network entities can be realized over a distributed set of compute/networking resources. OVN is tightly coupled with OVS, essentially being a layer of abstraction which sits above a set of OVS switches and realizes the above networking components across these switches in a distributed manner.

A number of cloud computing platforms and more general compute resource management frameworks are working on OVN support, including oVirt, Openstack, Kubernetes and Openshift – progress on this front is quite advanced. Interestingly and importantly, one dimension of the OVN vision is that it can act as a common networking substrate which could facilitate integration of more than one of the above systems, although the realization of that vision remains future work.

In the context of our work on developing an edge computing testbed, we set up a modest Openstack cluster, to emulate functionality deployed within an Enterprise Data Centre with OVN providing network capabilities to the cluster. This blog post provides a brief overview of the system architecture and notes some issues we had getting it up and running.

As our system is not a production system, providing High Availability (HA) support was not one of the requirements; consequently, it was not necessary to consider HA OVN mode. As such, it was natural to host the OVN control services, including the Northbound and Southbound DBs and the Northbound daemon (ovn-northd) on the Openstack controller node. As this is the node through which external traffic goes, we also needed to run an external facing OVS on this node which required its own OVN controller and local OVS database. Further, as this OVS chassis is intended for external traffic, it needed to be configured with ‘enable-chassis-as-gw‘.

We configured our system to use DHCP provided by OVN; consequently the neutron DHCP agent was no longer necessary, we removed this process from our controller node. Similarly, L3 routing was done within OVN meaning that the neutron L3 agent was no longer necessary. Openstack metadata support is implemented differently when OVN is used: instead of having a single metadata process running on a controller serving all metadata requests, the metadata service is deployed on each node and the OVS switch on each node routes requests to 169.254.169.254 to the local metadata agent; this then queries the nova metadata service to obtain the metadata for the specific VM.

The services deployed on the controller and compute nodes are shown in Figure 1 below.

Figure 1: Neutron containers with and without OVN

We used Kolla to deploy the system. Kolla does not currently have full support for OVN; however specific Kolla containers for OVN have been created (e.g. kolla/ubuntu-binary-ovn-controller:queens, kolla/ubuntu-binary-neutron-server-ovn:queens). Hence, we used an approach which augments the standard Kolla-ansible deployment with manual configuration of the extra containers necessary to get the system running on OVN.

As always, many smaller issues were encountered while getting the system working – we will not detail all these issues here, but rather focus on the more substantive issues. We divide these into three specific categories: OVN parameters which need to be configured, configuration specifics for the Kolla OVN containers and finally a point which arose due to assumptions made within Kolla that do not necessarily hold for OVN.

To enable OVN, it was necessary to modify the configuration of the OVS switches operating on all the nodes; the existing OVS containers and OVSDB could be used for this – the OVS version shipped with Kolla/Queens is v2.9.0 – but it was necessary to modify some settings. First, it was necessary to configure system-ids for all of the OVS chassis’ – we chose to select fixed UUIDs a priori and use these for each deployment such that we had a more systematic process for setting up the system but it’s possible to use a randomly generated UUID.

docker exec -ti openvswitch_vswitchd ovs-vsctl set open_vswitch . external-ids:system-id="$SYSTEM_ID"

On the controller node, it was also necessary to set the following parameters:

docker exec -ti openvswitch_vswitchd ovs-vsctl set Open_vSwitch . \
    external_ids:ovn-remote="tcp:$HOST_IP:6642" \
    external_ids:ovn-nb="tcp:$HOST_IP:6641" \
    external_ids:ovn-encap-ip=$HOST_IP external_ids:ovn-encap type="geneve" \
    external-ids:ovn-cms-options="enable-chassis-as-gw"

docker exec openvswitch_vswitchd ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-ex

On the compute nodes this was necessary:

docker exec -ti openvswitch_vswitchd ovs-vsctl set Open_vSwitch . \
    external_ids:ovn-remote="tcp:$OVN_SB_HOST_IP:6642" \
    external_ids:ovn-nb="tcp:$OVN_NB_HOST_IP:6641" \
    external_ids:ovn-encap-ip=$HOST_IP \
    external_ids:ovn-encap-type="geneve"

Having changed the OVS configuration on all the nodes, it was then necessary to get the services operational on the nodes. There are two specific aspects to this: modifying the service configuration files as necessary and starting the new services in the correct way.

Not many changes to the service configurations were required. The primary changes related to ensuring the the OVN mechanism driver was used and letting neutron know how to communicate with OVN. We also used the geneve tunnelling protocol in our deployment and this required the following configuration settings:

  • For the neutron server OVN container
    • ml2_conf.ini
              mechanism_drivers = ovn
       	type_drivers = local,flat,vlan,geneve
       	tenant_network_types = geneve
      
       	[ml2_type_geneve]
       	vni_ranges = 1:65536
       	max_header_size = 38
      
       	[ovn]
       	ovn_nb_connection = tcp:172.30.0.101:6641
       	ovn_sb_connection = tcp:172.30.0.101:6642
       	ovn_l3_scheduler = leastloaded
       	ovn_metadata_enabled = true
      
    • neutron.conf
              core_plugin = neutron.plugins.ml2.plugin.Ml2Plugin
       	service_plugins = networking_ovn.l3.l3_ovn.OVNL3RouterPlugin
      
  • For the metadata agent container (running on the compute nodes) it was necessary to configure it to point at the nova metadata service with the appropriate shared key as well as how to communicate with OVS running on each of the compute nodes
            nova_metadata_host = 172.30.0.101
     	metadata_proxy_shared_secret = <SECRET>
     	bridge_mappings = physnet1:br-ex
     	datapath_type = system
     	ovsdb_connection = tcp:127.0.0.1:6640
     	local_ip = 172.30.0.101
    

For the OVN specific containers – ovn-northd, ovn-sb and ovn-nb databases, it was necessary to ensure that they had the correct configuration at startup; specifically, that they knew how to communicate with the relevant dbs. Hence, start commands such as

/usr/sbin/ovsdb-server /var/lib/openvswitch/ovnnb.db -vconsole:emer -vsyslog:err -vfile:info --remote=punix:/run/openvswitch/ovnnb_db.sock --remote=ptcp:$ovnnb_port:$ovsdb_ip --unixctl=/run/openvswitch/ovnnb_db.ctl --log-file=/var/log/kolla/openvswitch/ovsdb-server-nb.log

were necessary (for the ovn northbound database) and we had to modify the container start process accordingly.

It was also necessary to update the neutron database to support OVN specific versioning information: this was straightforward using the following command:

docker exec -ti neutron-server-ovn_neutron_server_ovn_1 neutron-db-manage upgrade heads

The last issue which we had to overcome was that Kolla and neutron OVN had slightly different views regarding the naming of the external bridges. Kolla-ansible configured a connection between the br-ex and br-int OVS bridges on the controller node with port names phy-br-ex and int-br-ex respectively. OVN also created ports with the same purpose but with different names patch-provnet-<UUID>-to-br-int and patch-br-int-to-provonet-<UUID>; as these ports had the same purpose, our somewhat hacky solution was to manually remove the the ports created in the first instance by Kolla-ansible.

Having overcome all these steps, it was possible to launch a VM which had external network connectivity and to which a floating IP address could be assigned.

Clearly, this approach is not realistic for supporting a production environment, but it’s an appropriate level of hackery for a testbed.

Other noteworthy issues which arose during this work include the following:

  • Standard docker apparmor configuration in ubuntu is such that mount cannot be run inside containers, even if they have the appropriate privileges. This has to be disabled or else it is necessary to ensure that the containers do not use the default docker apparmor profile.
  • A specific issue with mounts inside a container which resulted in the mount table filling up with 65536 mounts and rendering the host quite unusable (thanks to Stefan for providing a bit more detail on this) – the workaround was to ensure that /run/netns was bind mounted into the container.
  • As we used geneve encapsulation, geneve kernel modules had to be loaded
  • Full datapath NAT support is only available for linux kernel 4.6 and up. We had to upgrade the 4.4 kernel which came with our standard ubuntu 16.04 environment.

This is certainly not a complete guide to how to get Openstack up and running with OVN, but may be useful to some folks who are toying with this. In future, we’re going to experiment with extending OVN to an edge networking context and will provide more details as this work evolves.

 

Things we learned at the 9th SDN workshop

What’s a better way to kick-start the 2018’s blog post series than with SDN topics? 🙂 Therefore in order to keep the tradition of regular blog posts dedicated to the SDN workshop, we have prepared a thorough reflection of the talks and demonstrations featuring the 9th workshop held on the 4th of December 2017 in the premises of IBM Research, Zurich, and organized by the ICCLab and SWITCH.

Continue reading

SDN model definition in Netfloc and data representation in Netflogi

After the latest work on Heat support for Netfloc service chaining, the SDN team used the summer’s quiet month of August to update some of the features in Netfloc and Netflogi.

A new extension to the API set is the functionality to retrieve the current service chains created in Netfloc. This has been made possible with the improvement of the Netfloc service yang definition and the persistent data record from Netfloc into the MD-SAL repository. Using this APIs the providers of a Netfloc service can now issue RPC requests to the datastore inventory and retrieve the current data and state of the service. This improves the resilience by keeping the network state and the chain aligned and consistent in case of Netfloc restart.

Continue reading

Orchestrate your network service with Netfloc plugin for OpenStack Heat

Stitching virtual network functions (VNFs) together in a so called Network (Service) Function Chain is not a novelty any longer. Described in our previous post, the SDN team had already worked on creating SFC library support for OpenStack in our SDK for SDN. In this blog we describe the advances made towards integrating Netfloc services with both, Heat Orchestration Template (HoT) – based orchestrators and Network Function Virtualization (NFV) – based orchestrators.

To do so, and also to make a step towards automatizing the SFC management with Netfloc, we created a Heat plugin for Netfoc. It is based on the Netfloc API library for managing network service chains in OpenStack clouds. The parameters required to create the service include: OpenDaylight credentials, the IP and the port of the Netfloc node, along with the Neutron port IDs of the VNF instances.

For a network service operator, applying the plugin makes it very simple to deploy multiple chains in OpenStack cloud infrastructure. An example includes a packet inspection VNF that determines if the traffic is video and the type of the video service, and sends it further to a virtual transcoding unit VNF for quality adjustment. If data traffic is detected, packets are steered to a virtual security appliance acting as a virtual firewall, which sends them further to a virtual proxy VNF and a deep packet inspection VNF.

Continue reading

SESAME, Hurtle & NetFloc

Recently, at EUCNC’16, the ZHAW SESAME team demonstrated the work of combining SESAME concepts through the use of Hurtle, our orchestration framework, and Netfloc, our SDK for datacenter network programming. The demonstration was also a joint demonstration between SESAME and the 5GPPP COHERENT project.

It was a demonstration that bridges the gap between the telco and cloud world by creating a network service based on the services and technologies coming from both projects.

Continue reading

Highlights – Open Cloud Day 2016

Last week on 16th June, we held our 5th Open Cloud Day. It took place this year at our campus (ZHAW Winterthur). We co-located another event of ours with the Open Cloud Day, the SDN Workshop.

Open Cloud Day is a joint activity, organised annually by ch/open/ and our research group, Service Engineering. The SDN workshop, a biannual event, focussing on the Software Defined Networking topic in cloud computing, is a joint activity between SWITCH and us.

The one day event hosted about 130 people, ran in 3 separate tracks; main, SDN and workshops, and covered a vast area of cloud topics. PaaS undeniably was the most discussed topic where some speakers took a provocative stance and others towards a positive and “game-changer” aptitude. Continue reading

Release of Netflogi – a Graphical Interface for Netfloc

In the past year i was working on a graphical interface for Netfloc – the SDK for SDN developed in the SDN initiative. Just to remind you, the aim of Netflogi is to: (1) make the SDK itself easy to use and (2) expose the Netfloc APIs and functional features to the network application developers and datacenter service providers.

With the initial version of Netflogi, the user was able to create a Service Function Chain only by using Neutron port IDs. This was done by choosing the number and the order of the Neutron ports used in the Chain. The current version of Netflogi allows to create Virtual Network Functions (VNFs) and a Service Chain using those VNFs. This blog post describes the updated functionalities of Netflogi that take part of the development work i did as part of my final exam, called IPA in Switzerland.

The novel functionality added in Netflogi is the VNF management (create & delete). Creating VNF includes choosing a name, a description of the functionality and assigning ingress and egress ports. Once the VNF is successfully created, it appears automatically in the list of the existing VNFs. The VNFs can be used in different services offered by Netfloc. Currently Service Function Chaining is fully implemented and can be done by combining one or more VNFs. The VNFs can be deleted, as well as the Service Chains. Help page and messaging system for the performed actions is also included in this version.

Continue reading

Wanted: Senior Researcher / Researcher in Software Defined Networking for Clouds

Job description

The Service Engineering (SE, blog.zhaw.ch/icclab) group at the Zurich University of Applied Sciences (ZHAW) / Institute of Applied Information Technology (InIT) in Switzerland is seeking applications for a full-time position at its Winterthur facility.

The successful candidate will work in the InIT Cloud Computing Lab (ICCLab) and will contribute to the research initiative on software defined networking for clouds.

Continue reading

Observations from 11th NGSDP Experts Talk

On 22nd April 2016 the 11th Experts Talk on Next Generation Service Delivery Platforms (NGSDP) was held at the Telekom Innovation Laboratories in Berlin. The purpose of the event is to bring together thought-leaders in the area of Next Generation Services to discuss the state of the art in the field. Continue reading

Service Function Chaining using the SDK4SDN

Last week in Athens we integrated the SDK4SDN aka Netfloc in the T-Nova Pilot testbed in order to showcase service function chaining using two endpoints and two VNFs (Virtual Network Functions).

NETwork FLOws for Clouds (Netfloc) is an open source SDK for datacenter network programming developed in the ICCLab SDN initiative. It is comprised of set of tools and libraries that interoperate with the OpenDaylight controller. Netfloc exposes REST API abstractions and Java interfaces for network programmers to enable optimal integration in cloud datacenters and fully SDN-enabled end-to-end management of OpenFlow enabled switches.

Continue reading