Tuesday, March 25, 2014

Open vSwitch in a Hardware

I wanted to replace the broadcom switch with OVS and i started analyzing and found some interesting analysis.




" In networking, most cases open source locks you in similarly to non-open source software"

References

http://www.sdncentral.com/technology/vswitch-the-new-battleground-what-every-datacenter-operator-must-know/2012/07/

http://www.xflowresearch.com/http://www.lightreading.com/comms-chips/startup-finds-a-business-in-openflow/d/d-id/690939
http://benpfaff.org/writings/openvswitch/orr.html

http://www.sdncentral.com/market/sdn-myth-busters-we-test-5-common-sdn-myths-propagated-by-vendors/2012/10/

A response to Michael Orr

Michael Orr wrote the following comment on an article at sdncentral.com:
We wanted to use OVS on our Silicon Switch, and initially wanted to use it as both a HW-speed OpenFlow implementation and a general L2/L3 HW-based switch. We found out that OVS assumed you may want to us the HW as a wild-card match-element for OpenFlow, but is not really suited to using HW for other functionality (Bridging, routing, VLANs, LAGs, etc. etc.). To use HW for these you have to write your own OFPROTO, a major change, which will cause you to generate your own private fork, and split off from OVS main branch irrevocably.
Believing a HW-capable OVS is generally A Good Thing, We contacted Nicira, suggesting we do the work, and asking they review/approve and (most importantly) adopt the final result into the main branch. They refused. I am not even arguing here who was right, and the reasons for the decision—That's not the point. The point is that (at that time, not too long ago) we were placed in the exact dillema described in this post—use OVS as Nicira sees fit, or abandon the main branch of OVS and fork an incompatible version.
These paragraphs spread a lot of misinformation. I'll address them in two sections below.

Infeasibility of a worthwhile hybrid provider

First, let's address what Orr says is his point:
I am not even arguing here who was right, and the reasons for the decision—That's not the point.
The reason for the decision was essentially that we did not see how Orr's request could be solved in a worthwhile way. Marvell, in turn, did not (to my knowledge) ever attempt to constructively respond to our arguments.
It helps to have some background on how Open vSwitch is ported to a new platform. You can find the whole story in the PORTING file at the top of the Open vSwitch distribution. To summarize, there were (and are) two viable options for a port, either to write an “ofproto provider” or a “dpif provider.” PORTING summarizes these choices as:
  • Only an ofproto provider can take full advantage of hardware with built-in support for wildcards (e.g. an ACL table or a TCAM).
  • A dpif provider can take advantage of the Open vSwitch built-in implementations of bonding, LACP, 802.1ag, 802.1Q VLANs, and other features. An ofproto provider has to provide its own implementations, if the hardware can support them at all.
  • A dpif provider is usually easier to implement, but most appropriate for software switching. It “explodes” wildcard rules into exact-match entries. This allows fast hash lookups in software, but makes inefficient use of TCAMs in hardware that support wildcarding.
Orr and Marvell were looking for a third “hybrid” option that combines the advantages of both approaches. We also regarded this as a desirable goal, so last summer we spent hours brainstorming ways to achieve this goal. Anyone who has worked with me knows that, if you want me to spend hours in a meeting, then it has to be for something important.
We did regard this goal as important, so we spent some serious time to talk through and critique a number of ideas. We didn't rule anything out, including drastic changes to the Open vSwitch architecture. Again, anyone who has worked with me knows that I never rule out changes simply because they are large, as long as the benefit is equally large. But we didn't come up with an effective solution, and we did come up with a few issues that seemed insurmountable, so we reported that back to Marvell.
You don't have to just take my word for any of the above, though, because I documented our reasoning and our conclusions in a section of the PORTING file titled “Why OVS Does Not Support Hybrid Providers” that I committed to the Open vSwitch Git repository on July 15, 2011. It reads as follows:
The “Porting Strategies” section above describes the “ofproto provider” and “dpif provider” porting strategies. Only an ofproto provider can take advantage of hardware TCAM support, and only a dpif provider can take advantage of the OVS built-in implementations of various features. It is therefore tempting to suggest a hybrid approach that shares the advantages of both strategies.
However, Open vSwitch does not support a hybrid approach. Doing so may be possible, with a significant amount of extra development work, but it does not yet seem worthwhile, for the reasons explained below.
First, user surprise is likely when a switch supports a feature only with a high performance penalty. For example, one user questioned why adding a particular OpenFlow action to a flow caused a 1,058x slowdown on a hardware OpenFlow implementation [1]. The action required the flow to be implemented in software.
Given that implementing a flow in software on the slow management CPU of a hardware switch causes a major slowdown, software-implemented flows would only make sense for very low-volume traffic. But many of the features built into the OVS software switch implementation would need to apply to every flow to be useful. There is no value, for example, in applying bonding or 802.1Q VLAN support only to low-volume traffic.
Besides supporting features of OpenFlow actions, a hybrid approach could also support forms of matching not supported by particular switching hardware, by sending all packets that might match a rule to software. But again this can cause an unacceptable slowdown by forcing bulk traffic through software in the hardware switch's slow management CPU. Consider, for example, a hardware switch that can match on the IPv6 Ethernet type but not on fields in IPv6 headers. An OpenFlow table that matched on the IPv6 Ethernet type would perform well, but adding a rule that matched only UDPv6 would force every IPv6 packet to software, slowing down not just UDPv6 but all IPv6 processing.
[1] Aaron Rosen, “Modify packet fields extremely slow”, openflow-discuss mailing list, June 26, 2011, archived at https://mailman.stanford.edu/pipermail/openflow-discuss/2011-June/002386.html.
I see that an internal email I wrote about the above text says:
Please notice that this patch is on ovs-dev… That's why it doesn't, for example, name Marvell or Michael Orr or anyone else.
but since Orr is bringing up the issue in public I don't see why I shouldn't.
If Orr or Marvell ever responded to the above, for example to explain why they disagree or to propose another approach, then it never made it to me. I don't see any follow-up to it in my email archive.

Implementing an ofproto provider

Michael Orr writes:
To use HW for these you have to write your own OFPROTO, a major change, which will cause you to generate your own private fork, and split off from OVS main branch irrevocably.
The first part of this is true. Short of a feasible approach to a hybrid provider, one must write an “ofproto provider” to obtain the best performance with hardware.
The remainder does not make sense.
First, yes, an ofproto provider could be a significant amount of code, but that does not make it a major change. It's simply adding one or more source files that implement hardware-specific functionality in a hardware-specific way. The Open vSwitch source code is intentionally designed to make plugging in such a provider straightforward. The header file that describes the interface has almost 3 times as many comment lines as other lines, to make the interface as clear as possible.
It also does not make any sense in this context to talk about writing an ofproto provider as forcing a private fork of Open vSwitch. Regardless of the means that Marvell chooses to port Open vSwitch to its hardware, it would be creating a fork of Open vSwitch, because Marvell regards as proprietary the specifications and the APIs for the high-end switching chips for which Open vSwitch is relevant. No one who has not signed a non-disclosure agreement with Marvell would ever see the code.
Finally, claiming that to write an ofproto provider, even one that due to Marvell's business practices would necessarily be private, is to “split off from OVS main branch irrevocably” does not make sense. The interface between an ofproto provider and the rest of Open vSwitch, though it is not frozen, has evolved rather than seen drastic changes over Open vSwitch releases. An author of a private ofproto provider should be able to track upstream Open vSwitch changes, not with negligible effort but with a reasonable amount.

http://www.xflowresearch.com/expertise.html

Technical Expertise



xFlow specializes in full SDN stack development:
  • Hypervisor technology like Xen and KVM
  • Network virtualization using emerging tunneling protocols, such as VxLAN, STT, NVGRE, STT, GTP etc.
  • Highly optimized fast path implementations in soft switches
SDN data planes on proprietary platforms:
  • Marvell switch chips: xCAT (24/48x1Gbps) and Lion (24/48x10Gbps) platforms
  • Broadcom switch chips: Ported OVS to Broadcom 563xx and Trident switch chips
  • Octeon-based switching platforms: Ported OVS with significant enhancements to Octeon NIC platforms
  • Optical/photonic switches: Designed an implemented an OpenFlow API extension for Calient Networks’ optical switches
SDN Controller Architectures:
  • Contributors to NOX/POX development
  • Ported NOX to Cavium Octeon processors
SDN Controller Application Development:
  • Expertise in NOX, Beacon, and Floodlight application development
Benchmarking Optimizations such as:
  • Optimized Queuing
  • Advanced Traffic Policing
  • Support for Multiple Lookups
  • Advanced QoS Support
  • Linux kernel optimization using large packet buffers
  • Benchmarking/Profiling


No comments:

Post a Comment

bee-social