Carrier routing and
switching worldwide saw a 14 percent sequential increase and a 5 percent
year-over-year increase, and is now a $3 billion market, according to
ACG Research. Cisco (Nasdaq: CSCO), as usual, topped the list of vendors with $1.52 billion in revenues, followed by Alcatel-Lucent (NYSE: ALU) and Juniper Networks (NYSE: JNPR).
Core routing remains soft as carriers moving toward 100G stay cautious
about spending, with revenues at $590 million, down 6.6 percent compared
to the same period last year. But sequentially, core routing is up 7.4
percent. "Core has been in a soft cycle but we anticipate growth as the
delays in upgrades are now starting to be addressed," said Ray Mota,
ACG Research founder. Post
Archive library of my periodic learning's and thoughts . *For Educational Purpose and references.ONLY*
Tuesday, January 14, 2014
CIO update: Post-mortem on the Skype outage
Courtesey : http://blogs.skype.com/2010/12/29/cio-update/
CIO update: Post-mortem on the Skype outage
What was the cause for the failure?
On Wednesday, December 22, a cluster of support servers responsible for offline instant messaging became overloaded. As a result of this overload, some Skype clients received delayed responses from the overloaded servers. In a version of the Skype for Windows client (version 5.0.0152), the delayed responses from the overloaded servers were not properly processed, causing Windows clients running the affected version to crash.Users running either the latest Skype for Windows (version 5.0.0.156), older versions of Skype for Windows (4.0 versions), Skype for Mac, Skype for iPhone, Skype on your TV, and Skype Connect or Skype Manager for enterprises were not affected by this initial problem.
However, around 50% of all Skype users globally were running the 5.0.0.152 version of Skype for Windows, and the crashes caused approximately 40% of those clients to fail. These clients included 25–30% of the publicly available supernodes, also failed as a result of this problem.
If approximately 20% of total Skype clients failed, why was there a much bigger disruption to Skype functionality?
Although Skype staff responded quickly to disable the overloaded servers and to eliminate client requests to them, a significant number of supernodes had already failed. A supernode is important to the P2P network because it takes on additional responsibilities compared to regular nodes, acting like a directory, supporting other Skype clients, helping to establish connections between them and creating local clusters typically of several hundred peer nodes per each supernode.Once a supernode has failed, even when restarted, it takes some time to become available as a resource to the P2P network again. As a result, the P2P network was left with 25–30% fewer supernodes than normal. This caused a disproportionate load on the remaining available supernodes.
Why weren’t the other supernodes available to help?
The failure of 25–30% of supernodes in the P2P network resulted in an increased load on the remaining supernodes. While we expect this kind of increase in the instance of a failure, a significant proportion of users were also restarting crashed Windows clients at this time. This massively increased the load as they reconnected to the peer-to-peer cloud. The initial crashes happened just before our usual daily peak-hour (1000 PST/1800 GMT), and very shortly after the initial crash, which resulted in traffic to the supernodes that was about 100 times what would normally be expected at that time of day.Supernodes have a built in mechanism to protect themselves and to avoid adverse impact on the systems hosting them when operational parameters do not fall into expected ranges. We believe that increased load in supernode traffic led to some of these parameters exceeding normal limits, and as a result, more supernodes started to shut down. This further increased the load on remaining supernodes and caused a positive feedback loop, which led to the near complete failures that occurred a few hours after the triggering event.
Regrettably, as a result of the confluence of events – server overload, a bug in Skype for Windows clients (version 5.0.0.152), and the decline in available supernodes – Skype’s functionality became unavailable to many of our users for approximately 24 hours.
How did Skype help support supernode recovery?
In order to restore Skype functionality, the Skype engineering and operations team introduced hundreds of instances of the Skype software into the P2P network to act as dedicated supernodes, which we nick-named “mega-supernodes,” to provide enough temporary supernode capacity to accelerate the recovery of the peer-to-peer cloud.By late Wednesday night (PST) it was evident that only a proportion (about 15-20%) of Skype users connections were ‘healing’ and the volume of load on the supernodes continued to be unusually high. In response, our team introduced several thousand more mega-supernodes through the night. During Wednesday night, full recovery of the P2P network was underway and the majority of users were able to connect to the P2P network normally by early morning (California-PST) on December 23rd.
As we reported during the incident, in order to recover the core Skype functionality as quickly as possible, we utilized resources normally used to support Group Video Calling, to deploy supernodes, and over the course of Thursday night and Friday morning we returned these to their normal use and restored Group Video Calling functionality in time for Christmas.
The supernodes stabilized overnight on Thursday and by Friday, several tens of thousands of supernodes were supporting the P2P network. During Friday, we withdrew a significant proportion of the mega-supernodes from service, leaving some in operation to ensure stability of the P2P network over Christmas and New Year.
What is Skype doing to prevent this from happening again?
We understand how important the reliability, security and quality of our software is to Skype users around the world, and we work hard to maintain high standards, as well as develop new features and products.First, we will continue to examine our software for potential issues, and provide ‘hotfixes’ where appropriate, for download or automatic delivery to our users. Since a bug was identified in Skype for Windows (version 5.0.0.152), we had provided a fix to v5.0 of our Windows software prior to the incident, and we will provide further updates for download this week. We will also be reviewing our processes for providing ‘automatic’ updates to our users so that we can help keep everyone on the latest Skype software. We believe these measures will reduce the possibility of this type of failure occurring again.
Second, we are learning the lessons we can from this incident and reviewing our processes and procedures, looking in particular for ways in which we can detect problems more quickly to potentially avoid such outages altogether, and ways to recover the system more rapidly after a failure.
Third, while our Windows v5 software release was subject to extensive internal testing and months of Beta testing with hundreds of thousands of users, we will be reviewing our testing processes to determine better ways of detecting and avoiding bugs which could affect the system.
Finally, as we continue to grow, we will keep under constant review the capacity of our core systems that support the Skype user base, and continue to invest in both capacity and resilience of these systems. An investment program we initiated a year ago has significantly increased our capacity already and more investment is planned for 2011 both to support the ongoing roll out of our paid and enterprise products, and to continue to support the growth of our core Skype software that we know millions of users rely on every day.
We are truly grateful to all of our users and humbled by your continued support. We know how much you rely on Skype, and we know that we fell short in both fulfilling your expectations and communicating with you during this incident. Lessons will be learned and we will use this as an opportunity to identify and introduce areas of improvement to our software, further assess and invest in capacity and stability, and develop better processes for outage recovery and communications to our user base. Thank you to everyone.
Friday, January 3, 2014
SDN Uncovered
Take 2 on my experience in SDN and how things are as off start of 2014.
http://www.cisco.com/en/US/prod/collateral/iosswrel/content/at_a_glance_c45-708540.pdf
http://www.networkcomputing.com/software-defined-networking-comparisons/
http://www.cisco.com/en/US/prod/collateral/iosswrel/content/at_a_glance_c45-708540.pdf
http://www.networkcomputing.com/software-defined-networking-comparisons/
Company | Product | More Coverage |
---|---|---|
C A S |
Cisco entered offerings in our controllers, applications and switches categories under the umbrella of its Cisco ONE Portfolio. The Cisco Open Networking Environment SDN controller supports version 1.0 of the OpenFlow standard alongside the Cisco onePK spec. VM support includes ESX, HyperV and KVM via Nexus 1000V. Applications are under the Monitor Manager NSM system. Switches/vswitches are based on onePK and support C, Java and Python APIs. More Detailed Product Information |
|
C A S |
HP offers a complete portfolio of OpenFlow hardware and software, the heart of which is the HP Virtual Application Networks SDN controller. Available as either a software virtual appliance or 2U, quad-core hardware box, it supports OpenFlow versions 1.0 and 1.3 and includes published northbound APIs to enable third-party SDN applications. Four applications are already available: HP Sentinel Security, Virtual Cloud Network portal, Microsoft Lync UC&C and the IMC VAN SDN Manager. HP also has more than 40 OpenFlow-compliant switches, ranging from entry-level products to its just-announced modular 12900 data center core switch, which sports a 36 Tbps switching backplane and up to 16 slots for a maximum of 768 10GbE or 64 100GbE ports. More Detailed Product Information |
|
S |
Juniper's EX9200 Programmable Switch supports OpenFlow 1.3 and comes in four models. The 6U EX9204 delivers four slots, the EX9208 offers eight slots in an 8U chassis and the 16U EX9214 sports 14 slots. Southbound API support includes Puppet and NetConf; support for the OpenStack Quantum plug-in is planned. The EX9200 can simultaneously handle IP, MPLS, CAPWAP termination, GRE, VPLS, policy/forwarding decisions and new SDN protocols. More Detailed Product Information |
|
C A S |
Netsocket Virtual Network is compatible with OpenFlow 1.0 and Microsoft's vSwitch and Hyper-V, users of which can get started now at no cost. Hardware requirements are straightforward: 64-bit X86, 1 core, 1 GB RAM. The system supports Openstack and RADIUS for user authentication More Detailed Product Information More Detailed Product Information |
|
C |
Nuage Networks' Virtualized Services Platform (VSP) is based on OpenFlow and is agnostic in terms of physical switches supported. Hypervisor support is likewise broad and includes KVM, Xen, ESXi and Hyper-V. Nuage partners for services support include F5 Networks, Palo Alto More Detailed Product Information |
|
C S O |
Enterasys has offerings in three of our four catgories. Its OneFabric line lacks OpenFlow support, but the controller sports an integrated GUI for end-to-end network management and an open northbound API. Its entire switch line is SDN-compatible, and support extends to switching, routing and wireless networking gear. More Detailed Product Information |
|
S |
Extreme Summit X440 Series The Summit devices run ExtremeXOS implementations of OpenFlow APIs and come in 10 variants with aggregated bandwidth ranging from 64 Gbps to 134 Gbps. The devices are OpenStack compatible through the Extreme Networks Quantum plug-in. TRILL support is due in Q4. More Detailed Product Information |
|
C A S |
NEC's ProgrammableFlow Controller line is built on OpenFlow 1.0, with OpenFlow 1.3 on the roadmap, and sports a list of compatible software ranging from DoS attack mitigation to QoS to VRRP. NEC has committed code to the open source Open Daylight program and demonstrated interoperability at Interop, PlugFest and ONS with Arista, Brocade, Centec, Dell, Extreme and IBM gear. The PF5240 hybrid switch was named Best of Interop in Infrastructure in 2011 and was the first OpenFlow hybrid switch offered on a commercial basis. More Detailed Product Information |
|
C A |
Tail-f embraces both top-down and bottom-up approaches to SDN, with support for OpenFlow (1.0 now, 1.3 later this year) and transaction-safe northbound APIs across OpenFlow and legacy devices. More Detailed Product Information |
|
C A S |
Big Switch has offerings in three of our four catgories. Its well-known OpenFlow-based controller is available as a physical or virtual appliance that works with a long list of switches from most major hardware vendors. The Big Virtual Switch (network virtualization)and Big Tap (unified network visbility) applications integrate with the controller, and the Switch Light is a software platform for merchant-silicon-based physical and virtual switches within hypervisors. More Detailed Product Information |
|
A |
Embrane heleo is a platform for software-defined network services for load balancing, firewall, VPN termination and SSL offload. Virtual appliances run on any X86 server and can quickly scale up and down to meet demand. Management and provisioning are via the Embrane Elastic Services Manager. Embrane uses REST for communication up and down the stack and between the Elastic Services Manager and virtual appliances. More Detailed Product Information |
|
A S |
Pica8's open switches are white-box devices that run the company's open PicOS NOS. PicOS supports OpenFlow 1.2 and Open vSwitch .1.9. The switches are available in four configurations, with a maximum switching fabric capacity of 176 Gbps for 1 GbE and 1.28 Tbps for 10 GbE platforms. More Detailed Product Information |
|
C A S |
Plexxi has entries in three areas with its Switch, Control and API offerings. The Plexxi 1U optical switch comes in two models, the 1 and 1x, with 64 and 72 10 Gbps ports, respectively. The controller runs on Linux and uses the Plexxi Affinities standard REST interface northbound and JMS for southbound control. The Plexxi Affinity Topologies application works as part of Plexxi Control and supports vSphere, vCloudDirector, OpenStack and KVM. More Detailed Product Information |
|
Subscribe to:
Posts (Atom)