Emulating a data center network on a single physical host with support for virtual machine mobility

Methods and arrangements for emulating a data center network. A first end host and a second end host are provided. A base hypervisor is associated with each of the first and second end hosts, and the first and second end hosts are interconnected. A virtual hypervisor is associated with at least one virtual machine running on at least one of the base hypervisors, and virtual hypervisors are interconnected within one of the first and second end hosts. A virtual machine is nested within the virtual hypervisor, and the virtual machine is migrated from one virtual hypervisor to a destination virtual hypervisor to further be nested within the destination virtual hypervisor.

Continue reading »

HAVEN: Holistic load balancing and auto scaling in the cloud

Load balancing and auto scaling are important services in the cloud. Traditionally, load balancing is achieved through either hardware or software appliances. Hardware appliances perform well but have several drawbacks. They are fairly expensive and are typically bought for managing peaks even if average volumes are 10% of peak. Further, they lack flexibility in terms of adding custom load balancing algorithms. They also lack multi-tenancy support. To address these concerns, most public clouds have adopted software load balancers that typically also comprise an auto scaling service. However, software load balancers do not match the performance of hardware load balancers. In order to avoid a single point of failure, they also require complex clustering solutions which further drives their cost higher.

Continue reading »

Providing replication and fail-over as a network service in data centers

Techniques for providing session level replication and fail-over as a network service include generating a replication rule that replicates network traffic destined for a primary server from an originating server to a network controller and installing said rule in a switch component, identifying flows from the originating server to the primary server, replicating each incoming data packet intended for the primary server to the network controller for replication and forwarding to replica servers, determining said primary server to be in a failed state based on a number of retransmissions of a packet, to selecting one of the replica servers as a fail-over target, and performing a connection level fail-over by installing a redirection flow in the switch component that redirects all packets destined to the primary server to the network controller, which forwards the packets to the replica server and forwards each response from the replica server to said originating server.

Continue reading »

Enabling co-existence of hosts or virtual machines with identical addresses

A method for enabling co-existence of multiple machines with identical addresses within a single data center network. The method includes assigning a unique pseudo identifier to each machine in the network that can be used for routing a packet to a destination machine, replacing a sender media access control address on an address resolution protocol request with a pseudo identifier of the sender at an edge network switch, retrieving a private network identifier from a mapping table based on the sender pseudo identifier and returning a pseudo identifier for the destination address based on the private network identifier, and replacing the pseudo identifier of the destination address with an actual identifier at a destination edge network switch for routing the packet to the destination machine.

Continue reading »

Effective switch memory management in OpenFlow networks

OpenFlow networks require installation of flow rules in a limited capacity switch memory (Ternary Content Addressable Memory or TCAMs, in particular) from a logically centralized controller. A controller can manage the switch memory in an OpenFlow network through events that are generated by the switch at discrete time intervals. Recent studies have shown that data centers can have up to 10,000 network flows per second per server rack today. Increasing the TCAM size to accommodate these large number of flow rules is not a viable solution since TCAM is costly and power hungry. Current OpenFlow controllers handle this issue by installing flow rules with a default idle timeout after which the switch automatically evicts the rule from its TCAM. This results in inefficient usage of switch memory for short lived flows when the timeout is too high and in increased controller workload for frequent flows when the timeout is too low.

Continue reading »
Top