基于哈希模式的负载均衡性能研究计算机毕业论文外文翻译.doc

资源描述

《基于哈希模式的负载均衡性能研究计算机毕业论文外文翻译.doc》由会员分享，可在线阅读，更多相关《基于哈希模式的负载均衡性能研究计算机毕业论文外文翻译.doc（21页珍藏版）》请在三一办公上搜索。

1、安徽建筑大学毕业设计外文翻译专业网络工程班级学生姓名 xx 学号 xx 指导教师 Performance of Hashing-Based Schemes for Internet Load BalancingZhiruo Cao ,Zheng Wang ,Ellen ZeguraCollege of Computing Georgia Institute of Technology Atlanta, GA 30332-0280Bell Labs Lucent Technologies Holmdel , NJ 07733AbstractLoad balancing is a ke

2、y technique for improving Internet performance. Effective use of load balancing requires good traffic distribution schemes. We study the performance of several hashing schemes for distributing traffic over multiple links while preserving the order of packets within a ow. Although hashing-based load

3、balancing schemes have been proposed in the past, this is the first comprehensive study of their performance using real traffic traces.We evaluate five direct hashing methods and one table-based hashing method. We find that hashing using a 16-bit CRC over the Five tuple gives excellent load balancin

4、g performance. Further, load-adaptive table-based hashing using the exclusive OR of the source and destination IP addresses achieves comparable performance to the 16-bit CRC. Table-based hashing can also distribute traffic load according to unequal weights. We also report on four other schemes with

5、poor to moderate performance.KeywordsLoad sharing, hashing.I. INTRODUCTIONLoad balancing (also known as load sharing) is a key technique for improving the performance and scalability of the Internet. For example, many large enterprise networks are connected to multiple Internet Service Providers (IS

6、Ps) to achieve redundant connectivity and to distribute traffic loading. Inside the Internet, the backbones are often engineered to have multiple parallel trunks between major Points of Presence to ensure high availability. Typically, these parallel trunks are congured as equal-cost paths and allow

7、load balancing over them.The parallel trunks may become even more ubiquitous when the promising Dense Wavelength Division Multiplexing (DWDM) technology is deployed in the future Internet back-bone. DWDM expands the capacity of communication trunks by allowing a greater number of channels to be carr

8、ied on a single optical fiber. With potentially tens or even hundreds of DWDM channels between major points, load balancing is essential in best utilizing the multiple parallel channels.Parallel architectures have been used for packet processing for coping with exponential growth in Internet traffic

9、, Instead of one processing engine, packets are dispatched to multiple parallel engines inside a router to increase the overall processing throughput. The same technique is also used in scaling web servers. Popular web servers often operate a farm of machines and the routers connected to them split

10、the HTTP requests to different machines.For all of these examples, effective use of load balancing requires good schemes for splitting traffic over multiple links. In addition, since the majority of the traffic on the Internet is TCP-based 1, traffic splitting schemes need to avoid packet misorderin

11、g within a TCP ow, which can falsely trigger congestion control mechanisms and cause unnecessary throughput degradation 2, 3.In this paper, we propose and evaluate a class of hashing based traffic splitting algorithms which preserve per-ow packet ordering. We consider five hash functions that are “d

12、irect,”meaning that the hash function produces a value in the range of 0.N-1, where N is the number of outgoing links. We also consider a table-based generalization that involves hashing to M bins, then assigning the M bins to the N outgoing links. Table based hashing requires more state than direct

13、 hashing, but has the flexibility to support unequal load distribution and dynamic adaptation.Our results are obtained by simulating the performance of a trafc splitter, using packet traces taken from two trunks of a major Internet backbone provider. We nd that direct hashing with the destination IP

14、 address causes signicant imbalance across two links. Using the Internet checksum or the exclusive OR of both the source IP address and destination IP address improves the performance considerably, though moderate imbalance persists. The more computationally complex 16-bit CRC of the ve-tuple (sourc

15、e address, destination address, source port,destination port and protocol id) gives excellent load balancing performance, keeping the load and queue lengths very similar on two links. Equally good load balancing can be achieved using table-based hashing with adaptation, which requires less computati

16、on than the CRC but necessitates monitoring the link loads and storing (and adjusting) the mapping from table bins to links.Table-based hashing has the additional advantage that it can distribute the load according to unequal weights. Further, an index-based version of this scheme can alter the weig

17、ht distribution with minimal disruption to existing ows . Our results conrm that the index-based hashing can accurately achieve a weighted distribution when adaptation is also used.The rest of this paper is organized as follows. In Section II we discuss related work in trafc splitting and load balan

18、cing. Section III describes the behavior of an ideal trafc splitter, explains the requirements for a practical system, and denes the performance metrics that will be used to assess various hashing-based schemes. The set of schemes that we consider are described in Section IV. The results of our stud

19、y are described in Section V, and include analysis of the randomness inherent in the trace data (Section V-A). We conclude and mention areas for future work in Section VI.II. RELATED WORKLoad balancing has been used in telecommunication networks in the form of inverse multiplexing 4. Inverse multipl

20、exing allows service providers to offer wideband channels by combining multiple narrowband 56 kbps and 64 kbps trunks 5. The load balancing in inverse multiplexing is typically based on round robin distribution of packets or bytes 6, 7.Our work differs from inverse multiplexing in two important dime

21、nsions. First, inverse multiplexing is designed for use over point-to-point links; its techniques are not typically applicable for network layer load balancing. Internet load balancing, however, makes use of the natural redundancy in the network topology. The paths for load balancing, for example, e

22、qual-cost multi-paths, are discovered dynamically by routing protocols, such as OSPF 8, rather than through configuration. Second, in order to maintain synchronization and per-flow FIFO packet ordering in inverse multiplexing, it is necessary to add extra packet headers with sequence numbers or to k

23、eep state at both ends of the channel. But, implementing these additional mechanisms for network load balancing requires a new network protocol. In comparison, the hashing-based schemes can maintain per-flow packet ordering and can be implemented without requiring any additional protocol support.Has

24、hing has been widely used in indexing and searching 9.In the networking context, hashing-based algorithms for address lookup 10, ow identication 11 and packet demultiplexing 12 have been proposed in the past. The use of hashing for network load balancing is not new. Some commercial router products h

25、ave implemented simple hashing over the IP destination address to distribute trafc 13. In the OSPF Optimized Multipath protocol (OSPF-OMP) 14, a number of possible approaches for load balancing over multiple paths are mentioned , including per-packet round robin, dividing destination prexes among av

26、ailable next hops in the forwarding table, and dividing trafc according to a hash function applied to the source and destination pair. However, the proposed schemes are not evaluated with simulation or real network measurement. In the study of load balancing with OSPF-OMP, perfect hashing is assumed

27、15. A trafc splitting scheme using random numbers is proposed in 16. It applies the name-based mappings approach to load balancing 17. In this scheme, each next-hop is assigned with a weight based on a simple pseudo-random number function seeded with the ow identier and the next-hop identier. When a

28、 packet arrives, the weights are generated, and the next-hop receiving the highest weight is used for forwarding. The scheme is approximately times as expensive as a hashing-based scheme, where is the number of outgoing links. Again, no performance study on the proposed scheme is presented.It is cle

29、ar that although hashing-based schemes for trafc splitting have been proposed in the past, and some simple schemes have even been implemented in commercial products, the performance of such schemes has not been adequately evaluated .This paper presents the rst comprehensive performance study on a wi

30、de range of hashing-based schemes, using real packet traces from backbone networks.III. FRAMEWORKIn this section, we describe the behavior of an ideal trafc splitter, explain the requirements for a practical system, and dene the performance metrics for assessing various schemes.A. Reference ModelA l

31、oad balancing system typically comprises a trafc splitter and multiple outgoing links as shown in Figure 1. In such a system, the trafc splitter receives an incoming packet from a higher-speed link and forwards it to one of the lower-speed outgoing links. A good load balancing system should be able

32、to split the trafc to the multiple outgoing links evenly or by some pre-dened proportion. In 7, it has been observed that there is a close relationship between fair queuing and load balancing. We now extend their observation to a mathematical model to obtain the constraints for ideal trafc splitting

33、.Let us rst look at an ideal uid model where the trafc isinnitely divisible. Suppose that there are out going links in the load balancing system, and the capacity of link I is ui . Let Si(T,t) be the amount of trafc forwarded to link I during the periodT,t. The ideal load balancing system should per

34、form as well as the corresponding system with a single outgoing link of capacity ui . Therefore, the ideal system should satisfy the following for any period T,t:The trafc load is essentially split in proportion to the rates of the outgoing links. At any time instance, the trafc load is perfectly ba

35、lanced; all outgoing links are busy or idle at the same time. Such a system is work-conserving; there is no bandwidth lost because of load balancing. By work-conserving, we mean no one outgoing link is idle while there is data waiting to be forwarded. Ideal load balancing is obviously impractical in

36、 a real network system. As the basic unit of forwarding is at least a single approximately times as expensive as a hashing-based scheme packet, a packetized load balancing system is no longer work where is the number of outgoing links. Again , no performance conserving. For example, suppose that a l

37、oad balancing systems has two outgoing links of the same capacity. Assume that the system is initially idle, then a single packet arrives. The packet is forwarded to one of the two outgoing link. Note that the packet is serviced with half of the total bandwidth available, thus it will take twice the

38、 amount of time to transmit compared with an ideal system. During this period, one of two outgoing links is busy servicing the packet while the other link remains idle. In a practical system, the trafc splitter may send several packets in a row to the same outgoing link, and thus increase the loss o

39、f bandwidth.In a packetized system, consider the worst case that all out going links have been idle since time T when a packet of maximum size Pmax arrives and no more packets are coming until the packet is served. Assume the packet is forwarded onto link i. During the service period, Equation 1 no

40、longer holds because, where C is a fraction of the packet that has been serviced during the period. Therefore, in a packetized system, the ideal load balancing should satisfy the following:over any interval T,t , where Pmax is the maximum size of packet. That is, the difference between the time link

41、 i is busy and the time link j is busy should be no more than the time to send a largest packet over the slower link.B. RequirementsThere are a number of basic requirements that trafc splitting schemes should meet for Internet load balancing:Low Overhead . Trafc splitting is executed for every packe

42、t in the packet forwarding path, thus the per-packet overhead it introduces is a major concern. Trafc splitting algorithms should be very simple and preferably keep no or little state.High Efciency. Poor trafc distribution will result in uneven link utilization and loss of bandwidth. A trafc splitte

43、r should try to distribute trafc as close as possible to the reference model.High Efciency. Poor trafc distribution will result in uneven link utilization and loss of bandwidth. A trafc splitter should try to distribute trafc as close as possible to the reference model.Per-Flow Ordering. Packet mis-

44、ordering within a TCP ow can produce false congestion signals and cause unnecessary throughput degradation 2, 3. It is therefore an essential requirement that the traffic splitting algorithms maintain per-flow packet ordering. This has to be achieved without requiring a new protocol layer.Let us now

45、 apply the above requirements to some of the possible traffic splitting approaches. Take packet-by-packet round robin or some form of fair queuing for example. The overheads are low and the performance is typically close to optimal. However, per-ow ordering cannot be guaranteed unless additional mec

46、hanisms, such as sequence numbers or state keeping, are added. Such additional mechanisms would increase the overhead drastically, and in many cases, only work over point-to-point links.Hashing-based trafc splitting algorithms are stateless and fairly easy to compute, particularly with hardware assi

47、stance. What is more, if the hash functions use any combination of the ve-tuple as input, per-flow ordering can be preserved 1. As we will show later in this paper, many of the hashing-based schemes perform well. Overall, hashing-based schemes meet the above requirements and offer the best tradeoff.

48、This is true because all packets within the same TCP ow have the same ve-tuple, thus the output of the hash function with the ve-tuple as input should always be the same.C. Performance MetricsWe now discuss the basic performance metrics for evaluating trafc splitting algorithms for Internet load balancing.Load Distribution. From the perspective of load balancing, the most important performance metric is the distribution of bytes over time among the multiple outgoing links. As

展开阅读全文