应用层网络与P2P.ppt

上传人:laozhun 文档编号:2263778 上传时间:2023-02-07 格式:PPT 页数:110 大小:3.03MB
返回 下载 相关 举报
应用层网络与P2P.ppt_第1页
第1页 / 共110页
应用层网络与P2P.ppt_第2页
第2页 / 共110页
应用层网络与P2P.ppt_第3页
第3页 / 共110页
应用层网络与P2P.ppt_第4页
第4页 / 共110页
应用层网络与P2P.ppt_第5页
第5页 / 共110页
点击查看更多>>
资源描述

《应用层网络与P2P.ppt》由会员分享,可在线阅读,更多相关《应用层网络与P2P.ppt(110页珍藏版)》请在三一办公上搜索。

1、1,应用层网络与P2P,清华大学计算机系,2,应用层网络与P2P,应用层网络对等网络系统应用层组播,3,应用层网络,Application Layer NetworkOverlay Network网络:定义主机之间通信的寻址方式、路由方式和服务模型在现有的Internet传输网络之上构建一个完全位于应用层的网络系统拓扑发现,路由等功能完全由应用层完成,不依赖网络层基于Internet网络的大规模的分布式应用,4,Internet,Internet本身也是一个Overlay Network目标:互联大量的局域网络方法:通过接入链路和主干网络互联策略:为分组增加IP报文头,进行寻址和路由,5,应用

2、层网络的优缺点,优点:易于部署,不依赖于网络设备的升级可扩展性好缺点:增加了复杂性和处理开销无法利用最佳路由,增加了延迟破坏了网络的分层结构模型路由具有“自私”特性,6,弹性覆盖网络RON1,弹性覆盖网络是一种分布式覆盖网络体系结构,它可以使分布式应用检测到路径的失效和周期性的性能降低现象并能够迅速恢复恢复时间少于20秒对比:在目前的Internet中,BGP恢复时间为几分钟RON节点监控它们之间的路径的质量并根据这些信息决定直接利用Internet转发分组通过其他RON节点转发分组,7,Internet的故障恢复,A,B,C,D,Packet switching and route arou

3、nd failures,8,Internet路由的鲁棒性,9,Internet路由的冗余特性,在12台主机上进行TraceRoute,得到的AS连接关系图,10,11,Internet路由的鲁棒性特点,优点可扩展性好缺点故障检测和恢复比较慢难以检测性能差的路径不能利用Internet的多路径特点不能为客户提供多穴(multihoming)连接难以表示复杂的策略,12,RONs Goal,To improve communication availability for small groups by at least a factor or 10,Many applicationsCollab

4、oration and conferencingVirtual Private Networks(VPNs)across public InternetOverlay Internet Service,13,RON:Routing Using Overlays,Cooperating end-systems in different routing domains can conspire to do better than scalable wide-area protocols,Types of failuresOutages:Configuration/operational error

5、s,backhoes,etc.Performance failures:Severe congestion,denial-of-service attacks,etc.,Scalable BGP-based IP routing substrate,14,RON Design,PerformanceDatabase,Application-specific routing tablesPolicy routing module,RON library,Nodes in differentrouting domains(ASes),15,可扩展性,测量和路由会带来额外开销50个节点是性能上限对大

6、多数应用已经足够了,16,测试,作者对有12个节点和132条路径的RON进行了64个小时的实际运行测试测试过程中经历了32次严重的路径失效,平均每次的持续时间超过30分钟RON用平均不到20秒的时间就可以检测出这些失效情况并从中恢复RON还改善了应用的丢失率,延迟和吞吐率等性能指标,17,进一步的研究,Internet可扩展性和鲁棒性的矛盾规模可扩展多个RON是否会有矛盾可以为哪些应用提供支持,18,对等网络系统,19,对等网络系统,20,Sarnoff law:效益规模是O(n):网络是广播媒介,任1发送者(设备)和多个(n-1)接收者(设备),Metcalfe law:效益规模是O(n2)

7、网络是全互连媒介,任何1个设备可与其它n-1个交互,同时存在n(n-1)=n2-n个并发执行的事务,Reed law:效益规模是O(2n):网络是群组媒介。网络可建立Cn2+Cn3+Cnn-1+Cnn=2n-n-1 个小组,网络服务规模三法则,21,P2P 与 C/S,二者在结构和构成上有很大区别管理能力、构态能力、功能(查找或发现)、组织、元素(DNS)和协议但又无明显边界都能运行在不同的(Internet/Intranet)平台上都能服务传统或新的应用:eBusiness eServices,22,有管理自组织,预构-Ad-hoc,查找发现,分层Mesh,静态移动,依赖服务器独立生存,以I

8、P为中心不以IP为中心,基于DNS客户命名,RPC异步,.NET,JXTA,C/S模式,P2P模式,CORBA,CORBA,Gnutella,Napster,eBusiness,Web apps,eServices,Distr.apps,Ad-hoc NW,Clusters,Internet Intranet,WANs,Grids,P2P与C/S,23,P2P流量持续增长,24,P2P Market Share by Country,25,What is being shared over P2P?,ParametersMix of file formats by volume of traf

9、fic generated over the 4 main P2P networksBitTorrent eDonkeyFastTrackGnutellaWeighted by volume of traffic generated on each network,26,How Did it Start?,A killer application:Napster 1999-2002Free music over the InternetKey idea:share the content,storage and bandwidth of individual(home)users,Intern

10、et,27,Model,Each user stores a subset of filesEach user has access(can download)files from all users in the system,28,Main Challenge,Find where a particular file is stored,A,B,C,D,E,F,E?,29,Other Challenges,Scale:up to hundred of thousands or millions of machines Dynamicity:machines can come and go

11、any time,30,Napster,Assume a centralized index system that maps files(songs)to machines that are aliveHow to find a file(song)Query the index system return a machine that stores the required fileIdeally this is the closest/least-loaded machineftp the file,31,Napster,Advantages:Simplicityeasy to impl

12、ement sophisticated search engines on top of the index systemDisadvantages:RobustnessScalability(?),32,Napster:Example,A,B,C,D,E,F,m1,m2,m3,m4,m5,m6,m1 Am2 Bm3 Cm4 Dm5 Em6 F,E?,33,Napster vs.RIAA,Recording Industry Association of America(RIAA)The RIAA sued Napster in December 1999It demanded$10,000

13、for every copyrighted song traded over the systemAfter long legal battles and a failed attempt to become a pay-based service,Napster stopped operation in July 2001,34,35,Gnutella,Distribute file locationIdea:Flood the requestHow to find a file:Send request to all neighborsNeighbors recursively multi

14、cast the requestEventually a machine that has the file receives the request,and it sends back the answer,36,Gnutella,Advantages:Totally decentralized,highly robustDisadvantages:Not scalable;the entire network can be swamped with request(to alleviate this problem,each request has a TTL),37,Gnutella:E

15、xample,Assume:m1s neighbors are m2 and m3;m3s neighbors are m4 and m5;,A,B,C,D,E,F,m1,m2,m3,m4,m5,m6,38,Freenet,Addition goals to file location:Provide publisher anonymity,security Resistant to attacks a third party shouldnt be able to deny the access to a particular file(data item,object),even if i

16、t compromises a large fraction of machines,39,Freenet,Architecture:Each file is identified by a unique identifierEach machine stores a set of files,and maintains a“routing table”to route the individual requests,40,Data Structure,Each node maintains a common stackid file identifiernext_hop another no

17、de that store the file idfile file identified by id being stored on the local node ForwardingEach message contains the file id it is referring toIf file id stored locally,then stopIf not,search for the“closest”id in the stack,and forward the message to the corresponding next_hop,id next_hop file,41,

18、Query,API:file=query(id)Upon receiving a query for document idCheck whether the queried file is stored locallyIf yes,return itIf not,forward the query message,42,Query,NotesEach query is associated a TTL that is decremented each time the query message is forwardedEach node maintains the state for al

19、l outstanding queries that have traversed it help to avoid cyclesWhen file is returned,the file is cached along the reverse path,43,Query Example,Note:doesnt show file caching on the reverse path,4 n1 f412 n2 f12 5 n3,9 n3 f9,3 n1 f314 n4 f14 5 n3,14 n5 f1413 n2 f13 3 n6,n1,n2,n3,n4,4 n1 f410 n5 f10

20、 8 n6,n5,query(10),44,Insert,API:insert(id,file)Two steps Search for the file to be insertedIf not found,insert the file,45,Insert,Searching:like query,but nodes maintain state after a collision is detected and the reply is sent back to the originatorInsertionFollow the forward path;insert the file

21、at all nodes along the pathA node probabilistically replace the originator with itself;obscure the true originator,46,Insert Example,Assume query returned failure along“gray”path;insert f10,4 n1 f412 n2 f12 5 n3,9 n3 f9,3 n1 f314 n4 f14 5 n3,14 n5 f1413 n2 f13 3 n6,n1,n2,n3,n4,4 n1 f411 n5 f11 8 n6,

22、n5,insert(10,f10),47,Insert Example,10 n1 f10 4 n1 f412 n2,3 n1 f314 n4 f14 5 n3,14 n5 f1413 n2 f13 3 n6,n1,n3,n4,4 n1 f411 n5 f11 8 n6,n5,insert(10,f10),9 n3 f9,n2,orig=n1,48,Insert Example,n2 replaces the originator(n1)with itself,10 n1 f10 4 n1 f412 n2,10 n2 f10 9 n3 f9,10 n2 10 3 n1 f314 n4,14 n

23、5 f1413 n2 f13 3 n6,n1,n2,n3,n4,4 n1 f411 n5 f11 8 n6,n5,insert(10,f10),orig=n2,49,Insert Example,n2 replaces the originator(n1)with itself,10 n1 f10 4 n1 f412 n2,10 n1 f10 9 n3 f9,10 n2 10 3 n1 f314 n4,10 n4 f1014 n5 f1413 n2,n1,n2,n3,n4,10 n4 f10 4 n1 f411 n5,n5,Insert(10,f10),50,Freenet Propertie

24、s,Newly queried/inserted files are stored on nodes storing similar idsNew nodes can announce themselves by inserting files Attempts to supplant or discover existing files will just spread the files,51,Freenet Summary,AdvantagesProvides publisher anonymityTotally decentralize architecture robust and

25、scalableResistant against malicious file deletionDisadvantagesDoes not always guarantee that a file is found,even if the file is in the network,52,分布式哈希表查找,DHT-Distributed Hash Table问题给定一个关键字,映射 到一台主机关键点正确性可扩展性支持网络状态和主机的动态 变化网络和主机的异构 性安全,53,Hash Tables 实例,null,null,null,null,Obj5key=1,Obj1key=15,Obj

26、4key=2,Obj2key=28,76543210,table,Obj3key=4,buckets,hash value,54,DHT Step 1:The Hash,Introduce a hash function to map the object being searched for to a unique identifier:e.g.,h(“Network Notes”)8045Distribute the range of the hash function among all nodes in the networkEach node must“know about”at l

27、east one copy of each object that hashes within its range(when one exists),55,“Knowing about objects”,Two alternativesNode can cache each(existing)object that hashes within its rangePointer-based:level of indirection-node caches pointer to location(s)of object,56,DHT Step 2:Routing,For each object,n

28、ode(s)whose range(s)cover that object must be reachable via a“short”pathThe different approaches(CAN,Chord,Pastry,Tapestry)differ fundamentally only in the routing approachany“good”random hash function will suffice,57,DHT Routing:Other Challenges,Neighbors for each node should scale with growth in o

29、verlay participation(e.g.,should not be O(N)DHT mechanism should be fully distributed(no centralized point that bottlenecks throughput or can act as single point of failure)DHT mechanism should gracefully handle nodes joining/leaving the overlayneed to repartition the range space over existing nodes

30、need to reorganize neighbor setneed bootstrap mechanism to connect new nodes into the existing DHT infrastructure,58,分布式哈希查找,Chord2CAN3Tapestry4 Pastry5,59,分布式哈希查找Chord,Chord实现了下面的功能给定一个关键字(key),将key映射到某个节点给对等网络应用的每个数据都分配一个key,那么对等网络中的数据查找问题就可以用Chord解决Chord采用了相容哈希(Consistent Hashing)6的一种变体为节点分配关键字哈希

31、函数可以做到负载平衡,也就是说所有的节点可以接收到基本相同数量的关键字当第N个节点加入或者离开网络时,只有1/N的关键字需要移动到另外的位置,60,分布式哈希查找Chord,Chord进一步改善了相容哈希的可扩展性在由N个节点组成的网络中,每个节点只需要维护O(logN)个节点的信息每次查找只需要O(logN)条消息当节点加入或者离开网络时,Chord需要更新路由信息,每次加入或者离开需要传递O(log2N)条消息,61,Chord ID,m bit identifier space for both keys and nodesKey identifier=SHA-1(key)Key=“Le

32、tItBe”ID=60Node identifier=SHA-1(IP address)IP=“198.10.10.1”ID=123Both are uniformly distributedHow to map key IDs to node IDs?,62,1,6,2,identifiercircle,successor(1)=1,successor(2)=3,successor(6)=0,Successor Nodes,63,6,1,2,successor(6)=7,6,1,successor(1)=3,Node Joins and Departures,64,Scalable Key

33、Location,A very small amount of routing information suffices to implement consistent hashing in a distributed environmentEach node need only be aware of its successor node on the circleQueries for a given identifier can be passed around the circle via these successor pointersResolution scheme correc

34、t,BUT inefficient:it may require traversing all N nodes!,65,Acceleration of Lookups,Lookups are accelerated by maintaining additional routing informationEach node maintains a routing table with(at most)m entries(where N=2m)called the finger tableith entry in the table at node n contains the identity

35、 of the first node,s,that succeeds n by at least 2i-1 on the identifier circles=successor(n+2i-1)s is called the ith finger of node n,denoted by n.finger(i).node,66,Finger Tables,124,1,2)2,4)4,0),130,67,Lookup Procedure,124,1,2)2,4)4,0),130,query(6),When node receive lookup requestCheck if it has it

36、em in local memoryIf cant find,forward the request to the successor node that has largest ID which is smaller than request ID,68,Node Joins with Finger Tables,124,1,2)2,4)4,0),130,finger table,start,int.,succ.,keys,2,457,4,5)5,7)7,3),000,finger table,start,int.,succ.,keys,6,6,6,6,6,69,Node Departure

37、s with Finger Tables,124,1,2)2,4)4,0),130,finger table,start,int.,succ.,keys,1,235,2,3)3,5)5,1),330,finger table,start,int.,succ.,keys,2,457,4,5)5,7)7,3),660,finger table,start,int.,succ.,keys,finger table,start,int.,succ.,keys,702,7,0)0,2)2,6),003,6,6,6,0,3,70,分布式哈希查找CAN,CAN基于虚拟的d维笛卡儿坐标空间实现数据组织和查找功

38、能整个坐标空间动态地分配给系统中的所有节点,每个节点都拥有独立的互不相交的一块区域,71,分布式哈希查找CAN,新节点加入CAN把某个现有节点的区域分裂成同样大小的两块,把其中一块分给新加入的节点,整个过程分为以下三步:(i)新节点首先找到一个已经在CAN中的节点(ii)新节点使用CAN的路由机制找到一个区域将要被分隔的节点(iii)执行分裂操作,原有区域的邻接区域被告知发生了分裂,这样新节点才能被别的节点路由到,72,CAN实例:两维空间,空间在节点之间划分所有的节点将覆盖全部空间每个节点覆盖一个正方形或者1:2的长方形实例节点n1:(1,2)是第一个加入节点,它将覆盖整个空间,73,CAN

39、实例:两维空间,节点 n2:(4,2)加入,空间在两个节点间划分,74,CAN实例:两维空间,节点n3:(3,5)加入,从n1的空间中划分一半,75,CAN实例:两维空间,节点 n4:(5,5)和 n5:(6,6)加入,76,CAN实例:两维空间,节点:n1:(1,2);n2:(4,2);n3:(3,5);n4:(5,5);n5:(6,6)数据项:f1:(2,3);f2:(5,0);f3:(2,1);f4:(7,5),77,CAN实例:两维空间,每个数据项保存在拥有这块空间的节点上,78,CAN:查询实例,每个节点都知道其在d维空间中邻居节点的位置将查询转发到离id最接近的邻居节点实例:假定n

40、1要查找f4有多条路径,可以绕开失效路径如果一个d维空间划分成n个相等的区域,那么平均路由长度是(d/4)(n1/d),每个节点只需要维护2d的邻接节点信息,1,2,3,4,5,6,7,0,1,2,3,4,5,6,7,0,n1,n2,n3,n4,n5,f1,f2,f3,f4,79,CAN:节点离开,当节点离开CAN时,必须保证它的区域被CAN系统收回,分配给其他仍然在系统中的节点一般过程是由某个节点来接管这个区域和所有的(关键字,值)数据库如果这个区域可以和相邻区域合并形成一个大的区域,那么CAN将执行合并操作如果合并不能进行,那么该区域将交给其邻接节点中区域最小的节点,也就是说,这个节点将临

41、时负责两个区域,80,CAN/Chord Optimizations,Weight neighbor nodes by RTTWhen routing,choose neighbor who is closer to destination with lowest RTT from meReduces path latencyMultiple physical nodes per virtual nodeReduces path length(fewer virtual nodes)Reduces path latency(can choose physical node from virtu

42、al node with lowest RTT)Improved fault tolerance(only one node per zone needs to survive to allow routing through the zone)Several others,81,Distributed Hash Tables,Distributed Hash Tables are a key component of scalable and robust overlay networksCAN:O(d)state,O(d*n1/d)distanceChord:O(log n)state,O

43、(log n)distanceSimplicity is keyServices built on top of distributed hash tablesp2p file storage,i3(chord)multicast(CAN,Tapestry)persistent storage(OceanStore using Tapestry),82,问题,只能进行精确匹配,实用价值有局限适应节点动态变化的能力低经过哈希之后,节点的位置信息被破坏了,来自同一个子网的站点很可能节点号相距甚远,这不利于查询性能的优化基于哈希表的系统不能利用应用本身的信息,许多应用(比如文件系统)的数据本身是按照

44、层次结构组织的,而使用哈希函数后,这些层次信息就丢失了TerraDir8中提出了一种面向层次结构的查找机制,83,DHTs vs Unstructured P2P,DHTs good at:exact match for“rare”itemsDHTs bad at:keyword search,etc.cant construct DHT-based Googletolerating extreme churnGnutella etc.good at:general searchfinding common objectsvery dynamic environmentsGnutella et

45、c.bad at:finding“rare”items,84,基于分布式哈希查找系统的大规模对等网络应用,协作文件系统CFS(Cooperative File System)9PAST10OceanStore11,85,基于分布式哈希查找系统的大规模对等网络应用CFS,Highest layer provides a file-like interface to user including user-friendly naming and authenticationThis file systems maps operations to lower-level block operatio

46、nsBlock storage uses Chord to identify responsible node for storing a block and then talk to the block storage server on that node,86,其他P2P应用,应用层组播流媒体直播与点播CoolStreaming,PPLive,PPStream计算能力的共享http:/setiathome.ssl.berkeley.edu/计算机支持的协同工作,87,Application Layer Multicast(ALM),IP Multicast is not globally

47、 deployedApplication Layer/Level Multicast(or Overlay Multicast)is hence proposedMulticasting implemented at end hosts instead of network routersNodes form unicast channels or tunnels between them,88,ALM Methodologies,Tree BasedContent flows from server to nodes in a tree like fashion,every node for

48、wards the content to its children,which in turn forward to their childrenOne point of failure for a complete subtreeHigh recovery timeNotes Tree Base Approaches:NICE,SpreadIT,ZigzagMesh BasedOvercomes tree based flawsNodes maintain state information of many nodesHigh control overheadNotes Mesh Based

49、 approaches include Narada and ESM from CMU,89,The delay between the source and receivers is smallAt the same time,the number of redundant packets on any physical link should be low,Gatech,“Efficient”overlay,CMU,Berk2,Stan1,Stan2,Berk1,Berk1,High degree(unicast),Berk2,Gatech,Stan2,CMU,Stan1,Stan2,Hi

50、gh latency,CMU,Berk2,Gatech,Stan1,Berk1,Network efficiency,90,Physical Link Stress(PLS),The number of identical copies of a packet that traverse a physical linkIndicates the bandwidth inefficiency,S,R1,R2,E1,E2,E3,Example:PLS for link S-R1 is 2Average PLS is 7/5,91,Relative Delay Penalty(RDP),The ra

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 建筑/施工/环境 > 项目建议


备案号:宁ICP备20000045号-2

经营许可证:宁B2-20210002

宁公网安备 64010402000987号