并行程序设计导论第一章PPT.ppt

上传人:sccc 文档编号:5382518 上传时间:2023-07-01 格式:PPT 页数:43 大小:3.76MB
返回 下载 相关 举报
并行程序设计导论第一章PPT.ppt_第1页
第1页 / 共43页
并行程序设计导论第一章PPT.ppt_第2页
第2页 / 共43页
并行程序设计导论第一章PPT.ppt_第3页
第3页 / 共43页
并行程序设计导论第一章PPT.ppt_第4页
第4页 / 共43页
并行程序设计导论第一章PPT.ppt_第5页
第5页 / 共43页
点击查看更多>>
资源描述

《并行程序设计导论第一章PPT.ppt》由会员分享,可在线阅读,更多相关《并行程序设计导论第一章PPT.ppt(43页珍藏版)》请在三一办公上搜索。

1、Copyright 2010,Elsevier Inc.All rights Reserved,Chapter 1,Why Parallel Computing?,An Introduction to Parallel ProgrammingPeter Pacheco,Copyright 2010,Elsevier Inc.All rights Reserved,Roadmap,Why we need ever-increasing performance.Why were building parallel systems.Why we need to write parallel prog

2、rams.How do we write parallel programs?What well be doing.Concurrent,parallel,distributed!,#Chapter Subtitle,Changing times,Copyright 2010,Elsevier Inc.All rights Reserved,From 1986 2002,microprocessors were speeding like a rocket,increasing in performance an average of 50%per year.Since then,its dr

3、opped to about 20%increase per year.,An intelligent solution,Copyright 2010,Elsevier Inc.All rights Reserved,Instead of designing and building faster microprocessors,put multiple processors on a single integrated circuit.,Now its up to the programmers,Adding more processors doesnt help much if progr

4、ammers arent aware of them or dont know how to use them.Serial programs dont benefit from this approach(in most cases).,Copyright 2010,Elsevier Inc.All rights Reserved,Why we need ever-increasing performance,Computational power is increasing,but so are our computation problems and needs.Problems we

5、never dreamed of have been solved because of past increases,such as decoding the human genome.More complex problems are still waiting to be solved.,Copyright 2010,Elsevier Inc.All rights Reserved,Climate modeling,Copyright 2010,Elsevier Inc.All rights Reserved,Protein folding,Copyright 2010,Elsevier

6、 Inc.All rights Reserved,Drug discovery,Copyright 2010,Elsevier Inc.All rights Reserved,Energy research,Copyright 2010,Elsevier Inc.All rights Reserved,Data analysis,Copyright 2010,Elsevier Inc.All rights Reserved,Why were building parallel systems,Up to now,performance increases have been attributa

7、ble to increasing density of transistors.But there areinherent problems.,Copyright 2010,Elsevier Inc.All rights Reserved,A little physics lesson,Smaller transistors=faster processors.Faster processors=increased power consumption.Increased power consumption=increased heat.Increased heat=unreliable pr

8、ocessors.,Copyright 2010,Elsevier Inc.All rights Reserved,Solution,Move away from single-core systems to multicore processors.“core”=central processing unit(CPU),Copyright 2010,Elsevier Inc.All rights Reserved,Introducing parallelism!,Why we need to write parallel programs,Running multiple instances

9、 of a serial program often isnt very useful.Think of running multiple instances of your favorite game.What you really want is forit to run faster.,Copyright 2010,Elsevier Inc.All rights Reserved,Approaches to the serial problem,Rewrite serial programs so that theyre parallel.Write translation progra

10、ms that automatically convert serial programs into parallel programs.This is very difficult to do.Success has been limited.,Copyright 2010,Elsevier Inc.All rights Reserved,More problems,Some coding constructs can be recognized by an automatic program generator,and converted to a parallel construct.H

11、owever,its likely that the result will be a very inefficient program.Sometimes the best parallel solution is to step back and devise an entirely new algorithm.,Copyright 2010,Elsevier Inc.All rights Reserved,Example,Compute n values and add them together.Serial solution:,Copyright 2010,Elsevier Inc.

12、All rights Reserved,Example(cont.),We have p cores,p much smaller than n.Each core performs a partial sum of approximately n/p values.,Copyright 2010,Elsevier Inc.All rights Reserved,Each core uses its own private variablesand executes this block of codeindependently of the other cores.,Example(cont

13、.),After each core completes execution of the code,is a private variable my_sum contains the sum of the values computed by its calls to Compute_next_value.Ex.,8 cores,n=24,then the calls to Compute_next_value return:,Copyright 2010,Elsevier Inc.All rights Reserved,1,4,3,9,2,8,5,1,1,5,2,7,2,5,0,4,1,8

14、,6,5,1,2,3,9,Example(cont.),Once all the cores are done computing their private my_sum,they form a global sum by sending results to a designated“master”core which adds the final result.,Copyright 2010,Elsevier Inc.All rights Reserved,Example(cont.),Copyright 2010,Elsevier Inc.All rights Reserved,Exa

15、mple(cont.),Copyright 2010,Elsevier Inc.All rights Reserved,Global sum8+19+7+15+7+13+12+14=95,Copyright 2010,Elsevier Inc.All rights Reserved,But wait!Theres a much better wayto compute the global sum.,Better parallel algorithm,Dont make the master core do all the work.Share it among the other cores

16、.Pair the cores so that core 0 adds its result with core 1s result.Core 2 adds its result with core 3s result,etc.Work with odd and even numbered pairs of cores.,Copyright 2010,Elsevier Inc.All rights Reserved,Better parallel algorithm(cont.),Repeat the process now with only the evenly ranked cores.

17、Core 0 adds result from core 2.Core 4 adds the result from core 6,etc.Now cores divisible by 4 repeat the process,and so forth,until core 0 has the final result.,Copyright 2010,Elsevier Inc.All rights Reserved,Multiple cores forming a global sum,Copyright 2010,Elsevier Inc.All rights Reserved,Analys

18、is,In the first example,the master core performs 7 receives and 7 additions.In the second example,the master core performs 3 receives and 3 additions.The improvement is more than a factor of 2!,Copyright 2010,Elsevier Inc.All rights Reserved,Analysis(cont.),The difference is more dramatic with a lar

19、ger number of cores.If we have 1000 cores:The first example would require the master to perform 999 receives and 999 additions.The second example would only require 10 receives and 10 additions.Thats an improvement of almost a factor of 100!,Copyright 2010,Elsevier Inc.All rights Reserved,How do we

20、write parallel programs?,Task parallelism Partition various tasks carried out solving the problem among the cores.Data parallelismPartition the data used in solving the problem among the cores.Each core carries out similar operations on its part of the data.,Copyright 2010,Elsevier Inc.All rights Re

21、served,Professor P,Copyright 2010,Elsevier Inc.All rights Reserved,15 questions300 exams,Professor Ps grading assistants,Copyright 2010,Elsevier Inc.All rights Reserved,TA#1,TA#2,TA#3,Division of work data parallelism,Copyright 2010,Elsevier Inc.All rights Reserved,TA#1,TA#2,TA#3,100 exams,100 exams

22、,100 exams,Division of work task parallelism,Copyright 2010,Elsevier Inc.All rights Reserved,TA#1,TA#2,TA#3,Questions 1-5,Questions 6-10,Questions 11-15,Division of work data parallelism,Copyright 2010,Elsevier Inc.All rights Reserved,Division of work task parallelism,Copyright 2010,Elsevier Inc.All

23、 rights Reserved,TasksReceivingAddition,Coordination,Cores usually need to coordinate their work.Communication one or more cores send their current partial sums to another core.Load balancing share the work evenly among the cores so that one is not heavily loaded.Synchronization because each core wo

24、rks at its own pace,make sure cores do not get too far ahead of the rest.,Copyright 2010,Elsevier Inc.All rights Reserved,What well be doing,Learning to write programs that are explicitly parallel.Using the C language.Using three different extensions to C.Message-Passing Interface(MPI)Posix Threads(

25、Pthreads)OpenMP,Copyright 2010,Elsevier Inc.All rights Reserved,Type of parallel systems,Shared-memoryThe cores can share access to the computers memory.Coordinate the cores by having them examine and update shared memory locations.Distributed-memoryEach core has its own,private memory.The cores mus

26、t communicate explicitly by sending messages across a network.,Copyright 2010,Elsevier Inc.All rights Reserved,Type of parallel systems,Copyright 2010,Elsevier Inc.All rights Reserved,Shared-memory,Distributed-memory,Terminology,Concurrent computing a program is one in which multiple tasks can be in

27、 progress at any instant.Parallel computing a program is one in which multiple tasks cooperate closely to solve a problemDistributed computing a program may need to cooperate with other programs to solve a problem.,Copyright 2010,Elsevier Inc.All rights Reserved,Concluding Remarks(1),The laws of phy

28、sics have brought us to the doorstep of multicore technology.Serial programs typically dont benefit from multiple cores.Automatic parallel program generation from serial program code isnt the most efficient approach to get high performance from multicore computers.,Copyright 2010,Elsevier Inc.All rights Reserved,Concluding Remarks(2),Learning to write parallel programs involves learning how to coordinate the cores.Parallel programs are usually very complex and therefore,require sound program techniques and development.,Copyright 2010,Elsevier Inc.All rights Reserved,

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 建筑/施工/环境 > 农业报告


备案号:宁ICP备20000045号-2

经营许可证:宁B2-20210002

宁公网安备 64010402000987号