FACTOID # 1: Idaho produces more milk than Iowa, Indiana and Illinois combined.
 
 Home   Encyclopedia   Statistics   States A-Z   Flags   Maps   FAQ   About 
   
 
WHAT'S NEW
 

SEARCH ALL

FACTS & STATISTICS    Advanced view

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > Parallel architecture

Parallel computing is the simultaneous execution of the same task (split up and specially adapted) on multiple processors in order to obtain results faster. The idea is based on the fact that the process of solving a problem usually can be divided into smaller tasks, which may be carried out simultaneously with some coordination. CPU redirects here. ...

Contents


Parallel computing systems

A parallel computing system is a computer with more than one processor for parallel processing. The recent multicore processors are also ideal to build up parallel computing systems. CPU redirects here. ... A multicore processor is a chip with more than one processing units (cores). ...


There are many different kinds of parallel computers. They are distinguished by the kind of interconnection between processors (known as "processing elements" or PEs), processors and memories.


Traditionally, Flynn's taxonomy classifies parallel (and serial) computers according to Flynns taxonomy is a classification of computer architectures, proposed by Michael J. Flynn in 1972. ...

  • whether all processors execute the same instructions at the same time (single instruction/multiple data -- SIMD) or
  • each processor executes different instructions (multiple instruction/multiple data -- MIMD).

One major way to classify parallel computers is based on their memory architectures. Shared memory parallel computers have multiple processors accessing all available memory as global address space. They can be further divided into two main classes based on memory access times: Uniform memory access(UMA) or Non-Uniform memory access(NUMA). Alternatively, Distributed memory parallel computers also have multiple processors, but each of the processors can only access its own local memory; no global memory address space exists across them.-1... Multiple Instruction Multiple Data (MIMD) is a type of parallel computing architecture where many functional units perform different operations on different data. ... In computer hardware, shared memory refers to a (typically) large block of random access memory that can be accessed by several different central processing units (CPUs) in a multiple-processor computer system. ... Distributed memory is a concept used in parallel computing. ...


Parallel computing systems can also be categorized by the numbers of processors in them. Systems with thousands of such processors are known as massively parallel. Subsequently there are what are referred to as "Large scale" vs "Small scale" parallel processors. This refers to the size of the processor, eg. a PC based parallel system would be considered a small scale system. Massively Parallel Processing is a term used in Computer Engineering. ...


Parallel processor machines are also divided into symmetric and asymmetric multiprocessors, depending on whether all the processors are capable of running all the operating system code and, say, accessing I/O devices or if some processors are more or less privileged. Multiprocessing is traditionally known as the use of multiple concurrent processes in a system as opposed to a single process at any one instant. ...


There are also a variety of architectures that have been developed to accomplish parallel processing. One such example would be a "Ring Architecture" where processors are linked by a ring structure, allowing all processors to read and write data simultaneously, therefore accomplishing a true parallel processing environment.


Theory and practice

Parallel computers are theoretically modeled as Parallel Random Access Machines (PRAMs). The PRAM model ignores the cost of interconnection between the constituent computing units, but is nevertheless very useful in providing upper bounds on the parallel solvability of many problems. In reality the interconnection plays a significant role. PRAM stands for Parallel Random Access Machine, which is an abstract machine for designing the algorithms applicable to parallel computers. ...


The processors may either communicate in order to be able to cooperate in solving a problem or they may run completely independently, possibly under the control of another processor which distributes work to the others and collects results from them (a "processor farm").


Processors in a parallel computer may communicate with each other in a number of ways, including shared (either multiported or multiplexed) memory, a crossbar, a shared bus or an interconnect network of a myriad of topologies including star, ring, tree, hypercube, fat hypercube (a hypercube with more than one processor at a node), an n-dimensional mesh, etc. Parallel computers based on interconnect network need to employ some kind of routing to enable passing of messages between nodes that are not directly connected. The communication medium used for communication between the processors is likely to be hierarchical in large multiprocessor machines. Similarly, memory may be either private to the processor, shared between a number of processors, or globally shared. Systolic array is an example of a multiprocessor with fixed function nodes, local-only memory and no message routing. A network topology is the pattern of links connecting pairs of nodes of a network. ... In computer networking the term routing refers to selecting paths in a computer network along which to send data. ... By analogy with the regular pumping of blood by the heart, a systolic array is an arrangement of processors in an array (often rectangular) where data flows synchronously across the array between neighbours, usually with different data flowing in different directions. ...


Approaches to parallel computers include:

Multiprocessing is traditionally known as the use of multiple concurrent processes in a system as opposed to a single process at any one instant. ... Linux Cluster at Purdue University. ... A supercomputer is a computer that leads the world in terms of processing capacity, particularly speed of calculation, at the time of its introduction. ... Distributed computing is decentralised and parallel computing, using two or more computers communicating over a network to accomplish a common objective or task. ... Non-Uniform Memory Access or Non-Uniform Memory Architecture (NUMA) is a computer memory design used in multiprocessors, where the memory access time depends on the memory location relative to a processor. ... Symmetric Multiprocessing, or SMP, is a multiprocessor computer architecture where two or more identical processors are connected to a single shared main memory. ... Parallel computing is the simultaneous execution of the same task (split up and specially adapted) on multiple processors in order to obtain faster results. ... Grid computing is an emerging computing model that provides the ability to perform higher throughput computing by taking advantage of many networked computers to model a virtual computer architecture that is able to distribute process execution across a parallel infrastructure. ...

Performance vs. cost

While a system of n parallel processors is less efficient than one n-times-faster processor, the parallel system is often cheaper to build. Parallel computation is an excellent solution for tasks which require very large amounts of computation, have time constraints on completion and especially for those which can be divided into n execution threads. In fact, in recent years, most high performance computing systems, also known as supercomputers, have a parallel architecture. A supercomputer is a computer that leads the world in terms of processing capacity, particularly speed of calculation, at the time of its introduction. ... A supercomputer is a computer that leads the world in terms of processing capacity, particularly speed of calculation, at the time of its introduction. ...


Terminology in parallel computing

Some frequently used terms in parallel computing are:

Task
a logically high level, discrete, independent section of computational work. A task is typically executed by a processor as a program
Synchronization
the coordination of simultaneous tasks to ensure correctness and avoid unexpected race conditions.
Speedup
also called parallel speedup, which is defined as wall-clock time of best serial execution divided by wall-clock time of parallel execution.
Parallel overhead
the extra work associated with parallel version compared to its sequential code, mostly the extra CPU time and memory space requirements from synchronization, data communications, parallel environment creation and cancellation, etc.
Scalability
a parallel system's ability to gain proportionate increase in parallel speedup with the addition of more processors.

A race hazard (or race condition) is a flaw in a system or process where the output exhibits unexpected critical dependence on the relative timing of events. ... In parallel computing, speedup refers to how much a parallel algorithm is faster than a corresponding sequential algorithm. ... In telecommunications and software engineering, scalability indicates the capability of a system to increase total througput under an increased load when resources (typically hardware) are added. ...

Algorithms

It should not be imagined that successful parallel computing is a matter of obtaining the required hardware and connecting it suitably. The difficulty of cooperative problem solving is aptly demonstrated by the following dubious reasoning:

If it takes one man one minute to dig a post-hole then sixty men can dig it in one second.

In practice, linear speedup (i.e., speedup proportional to the number of processors) is very difficult to achieve. This is because many algorithms are essentially sequential in nature (Amdahl's law states this more formally). In parallel computing, speedup refers to how much a parallel algorithm is faster than a corresponding sequential algorithm. ... Amdahls law, named after computer architect Gene Amdahl, is used to find the maximum expected improvement to an overall system when only part of the system is improved. ...


Up to a certain point, certain workloads can benefit from pipeline parallelism when extra processors are added. This uses a factory assembly line approach to divide the work. If the work can be divided into n stages where a discrete deliverable is passed from stage to stage, then up to n processors can be used. However, the slowest stage will hold up the other stages so it is rare to be able to fully use n processors. In software engineering, a pipeline consisting of chain of processes or other data processing entities, arranged so that the output of each element of the chain is the input of the of the next one. ...


Most algorithms must be redesigned in order to make effective use of parallel hardware. See parallel algorithms. Programs which work correctly in a single CPU system may not do so in a parallel environment. This is because multiple copies of the same program may interfere with each other, for instance by accessing the same memory location at the same time. Therefore, careful programming is required in a parallel system. In computer science, a parallel algorithm, as opposed to a traditional serial algorithm, is one which can be executed a piece at a time in many different processing devices, and then put back together again at the end to get the correct result. ... The terms storage (U.K.) or memory (U.S.) refer to the parts of a digital computer that retain physical state (data) for some interval of time, possibly even after electrical power to the computer is turned off. ...


Superlinear speedup - the effect of an N processor machine completing a task more than N times faster than a machine with a single processor similar to that in the multiprocessor has at times been a controversial issue (and lead to much benchmarking) but can be brought about by such effects as the multiprocessor machine having not just N times the processing power but also N times cache and memory thus flattening the cache-memory-disk hierarchy, more efficient use of memory by the individual processors due to partitioning of the problem and a number of other effects. Similar boosted efficiency claims are sometimes aired for the use of a cluster of cheap computers as a replacement of a large multiprocessor, but again the actual results depend much on the problem at hand and the ability to partition the problem in a way that is conducive to clustering.


Parallel problems

Well known parallel software problem sets are:

In the jargon of parallel computing, an embarrassingly parallel workload (or embarrassingly parallel problem) is one for which no particular effort is needed to segment the problem into a very large number of parallel tasks, and there is no essential dependency (or communication) between those parallel tasks. ... A Grand Challenge Problem is a general category of unsolved problems. ...

Parallel programming

Parallel programming involes the designing, implementing, and tuning parallel computer programs which take advantage of underlying parallel computing systems. It also refers to applying parallel programming methods on existing serial programs, namely parallelization, in order to solve larger problems, or to reduce execution time or both. The terms computer program, software program, applications program, system software, or just program are used to refer to either an executable program by both lay people and computer programmers or the collection of source code from which an executable program is created (eg, compiled). ... Parallel computing is the simultaneous execution of the same task (split up and specially adapted) on multiple processors in order to obtain results faster. ...


Parallel programming focuses on partitioning the overall problem into separate tasks, allocating tasks to processors and synchronizing the tasks to get meaningful results. Before using parallel programming, one must understand the target problem is inherently parallelizable, mostly without data dependence. Then a problem can be partitioned based on domain decomposition or functional decomposition, or hybrid ones. Functional decomposition of engineering is a method for analyzing engineered systems. ...


There are two major approaches to parallel programming.

Many factors and techniques have impacts on the final performance of parallel programming, like In computer science, implicit parallelism is a characteristic of a programming language that allows a compiler to automatically exploit the parallelism inherent to the computations expressed by some of the languages constructs. ... A diagram of the operation of a typical multi-language compiler. ... In computer science, implicit parallelism is a characteristic of a programming language that allows a compiler to automatically exploit the parallelism inherent to the computations expressed by some of the languages constructs. ... Explicit-from Latin explicare, short for explicitus-to unroll or unfold. ...

  • Load balancing attempts to keep all processors busy by moving tasks from heavily loaded processors to less loaded ones.

Some people consider parallel programming to be synonymous with concurrent programming. Others draw a distinction between parallel programming, which uses well-defined and -structured patterns of communications between processes and focuses on parallel execution of processes to enhance throughput, and concurrent programming, which typically involves defining new patterns of communication between processes that may have been made concurrent for reasons other than performance. In either case, communication between processes is performed either via shared memory or with message passing, either of which may be implemented in terms of the other. Load balancing refers to the general practice of balancing a load. ... Concurrent programming languages are programming languages that use language constructs for concurrency. ... In computer hardware, shared memory refers to a (typically) large block of random access memory that can be accessed by several different central processing units (CPUs) in a multiple-processor computer system. ... In computer science, Message passing is used in concurrent programming, parallel programming, and object-oriented programming, to accomplish communication by sending messages to recipients. ...


Parallel programming models

Main article: Parallel programming model A parallel programming model is a set of software technologies to express parallel algorithms and match applications with the underlying parallel systems. ...


A parallel programming model is a set of software technologies to express parallel algorithms and match applications with the underlying parallel systems. It encloses the areas of applications, languages, compilers, libraries, communication systems, and parallel I/O. People have to choose a proper parallel programming model or a form of mixture of them to develop their parallel applications on a particular platform.


Parallel models are implemented in several ways: as libraries invoked from traditional sequential languages, as language extensions, or complete new execution models. They are also roughly categorized for two kinds of systems: shared memory systems and distributed memory systems, though the lines between them are largely blurred nowadays. In computer hardware, shared memory refers to a (typically) large block of random access memory that can be accessed by several different central processing units (CPUs) in a multiple-processor computer system. ... Distributed memory is a concept used in parallel computing. ...


Currently main-stream parallel programming models are:

The Parallel Virtual Machine (PVM) is a software tool for parallel networking of computers. ... The Message Passing Interface (MPI) is a computer communications protocol. ... OpenMP logo The OpenMP application programming interface (API) supports multi-platform shared memory multiprocessing programming in C/C++ and Fortran on many architectures, including Unix and Microsoft Windows platforms. ... Unified Parallel C (UPC) a parallel extension to the C programming language for high-performance computing. ... High Performance Fortran (HPF) is an extension of Fortran 90 with constructs that support parallel computing. ...

Topics in parallel computing

Generic:

Computer science topics: Automatic parallelization (also known as auto parallelization or Autoparallelization), refers to the use of a modern optimizing parallelizing compiler to convert sequential code into multi-threaded or vectorized (or even both) code in order to utilize multiple processors simultaneously in a shared-memory multiprocessor (SMP) machine. ... In computer science, a parallel algorithm, as opposed to a traditional serial algorithm, is one which can be executed a piece at a time in many different processing devices, and then put back together again at the end to get the correct result. ... A cellular automaton (plural: cellular automata) is a discrete model studied in computability theory, mathematics, and theoretical biology. ...

Practical problems: In computer programming, lazy evaluation is a technique that attempts to delay computation of expressions until the results of the computation are known to be needed. ... Eager evaluation is the evaluation model in most traditional programming languages. ... In computational complexity theory, a complexity class is a set of problems of related complexity. ... In complexity theory, the class NC (Nicks Class) is the set of decision problems decidable in polylogarithmic time on a parallel computer with a polynomial number of processors. ... In computer science, Communicating Sequential Processes (CSP) is a formal language for describing patterns of interaction in concurrent systems. ... Dataflow architecture is a computer architecture that directly contrasts the traditional von Neuman or control flow architecture. ...

  • Parallel computer interconnects
  • Parallel computer I/O
  • Reliability problems in large systems

Programming languages/models:

Specific: OpenMP logo The OpenMP application programming interface (API) supports multi-platform shared memory multiprocessing programming in C/C++ and Fortran on many architectures, including Unix and Microsoft Windows platforms. ... MPI may stand for: Magnetic Particle Imaging, an imaging technique still being developed Max-Planck-Institut, elite scientific research institutes in Germany Mental Performance Index, first ever mental scoring system in sport by Dr. John F. Murray Message Passing Interface, a computer communications protocol Message Parsing Interpreter, a lisp-like... Occam is a parallel programming language that builds on Communicating Sequential Processes (CSP) and shares many of their features. ... In computer science, Linda is an implementation of the TupleSpaces model for concurrent programming. ... Cilk is a general-purpose programming language designed for multi-threaded parallel programming. ...


Parallel computing to increase fault tolerance: The Atari Transputer Workstation (also known as ATW-800, or simply ATW) was a workstation class computer released by Atari in the late 1980s. ... The BBN Butterfly was a massively parallel computer from the 1980s based on an earlier Pluribus design. ... Beowulf is a design for high-performance parallel computing clusters on inexpensive personal computer hardware. ... Blue Gene/L Blue Gene is a computer architecture project designed to produce several next-generation supercomputers, designed to reach operating speeds in the petaflops range, and currently reaching speeds over 280 teraflops (sustained). ... Kasparov vs. ... The PIM/m-1 machine, one of the few fifth generation computers ever produced The Fifth Generation Computer Systems project (FGCS) was an initiative by Japans Ministry of International Trade and Industry, begun in 1982, to create a fifth generation computer (see history of computing hardware) which was supposed... The ILLIAC III was a fine-grained SIMD pattern recognition computer built by the University of Illinois in 1966. ... The ILLIAC IV was one of the most infamous supercomputers ever, destined to be the last in a series of research machines from the University of Illinois. ... Meiko Scientific was a supercomputer company founded by members of the design team working on the INMOS Transputer. ... The correct title of this article is nCUBE. The initial letter is capitalized due to technical restrictions. ... The INMOS Transputer was a pioneering parallel computing microprocessor design of the 1980s from INMOS, a small English company. ...

  • Master-checker

Companies (largely historical): A master-checker is a hardware-supported fault tolerance method for multiprocessor systems. ...

Thinking Machines Corporation was a supercomputer manufacturer founded in Waltham, Massachusetts in 1982 by W. Daniel Hillis and Sheryl Handler to turn Hilliss doctoral work at MIT on massively parallel computing architectures into a commercial product called the Connection Machine. ... Convex Computer was a company that produced a number of vector minisupercomputers, supercomputers for small-to-medium-sized businesses. ... Meiko Scientific was a supercomputer company founded by members of the design team working on the INMOS Transputer. ... Control Data Corporation, or CDC, was one of the pioneering supercomputer firms. ...

See also

Cluster Resources logo Cluster Resources, Inc. ... Linux Cluster at Purdue University. ... Concurrent computing is the overlapped coordinated execution of the multiple tasks (split up and specially adapted) on multiple processors in order to share common resources. ... DNA computing is a form of computing which uses DNA and molecular biology, instead of the traditional silicon-based computer technologies. ... Grid computing is an emerging computing model that provides the ability to perform higher throughput computing by taking advantage of many networked computers to model a virtual computer architecture that is able to distribute process execution across a parallel infrastructure. ... This is a list of important publications in computer science, organized by field. ... Parallel rendering is used to improve the performance of computer graphics. ...

References

This article was originally based on material from the Free On-line Dictionary of Computing, which is licensed under the GFDL.
  • http://www.llnl.gov/computing/tutorials/parallel_comp/ Introduction to Parallel Computing

The Free On-line Dictionary of Computing (FOLDOC) is an on-line, searchable encyclopedic dictionary of computing subjects. ... GNU logo (similar in appearance to a gnu) The GNU Free Documentation License (GNU FDL or simply GFDL) is a copyleft license for free content, designed by the Free Software Foundation (FSF) for the GNU project. ...

External links


 
 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments

Want to know more?
Search encyclopedia, statistics and forums:

 


Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms, 1022, m