FACTOID # 15: A mere 0.8% of West Virginians were born in a foreign country.
 
 Home   Encyclopedia   Statistics   States A-Z   Flags   Maps   FAQ   About 
   
 
WHAT'S NEW
 

SEARCH ALL

FACTS & STATISTICS    Advanced view

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > Very long instruction word

A Very Long Instruction Word or VLIW CPU architecture implements a form of instruction level parallelism. Similar to superscalar architectures, it uses several execution units (e.g. two multipliers), which enables the CPU to execute several instructions at the same time (e.g. two multiplications). Intel 80486DX2 microprocessor in a ceramic PGA package A central processing unit (CPU), or sometimes simply processor, is the component in a digital computer that interprets instructions and processes data contained in software. ... Instruction-level parallelism (ILP) is a measure of how many of the operations in a computer program can be dealt with at once. ... A superscalar CPU architecture implements a form of parallelism on a single chip, thereby allowing the system as a whole to run much faster than it would otherwise be able to at a given clock speed. ... In computer engineering, an execution unit is a part of a CPU that performs the operations and calculations called for by the program. ...

Contents


Design

In superscalar designs, the number of execution units is invisible to the instruction set. Each instruction encodes only one operation. For most superscalar designs, the instruction width is 32-bits or less.


In contrast, one VLIW instruction encodes multiple operations, specifically at least one operation for each execution unit of the device. For example, if a VLIW device has 5 execution units, then a VLIW instruction for that device would have 5 operation fields, each field specifying what operation would be done on that corresponding execution unit. To accommodate these operation fields, VLIW instructions are usually at least 64-bits in width (and in some architectures are much wider than that).


Since the very earliest days of computer architecture, some CPUs have added several additional arithmetic logic units (ALUs) to run in parallel. Superscalar CPUs use hardware to decide which operations can run in parallel. VLIW CPUs use software (the compiler) to decide which operations can run in parallel. The arithmetic logic unit/arithmetic-logic unit (ALU) of a computers CPU is a part of the execution unit, a core component of all CPUs. ...


For instance, the CPU might have the ability to multiply two numbers at the same time. However, the results of the second may depend on the first. If so, the second of the two units "stalls" while it waits for the first one to finish. In a conventional CPU, such stalling is implemented in hardware. In a VLIW, the compiler predetermines the schedule of operations: while one multiplier is working on the first result, the compiler has scheduled a NOP for the other multiplier, until the result from the first multiply is ready. This substantially reduces the hardware's complexity. NOP or NOOP (short for No OPeration) is an assembly language instruction, sequence of programming language statements, or computer protocol command that does nothing at all. ...


A similar problem occurs when the result of such an instruction is used as input for a branch. Most modern CPUs "guess" which branch will be taken even before the calculation is complete, so that they can load up the instructions for the branch, or (in some architectures) even start to compute them speculatively. If the CPU guesses wrong, all of these instructions and their context need to be "flushed" and the correct ones loaded, which is time-consuming. In computer science, speculative execution is the execution of code whose result may not actually be needed. ...


This has led to increasingly complex instruction dispatch logic that attempts to guess right, and the simplicity of the original RISC designs has been eroded. VLIW lacks this logic, and therefore lacks its power consumption, possible design defects and other negative features. Reduced Instruction Set Computer (RISC), is a microprocessor CPU design philosophy that favors a smaller and simpler set of instructions that all take about the same amount of time to execute. ...


In a VLIW, the compiler uses heuristics or profile information to guess the direction of a branch. This allows it to move and preschedule operations speculatively before the branch is taken, favoring the most likely path it expects through the branch. If the branch goes the unexpected way, the compiler has already generated compensation code to discard speculative results in order to preserve program semantics.


History

The term VLIW, and the VLIW architecture concept, was invented by Professor Josh Fisher in his research group at Yale University in the early 1980s. His original development of Trace Scheduling as a compilation technique for VLIW was developed when he was a graduate student at New York University. Prior to VLIW, the notion of prescheduling functional units and instruction level parallelism in software was well established in the practice of developing horizontal microcode. Fisher's innovations were around developing a compiler that could target horizontal microcode from programs written in ordinary programming language. He realized that in order to get good performance, and to target a wide-issue machine, it would be necessary to find parallelism beyond what one generally finds within basic blocks. He developed region scheduling techniques to identify parallelism beyond basic blocks. Trace Scheduling is such a technique, and involves scheduling a most likely path of basic blocks first, inserting compensation code to deal with speculative motions, scheduling the second most likely trace, and so on, until the schedule was complete. Josh Fisher Joseph A. Fisher is a Hewlett-Packard Senior Fellow at HP Labs, where he has worked since 1990 in instruction-level parallelism and in custom embedded VLIW processors and their compilers. ... Yale University is a private university in New Haven, Connecticut. ... New York University (NYU) is a major research university in New York City. ... A microprogram is a program consisting of microcode that controls the different parts of a computers central processing unit (CPU). ... In computing, a basic block is a straight-line piece of code without any jumps or jump targets in the middle; jump targets, if any, start a block, and jumps end a block. ...


Fisher's second innovation was the notion that the target CPU architecture should be designed to be a reasonable target for a compiler -- the compiler and the architecture for VLIW must be co-designed. This was partly inspired by the difficulty Fisher observed at Yale of compiling for architectures like Floating Point Systems FPS164, which had a complex instruction set architecture that separated instruction initiation from the instructions that saved the result -- leading to the need for very complicated scheduling algorithms. Fisher developed a set of principles characterizing a proper VLIW design, such as self-draining pipelines, wide multi-port register files, and memory architectures. These principles made it easier for compilers to write fast code. Floating Point Systems Inc. ... An instruction set, or instruction set architecture (ISA), describes the aspects of a computer architecture visible to a programmer, including the native datatypes, instructions, registers, addressing modes, memory architecture, interrupt and exception handling, and external I/O (if any). ...


The first VLIW compiler was described in a Ph.D. thesis by John Ellis, supervised by Fisher. John Ruttenberg also developed certain important algorithms for scheduling.


Fisher left Yale in 1984 to found a startup company, Multiflow, along with co-founders John O'Donnell and John Ruttenberg. Multiflow produced the TRACE series of VLIW minisupercomputers, shipping their first machines around 1988. Multiflow's VLIW could issue 28 operations in parallel each instruction. Multiflow failed as a business in 1990. The reasons for any business failure are complex; part of the challenge Multiflow faced was timing with respect to hardware implementation technology. Multiflow implemented its VLIW in an MSI/LSI/VLSI mix packaged in cabinets, a technology that fell out of favor when it became more cost-effective to integrate all of the components of a processor (excluding memory) on a single chip. Multiflow was too early to catch the following wave when chip architectures allowed multiple issue CPUs. The major semiconductor companies recognized the value of Multiflow technology in this context, so the compiler and architecture were subsequently licensed to most of these companies. Multiflow Computer, Inc. ...


There are instances of the Multiflow Trace machines in the computer museum.


Implementations

Cydrome was a company producing VLIW numeric processors using ECL technology in the same timeframe (late-1980s). This company also failed after a few years. Cydrome was a computer company started in the mid 80s whose mission was to develop a Numeric Processor for primary customer Prime. ... ECL may stand for: Emitter Coupled Logic Embeddable Common Lisp ... The 1980s decade refers to the years from 1980 to 1989, inclusive. ...


One of the licensees of the Multiflow technology is Hewlett-Packard, which Fisher joined after Multiflow's failure. Bob Rau, founder of Cydrome, also joined HP after Cydrome failed. These two would lead computer architecture research within Hewlett-Packard during the 1990s. The Hewlett-Packard Company (NYSE: HPQ), commonly known as HP, is a very large, global company headquartered in Palo Alto, California, United States. ... The Hewlett-Packard Company (NYSE: HPQ), commonly known as HP, is a very large, global company headquartered in Palo Alto, California, United States. ...


In the 1990s, Hewlett-Packard researched this problem as a side effect of ongoing work on their PA-RISC processor family. They found that the CPU could be greatly simplified by removing the complex dispatch logic from the CPU and placing it into the compiler. Today's compilers are much more complex than those from the 1980s, so the added complexity in the compiler was considered to be a small cost. The Hewlett-Packard Company (NYSE: HPQ), commonly known as HP, is a very large, global company headquartered in Palo Alto, California, United States. ... PA-RISC is a microprocessor architecture developed by Hewlett-Packards Systems & VLSI Technology Operation. ...


VLIW CPUs are actually RISC-based, and typically have four to eight main units. After compiling the program normally, the VLIW compiler re-orders the code into paths that don't have any dependencies. These are then sliced into four or more parts (one for each unit of the CPU) and re-packaged together into one larger instruction with additional information about which of the instructions should run which unit. The result is a single much larger opcode (thus the term "very long").


Philips' TriMedia processor as well as Intel's Itanium IA-64 EPIC processor are examples of VLIW CPUs. Texas Instruments C6000 is an example of VLIW DSP. Koninklijke Philips Electronics N.V. (Royal Philips Electronics N.V.), usually known as Philips, (Euronext: PHIA, NYSE: PHG) is one of the largest electronics companies in the world. ... Intel Corporation (NASDAQ: INTC, HKEx: 4335), founded in 1968 as Integrated Electronics Corporation, is a U.S.-based multinational corporation that is best known for designing and manufacturing microprocessors and specialized integrated circuits. ... Itanium brand logo In computing, the Itanium is an IA-64 microprocessor developed jointly by Hewlett-Packard and Intel. ... In computing, IA-64 (Intel Architecture-64) is a 64-bit processor architecture developed in cooperation by Intel and Hewlett-Packard, implemented by processors such as Itanium and Itanium 2. ... Explicitly Parallel Instruction Computing (EPIC) is a computing paradigm that began to be researched in the 1990s. ... Texas Instruments (NYSE: TXN), better known in the electronics industry as TI, is a company based in Dallas, Texas, renowned for developing and commercializing semiconductor and computer technology. ... A digital signal processor (DSP) is a specialized microprocessor designed specifically for digital signal processing, generally in real-time. ...


Many people felt that an early problem with VLIW processors is that they did not allow for backward compatibility. When silicon technology would allow for wider (more execution units) implementations to be built, the compiled programs for the earlier generation of VLIW machine would not run on the wider implementation as the binary instructions encoded the number of execution units of the narrower machine. In technology, especially computing, a product is said to be backward compatible (or downward compatible) when it is able to take the place of an older product, by interoperating with other products that were designed for the older product. ...


Transmeta addresses this issue by including a binary-to-binary software compiler layer (termed Code Morphing) in their Crusoe implementation of the x86 architecture. Basically, this mechanism is advertised to recompile, optimize, and translate x86 opcodes at runtime into the CPU's internal machine code. Thus, the Transmeta chip is internally a VLIW processor, effectively decoupled from the x86 CISC instruction set that it executes. Transmeta NASDAQ: TMTA develops efficient computing technologies that improve performance and reduce power consumption in electronic devices. ... In computing, binary translation is the emulation of one instruction set by another through translation of code. ... x86 or 80x86 is the generic name of a microprocessor architecture first developed and manufactured by Intel. ... An instruction set, or instruction set architecture (ISA), describes the aspects of a computer architecture visible to a programmer, including the native datatypes, instructions, registers, addressing modes, memory architecture, interrupt and exception handling, and external I/O (if any). ...


Intel's Itanium architecture (among others) solved this backward-compatibility problem with a more general mechanism. Within each of the multiple-opcode instructions, a bit field is allocated to denote dependency on the previous VLIW instruction within the program instruction stream. These bits are set at compile time, thus alleviating the hardware from calculating this dependency information. Having this dependency information encoded into the instruction stream allows wider implementations to issue multiple non-dependent VLIW instructions in parallel per cycle while cheaper, narrower implementations would issue a smaller number of these VLIW instructions per cycle.


Another perceived deficiency of VLIW architectures is the code bloat that occurs when not all of the execution units have useful work to do and thus have to execute NOPs. This occurs when there are dependencies in the code and the functional pipelines must be allowed to drain before subsequent operations can proceed. Code bloat is the production of code that is unecessarily long. ...


Since the number of transistors on a chip has grown, the perceived disadvantages of the VLIW have diminished in importance. The VLIW architecture is growing in popularity, particularly in the embedded market, where it is possible to customize a processor for an application in an embedded system-on-a-chip. Embedded VLIW products are available from several vendors, including Fujitsu, the ST231 from STMicroelectronics, the Jazz DSP from Improv Systems, and Silicon Hive. The Texas Instruments TMS320 DSP line has evolved, in its C6xxx family, to look more like a VLIW, in contrast to the earlier C5xxx family. System-on-a-chip (SoC or SOC) is an idea of integrating all components of a computer system into a single chip. ... Geneva, September 22, 2005 – STMicroelectronics (NYSE: STM) today published the results of EEMBC (Embedded Microprocessor Benchmark Consortium) certification of the latest and most advanced processor core in the ST200 VLIW (Very Long Instruction Word) family, which is intended for use in high-performance multimedia System-on-Chip (SoC) devices. ... The Jazz DSP, by Improv Systems, is a VLIW embedded digital siginal processor architecture with a 2-stage instruction pipeline, and single-cycle execution units. ... Texas Instruments (NYSE: TXN), better known in the electronics industry as TI, is a company based in Dallas, Texas, renowned for developing and commercializing semiconductor and computer technology. ... Texas Instruments TMS320 is a blanket name for a series of digital signal processors from Texas Instruments. ...


External links

  • VLIW and Embedded Processing
  • VLIW Processors and Trace Scheduling

  Results from FactBites:
 
Very long instruction word - Wikipedia, the free encyclopedia (1580 words)
In a VLIW, the compiler predetermines the schedule of operations: while one multiplier is working on the first result, the compiler has scheduled a NOP for the other multiplier, until the result from the first multiply is ready.
The term VLIW, and the VLIW architecture concept, was invented by Professor Josh Fisher in his research group at Yale University in the early 1980s.
The VLIW architecture is growing in popularity, particularly in the embedded market, where it is possible to customize a processor for an application in an embedded system-on-a-chip.
Talk:Very long instruction word - Wikipedia, the free encyclopedia (326 words)
VLIW machines, on the other hand, rely on the compiler to explicitly tell the processor exactly what every functional unit is doing at any instant -- all packed into a single instruction (the Very Long Instruction Word).
A compiler for a VLIW machine will schedule the instructions for the "true" side of the condition into some of the functional units, and the "false" side of the condition into other functional units, so both sides get executed simultaneously.
As I previously mentioned, Cydrome was a company pioneering VLIW concepts in the same timeframe as Multiflow, but there's no mention of that in the article.
  More results at FactBites »

 
 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments

Want to know more?
Search encyclopedia, statistics and forums:

 


Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms, 1022, m