The Intel iAPX 432 was Intel's first 32-bit microprocessor design, introduced in 1981 as a set of three integrated circuits. The iAPX 432 was intended to be Intel's major design for the 1980s, implementing many advanced multitasking and memory management features in hardware, which led them to refer to the design as the Micromainframe.
The processor's data structure support allowed modern operating systems to be implemented on it using far less program code than ordinary CPUs—the 432 would instead do much of the work internally in hardware. However, the design was extremely complex compared to the mainstream microprocessors of the era, so much so that Intel's engineers weren't able to translate the design into an efficient implementation using the semiconductor technology of its day. The resulting CPU was very slow and expensive, and so Intel's plans to replace the x86 architecture with the iAPX 432 ended miserably.
The abbreviation iAPX prefixing the model name reportedly stands for intel Advanced Processor architecture, the X coming from the greek letter Chi.
The 432 project started in 1975 as the 8800, so named as a follow-on to the existing 8008 and 8080 CPUs. The design was intended to be purely 32-bit from the outset, and be the backbone of Intel's processor offerings in the 1980s. As such it was to be considerably more powerful and complex than their existing "simple" offerings. However the design was well beyond the capabilities of the existing process technology of the era, and had to be split into several individual chips.
The core of the design was the two-chip General Data Processor (GDP) which was the main processor. The GDP was split in two, one chip (the 43201) handling the fetching and decoding of the instructions, the other (the 43202) executing them. Most systems would also include the 43203 Interface Processor (IP) which operated as a channel controller for I/O. Together the three-chip system used about 250,000 logic gates, making it one of the largest designs of its era; the contemporary Motorola 68000 contained about 68,000 for instance, about 1/3 of that for its microcode.
In 1983 Intel released two additional integrated circuits for the iAPX 432 Interconnect Architecture, the 43204 Bus Interface Unit (BIU) and 43205 Memory Control Unit (MCU). These chips allowed for nearly glueless multiprocessor systems with up to 63 nodes.
The project's failures
Several design features of the iAPX 432 conspired to make it much slower than it could have been. The two-chip implementation of the GDP limited it to the speed of the motherboard's electrical wiring, although this is a minor issue. The lack of reasonable caches and registers was considerably more serious. The instruction set also hindered performance by using bit-aligned variable-length instructions, as opposed to word-aligned fixed-length instructions use in the majority of designs, making instruction decoding complex and slow. In addition the BIU was designed to support fault-tolerant systems, and in doing so added considerable overhead to the bus, with up to 40% of the bus time in wait states.
Post-project research suggested that the biggest problem was in the compiler, which used high-cost "general" instructions in every case, instead of high-performance simpler ones where it would have made sense. For instance the iAPX 432 included a very expensive inter-module procedure call instruction, which the compiler used for all calls. However this call was also very expensive, and the much faster branch and link instructions were ignored. Another very slow call was enter_environment, which set up the memory protection. The compiler ran this for every single variable in the system, even though the vast majority were running inside an existing environment and didn't have to be checked. To make matters worse it always passed data to and from procedures by value rather than by reference, requiring huge memory copies in many cases.
Impact and similar designs
Sadly the market appears to have taken away the wrong lesson from the iAPX 432 failure. Generally the market has concluded that object support in the chip leads to a complex design that will invariably run slowly. However it appears that the OO support was not the problem at all, the problems were much more general and would have made any chip design slow. Since the iAPX 432 no one has attempted a similar design, although the INMOS Transputer's process support was similar -- and very fast.
Intel had spent considerable time, money and mindshare on the 432, had a skilled team devoted to it, and were loath to abdandon it entirely after its failure in the marketplace. A new architect, Glenford Myers, was brought in to produce a new design for the core processor, which would be built in a joint Intel/Siemens project (later Biin), resulting in the i960-series processors. The i960 RISC subset became popular for a time in the embedded processor market, but the high-end 960MC and the tagged-memory 960MX saw even less use than the 432.
Object-oriented memory and capabilities
The iAPX 432 has hardware and microcode support for object-oriented programming. The system uses segmented memory, with up to 224 segments of up to 64 Kilobytes each, providing a total virtual address space of 240 bytes. The physical address space is 224 bytes (16 Megabytes).
Programs are not able to reference data or instructions by address; instead they must specify a segment and an offset within the segment. Segments are referenced by Access Descriptors (ADs), which provide an index into the system object table and a set of rights (capabilities) governing accesses to that segment. Segments may be access segments, which can only contain Access Descriptors, or data segments which cannot contain ADs. The hardware and microcode rigidly enforce the distinction between data and access segments, and will not allow software to treat data as access descriptors, or vice versa.
System-defined objects consist of either a single access segment, or an access segment and a data segment. System-defined segments contain data or access descriptors for system-defined data at designated offsets, though the operating system or user software may extend these with additional data. Each system object has a type field which is checked by microcode, such that a Port Object cannot be used where a Carrier Object is needed. User program can define new object types which will get the full benefit of the hardware type checking, through the use of Type Control Objects (TCO).
In Release 1 of the iAPX 432 architecture, a system-defined object typically consisted of an access segment, and optionally (depending on the object type) a data segment specified by an access descriptor at a fixed offset within the access segment.
By Release 3 of the architecture, in order to improve performance, access segments and data segments were combined into single segments of up to 128 Kilobytes, split into an access part and a data part of 0–64K each. This reduced the number of object table lookups dramatically, and doubled the maximum virtual address space.
Multitasking and interprocess communication
The iAPX 432 microcode implements multitasking, using objects in memory to represent the processor, processes, communcation ports, and dispatching ports. Each processor is associated with a dispatching port, and when it is idle will attempt to dispatch a process from that dispatching port. When the process blocks or its time quantum expires, the processor re-enqueues that process at its dispatching port, then dispatches a new process from the dispatching port.
Interprocess communication is supported through the use of communication ports. A communciation port is essentially a FIFO that can enqueue either messages waiting to be received by a process, or processes waiting to receive a message (but never both). A program can use the Send, Receive, Conditional Send, Conditional Receive, Surrogate Send, or Surrogate Receive instructions to communicate with other processes by sending messages to or receiving messages from communication ports. If there is no message enqueued at a communication port, a normal Receive instruction on that port will block the current process until a message is available. Similarly, a normal Send instruction will block the current process if the port is full. The Conditional Send and Conditional Receive instructions do not block, instead returning a boolean result indicating whether the operation succeeded. The Surrogate Send and Surrogate Receive instructions provide a Carrier object that can block in place of the process.
One of the elegant aspects of the iAPX 432 architecture is that a dispatching port is actually just a communication port whose messages are process objects, thus unifying the operation of process dispatching and interprocess communication and simplifying the underlying implementation.
The iAPX 432 has hardware support for multiprocessing, using up to 64 processors (combination of GDPs and IPs). Usually all GDPs share a common workload by using a single system-wide dispatching port, though it is possible to partition the workload by assigning some processors to different dispatching ports. With suitably designed hardware, processors can be added to or removed from the system on the fly.
From the outset, the iAPX 432 included support for fault tolerance. All of the 432's chips could be configured in pairs for Functional Redundancy Checking (FRC), in which one component, the master, operated normally, and a second, the checker, carried out the same internal operations in parallel and verified its results against those of the master.
FRC provides for failure detection, but full fault tolerance requires a recovery mechanism. Systems based on the Interconnect Architecture supported automatic failure recovery by combining pairs of FRC modules for Quad Modular Redundancy (QMR). In a QMR configuration, at any given time one FRC module is a primary and the other is a shadow. The two modules operate in lockstep, but the roles alternate in order to detect latent faults. The shadow module does not drive the bus. If a fault is detected in either FRC module, that module is disabled while the nonfaulted module can continue operation. The software is notified, and can choose to let the system continue operating (without fault tolerance for that module), pair the module with a spare, or take the module offline (shifting its workload to other processors in the system for graceful performace degradation).
The 43203 Interface Processor (IP) allows a more conventional microprocessor to be interfaced as an Attached Processor (AP) to an iAPX 432 system. The AP acts as an intelligent I/O controller. The IP allows the AP to access objects in the iAPX 432 memory through the use of memory-mapped windows, but will enforces the access rights applicable to the objects.
The IP provides five memory windows. Four are used to map objects for I/O operations; the fifth is the control window and is used by the AP to perform control operations such as requesting changes to the mapping of the other windows.
The IP also offers a special "physical" mode in which the AP has unrestricted access to the entire iAPX 432 address space. Physical mode is intended to be used only for system startup and debugger support.
- Intel iAPX 432 (Computer Science project paper) (http://www.brouhaha.com/~eric/retrocomputing/intel/iapx432/cs460/) – By David King, Liang Zhou, Jon Bryson, David Dickson