Other Free Encyclopedias » Online Encyclopedia » Encyclopedia - Featured Articles » Contributed Topics from F-J

Intel® Xscale® Micro-Architecture

memory pipeline data load

Definition: The Intel XScale micro-architecture is an implementation of the ARM V5TE architecture.

The XScale core supports both dynamic frequency and voltage scaling with a maximum frequency today in handheld devices of 624MHz (and increasing going forward). The design is a scalar, in-order single issue architecture with concurrent execution in 3 pipes that support out-of-order return. To support the frequency targets, a 7-stage integer pipeline is employed with dynamic branch prediction supplied to mitigate the cost of a deeper pipeline.

In favour of memory access efficiency, the Intel XScale micro-architecture contains instruction and data caches (32KB each). Also, in order to hide memory latency the micro-architecture supports software issued prefetch capability coupled with advanced load and store buffering. Load buffering allows multiple data/cache lines request from the memory concurrently, thus reducing the data loading overhead. Similarly, store buffers combine multiple neighbouring store operations to improve the memory and bus utilization. For virtual memory address translation, the Microarchitecture provides instruction and data translation look-aside buffer with 32 entry for each. Dynamic branch prediction with a target buffer of 32 entries significantly reduces the branch-penalty for deeper pipelines.

Intel Xscale Microarchitecture supports standard ARM* coprocessor framework. Intel Wireless MMX TM technology is incorporated as a coprocessor on the Intel XScale® micro-architecture. The ARM architecture specifies that the main core is responsible for fetching instructions and data from memory and delivering them to the coprocessor. An instruction can be issued to the main core pipeline or coprocessor pipeline. For example, an instruction can be issued to the load pipeline while a MAC operation completes in the multiply pipeline.

The architecture allows instructions to be retired out of order. The load buffering combined with the out-of-order completion, allows non-dependent load instructions to execute, reducing the impact of memory latency in system on a chip applications. Figure 2 shows the XScale core supports a debug interface, JTAG and PMU (Performance Monitoring Unit) in addition to a high-speed interface to the Wireless MMX unit.

Interactive Digital Television - BACKGROUND, A DEFINITION OF INTERACTIVITY, Local Interactivity, One-Way Interactivity, Two-Way Interactivity [next] [back] Integrated Platform for Networked and User-Oriented Virtual Clothing - INTRODUCTION, APPROACH AND RESEARCH

User Comments

Your email address will be altered so spam harvesting bots can't read it easily.
Hide my email completely instead?

Cancel or