Figure 2. AGP Pipelining vs. PCI transfers
The AGP bus also supports sideband addressing to allow overlapping of data with new address requests for further performance improvements. We won't need to know any more about the underlying protocol and addressing scheme of the AGP specification, but the description just provided should help us to better understand what's going on under the hood of the machine. What will be essential is to realize that the graphics card uses the AGP interface to access data in the system's main memory. It's this data in main memory that we're going to look at in detail.
Pre-requisites for AGP memory
For applications to be able to take advantage of the benefits of the AGP bus, several housekeeping tasks must be performed by the operating system and the hardware. First, because system memory is being used primarily for typical applications, there can be considerable fragmentation based on what applications are running and what particular memory allocations have occurred. A request to allocate 256k of memory for the graphics card to access via the AGP bus could result in several small, non-contiguous chunks of main memory being allocated. To make this appear as one 256k, contiguous piece of memory to the graphics card, the chipset, which controls the communication between the processor and main memory as well as the PCI bus, has something called a Graphics Address Re-Mapping Table (GART). The GART maps a linear range of virtual memory addresses to multiple, 4k, physical addresses in main memory. Figure 3 shows an example of a region of linear memory visible to the graphics card being mapped to a non-contiguous set of pages in physical memory. The amount of memory available for remapping by the GART is often determined by a setting in the BIOS known as the AGP aperture. Values vary from system to system but there is usually a maximum limit of half the available system memory.

Figure 3. Remapping of Memory Addresses through the GART
We've addressed the tasks the chipset must perform. Next, because the graphics card will be accessing the memory regularly and performance is critical, the operating system must make sure that memory that will be accessed by the AGP bus is not swapped out to disk. The operating system does this by locking the pages of memory. Finally, because current processors have one or more levels of caches to improve performance of typical applications, something must be done to make sure the graphics card sees the most up to date data. To achieve this without requiring the graphics card to "snoop" the caches of the processor, the memory being used for AGP transfers is marked as not cacheable, uncacheable, by the processor. This uncacheable memory is the heart of what we're going to look at to achieve performance improvements in 3D applications. To understand it, we need to examine a feature that Intel processors have carried for several generations.
Memory types
Starting with the Intel Pentium Pro processor, Intel processors have provided a mechanism for changing the access characteristics of regions of memory using a set of registers know as the Memory Type Range Registers (MTRRs). These registers are not accessible to a normal application and can only be changed by the operating system or a device driver. However, the drivers that support AGP memory use the MTRRs to provide the right characteristics to the memory being accessed by the graphics card. Let's look at the various options that can be selected using the MTRRs.
![]()
![]() | Creating a particle system with streaming SIMD extensions by William Damon, technical marketing engineer, Software Solutions... |
![]() | Combining Linux Message Passing and Threading in High-Performance Computing by Andrew Binstock, principal analyst, Pacific Data Works LLC. Intel C... |
If you're interested in this topic, these articles may be helpful:
![]() | Open Source Game Development Threading Quake 3 Quake* 3 Profiling So, where do we begin threading? Profiling, Pro... |
![]() | Maximum FPS: three tips for faster code by Dean Macri, Solutions Enabling Group, Intel Corp. Welcome back t... |
![]() | Threaded Cross-Platform Game Development by Brad Werth Introduction The technology of computer gaming is unde... |
![]() | Three Methods for Speeding up Matrix-Vector Multiplication by Kiefer Kuah, Intel Corp. Speeding up matrix-vector multiplicati... |
![]()
Related Jobs:

