Memory hierarchy

The hierarchical arrangement of storage in current computer architectures is called the memory hierarchy. Each level of the hierarchy is of higher speed and lower latency, and is of smaller size, than lower levels.

Most modern CPUs are so fast that for most program workloads the locality of reference of memory accesses, and the efficiency of the caching and memory transfer between different levels of the hierarchy, is the practical limitation on processing speed. As a result, the CPU spends much of its time idling, waiting for memory I/O to complete.

The memory hierarchy in most computers is as follows:

CPU registers (fastest possible access, only hundreds of bytes)
Level 1 cache (often accessed in just a few cycles, usually tens of kilobytes)
Level 2 cache (higher latency than L1 by 2x-10x, often 512KB or more)
DRAM (may take hundreds of cycles, but can be multiple gigabytes)
Disk storage (hundreds of thousands of cycles latency, but very large)