The term Harvard architecture originally referred to computer architectures that used physically separate storage devicess for their instructions and data (in contrast to the von Neumann architecture). The term originated from the Harvard Mark I relay-based computer, which stored instructions on punched tape and data in relay latches.
All computers consist primary of two parts, the CPU which processes data, and the memory which holds the data. The memory in turn has two aspects to it, the data itself, and the location where it is found - known as the address. Both are important to the CPU, as many common instructions boil down to something like "take the data in this address and add it to the data in that address", without actually knowing what the data itself is.
In recent years the speed of the CPU has grown many times in comparison to the memory it talks to, so care needs to be taken to reduce the numbers of times you access it in order to keep performance. If, for instance, every instruction run in the CPU requires an access to memory, the computer gains nothing for increased CPU speed - a problem referred to as being memory bound.
Memory can be made much faster, but only at high cost. The solution then is to provide a small amount of very fast memory known as a cache. As long as the memory the CPU needs is in the cache, the performance hit is very much less than it is if the cache then has to turn around and get the data from the main memory. Tuning the cache is an important aspect of computer design.
The Harvard architecture refers to one particular solution to this problem. Instructions and data are stored in separate caches to improve performance. However this has the disadvantage of halving the amount of cache available to either one, so it works best only if the CPU reads instructions and data at about the same frequency. This is frequently used in specialized DSPs, or Digital Signal Processors, commonly used in audio or video processing products.