Object-relational mapping

Object-relational mapping is a programming technique that links relational databases to object-oriented language concepts, creating (in effect) a "virtual object database."

The Problem

In the object-oriented programming methodology, programming objects represent real-world objects to some degree. To illustrate, consider the example of an address book, which contains listings of people along with zero or more phone numbers and zero or more addresses. In the OO world this would be represented by a "person object" with "slots" (fields, members, instance variables etc.) to hold the data that makes up this listing: the persons name, a list (or array) of phone numbers, and a list of addresses.

The trick comes when it's time to save that data out to permanent storage. From a programmer's perspective, the best solution is a persistent object store, into which any object can be placed and later found. Most object oriented APIss include some sort of solution for this task. For instance, Java uses the Serializable interface which will "serialize" a series of objects into a form suitable for saving to disk. This is a perfectly usable solution in most situations, and most programs use some similar form of object store.

However, for larger applications the store has to be considerably more powerful. Even minor damage to a "simple" file can render the entire file unusable for real-world applications. In addition, it becomes progressively more expensive to find information in such a store as the total number of objects—or more generally, the total amount of data—grows.

The solution to these sorts of data storage problems already exists: databases. However, almost all database systems pre-date the object revolution that occurred in the 1990s, and they tend to "map" poorly into the OO world because they are based on a completely different set of concepts. This problem is known as impedance mismatch.

The best solution would be to use an object database, which, as the name implies, is a database designed specifically to look at and work with object programs. However, these databases have a serious credibility problem in the "big iron" world of databases. Programmers are forced—directly or indirectly—to use relational databases instead.

Traditional relational databases use a series of tables representing simple data. Optional or related information is stored in other tables. A single record in the database often spans several of these tables, and requires a join to collect all of the related information back into a single piece of data for processing. This would be the case for the address book, which would likely include at least a user and address table, but perhaps even a phone number table as well.

In the object world there is a clear sense of "ownership," where a particular person object owns a particular phone number. This is not the case in a relational database, where the tables have no idea how they relate to other tables at a fundamental level. Instead, the user must construct a "query" to gather the information back together.

Doing this is not a simple task. Due to the complexities of relational databases, it can be very expensive to submit several queries in a row. One can't, for instance, expect good performance if one does a series of operations like "find this user, ok, now find this user's addresses, ok...". Instead, one must construct a single large query that says "find this user and all their addresses and phone numbers and return them in this format."

After construction of the query, the data returned has to be copied into the fields in the objects in question. Once there, the object has to watch to see if these values change, and then carefully reverse the process to write the data back out to the database.

Given these two very different worlds, object code for working with databases tends to be very complex and bug-ridden.

The Solution

Object-relational systems attempt to solve this problem by providing software to do this mapping automatically. Given a list of tables in the database, and objects in the program, they will automatically map requests from one to the other. Asking a person object for its phone numbers will result in the proper query being created and sent, and the results being "magically" translated directly into address objects inside the program.

From a programmer's perspective, the system looks like a persistent object store. One can create objects and work with them as one would normally, and they automatically end up in the relational database.

Things are never that simple though. All O-R systems tend to make themselves visible in various ways, reducing to some degree one's ability to ignore the database. Worse, the translation layer can be slow and inefficient (notably in terms of the SQL it writes), resulting in programs that are slower and use more memory than code written "by hand."

A number of O-R systems have been created over the years, but their effect on the market seems mixed. Considered one of the best was NeXT's Enterprise Objects Framework (EOF), but it failed to have a lasting impact on the market, chiefly because it was tightly tied to NeXT's entire toolkit, OpenStep. It was later integrated into NeXT's WebObjects, the first object-oriented Web Application Server. Since Apple Computer bought NeXT in 1997, EOF provides the technology behind the Apple e-commerce Web site and the .Mac services. Apple provides EOF in two implementations: the Objective-C implementation that comes with the Apple Developers Tools and the Pure Java implementation that comes in WebObjects 5.2

More recently, a similar system has started to evolve in the Java world, known as Java Data Objects (JDO). Unlike EOF, it is a standard only, and it is expected that several implementations will be available from different vendors.