I’ve been quiet about my IRM 2 activities recently, and I apologize. This article is not really a status update, but a reflection on mapping programs such as IRM 2 to a traditional RDBMS.
I’ve determined that I just built a database… inside another database. The IRM 2 core currently has a core table, which is quite literally an ID, Column, Value mapping table. The data simply looked liked this:
A100, 1, Name, Computer
A100, 1, Color, Blue
A100, 1, Owner, Bob
Every property simply gets dumped into this table, and requires extensive remapping on the application logic. The RDBMS doesn’t have much insight into schemas, as it subverts its notion of schemas. Searching for explicit property values requires joining huge amounts of data together. Data storage is inefficient, with records scattered throughout the datastore (and in many places on the disk), especially if records were updated after creation.
More and more I’m starting to feel that approaches such as BigTable, SimpleDB, and CouchDB are taking the correct approach to the large data storage problem. The data stored is as a series of documents with arbitrary and per-document schema. Each node is a replicated element, and data storage is not centralized but rather distributed.
The advent of Google’s AppEngine supporting this model of database reaffirms that this is the correct direction to move in for large scale online applications.
Now, what does all of this have to do with IRM 2 and gIRM? Stay tuned – I’m trying to find out myself