Search this blog

17 March, 2009

Be driven by data but don't let data drive your code

Data
I don't like data driven systems too much, for reasons I've already explained. They tend to be bulky and inflexible rather easily. They are just a way of constraining change, hoping that if we parametrize and generalize enough, things can be all designed from the beginning, and change (or creativity really) won't happen. That's plain stupid.

I don't like big frameworks and huge tools, I'm a fan of fast iterations, and tight interaction between coders and artists. Scripting is nice.

Scripting is very nice as it's a glorified, and in many cases faster and smaller, data provider. Hey, plus you get a scripting system too! Or, if you have a scripting system, hey, you've got your data system as well (that's what maya engineers had to have thought when they designed maya ascii file format just as a snapshot of the scripting commands that generated the scene in the first place).

Don't get me wrong, I don't have anything against raw data in general, intended as information outside the code.
Of course we need data everywhere, textures, geometries etc. We also need parameters, and if you don't have a system to store and tweak parameter variables, either by reflecting code declared ones or code generating tool created ones, you're crazy, go and code (or steal) one now.

I like the approach the uses reflection more, as reflection is one of the extensions you'll need to provide to c++ anyway. Plus a well made reflection system won't allow you only to tweak variables, but also to bind a scripting system (reflecting functions) and to serialize your classes, that's handy for a lot of tasks (think savegames, networking, dumps for debugging, asset loading...)

Structure
But what if you have to make your system structure "data driven"? That means, the creation, lifetime of objects and their relationships. Those are not parameters, hardly can expressed as parameters. You might be tempted to use a structured data format (xml...) for the purpose, and it can be a very good solution in many cases, for sure. Just remember to not let your data drive your code.
The wrong approach is to have the code depend on the data, querying it, being built around it. It creates a lot of coupling that you want to avoid. Really, you do.

Make the data push objects into existence, data->translator->objects, not viceversa, do not have objects query for data, pull information out of it! Creating an object(data) makes the two things hard to untangle, of course some data will be required to create an object, but that should be the data it needs, not a generic structure designed around your data-driving/management/asset loading/etc... system.

You want to think about scripting, having the data call the code via its interpreter... Even the word "interpreter" lets you think about scripting, in fact you know that there's no difference between code and data, but here we're talking about engineering, not computer science.

1 comment:

Nick said...

Thanks for the plug :)

I think one of the biggest mistakes people make when they discover XML is to confuse the data they are recording, with its representation.

"Even the word "interpreter" lets you think about scripting". Awesome insight!


Applies to good data design too. Got to be careful with C++ data reflection; too easy to make a system where adding or removing an irrelevant struct member can corrupt or invalidate data. Reflect what you need, and be ready to interpret for forward compatibility!