RenderRambling
Ramble
Welcome to a page to keep track of notes about the rendering in xoreos. For now it is focusing on Neverwinter Nights (or NWN for short). Occasionally other games are actually tested as well, but NWN serves as the primary testing ground with the hopes that it will at least be semi-relevant for other games.
This rambling start is coming about from recent attempts to incorporate lighting into NWN, which has exposed some weaknesses in the render pipeline structuring. For a bit of history, the rendering has changed much and was done with knowledge of how things "should be" at the time - this knowledge has changed, and with it the realisation that some of the nice ways of handling rendering will need workarounds and become not so nice. By rambling a little here of how things are, it's hoped that a way forward will become more clear.
Confused yet? Yes? Good. Me too.
Rendering at current applies the concept of a combination of a mesh, surface (vertex data and shader), and material (fragment data and shader) into a complete rendering information object. Shaders are constructed with a builder, and used as the basis for a surface or material; a surface or material is essentially a list of data to bind into shader locations. Need the same tiled texture across every room? No problem, the same material handles it. There's always some per-object (in xoreos terms, per-modelnode) information which can be bound as necessary: object transform, alpha (transparency) for example, and therein is the problem being run into with the lighting.
Sharing surfaces and materials allows for a render queue to be constructed which minimises swapping shaders (because that's relatively expensive). Note the mistake here: designing for optimised rendering before everything is fully understood. NWN has a lot geometry transparency going on too, which means there's sub-optimal shader swapping just to deal with depth sorted rendering anyway. Per-object alpha values start to break the assumption of being able to share materials. Then lighting can add ambient, diffuse, specular, and other properties - all of which could be different for every single object (modelnode).
This was a key misunderstanding: shared materials aren't really a concept for NWN. Shared textures perhaps, but not properties. Suddenly every single modelnode starts to need a very large block of custom data for material (but not surfaces as it turns out - they're just mesh data for the most part and easily shared). Perhaps most of this approach comes from the rendering APIs available when the game was made, mainly OpenGL1.2 - a very state-based API. It was quite feasible to simply change state for every render operation, so it's only natural games would incorporate that into the design.
So what to do now? The renderqueue class is almost completely useless if taken to that extreme, and surfaces and materials are far too heavyweight. Strip the latter down to bare essentials: convenient binding of data pointers to shader locations. Material and surface managers can be removed, and objects brute forced (for now). If common blocks of per-object information are identified and handled in the shaders correctly, then this actually helps at some point in the future: just bind that information block to some buffer object and suddenly batched rendering is possible again (just not so easy with OpenGL 2.1). More importantly, this dramatically simplifies the code and makes it easier for others to work with. Side note: shader construction is complex, but looks to be standing up alright. The shaderbuilder is very convenient for not having to keep track of multiple shader files, so that can stay.
Back to lighting. Right now everything is pushed to a queue, sorted, rendered. Lights, however, are applied to transparent surfaces, opaque surfaces, everything - and there are a lot of lights in a NWN level. The transparency is key here, and that's not good for any sort of deferred lighting setup. It needs to be a forward renderer. There are far too many lights to simply bind them all in one large shader uniform (arguably they could be put into a buffer object, but uniform data still needs to index into that anyway), and it's not feasible to iterate over lights that have no influence over the current fragment. In other words, objects need to know which lights are active in the immediate surroundings. This is again bad for batched queues, but relatively trivial with immediate objects rendering if the list of lights can be quickly calculated.
Knowing which lights are close to an object (not just a modelnode, but the parent object - it's assumed lighting will impact all modelnodes of a model, and all models of a single object) means some kind of volume based hierarchy to properly manage it all. Current rendering just takes modelnodes, but knows nothing of the parent object. What does this mean? NWN could very likely do with a scenegraph of some kind (quadtree, or possibly just a grid seeing as areas are tile based). This could help visibility culling, traversing would relatively quickly build a lighting list without trouble, and knowledge of objects could help with rendering order, for example making sure that the tops of tile data (often transparent) are rendered last. It is definitely worth keeping in mind that this is very NWN-specific. It makes senses to therefore create a class that is dedicated to rendering NWN in a specific fashion. Other games can copy and modify as necessary. This class would need to be hooked into the graphics loops somewhere - the graphics loop doesn't, and shouldn't, know if NWN or Jade Empire is being rendered.
Removing some of the rendering management is not trivial work, but should be straight forward enough. Creating some kind of volume hierarchy and making sure objects are correctly updated, that's more tricky. NWN already has the Area class which looks to handle the majority of map data, so something hooked into there would be beneficial. Area also contains information on tiles, which in turn have main and source lights. Initial testing just needs lists of everything - hierarchies are intended to simplify calculations, but it can be very inefficiently brute forced to have it all function, then optimise with a proper hierarchy later.
On the topic of tiles and main lights, it looks like main lights should have a radius of 7. This is close to the diagonal "radius" of a tile (each tile is 5x5). The light radius and other properties other than colour aren't currently known where to find. They might be hard coded. Main light radius should cover the tile it's part of, but not extend too far beyond: enough to blend into other tiles, but not enough to cover entire large rooms or bleed through walls.
Playing Around
Continuing on from above (perhaps this should be put into a different page) and there are a few advancements in local builds. Light information, it turns out, is stored in modelnode controllers. This makes sense really: controllers can act as animation keyframes, and the radius, intensity, colour, etc, of a light could all change. Think of a flickering candle, or the light given off by fireworks: they aren't static lights. Inspecting controllers from model files and it appears that tile main lights will generally have at least a radius and multiplier value. The radius of main light 1 looks to generally be 14, with main light 2 being 5. The multiplier value generally looks to be 1. For candles, the multiplier has been zero. I'm not quite sure how the radius and multiplier values map to OpenGL lighting equations: the radius very likely impacts the linear coefficient, probably based on the light attenuation falling to some value (0.5, 0.25, 0.1, could be anything really) but the multiplier? No idea currently.
All this is very nice and allows some playing around with lighting to see what fits. Mesh ambient values are just source colour component-wise multipliers, easily added to the shaders. On the topic of mesh, let's take a step back a moment and look at modelnodes in general. Currently all mesh data is added to a mesh manager to prevent duplication. This is now wrong. Model caches mean that model files are only loaded once, and so the mesh data is only going to exist once within the model's node hierarchy. There's simply no need to use the dedicated mesh manager in this area anymore, and removing the usage means cleaner code and fewer hacks (always nice). (--edit: actually, seems that model files are _not_ cached currently, they're loaded multiple times - this should probably change!) (--edit2: fixing some of the unique mesh naming gives all the mesh data sharing benefits, but allows the same model to be loaded multiple times - which effectively gives model instancing anyway. To avoid reloading from disk accesses, just load the model file into memory and parse directly from that if necessary. Mesh manager lives on.)
Not to say that the mesh manager is entirely useless. It's still good to keep some engine default meshes around: wireframe boxes for debug and development purposes (showing bounding volumes for example), squares, circles, that sort of thing. So it still serves a important purpose, just not for the mesh data from nodes.
Models are generally referenced by a placeable, door, character, or some such class. Models can be immediately rendered, kept in a cache and removed when nothing more uses it, provide animation details, attachment points, etc - but a model is still only referenced. This means the model data and its entire hierarchy shouldn't be modified, or else multiple objects might be impacted. It all gets a bit more murky from here. Objects will need to track which state their reference to the model is (door open, closed, damaged, etc), which keyframe might be applied to animations, if something is attached (a character holding a weapon) location, orientation, scaling that's applied, and so on. All of that isn't so difficult, but models can also contain lights (normal lights and shadow casting lights) and emitters (particle effects). Lights and emitters aren't simply rendered as the node hierarchy is traversed, and the list of lights impacting each object must be runtime calculated.
Lights are effectively going to be generated for each object and their duration tied to the object, but referencing model data, possibly individually animated, and globally available to see which objects each light influences. Also, tile lights are treated slightly differently just because.
Lights, Camera, Action
After quite a deal of poking and prodding, lighting appears to be now somewhat functional. Static lights only, and only those from tiles, are currently being included but appear to be more or less correctly being applied. There are a few notes:
- Node absolute position is absolute within the node hierarchy, but relative to the parent model. It is not world co-ordinates.
- Be careful using absolute positions to determine world coordinates. They don't take into account orientation or scaling. Prefer to use the absolute (4x4 matrix) transform instead and extract position (matrix[3]) once the entire world transform is calculated (i.e model_transform x node_transform). Lights must be in world coordinates to give a single reference system for comparisons.
- There are a bunch of hints about what geometry is affected by lighting, and currently all of those hints are being ignored. Something for later.
- Building a list of active lights for rendering can't really be done on a model level. The radius of some model meshes is very large and will intersect with a lot of lights, far too many for use in the shaders. This list building does not currently take into account light priority or distance, both of which could be used to trim the list accordingly. Per-node list building is being done instead to keep most geometry correctly lit; per-node calculations become much more expensive, but also far more generic.
NWN, and presumably other games, has some features to help with some of the calculation overhead. Not every tile is visible at once for example: tiles fade in and fade out as the player character moves around. Lights can be activated as tiles move into view, and deactivated as they move out of view. Instead of performing comparisons against every single light in the entire area, only those within the current view need be checked against.
Static lights can be placed into a quadtree (or some kind of tree setup) to quickly perform volume intersection comparisons. This should be a moderate performance increase given that every single rendered node will use it every single frame, but it's something that will need investigation. Dynamic lights are likely best kept in just a simple list, but fortunately they're expected to be far fewer than static lights. It's highly recommended to stick with fade in and fade out to trim the list of active lights at first, as this is far simpler to code and maintain, and it unifies how lights are handled. It just depends on performance if something further should be done.
Light radius and multiplier values appear to be directly related to OpenGL lighting model values. The linear coefficient looks to be the multiplier value directly (normally 1.0f), while the radius doesn't directly map to the quadratic coefficient. Under the assumption that the half-radius point equates to roughly 0.5f attenuation, then the quadratic component is being calculated as: 1.0f / ((radius*0.5f)*(radius*0.5f)). Now that lighting is in place then any adjustments here should be relatively trivial.
Transparency
Transparency is quite naturally reliant on rendering order here. Opaque objects should be rendered first, than transparent objects in depth-sorted order. For NWN (again again presumably other games) a lot of geometry can be transparent as they fade in and out of view depending on where the player character is, the distance from the camera, any effects being applied, etc. Pretty much anything can be transparent - and so all nodes can have an alpha value.
The alpha value is very likely going to be the key deciding factor behind rendering order. Anything less than about 90% (0.9f) can be considered transparent, while some general Internet reading suggests the original NWN treats anything less than 20% (0.2f) to be essentially invisible and not rendered at all. Alpha values are going to be considered on a mesh node basis, not a model basis, to again be far more generic and applicable to all games.
There does appear to be a transparency hint includes with the geometry format. This isn't so much a transparency indicator (because just about everything can be anyway), but instead a rendering order hint. It basically says to render the mesh later than others, regardless of other factors. Presumably this is related to how the original game might have drawn nodes, perhaps something like:
- If mesh geometry is static and opaque, draw immediately.
- Traverse child nodes and draw those.
- After child nodes are drawn, now draw dynamic and transparent geometry.
Very recursive-friendly, but also requires knowledge of game data. NWN for example might have rendered tiles last to be sure that the fade in and out geometry doesn't obscure characters, placeables, etc. Each game might do something slightly different in this regard, so it would mean each game would then need a customised bit of world rendering code. Possibly a good idea anyway, but attempts at a more generic approach will be used to start with. Note: looking at Knights of the Old Republic, it has been observed that "transparency hints" of different values to NWN exist. While NWN primarily has a uint32_t value of 0 or 1 (at least so far), KotOR has 0, 1, 3, 5, and 8 at least. Current theory is to treat this as a priority system: lower numbers have higher rendering priority.
Queues
The rendering queue system so far appears to handle rendering order fairly well for most games. The original intent behind the queues was to perform batched rendering where possible and so improve overall rendering performance. That's not being now (far too difficult to maintain just yet, but can be revisited much further down the line), but the idea of multiple queues for opaque and transparent (translucent more accurately) objects stood up quite well. Instead of drawing geometry as model nodes are traversed, place them into appropriate queues which can be sorted if necessary before rendering. This approach separates transparency from hierarchies, allowing overlapping model volumes to be more correctly drawn. It has helped a lot with KotOR, which has a lot of windows, so there's no reason not to continue using this setup. The only real change is that previously a lot of extra data was included to directly access material data, surface data, alpha values, position, etc. The render queues could instead just keep a pointer to the node and render it directly as needed. There are some challenges with that approach, mostly that immediate node rendering must rely on updated _renderTransform values (basically world coordinate transforms), and some of the rendering function call interfaces might need a bit of tweaking, but on the whole there are not insurmountable issues; tweaks rather than overhauls.
Fade In/Out
Particularly with NWN, possibly other games, there's object fade in and out as the camera moves around. This is intended to stop geometry from obscuring the player's view of a scene. This is not actually determined by transparency hints; many meshes with fade in/out can have transparency hints of zero. Instead, the tilefade attribute marks meshes this applies to. So far there have been seen values of 0, 1, 2, and 4. These are theorised to indicate the distance that fade in/out is applied, with 0 being fade disabled for that mesh. Fade in/out would work by adjusting the alpha value of the mesh. This impacts the rendering queue - once the alpha value is less than e.g 90% it should move to the transparency queues. Below 20% and just don't render that object.
Render Order
Looks like KotOR at least, and possibly NWN, bake rendering order of modelnodes into the mdl file format itself. In KotOR this has helped render blast scorch marks correctly over everything else. Unfortunately the models overlap each other, so model rendering order (in addition to modelnode rendering order) is possibly also baked into the loading order somehow. This is considered somewhat gross, and how the tools managed to do that, or if it was up to the artists, seems to be a less than optimal way of handling things.
Regardless of reasons, the pre-baked rendering order of static geometry is critical in getting everything to show up properly. Everything has blending enabled by default, even objects that are actually opaque, which is just the way the Aurora engine seems to work. This is fairly inefficient for a GPU, but for whatever reason was a design decision that someone took which has wide ranging and often subtle impacts elsewhere. With everything being able to offer transparency, and with the mdl format not giving enough secondary hints as to what is really transparent and what is not, then everything is pretty much stuck on using the pre-baked rendering order. This has side effects, for example geometry rendering order cannot be sorted by shader for batched rendering.
Static geometry will require this rendering order, but dynamic geometry, e.g player characters, cannot be pre-baked into the overall area rendering order. Within a single model it can, but not within an entire scene. So rendering will very likely end up as: render static geometry, render dynamic geometry, render special effects in there somewhere. This means the rendering pipeline will need to know the difference between static and dynamic geometry, which is not something currently implemented. This information can be specific to games; at the time of writing it's not certain if each game should offer a customised top level "renderWorld" callback, or if each game should provide a list of static and dynamic objects into the top level graphics handling layer.