Ainur Engine

Sunday, January 4, 2009

Grass lighting problem

I'm implementing dynamic lighting adapting the shaders I've found on the ETM forum, but I noticed that the grass made wtih PagedGeometry doesn't support lighting. Fortunately, someone wrote the patch and this allowed me to concentrate on getting proper dynamic lighting with normal mapping on ETM.
Here's the topic on the PG forum.
You can also download the diff file here.

A note about terrains

Ainur Engine uses Editable Terrain Manager which supports also terrain editing, which is a feature useful for the future Ainur Editor.
ETM is capable of splatting up to 12 textures per pass, using 4 coverage maps with 3 channels RGB, which seems to be the maximum number of textures per pass allowed by shader model 2.0 - actually Ainur is using 9 textures with 3 coverage maps but I am already removing shader model 1.0 support where it does make sense.
Since actually OpenGL 3.0 is not well received and Ogre only (officially) supports OpenGL and Direct3D, shader model 2.0 makes sense. Every graphics card should support it.

Friday, January 2, 2009

Roadmap

Currently the Ainur Engine has the following features:

Modular and object oriented design
Game state management
IntroState does a nice slideshow of your images
TextFile and XmlFile resource managers
CEGUI support
Console done with CEGUI which is not yet enabled
Support for mouse, keyboard, joystick/gamepad (even Microsoft XBOX 360 gamepad)
Terrains with Editable Terrain Manager whith support for splatting, lightmap and colourmap
Grass support with PagedGeometry
Sky, clouds, starts, sun, moon and da/night cycle with Caelum
Bloom effect (which was taken from Ogre examples but it's not that great with Ainur because the lights seem too bright)
Resource locations with a simple XML file
CEGUI sets and cursors specified with a simple XML file
Configuration file for stages (AKA levels), implemented via INI-style file instead of XML (may use XML for this in the future)

Current planned features for Ainur Engine v0.1.0:

Implement HDR and Motion Blur
Fix Bloom weight
Disable Bloom and use HDR
Initial PostEffects infrastructure to handle compositors
Water support with Hydrax
Initial code for weather management
Physics code to make the player walk (with three camera settings: first person, third person and free)
MainMenu state implementation with Play Singleplayer, Options and Quit menu items
Also, implementing this game state I will also implement a way to load/save graphics options
Implement EditState to be used in the Stage editor instead of the PlayState which will be used for in-game preview
Stage editor for terrains, grass, sky and water
Loading IntroState images from XML

At the time of this writing, I already implement the first four points and I'm working on the 5th.

Thursday, January 1, 2009

Actual screenshots

Now I would like to add Hydrax water support but first it's better to post some screenshots.

World refactor and grass

The World class has been refactored: I separated sky and terrains (vegetation was already separated, but the class needs to be named VegetationSystem instead of PagedGeometrySystem). There are now SkySystem which uses Caelum and TerrainSystem (with the TerrainSystemHeightFunction namespace for the height function used by PagedGeometry) which uses ETM. The world was loaded from a XML file taken from the resources, but I've seen it's too slow to parse - so I used a INI-style file which I can read with Ogre::ConfigFile class. This file contains all the information about stage name, author(s), scene (camera position only at the moment), whether to load grass and terrain.

Monday, December 29, 2008

Thread safe singletons

With this rewrite of Ainur, I will loose several Ogre3D stuff.
Before thinking about shaders, post-process effects and all the graphics stuff, I need to implement "core" stuff like singletons.

In the previous version I wrote all the managers as singletons because the game engine only need one instance that is accessible from every place.

Since the new Ainur engine will be multi-threaded, I felt the need for a thread-safe implementation.

Single-threaded implementation

If you are sure that the singleton will be first accessed in a single-threaded context, for example during the initialisation fase where we do not have threads yet, a simple test for singleton existence is sufficient.

In this case, the Singleton has a private static pointer to the singleton instance, and public static functions such as getSingletonPtr() and getSingleton() to get the pointer or reference to the instance.

In getSingletonPtr() we create the pointer if it doesn't exist yet and return it to the caller.
The constructor and destructor are private, so that it's impossible to directly instantiate.

class Singleton

{

public:

static Singleton* getSingletonPtr()

{

if (m_singleton == NULL)

m_singleton = new Singleton();

return m_singleton;

}



static Singleton getSingleton()

{

return (*m_singleton);

}



private:

Singleton() {}

~Singleton() {}



static Singleton* m_singleton = NULL;

};

Multi-threaded implementation

What happens in a multi-threaded environment?
What, if multiple threads try to access the singleton at the same time?

We need to find a way to avoid creation of more than one instance.
In order to do this, I think we should use mutex locks to protect the singleton creation code from being executed by more than one thread at a time.

#include m_singleton="new" m_mutex="CreateMutex(NULL,">
Attempting to Avoid Locking Overhead

That's pretty simple, and it works. However, you might notice that we've incurred the overhead of locking every time someone accesses the instance of the singleton.

There's a well-known pattern to eliminate this overhead, called the double-checked lock. It takes advantage of the fact that we can (apparently) check for null as a quick test before we even bother with locking. Here's how this would look:

T* T::instance()

{

if (smInstance == NULL)

{

VMutexLocker lock(&smMutex);

if (smInstance == NULL) // double-check

smInstance = new T();

}

return smInstance;

}

The idea is if we can see that the instance pointer is not null, then there's no need to even enter the synchronized lock and consider creating it.

Once we determine that the instance may need to be created, we enter the locked section. However, we need to re-check for null once we acquire the lock, because "we" may be the thread that lost a race to the lock. That is, another thread may have also checked for null at the same time (we both saw a null pointer) and then it acquired the lock first, created the instance, and released the lock before we were given the lock.

Unfortunately, while this will work reliably on many if not almost all platforms, there is no requirement that the combination of the compiler and the processor's memory model will work as this code intends. It is simply not guaranteed to work.

The Potential Problem

The problem comes from the following statement and the lack of guarantee of the order in which it performs its multiple sub-operations. (The same problem can exist in other languages such as Java and .NET; this is not a C++-specific issue.)

smInstance = new T();

Remember that in our attempt to optimize away unnecessary locking, another thread may be executing the first null check statement (prior to lock acquisition) at the exact same time as our thread is creating and assigning the instance:

if (smInstance == NULL)

Here is the core problem: We have no guarantee that T will be fully constructed before the pointer smInstance is assigned. For example, the compiler is within its rights to generate processor instructions that implement the statement in the following order:

Allocate a memory block sized for T.

Assign the address of the memory block to smInstance

Call T's constructor.

(Multi-processor memory architectures can provide a similarly "hostile" environment to our code.)

Consider then what happens if the other thread performs its unsynchronized null pointer test when we are between steps 2 and 3 of the instantiation statement: It will see a non-null value, and will proceed to return the pointer to the raw memory of the not-yet-constructed object.

The Outcome

Fortunately, we can fall back to the original "thread-safe" implementation if we want the instance() function to work with certainty from multiple threads on any platform and compiler.

This leaves you with some choices. If you are sure that the first call to instance() will be made in a single-threaded mode, such as during static initialization, then you don't need the lock. Similarly, if you know that (or can structure your code such that) the instance is created from the main thread before you create other threads, then you don't need the lock. Alternatively, the performance overhead of the lock may well be a non-issue unless you are calling instance() frequently--a classic case of "don't optimize prematurely", where you shouldn't work to eliminate the lock if it isn't a problem to begin with. And of course, if you reference the singleton a lot in one place, you can avoid the cost of checking the lock repeatedly by calling instance() once and then using the object directly, rather than calling instance() repeatedly.

In my Code Vault library, I've taken this two steps further, first by implementing the singleton pattern as a template that is parameterized with the locking requirements, and also allowing the singleton to be registered for deletion during program termination, with rules for whether a deleted singleton can "resurrected" if it is accessed after being deleted.

Sunday, December 28, 2008

First cool screenshot!!!

This night I moved from PLSM2 back to Editable Terrain Manager which seems a loooooooot better!!