Adventures in Deferred Rendering: Volumetric Lighting

I like god rays. They’re easy to implement and yet they add so much detail to a scene.

Volumetric lighting is typically driven by a single-phase function that determines how much of the total scattered light is scattered toward the camera. The angle between the incoming light direction and the vector from our camera to the world space position of a given fragment is what primarily drives the Mie-scattering phase function. Alexandre Pestana’s blog post, as well as Nathan Vos’s article in GPU Pro 5 on volumetric light effects in Killzone: Shadow Fall, is an excellent look into the details surrounding volumetric lights.

Given such resources, I mostly just wanted to share my own implementation to shed some light on how to set up the rendering and touch on concepts that allow us to enhance the performance of our applications at, perhaps, the expense of dealing with visual artifacts and other unpleasantries.

Implementing a Shadow Mapper for Directional Lighting

The first order of business is to set up shadow mapping in my deferred renderer, which is going to serve two purposes: to render shadows (obviously) and to cull the container of the “volume” that represents the lighting so that we achieve clean casts onto the illuminated parts of surfaces, as seen in the image above. Because the shadow map encodes information that reveals which parts of the scene can be viewed from the light’s perspective, we can use this information to cultivate nice-looking volumetric beams that extend directly from the lit regions.

To facilitate shadow mapping, we’re going to want to render all of the geometries in our scene from the light’s perspective, like so:

void setLightSpaceVP(const phoenix::Shader& shader)

    glm::vec3 lightViewForward = glm::normalize(directLight._direction);
    glm::vec3 lightViewRight = glm::normalize(glm::cross(phoenix::UP, lightViewForward));
    glm::vec3 lightViewUp = glm::normalize(glm::cross(lightViewForward, lightViewRight));
    glm::mat3 rotation;
    rotation[0] = lightViewForward;
    rotation[1] = lightViewRight;
    rotation[2] = lightViewUp;

    std::array<glm::vec3, phoenix::NUM_FRUSTUM_CORNERS> frustumCorners{ {
            glm::vec3(-phoenix::FRUSTUM_HALF_WIDTH, -phoenix::FRUSTUM_HALF_WIDTH, phoenix::FRUSTUM_HALF_WIDTH),
            glm::vec3(-phoenix::FRUSTUM_HALF_WIDTH, -phoenix::FRUSTUM_HALF_WIDTH, -phoenix::FRUSTUM_HALF_WIDTH),
            glm::vec3(-phoenix::FRUSTUM_HALF_WIDTH, phoenix::FRUSTUM_HALF_WIDTH, phoenix::FRUSTUM_HALF_WIDTH),
            glm::vec3(-phoenix::FRUSTUM_HALF_WIDTH, phoenix::FRUSTUM_HALF_WIDTH, -phoenix::FRUSTUM_HALF_WIDTH),
            glm::vec3(phoenix::FRUSTUM_HALF_WIDTH, -phoenix::FRUSTUM_HALF_WIDTH, phoenix::FRUSTUM_HALF_WIDTH),
            glm::vec3(phoenix::FRUSTUM_HALF_WIDTH, -phoenix::FRUSTUM_HALF_WIDTH, -phoenix::FRUSTUM_HALF_WIDTH),
            glm::vec3(phoenix::FRUSTUM_HALF_WIDTH, phoenix::FRUSTUM_HALF_WIDTH, phoenix::FRUSTUM_HALF_WIDTH),
            glm::vec3(phoenix::FRUSTUM_HALF_WIDTH, phoenix::FRUSTUM_HALF_WIDTH, -phoenix::FRUSTUM_HALF_WIDTH)
        } };

    for (size_t i = 0; i < phoenix::NUM_FRUSTUM_CORNERS; ++i)
        frustumCorners[i] = frustumCorners[i] * rotation;

    float minX = frustumCorners[0].x, minY = frustumCorners[0].y, minZ = frustumCorners[0].z, maxX = frustumCorners[0].x, maxY = frustumCorners[0].y, maxZ = frustumCorners[0].z;
    for (size_t i = 0; i < phoenix::NUM_FRUSTUM_CORNERS; ++i)
        if (frustumCorners[i].x < minX)
            minX = frustumCorners[i].x;
        if (frustumCorners[i].x > maxX)
            maxX = frustumCorners[i].x;
        if (frustumCorners[i].y < minY)
            minY = frustumCorners[i].y;
        if (frustumCorners[i].y > maxY)
            maxY = frustumCorners[i].y;
        if (frustumCorners[i].z < minZ)
            minZ = frustumCorners[i].z;
        if (frustumCorners[i].z > maxZ)
            maxZ = frustumCorners[i].z;

    glm::mat4 lightView(rotation);
    lightView[3][0] = -(minX + maxX) * 0.5f;
    lightView[3][1] = -(minY + maxY) * 0.5f;
    lightView[3][2] = -(minZ + maxZ) * 0.5f;
    lightView[0][3] = 0.0f;
    lightView[1][3] = 0.0f;
    lightView[2][3] = 0.0f;
    lightView[3][3] = 1.0f;

    glm::vec3 frustumExtents(maxX - minX, maxY - minY, maxZ - minZ);
    glm::vec3 halfExtents = frustumExtents * 0.5f;

    glm::mat4 lightSpaceVP = glm::orthoLH(-halfExtents.x, halfExtents.x, -halfExtents.y, halfExtents.y, -halfExtents.z, halfExtents.z) * lightView;
    shader.setMat4(phoenix::G_LIGHT_SPACE_VP, lightSpaceVP);

// ...

void execShadowMapPass(const phoenix::Shader& shader, phoenix::Model& object)
    glViewport(0, 0, phoenix::HIGH_RES_WIDTH, phoenix::HIGH_RES_HEIGHT);
    glBindFramebuffer(GL_FRAMEBUFFER, shadowMapRenderTarget->_FBO);


    renderObject(shader, object);

    glBindFramebuffer(GL_FRAMEBUFFER, 0);
    glViewport(0, 0, phoenix::SCREEN_WIDTH, phoenix::SCREEN_HEIGHT);

As a bit of a hack, I rotated a frustum with half-widths of size 25 so that it was aligned with the direction of the lighting; that’s enough to cover the entire Sponza scene. I also had the shadow map as the sole render target of a framebuffer object, which we’ll need to swap to every time we execute the shadow map pass.

As long as we keep track of the texture bound as the render target, we can use it in a subsequent pass to assess occlusion information:

Executing the Lighting Pass

As I mentioned before, much of the volumetric light scattering is going to be driven by a single-phase function, which I’ll explain after the jump:

#version 460 core
layout (location = 3) out vec4 FragColor;

const int SCREEN_WIDTH = 2460;
const int SCREEN_HEIGHT = 1440;
const int NUM_STEPS_INT = 15;
const float NUM_STEPS = float(NUM_STEPS_INT);
const float G = 0.7f; // Controls how much light will scatter in the forward direction
const float PI = 3.14159265359f;
const float SPECULAR_FACTOR = 16.0f;
const mat4 DITHER_PATTERN = mat4
    (vec4(0.0f, 0.5f, 0.125f, 0.625f),
     vec4(0.75f, 0.22f, 0.875f, 0.375f),
     vec4(0.1875f, 0.6875f, 0.0625f, 0.5625f),
     vec4(0.9375f, 0.4375f, 0.8125f, 0.3125f));

struct Light
    vec3 _color;
    float _intensity;

struct DirectLight
    struct Light _light;
    vec3 _direction;

in vec2 TexCoords;

uniform sampler2D gPositionMap;
uniform sampler2D gNormalMap;
uniform sampler2D gAlbedoSpecularMap;
uniform sampler2D gShadowMap;

uniform DirectLight gDirectLight;
uniform mat4 gLightSpaceVP;
uniform vec3 gViewPos;
uniform float gAmbientFactor;

const vec3 P = texture(gPositionMap, TexCoords).rgb;
const vec3 N = texture(gNormalMap, TexCoords).rgb;
const vec4 ALBEDO_SPECULAR = texture(gAlbedoSpecularMap, TexCoords);
const vec4 LIGHT_SPACE_POS = gLightSpaceVP * vec4(P, 1.0f);
const vec3 LIGHT_SPACE_POS_POST_W = / LIGHT_SPACE_POS.w * 0.5f + 0.5f;
const vec3 L = normalize(-gDirectLight._direction);
const float NoL = dot(N, L);

// Mie-scattering phase function approximated by the Henyey-Greenstein phase function
float calcScattering(float cosTheta)
    return (1.0f - G * G) / (4.0f * PI * pow(1.0f + G * G - 2.0f * G * cosTheta, 1.5f));

float calcShadow()
    if (LIGHT_SPACE_POS_POST_W.z > 1.0f)
        return 0.0f;

    float bias = max(0.05f * (1.0f - NoL), 0.005f);
    float depth = texture(gShadowMap, LIGHT_SPACE_POS_POST_W.xy).r;
    return LIGHT_SPACE_POS_POST_W.z - bias > depth ? 1.0f : 0.0f;

void main()
    vec3 V = P - gViewPos;

    float stepSize = length(V) / NUM_STEPS;
    V = normalize(V);
    vec3 step = V * stepSize;

    vec3 position = gViewPos;
    position += step * DITHER_PATTERN[int(TexCoords.x * SCREEN_WIDTH) % 4][int(TexCoords.y * SCREEN_HEIGHT) % 4];

    vec3 volumetric = vec3(0.0f);
    for (int i = 0; i < NUM_STEPS_INT; ++i)
        vec4 lightSpacePos = gLightSpaceVP * vec4(position, 1.0f);
        vec3 lightSpacePosPostW = / lightSpacePos.w * 0.5f + 0.5f;
        float depth = texture(gShadowMap, lightSpacePosPostW.xy).r;
        if (depth > lightSpacePosPostW.z)
            volumetric += calcScattering(dot(V, -L)) * gDirectLight._light._color;
        position += step;
    volumetric /= NUM_STEPS;

    V = normalize(gViewPos - P);
    vec3 diffuse = clamp(NoL, 0.0f, 1.0f) * gDirectLight._light._color * ALBEDO_SPECULAR.rgb;
    vec3 H = normalize(L + V);
    vec3 specular = pow(clamp(dot(N, H), 0.0f, 1.0f), SPECULAR_FACTOR) * gDirectLight._light._color * ALBEDO_SPECULAR.a;

    float shadow = calcShadow();
    if (shadow < 1.0f)
        FragColor = vec4((1.0f - shadow) * (diffuse + specular) + volumetric, 1.0f);
        FragColor = vec4(gAmbientFactor * ALBEDO_SPECULAR.rgb + volumetric, 1.0f);

The function is very simple, and what it tells us is that given a point on the raymarch, if it is visible from the light’s perspective, then we need to add a calibrated amount determined by the angle between V and L to the running total that serves as the amount of volumetric lighting on a given fragment. Given that V and L are relatively static with respect to the rest of the shader implementation, I chose to add a consistent amount for every point on the raymarch that was not in shadow; however, I’m sure that other options for increased rendering effect are possible.

What we’re adding for a given point that’s exposed to the light is driven by the Henyey-Greenstein phase function:

Now, I went with a slightly different numerator in my implementation, but it’s worth testing different values of g and simplifying or reintegrating terms into our numerator to see what leads to the desired rendering effect. Obviously, such options can be exposed to artists as toggles or calibration bars.

So that was easy, but there are a few things I want to illustrate. Ignoring the dithering offset, which is a common strategy of introducing noise for dealing with anti-aliasing artifacts that result from insufficient sampling, a raymarch consisting of 100 steps is going to result in something that looks like this:

Now, that’s pretty good, but it’s always worth asking ourselves if it’s possible to achieve a similar effect using fewer steps in our raymarching. Let’s try 15:

That’s not quite what we had in mind. What if we were to reincorporate the Bayer matrix (i.e., DITHER_PATTERN) while maintaining the smaller number of raymarching steps:

Adding Bilateral Filtering

That’s much better. As per the advice in GPU Pro 5, I also ran a subsequent Gaussian blur over two image-space passes. The first pass uses the result from our lighting pass as an input to output onto the sole render target of a framebuffer created just for the filtering (and because the framebuffer containing data from the geometry pass could not support additional render targets by default). The second pass—the vertical-filtering pass—then operates using the result from the first blur and renders onto the render target of our default framebuffer, which is precisely what gets displayed to us on screen. Here is the shader, the setup on the CPU side, and the result of the filtering:

#version 460 core
out vec4 FragColor;

const float H_BLUR = 3.0f / 2460.0f; // Sample radius in texels
const float V_BLUR = 3.0f / 1440.0f;

in vec2 TexCoords;

uniform sampler2D gPreviousFrameMap;
uniform vec2 gDir;

void main()
    vec4 sum = vec4(0.0f);

    // 9-tap Gaussian filter
    sum += texture(gPreviousFrameMap, vec2(TexCoords.x - 4.0f * H_BLUR * gDir.x, TexCoords.y - 4.0f * V_BLUR * gDir.y)) * 0.0162162162f;
    sum += texture(gPreviousFrameMap, vec2(TexCoords.x - 3.0f * H_BLUR * gDir.x, TexCoords.y - 3.0f * V_BLUR * gDir.y)) * 0.0540540541f;
    sum += texture(gPreviousFrameMap, vec2(TexCoords.x - 2.0f * H_BLUR * gDir.x, TexCoords.y - 2.0f * V_BLUR * gDir.y)) * 0.1216216216f;
    sum += texture(gPreviousFrameMap, vec2(TexCoords.x - 1.0f * H_BLUR * gDir.x, TexCoords.y - 1.0f * V_BLUR * gDir.y)) * 0.1945945946f;

    sum += texture(gPreviousFrameMap, TexCoords) * 0.2270270270f;

    sum += texture(gPreviousFrameMap, vec2(TexCoords.x + 1.0f * H_BLUR * gDir.x, TexCoords.y + 1.0f * V_BLUR * gDir.y)) * 0.1945945946f;
    sum += texture(gPreviousFrameMap, vec2(TexCoords.x + 2.0f * H_BLUR * gDir.x, TexCoords.y + 2.0f * V_BLUR * gDir.y)) * 0.1216216216f;
    sum += texture(gPreviousFrameMap, vec2(TexCoords.x + 3.0f * H_BLUR * gDir.x, TexCoords.y + 3.0f * V_BLUR * gDir.y)) * 0.0540540541f;
    sum += texture(gPreviousFrameMap, vec2(TexCoords.x + 4.0f * H_BLUR * gDir.x, TexCoords.y + 4.0f * V_BLUR * gDir.y)) * 0.0162162162f;

    FragColor = vec4(sum.rgb, 1.0f);
void execBlurPasses(const phoenix::Shader& blurShader)

    glBindFramebuffer(GL_FRAMEBUFFER, blurRenderTarget->_FBO);
    blurShader.setVec2(phoenix::G_DIR, glm::vec2(1.0f, 0.0f));
    utils->renderQuad(blurShader, previousFrameMap);

    glBindFramebuffer(GL_FRAMEBUFFER, 0);
    blurShader.setVec2(phoenix::G_DIR, glm::vec2(0.0f, 1.0f));
    utils->renderQuad(blurShader, blurRenderTarget->_textureID);

That’s definitely a nice softening of the dithering artifacts. Here’s another perspective to end off on using the initial 100 samples:

All in all, this is a great option to add to your scenes for situations that would benefit from the kinds of effects illustrated here. I highly recommend checking out sections 3.5 and onward in GPU Pro 5 for the remainder of Vos’s work, which explains how to control the amount of scattering so that we can yield a dustier and smokier feel to the rather uniform appearance we have thus far. The book also brings up how to deal with transparent objects and suggests potential approaches for additional improvements.


GPU Pro 5
Alexandre Pestana’s Post on Volumetric Lights

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s