Vulkan Tutorial - Shaders

Welcome. This is my second Vulkan tutorial and things are about to get nasty in here. We will pick right where we left off from our basic triangle rendering. If you haven't read the 101 tutorial I strongly recommend you to go and read it first before tackling this one.

We start by setting up our shader interface with proper projection, view, and model matrices. Then we make lighting happen and finally we finish of with some texture mapping. Truth be told, I left out some important concepts and code from our first tutorial. As you will see, the things I left out would over complicate our 101 tutorial. Regardless, I fell that a "Vulkan Tutorial" is not complete without the information that I will share in this section.

But first a few words on the previous tutorial. I would like to thank every one who send me their awesome feedback. Much appreciated. Was not really expecting all the interest generated around Vulkan. And... about Vulkan, my point on some of the misconceptions is that Vulkan is not really like OpenGL or DirectX... It is much lower level. I was not involved in creating this API but from my use and experience I fell I kinda could write OpenGL on top of Vulkan? (but not Vulkan on top of OpenGL) ...that is my point of view anyway. So, yes it's complicated (over complicated even...). Is it for you? probably not. Because probably you are one of the 80% that actually just want to use Unreal/Unity/Whatever engine... that is fine. But, if you are part of the 20% that will write the tools for the other 80%, Vulkan seems to have quite the potential. I mean, the amount of room for third parties/middleware solutions is huge. Was this by design?

Now, with that said, let's get back to the quirky details. I will follow the same principles from the 101 tutorial. This includes posting all the code and the commit id so you can checkout the code yourself from this repo (the same as previous tutorial):

git clone https://bitbucket.org/jose_henriques/vulkan_tutorial.git

I am not writing a framework. I am showing you all the code in place and I do it because for a tutorial I find that ideal.

[Someone asked me why am I doing all this from scratch... well, from my own experience, you can not simply say you understood the solution if you just grabbed a third party library (that you understand) that fixes/solves your main problem... you still don't understand it even if you made it work. Probably now you think you do and you will go on making assumptions and creating code that does not matches reality. If I'm learning it, I should actually learn the right thing? Did you notice how code nowadays is just awful? Allow me to just shout out VIVA À PROGRAMAÇÃO (I am NOT shouting that out in spanish... you know... historical divergences and all :) (just kidding!) ).]

Ok, we came clean. So what about some Vulkan? Let's start by creating a test matrix and upload it to our shaders.

Create Uniform Buffer

[Commit: 342dc89]

One of the most important features missing from the previous tutorial is that we are not passing any parameter to our shaders. We want to pass some matrices and probably other shader parameters that do not change often. These are our uniforms and we are going to set them up. We start with using familiar code to create a buffer to hold one 4x4 matrix.

To do so, we need a VkBuffer that we need to create and some VkDeviceMemory that we need to allocate and bind to the buffer. We store it in a new structure called shader_uniform and add one new variable of that type to our vulkan_context:

struct shader_uniform {

    VkBuffer buffer;
    VkDeviceMemory memory;

};

struct vulkan_context {
    (...)     
    shader_uniform uniforms;
    (...)
}

Right after our shader modules creation we will add the creation of our buffer. For now we define one identity matrix and make that the buffer content. This code is very similar to the code we used to create our vertex buffer:

float identityMatrix[16] = { 1, 0, 0, 0,
                             0, 1, 0, 0,
                             0, 0, 1, 0,
                             0, 0, 0, 1 };

// create our uniforms buffers:
VkBufferCreateInfo bufferCreateInfo = {};
bufferCreateInfo.sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO;
bufferCreateInfo.size = sizeof(float) * 16;                    // size in bytes
bufferCreateInfo.usage = VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT;   // <--
bufferCreateInfo.sharingMode = VK_SHARING_MODE_EXCLUSIVE;

Notice how we are allocating just enough space for our 4x4 matrix and that the usage also changed to VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT. The remaining code should be familiar by now:

result = vkCreateBuffer( context.device, &bufferCreateInfo, NULL, &context.uniforms.buffer );  
checkVulkanResult( result, "Failed to create uniforms buffer." );

// allocate memory for buffer:
VkMemoryRequirements bufferMemoryRequirements = {};
vkGetBufferMemoryRequirements( context.device, context.uniforms.buffer, &bufferMemoryRequirements );

VkMemoryAllocateInfo matrixAllocateInfo = {};
matrixAllocateInfo.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO;
matrixAllocateInfo.allocationSize = bufferMemoryRequirements.size;

uint32_t uniformMemoryTypeBits = bufferMemoryRequirements.memoryTypeBits;
VkMemoryPropertyFlags uniformDesiredMemoryFlags = VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT;
for( uint32_t i = 0; i < 32; ++i ) {
    VkMemoryType memoryType = context.memoryProperties.memoryTypes[i];
    if( uniformMemoryTypeBits & 1 ) {
        if( ( memoryType.propertyFlags & uniformDesiredMemoryFlags ) == uniformDesiredMemoryFlags ) {
            matrixAllocateInfo.memoryTypeIndex = i;
            break;
        }
    }
    uniformMemoryTypeBits = uniformMemoryTypeBits >> 1;
}

//VkDeviceMemory bufferMemory;
result = vkAllocateMemory( context.device, &matrixAllocateInfo, NULL, &context.uniforms.memory );
checkVulkanResult( result, "Failed to allocate uniforms buffer memory." );

result = vkBindBufferMemory( context.device, context.uniforms.buffer, context.uniforms.memory, 0 );
checkVulkanResult( result, "Failed to bind uniforms buffer memory." );

Only thing left to do is to map memory and copy our identity matrix to the buffer:

void *matrixMapped;
result = vkMapMemory( context.device, context.uniforms.memory, 0, VK_WHOLE_SIZE, 0, 
                      &matrixMapped );
checkVulkanResult( result, "Failed to map uniform buffer memory." );

memcpy( matrixMapped, &identityMatrix, sizeof(float) * 16 );

vkUnmapMemory( context.device, context.uniforms.memory );

Ok, so we have a buffer where we wrote our matrix into, but now how do we make it a shader parameter?

Descriptors

[Commit: e3a65c1]

We must now talk about descriptors, descriptor sets, and descriptor set layouts. These are the reason why I did not include this code in the previous tutorial.

A descriptor is an opaque data structure that represents a shader resource. Descriptors are grouped together into descriptor set [set] objects. These are opaque objects that contain storage for a set of descriptors. The type and number of descriptors in a descriptor set is defined by the descriptor set layout [layout].

Vulkan supports a number of descriptor types. We will be using the VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER type for our uniform buffer and VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER for our texture/sampler (later in the tutorial). (Chapter 13 of the Vulkan specification has a description of all the descriptors types Vulkan supports. I anyway recommend you to read this chapter.)

Our shaders access buffer and image resources by using special variables which are indirectly bound to buffer and image views by Vulkan. These are organised in set of bindings that match our Vulkan descriptors sets. Finally the set layout objects are used to map reources that need to be associated with the descriptor set and to map the interface between shaders stages and shader resources. Pragmatically, they are the high level glue that we use to define the association of each descriptor binding with memory or other hardware resource. This might be easy to understand once I show you some code.

If you remember or check the code where we create our pipeline layout, you will notice that we set no setLayout. We need to create our sets and then create our set layout. Let us start by creating our one VkDescriptorSetLayoutBinding for our uniform buffer and then create the VkDescriptorSetLayout:

VkDescriptorSetLayoutBinding bindings = {};
bindings.binding = 0;
bindings.descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
bindings.descriptorCount = 1;
bindings.stageFlags = VK_SHADER_STAGE_VERTEX_BIT;
bindings.pImmutableSamplers = NULL;

Take note of the binding id: 0. We will be needing that when coding our shaders. The rest of the binding creation should be self explanatory. Notice that we are only going to be able to use our uniform for the vertex shader. You could change it to VK_SHADER_STAGE_ALL_GRAPHICS if you planned on being able to access this set binding in all shader stages, for example. Or any other combinations of VkShaderStageFlagBits flags.

Then we create our set layout with the newly created single binding:

VkDescriptorSetLayoutCreateInfo setLayoutCreateInfo = {};
setLayoutCreateInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO;
setLayoutCreateInfo.bindingCount = 1;
setLayoutCreateInfo.pBindings = &bindings;

VkDescriptorSetLayout setLayout;
result = vkCreateDescriptorSetLayout( context.device, &setLayoutCreateInfo, NULL, &setLayout );
checkVulkanResult( result, "Failed to create DescriptorSetLayout." );

Remember I told you that the descriptor set contain storage for the descriptors? Yeap, we need to allocate some memory next. To do so we need to create a VkDescriptorPool from where we will be able to allocate our descriptor set:

VkDescriptorPoolSize uniformBufferPoolSize = {};
uniformBufferPoolSize.type = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
uniformBufferPoolSize.descriptorCount = 1;

VkDescriptorPoolCreateInfo poolCreateInfo = {}; 
poolCreateInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO;
poolCreateInfo.maxSets = 1;
poolCreateInfo.poolSizeCount = 1;
poolCreateInfo.pPoolSizes = &uniformBufferPoolSize;

VkDescriptorPool descriptorPool;
result = vkCreateDescriptorPool( context.device, &poolCreateInfo, NULL, &descriptorPool );
checkVulkanResult( result, "Failed to create descriptor pool." );

We are very conservative and ask it to be created with just enough space for one uniform buffer descriptor. Once we have the descriptor pool, we can allocate our descriptor set:

VkDescriptorSetAllocateInfo descriptorAllocateInfo = {};
descriptorAllocateInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO;
descriptorAllocateInfo.descriptorPool = descriptorPool;
descriptorAllocateInfo.descriptorSetCount = 1;
descriptorAllocateInfo.pSetLayouts = &setLayout;

result = vkAllocateDescriptorSets( context.device, &descriptorAllocateInfo, 
                                   &context.descriptorSet );
checkVulkanResult( result, "Failed to allocate descriptor sets." );

Notice that we pass the set layout as a parameter for the allocation. This is because the set layout actually has the information about how and what to allocate (the "glue"). We store this set in our context as we will need it later in our rendering routine.

Almost done with this stage. There is one very important thing left to do. Our descriptor set as just been allocated and as such is "largely uninitialised" (whatever that means!). Matter of fact, I managed to crash my PC several times at this point because, as per the spec, "all entries that are statically used by a pipeline in a drawing or dispatching command must have been populated before the descriptor set is bound for use by that command". Fine, crashing my system is cool, but ok, lets update the descriptor set by writing our identity matrix to it (via our uniform buffer):

// When a set is allocated all values are undefined and all 
// descriptors are uninitialised. must init all statically used bindings:
VkDescriptorBufferInfo descriptorBufferInfo = {};
descriptorBufferInfo.buffer = context.uniforms.buffer;
descriptorBufferInfo.offset = 0;
descriptorBufferInfo.range = VK_WHOLE_SIZE;

VkWriteDescriptorSet writeDescriptor = {};
writeDescriptor.sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
writeDescriptor.dstSet = context.descriptorSet;
writeDescriptor.dstBinding = 0;
writeDescriptor.dstArrayElement = 0;
writeDescriptor.descriptorCount = 1;
writeDescriptor.descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
writeDescriptor.pImageInfo = NULL;
writeDescriptor.pBufferInfo = &descriptorBufferInfo;
writeDescriptor.pTexelBufferView = NULL;

vkUpdateDescriptorSets( context.device, 1, &writeDescriptor, 0, NULL );

And we can go back to our pipeline layout creation and update it to use our layout set:

(...)
VkPipelineLayoutCreateInfo layoutCreateInfo = {};
layoutCreateInfo.sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO;
layoutCreateInfo.setLayoutCount = 1;
layoutCreateInfo.pSetLayouts = &setLayout;
(...)

Shader Bindings

[Commit: b75f798]

We are still not making any use of our previous work. We only need a couple more steps to pass and use our uniform buffer. In our render function we need to add one command to bind our descriptor set to the graphics pipeline. We do it with a call to vkCmdBindDescriptorSets before our draw commands. Once bound to our graphical pipeline, subsequent rendering calls recorded to our command buffer will use the bindings in our descriptor set until we bind another descriptor set.

vkCmdBindDescriptorSets( context.drawCmdBuffer, VK_PIPELINE_BIND_POINT_GRAPHICS, 
                         context.pipelineLayout, 0, 1, &context.descriptorSet, 0, NULL );

We can now access our matrix from our vertex shader. In GLSL, the set and binding number are assigned via the layout qualifier. Important to note that the array element is implicitly assigned consecutively starting with index equal to zero for the first element of the array. We will see what this means later. For now, we update the shader code to:

layout( std140, binding = 0 ) uniform buffer {
    mat4 matrix;
} UBO;

layout( location = 0 ) in vec4 pos;

void main() {
	gl_Position = pos * UBO.matrix;
}

Ok, some explaining is due here about shader interfaces, in particular the vertex input interface. The vertex shader input variables (the ones we decorated with the in keyword) form an interface with the vertex input attributes. They are made to match (by using the layout decoration) the input attributes that we set in the pVertexInputState member of our VkGraphicsPipelineCreateInfo. If you check that code you will see that we defined binding 0 to hold our vertex position attributes.

Our uniform variables are bound by the descriptor set shader interface. Notice the use of the binding decoration to reference our VkDescriptorSetLayoutBinding binding 0. Also, our uniform buffer block must be laid out according to some strict rules, which the std140 layout rules in GLSL satisfies (check the specification chapter 14.5.4 for more info).

After running you should in fact see the same as before: our beautiful blue-ish triangle. Just look at that. sight... We are not dumb I promise you! So, we need to talk matrices. This could be a huge topic for discussion, but not here. I like multiplying my matrices on the right. And I like to have my translations on the last column. And that's that. If you prefer it the other way, don't forget to transpose your matrices and multiply on the left.

Now, let's move the triangle to the right a bit just to make sure all is well. Go back to our identity matrix declaration and change it to this:

float identityMatrix[16] = { 1, 0, 0, 0.5,   // <-- here
                             0, 1, 0, 0,
                             0, 0, 1, 0,
                             0, 0, 0, 1 };

Recompile and run and you should see the triangle "slided" to the right. All is well. Wasn't that fun?

MVP

[Commit: 5ea6c1b]

Ok, let's get a bit more serious. Our goal is to fill our uniform buffer with our projection, view, and model matrices. We update the code and start by creating our matrices:

const double PI = 3.14159265359f;
const double TORAD = PI/180.0f;

// perspective projection parameters:
float fov = 45.0f;
float nearZ = 0.1f;
float farZ = 1000.0f;

float aspectRatio = context.width / (float)context.height;
float t = 1.0f / tan( fov * TORAD * 0.5 );
float nf = nearZ - farZ;

float projectionMatrix[16] = { t / aspectRatio, 0, 0, 0,
                               0, t, 0, 0,
                               0, 0, (-nearZ-farZ) / nf, (2*nearZ*farZ) / nf,
                               0, 0, 1, 0 };

float viewMatrix[16] = { 1, 0, 0, 0,
                         0, 1, 0, 0,
                         0, 0, 1, 0,
                         0, 0, 0, 1 };

float modelMatrix[16] = { 1, 0, 0, 0,
                          0, 1, 0, 0,
                          0, 0, 1, 0,
                          0, 0, 0, 1 };

There we go. Our "Model-View-Projection" matrices. I'm not going to show how to derive the projection matrix, sorry. Also, no perspective divide, the fixed function pipeline will do that for us. You know what, let's throw in something else:

context.cameraZ = 10.0f;
context.cameraZDir = -1.0f;
viewMatrix[11] = context.cameraZ;

// store matrices in our context uniforms
context.uniforms.projectionMatrix = projectionMatrix;
context.uniforms.viewMatrix = viewMatrix;
context.uniforms.modelMatrix = modelMatrix;

Notice we are storing our matrices and some extra content on our context. You probably already know what this is for... this will allow us to animate our camera and test our MVP. Let's continue. We need to update the allocation and update code of our uniforms, after all, we are now passing 3 matrices:

VkBufferCreateInfo bufferCreateInfo = {};
bufferCreateInfo.sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO;

bufferCreateInfo.size = sizeof(float) * 16 * 3; // <-- Updated size to old 3 4x4 matrices

bufferCreateInfo.usage = VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT;
bufferCreateInfo.sharingMode = VK_SHARING_MODE_EXCLUSIVE;

The question in your mind should be, now that we have three matrices, how do we interface with the shaders? Good news is there is nothing to do about our descriptors. We still only have one descriptor and one binding. What we do need to do is to fill the buffer with our new matrices...

void *matrixMapped;
result = vkMapMemory( context.device, context.uniforms.memory, 0, VK_WHOLE_SIZE, 0, 
                      &matrixMapped );
checkVulkanResult( result, "Failed to map uniform buffer memory." );

memcpy( matrixMapped, &projectionMatrix, sizeof( projectionMatrix ) );       // <-- new
memcpy( ((float *)matrixMapped + 16), &viewMatrix, sizeof( viewMatrix ) );   // <-- new
memcpy( ((float *)matrixMapped + 32), &modelMatrix, sizeof( modelMatrix ) ); // <-- new

vkUnmapMemory( context.device, context.uniforms.memory );

...and then in our shader code to update our uniform block to have two more mat4. Remember the big fuss I made about array elements being implicitly assigned consecutively? Well, this is why it is important. The projection_matrix will take the first sixteen floats, the view_matrix the next sixteen, and model_matrix the last ones.

layout (std140, binding = 0) uniform buffer {
    mat4 projection_matrix;
    mat4 view_matrix;
    mat4 model_matrix;
} UBO;

layout (location = 0) in vec4 pos;

void main() {
    gl_Position = pos * ( UBO.model_matrix * UBO.view_matrix * UBO.projection_matrix );
}

Now, how do we update the, lets say, view matrix? This is surprisingly straight forward if not for the care we must take in synchronising our buffer updating. Let's start with the easy part, updating our camera. In our render function, right at the top:

void render( ) {

    // some shenanigans to animate camera:
    if( context.cameraZ <= 1 ) {
        context.cameraZ = 1;
        context.cameraZDir = 1;
    } else if( context.cameraZ >= 10 ) {
        context.cameraZ = 10;
        context.cameraZDir = -1;
    }
    
    context.cameraZ += context.cameraZDir * 0.01f;
    context.uniforms.viewMatrix[11] = context.cameraZ;

That should animate our camera to move it back and forth from our triangle.

Next, we must update our uniform buffer. This is a matter of mapping our uniforms buffer memory and writing the current values. The only problem is we must make this writes affect the current frame. This means making sure that the device memory is coherent with the mapped memory we just wrote to before we submit our command buffers to our queues. This can be achieved by allocating memory from a heap with VK_MEMORY_PROPERTY_HOST_COHERENT_BIT property or by calling vkFlushMappedMemoryRanges() and vkInvalidateMappedMemoryRanges() to make sure memory mapped is coherent between host and device. Our memory is not on such a heap, so we manually call vkFlushMappedMemoryRanges(). Also, we make sure we only start executing once we have uploaded our new uniforms buffer by placing a memory barrier at the top of our command buffer:

    // still in our render():
							
    // update shader uniforms:
    void *matrixMapped;
    vkMapMemory( context.device, context.uniforms.memory, 0, VK_WHOLE_SIZE, 0, &matrixMapped );
        
    memcpy( matrixMapped, context.uniforms.projectionMatrix, sizeof(float) * 16 );
    memcpy( ((float *)matrixMapped + 16), context.uniforms.viewMatrix, sizeof(float) * 16 );
    memcpy( ((float *)matrixMapped + 32), context.uniforms.modelMatrix, sizeof(float) * 16 );
    
    // flush device memory:
    VkMappedMemoryRange memoryRange = {};
    memoryRange.sType = VK_STRUCTURE_TYPE_MAPPED_MEMORY_RANGE;
    memoryRange.memory = context.uniforms.memory;
    memoryRange.offset = 0;
    memoryRange.size = VK_WHOLE_SIZE;
    vkFlushMappedMemoryRanges( context.device, 1, &memoryRange );
    
    vkUnmapMemory( context.device, context.uniforms.memory );
    
    (...)
    
    // right after beginning our command buffer recording:
    vkBeginCommandBuffer( context.drawCmdBuffer, &beginInfo );
    
    // barrier for reading from uniform buffer after all writing is done:
    VkMemoryBarrier uniformMemoryBarrier = {};
    uniformMemoryBarrier.sType = VK_STRUCTURE_TYPE_MEMORY_BARRIER;
    uniformMemoryBarrier.srcAccessMask = VK_ACCESS_HOST_WRITE_BIT;
    uniformMemoryBarrier.dstAccessMask = VK_ACCESS_UNIFORM_READ_BIT;
    
    vkCmdPipelineBarrier(   context.drawCmdBuffer, 
                            VK_PIPELINE_STAGE_HOST_BIT, 
                            VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT, 
                            0,
                            1, &uniformMemoryBarrier,
                            0, NULL, 
                            0, NULL );
    
    // the remaining of our render function as before

Your triangle should now be having a party on your screen. At this point we have our model-view-projection going and we are effectively rendering in "3D".

Our blue-ish triangle is getting old and dull... let's do something about it.

Vertex Attributes

[Commit: 16f58e5]

To make our triangle prettier we need to add some extra vertex attributes, such as normals and UV mapping. We update our vertex structure and add this new attributes...

struct vertex {
    float x, y, z, w;
    float nx, ny, nz;
    float u, v;
};

...and update our three vertices to some useful values:

vertex *triangle = (vertex *) mapped;
vertex v1 = { -1.0f, -1.0f, 0.0f, 1.0f,    // position
               0.0f, -1.0f, 0.0f,          // normal
               0.0f, 0.0f };               // uvs
vertex v2 = {  1.0f, -1.0f, 0.0f, 1.0f,
               0.0f, -1.0f, 0.0f,
               1.0f, 0.0f };
vertex v3 = {  0.0f,  1.0f, 0.0f, 1.0f,
               0.0f, 0.0f, 1.0f,
               0.5f, 1.0f };
triangle[0] = v1;
triangle[1] = v2;
triangle[2] = v3;

We make the two top vertices have a weird Y up pointing normal and the third bottom vertex normal point to the screen... This is on purpose, and will allow us to test our illumination code later. Now, that was the easy part. We need to go back to our vertex input configuration code and change it to create the right bindings:

VkVertexInputAttributeDescription vertexAttributeDescritpion[3];
    
// position:
vertexAttributeDescritpion[0].location = 0;     // <--
vertexAttributeDescritpion[0].binding = 0;
vertexAttributeDescritpion[0].format = VK_FORMAT_R32G32B32A32_SFLOAT;
vertexAttributeDescritpion[0].offset = 0;

// normals:
vertexAttributeDescritpion[1].location = 1;     // <--
vertexAttributeDescritpion[1].binding = 0;
vertexAttributeDescritpion[1].format = VK_FORMAT_R32G32B32_SFLOAT;
vertexAttributeDescritpion[1].offset = 4 * sizeof(float);

// texture coordinates:
vertexAttributeDescritpion[2].location = 2;     // <--
vertexAttributeDescritpion[2].binding = 0;
vertexAttributeDescritpion[2].format = VK_FORMAT_R32G32_SFLOAT;
vertexAttributeDescritpion[2].offset = (4 + 3) * sizeof(float);

Note how we are setting our location for each of the attributes. These are the values we need to match in our shader code later. The only thing missing is passing the correct values to the VkPipelineVertexInputStateCreateInfo:

VkPipelineVertexInputStateCreateInfo vertexInputStateCreateInfo = {};
vertexInputStateCreateInfo.sType = VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_STATE_CREATE_INFO;
vertexInputStateCreateInfo.vertexBindingDescriptionCount = 1;
vertexInputStateCreateInfo.pVertexBindingDescriptions = &vertexBindingDescription;
vertexInputStateCreateInfo.vertexAttributeDescriptionCount = 3;                       // <-- new
vertexInputStateCreateInfo.pVertexAttributeDescriptions = vertexAttributeDescritpion; // <-- new

Ok, that is it from the pipeline setup. Now we update the shader code. We start with the vertex shader where we first map our vertex input interfaces by declaring a couple more in variables where we add the layout decoration to match our normal (1) and UV mapping (2) locations in our vertex attributes. We also define our vertex out interface, which must match the next stage (in our case the fragment shader input interface) by creating a struct named vertex_out. The members of this struct make up the interface between our vertex stage and our fragment stage:

layout( location = 0 ) in vec4 pos;
layout( location = 1 ) in vec3 normal;
layout( location = 2 ) in vec2 uv;

layout( location = 0 ) out struct vertex_out {
    vec4 vColor;
    vec3 normal;
    vec2 uv;
    vec3 camera;
} OUT;

Only need to write the actual vertex shader:

void main() {
							
    mat4 modelView = UBO.model_matrix * UBO.view_matrix;

    gl_Position = pos * ( modelView * UBO.projection_matrix );

    OUT.vColor = vec4( 0, 0.5, 1.0, 1 );
    OUT.uv = uv;
    OUT.normal = (vec4( normal, 0.0 ) * inverse( modelView )).xyz;
    OUT.camera = vec3( UBO.view_matrix[3][1], UBO.view_matrix[3][2], UBO.view_matrix[3][3] );

}

(If you are wondering what on earth am I doing for the normal calculation... you have some reading to do. Search google for "proper normal transformation adjoint matrix"... or read this.)

In the fragment shader we can now define our input as a mirror of the vertex_out:

layout ( location = 0 ) in struct fragment_in {
    vec4 vColor;
    vec3 normal;
    vec2 uv;
    vec3 camera;
} IN;

With this information we do some simple illumination where we assume there is a light at the camera position:

void main() {
	
    vec3 L = normalize( IN.camera );
    float NdotL = max( dot( IN.normal, L ), 0.0f );
    uFragColor = NdotL * IN.vColor;

}

That should produce a slightly less boring triangle rendering... but I think we can improve it a bit more.

Texture Mapping

[Commit: 3f380e6]

One way we can improve our rendering is to make use of the UV mapping to texture our triangle. I hope that by now you should be thinking on the things we need to do. First, we a need a VkImage and a VkImageView. We also need to allocate and somehow fill up the image. Those are things you should be familiar with already.

But before we go into that code, we need to load an image... well, I tried, but I could not fit a bmp loader in a small enough code sample... So I decided that we are creating some programmers art! A checker board texture will do and we will do it on the fly. Here it is:

struct loaded_image {
    int width;
    int height;
    void *data;
};

loaded_image testImage;
testImage.width = 800;
testImage.height = 600;
testImage.data = (void *) new float[ testImage.width * testImage.height * 3 ];

for( uint32_t x = 0; x < testImage.width; ++x ) {
    for( uint32_t y = 0; y < testImage.height; ++y ) {
        float g = 0.3;
        if( x % 40 < 20 && y % 40 < 20 ) {
            g = 1;
        }
        if( x % 40 >= 20 && y % 40 >= 20) {
            g = 1;
        }

        float *pixel = ((float *) testImage.data) + ( x * testImage.height * 3 ) + (y * 3);
        pixel[0] = g * 0.4;
        pixel[1] = g * 0.5;
        pixel[2] = g * 0.7;
    }
}

Fell free to use your image loader here. For our purpose, this is just fine.

Next we create our image:

VkImageCreateInfo textureCreateInfo = {};
textureCreateInfo.sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO;
textureCreateInfo.imageType = VK_IMAGE_TYPE_2D;
textureCreateInfo.format = VK_FORMAT_R32G32B32_SFLOAT;
textureCreateInfo.extent = { testImage.width, testImage.height, 1 };
textureCreateInfo.mipLevels = 1;
textureCreateInfo.arrayLayers = 1;
textureCreateInfo.samples = VK_SAMPLE_COUNT_1_BIT;
textureCreateInfo.tiling = VK_IMAGE_TILING_LINEAR;
textureCreateInfo.usage = VK_IMAGE_USAGE_SAMPLED_BIT;
textureCreateInfo.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
textureCreateInfo.initialLayout = VK_IMAGE_LAYOUT_PREINITIALIZED;

VkImage textureImage;
result = vkCreateImage( context.device, &textureCreateInfo, NULL, &textureImage );
checkVulkanResult( result, "Failed to create texture image." );

Note our initial layout is set to VK_IMAGE_LAYOUT_PREINITIALIZED. This tell Vulkan that we will be filling it up and that when changing layout it should not discard the image contents. Also, by the spec, because we are using this initial layout we set the tiling to VK_IMAGE_TILING_LINEAR.

Next is our allocation and binding of the image memory. Familiar code incoming:

VkMemoryRequirements textureMemoryRequirements = {};
vkGetImageMemoryRequirements( context.device, textureImage, &textureMemoryRequirements );

VkMemoryAllocateInfo textureImageAllocateInfo = {};
textureImageAllocateInfo.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO;
textureImageAllocateInfo.allocationSize = textureMemoryRequirements.size;

uint32_t textureMemoryTypeBits = textureMemoryRequirements.memoryTypeBits;
VkMemoryPropertyFlags tDesiredMemoryFlags = VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT;
for( uint32_t i = 0; i < 32; ++i ) {
    VkMemoryType memoryType = context.memoryProperties.memoryTypes[i];
    if( textureMemoryTypeBits & 1 ) {
        if( ( memoryType.propertyFlags & tDesiredMemoryFlags ) == tDesiredMemoryFlags ) {
            textureImageAllocateInfo.memoryTypeIndex = i;
            break;
        }
    }
    textureMemoryTypeBits = textureMemoryTypeBits >> 1;
}

VkDeviceMemory textureImageMemory = {};
result = vkAllocateMemory( context.device, &textureImageAllocateInfo, NULL, &textureImageMemory );
checkVulkanResult( result, "Failed to allocate device memory." );

result = vkBindImageMemory( context.device, textureImage, textureImageMemory, 0 );
checkVulkanResult( result, "Failed to bind image memory." );

Only thing missing is uploading our awesome checker board texture. This code is very similar to the uniform buffer updating and we also make sure to flush the device memory:

void *imageMapped;
result = vkMapMemory( context.device, textureImageMemory, 0, VK_WHOLE_SIZE, 0, &imageMapped );
checkVulkanResult( result, "Failed to map image memory." );

memcpy( imageMapped, testImage.data, sizeof(float) * testImage.width * testImage.height * 3 );

VkMappedMemoryRange memoryRange = {};
memoryRange.sType = VK_STRUCTURE_TYPE_MAPPED_MEMORY_RANGE;
memoryRange.memory = textureImageMemory;
memoryRange.offset = 0;
memoryRange.size = VK_WHOLE_SIZE;
vkFlushMappedMemoryRanges( context.device, 1, &memoryRange );

vkUnmapMemory( context.device, textureImageMemory );

// we can clear the image data:
delete[] testImage.data;

Next we change the image layout from VK_IMAGE_LAYOUT_PREINITIALIZED to the layout the shader expect it: VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL: (this is probably the most repeated code in all the tutorial!)

{
    VkCommandBufferBeginInfo beginInfo = {};
    beginInfo.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO;
    beginInfo.flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT;

    vkBeginCommandBuffer( context.setupCmdBuffer, &beginInfo );

    VkImageMemoryBarrier layoutTransitionBarrier = {};
    layoutTransitionBarrier.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER;
    layoutTransitionBarrier.srcAccessMask = VK_ACCESS_HOST_WRITE_BIT;
    layoutTransitionBarrier.dstAccessMask = VK_ACCESS_SHADER_READ_BIT;
    layoutTransitionBarrier.oldLayout = VK_IMAGE_LAYOUT_PREINITIALIZED;
    layoutTransitionBarrier.newLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;
    layoutTransitionBarrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
    layoutTransitionBarrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
    layoutTransitionBarrier.image = textureImage;
    VkImageSubresourceRange resourceRange = { VK_IMAGE_ASPECT_COLOR_BIT, 0, 1, 0, 1 };
    layoutTransitionBarrier.subresourceRange = resourceRange;

    vkCmdPipelineBarrier(   context.setupCmdBuffer, 
                            VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT, 
                            VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT, 
                            0,
                            0, NULL,
                            0, NULL, 
                            1, &layoutTransitionBarrier );

    vkEndCommandBuffer( context.setupCmdBuffer );

    VkPipelineStageFlags waitStageMash[] = { VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT };
    VkSubmitInfo submitInfo = {};
    submitInfo.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO;
    submitInfo.waitSemaphoreCount = 0;
    submitInfo.pWaitSemaphores = NULL;
    submitInfo.pWaitDstStageMask = waitStageMash;
    submitInfo.commandBufferCount = 1;
    submitInfo.pCommandBuffers = &context.setupCmdBuffer;
    submitInfo.signalSemaphoreCount = 0;
    submitInfo.pSignalSemaphores = NULL;
    result = vkQueueSubmit( context.presentQueue, 1, &submitInfo, submitFence );

    vkWaitForFences( context.device, 1, &submitFence, VK_TRUE, UINT64_MAX );
    vkResetFences( context.device, 1, &submitFence );
    vkResetCommandBuffer( context.setupCmdBuffer, 0 );
}

So, what is missing? ...right, the image view:

VkImageViewCreateInfo textureImageViewCreateInfo = {};
textureImageViewCreateInfo.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO;
textureImageViewCreateInfo.image = textureImage;
textureImageViewCreateInfo.viewType = VK_IMAGE_VIEW_TYPE_2D;
textureImageViewCreateInfo.format = VK_FORMAT_R32G32B32_SFLOAT;
textureImageViewCreateInfo.components = { VK_COMPONENT_SWIZZLE_R, 
                                          VK_COMPONENT_SWIZZLE_G, 
                                          VK_COMPONENT_SWIZZLE_B, 
                                          VK_COMPONENT_SWIZZLE_A };
textureImageViewCreateInfo.subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
textureImageViewCreateInfo.subresourceRange.baseMipLevel = 0;
textureImageViewCreateInfo.subresourceRange.levelCount = 1;
textureImageViewCreateInfo.subresourceRange.baseArrayLayer = 0;
textureImageViewCreateInfo.subresourceRange.layerCount = 1;

VkImageView textureView;
result = vkCreateImageView( context.device, &textureImageViewCreateInfo, NULL, &textureView );
checkVulkanResult( result, "Failed to create image view." );

That was a lot of code. But we now have an image that we can use to sample. How we sample the image will be defined in a VkSampler. A sampler encapsulates the state of the image and it is used by the implementation to read from the image applying filtering and other transformations. Here is the code to create one:

VkSamplerCreateInfo samplerCreateInfo = {};
samplerCreateInfo.sType = VK_STRUCTURE_TYPE_SAMPLER_CREATE_INFO;
samplerCreateInfo.magFilter = VK_FILTER_LINEAR;
samplerCreateInfo.minFilter = VK_FILTER_LINEAR;
samplerCreateInfo.mipmapMode = VK_SAMPLER_MIPMAP_MODE_LINEAR;
samplerCreateInfo.addressModeU = VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE;
samplerCreateInfo.addressModeV = VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE;
samplerCreateInfo.addressModeW = VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE;
samplerCreateInfo.mipLodBias = 0;
samplerCreateInfo.anisotropyEnable = VK_FALSE;
samplerCreateInfo.minLod = 0;
samplerCreateInfo.maxLod = 5;
samplerCreateInfo.borderColor = VK_BORDER_COLOR_FLOAT_TRANSPARENT_BLACK;
samplerCreateInfo.unnormalizedCoordinates = VK_FALSE;

VkSampler sampler;
result = vkCreateSampler( context.device, &samplerCreateInfo, NULL, &sampler );
checkVulkanResult( result, "Failed to create sampler." );

Ok, we are done with the setup of our resources. Now, we want to be able to sample from the texture in our fragment shader. To do so, we need to create a new binding in our descriptors set of the VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER type. This type combines an image and the sampler, and per the spec, it might provide better performance on some platforms. It is also easier to use. So, let us update the descriptor set layout code:

VkDescriptorSetLayoutBinding bindings[2];

// uniform buffer for our matrices:
bindings[0].binding = 0;
bindings[0].descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
bindings[0].descriptorCount = 1;
bindings[0].stageFlags = VK_SHADER_STAGE_VERTEX_BIT;
bindings[0].pImmutableSamplers = NULL;

// our example texture sampler:
bindings[1].binding = 1;
bindings[1].descriptorType = VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER;
bindings[1].descriptorCount = 1;
bindings[1].stageFlags = VK_SHADER_STAGE_FRAGMENT_BIT;
bindings[1].pImmutableSamplers = NULL;

VkDescriptorSetLayoutCreateInfo setLayoutCreateInfo = {};
setLayoutCreateInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO;
setLayoutCreateInfo.bindingCount = 2;
setLayoutCreateInfo.pBindings = bindings;

Notice that we also need to update our descriptor pool creation code because now we need it to be able to provide a new type of descriptors:

VkDescriptorPoolSize uniformBufferPoolSize[2];
uniformBufferPoolSize[0].type = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
uniformBufferPoolSize[0].descriptorCount = 1;
uniformBufferPoolSize[1].type = VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER;
uniformBufferPoolSize[1].descriptorCount = 1;

VkDescriptorPoolCreateInfo poolCreateInfo = {}; 
poolCreateInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO;
poolCreateInfo.maxSets = 1;
poolCreateInfo.poolSizeCount = 2;
poolCreateInfo.pPoolSizes = uniformBufferPoolSize;

To finish our code we update our newly allocated combined sampler:

VkDescriptorImageInfo descriptorImageInfo = {};
descriptorImageInfo.sampler = sampler;
descriptorImageInfo.imageView = textureView;
descriptorImageInfo.imageLayout = VK_IMAGE_LAYOUT_PREINITIALIZED;

writeDescriptor.dstSet = context.descriptorSet;
writeDescriptor.dstBinding = 1;
writeDescriptor.dstArrayElement = 0;
writeDescriptor.descriptorCount = 1;
writeDescriptor.descriptorType = VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER;
writeDescriptor.pImageInfo = &descriptorImageInfo;
writeDescriptor.pBufferInfo = NULL;
writeDescriptor.pTexelBufferView = NULL;

vkUpdateDescriptorSets( context.device, 1, &writeDescriptor, 0, NULL );

Almost done. On the fragment shader we can declare our uniform sampler2D like this:

layout ( set = 0, binding = 1 ) uniform sampler2D mySampler;

Note how we map binding 1 from the same set as our uniform buffer on the vertex shader. This enables us to sample from the texture in the shader code:

void main() {
    
    vec3 L = normalize( IN.camera );
    float NdotL = max( dot( IN.normal, L ), 0.0f );
    uFragColor = NdotL * texture( mySampler, IN.uv );

}

And that completes our tutorial. I hope you can look back and understand why I decided to move this out of the 101 tutorial. Anyway, you now should have sufficient knowledge of Vulkan to go and figure the rest of the API by yourself. Things I would investigate next would be the push constants for the shaders, how to actually pipeline a scene graph (where do you store the geometry, and how do you synchronise the model matrix updates?), and maybe how to setup another graphic pipeline to create some shadow maps.

I do not see myself writing another entry level tutorial... Also, I need to put some effort back to my own game engine! But, if you have an idea or suggestion please fell free to contact me at jhenriques@gmail.com.

Have a good one, JH.

.Vulkan