Anton's Research Ramblings

Comparing glBufferSubData to glMapBufferRange

Yesterday I had a look at my demo code for OpenGL uniform buffer objects (UBO). UBOs are pretty neat - they let you share blocks of uniform variable data between multiple shaders. The other feature is that they behave like a GL buffer object - the same way as a vertex buffer object (VBO). In my demo I was using the older interface to put data into the buffer; glBufferData and glBufferSubData. The newer way to do this, which is recommended by the aficionados, is to use glMapBuffer and glMapBufferRange, respectively.

In my opinion, OpenGL's major shortcoming isn't some technical or programmatic thing, but its documentation, or rather, its style of documentation. You can read how to use OpenGL functions in the online pages; glMapBufferRange, but there's no indication as to why, or when to use the functions. You can sort-of get hints to this in the extension documentation, but really, no - we have to just try them all. My question here is "What are the technical and non-technical pros and cons of each alternative?".

Some Set Up

First, we create an empty buffer for the UBO. I'm going to store 2 4X4 camera matrices in here. Now, I see the intention with UBOs and std140 is to set up an equivalent C struct that matches the block in the shaders, but honestly I find that style a bit gross (I'm not an OO person), so I'm just going to use simple pointers. I also suspect that it can easily come unstuck and become an incorrect analogy if you're not using length-4 vectors.

I'm going to use this UBO in 2 shader programmes, so I'll give it an unique id, and bind it to the shaders properly here.

Right, enough about style. Onwards!

The Auld Way

Updating buffer data with glBufferSubData works like this. Assuming that I just want to update my second matrix in the UBO:

It's simple C-like pointer offsets, in the style of memcpy(). Easy to screw-up when you're learning, but easy to fix once you've done that a few times, and know what to look for.

The Map Way

So, converting the documentation into human language, what the map function does is return a pointer to a new block of memory representing your UBO. You can tell it to offset this pointer, so that we can have it point to the where the second matrix starts. You then modify that memory directly, and call glUnmapBuffer when you're done, and it should stream your changes into the buffer memory on the graphics device. Really, this is the same type of function as BufferSubData but with a slightly different way of accessing the data.

I'm going to use memcpy here so that it's obvious what I'm doing with the memory pointed to. I actually have some overloaded functions to do that more concisely but it might be confusing to read.

Okay, binding the block buffer is the same, and then we get this pointer to memory. The format and offset parameters are pretty much the same. Just a note about that - offsetting to isolate the second matrix didn't work. I'm not sure what the point is here then with these offsets.

The parameters at the end are of interest. We can actually set the map up to read rather than write memory. In this case we want to write, so GL_MAP_WRITE_BIT. The next flag seems to be required, but it doesn't seem to do anything in effect. The docs say that it wipes all the data in the buffer but this doesn't seem to be the case at all. If I only modify part of the buffer it still retains the rest. If I take out this flag entirely it still works the same way. Not clear.

Finally, you need to call glUnmapBuffer or everything will explode.

Comparison

In my environment mapping demo there was no noticeable performance difference. I suspect that this might be different in much larger projects - perhaps the map will scale better with more data. I didn't do any proper CPU/GPU timer queries.

In terms of amount of code, the map way is slightly more verbose here, but I've deliberately not used any short-cuts for clarity - not a big difference then.

One thing I did notice - there are more possibilities for making a mistake with the map interface, and really only one way to screw up the buffer sub-data function (wrong memory offsets). The documentation for the map is also much more convoluted - I'm not entirely sure which flags I need, and I can't really tell what GL_MAP_INVALIDATE_BUFFER_BIT actually does in reality - perhaps it's intended more as a hint to the GL to make more efficient use of memory. I see that there are also explicit glInvalidateBufferSubData() functions to point out parts of buffers that are no longer required, so this follows.

It's hard to say when one is better then, from this little test. The map functions are core in 3.0, so perhaps that is the most important point of difference.

Also To Try

OpenGL 4.4 has glBufferStorage, which is a newer alternative again to glBufferData. That might be the next comparison to make! This whole set of new functions seems to be from a new type of thinking about efficiency.