Profiling JS and WebGL is almost impossible. I went back to OpenGL and C to try some optimisations that I think will work inside the browser as well. Profiling showed me that I was wasting the majority of my CPU time on two things:
The first problem was a two-step fix. First, I was updating all of my uniform variables, every time that I rendered an object. This is not necessary, because uniforms are particular to shader programmes, not to vertex buffers. Many of the objects to be rendered will re-use the same shader programme. I sprinkled a few boolean flags around so that uniforms were only sent using glUniform... when the shader programme being used had changed, or the values in the uniforms needed to be changed. My objects are loaded in clumps, so I can be fairly confident that those using the same shader will all pop up in contiguous order in my rendering loop.
The second problem was not made clear anywhere, and I just discovered it by profiling. Each call to glUniform requires that you give it the 'location' of the uniform variables in the shader programme. This is just an unique unsigned integer identifier. I was getting these, by name (by string), immediately before each call to set the uniform. But, as I discover, Getting uniform locations during the main loop is incredibly slow. I changed my engine to fetch all of these in the loading phase.
And the result of these few, seeminly minor, improvements? My OpenGL
More event-driven engines are key I think. Making clever use of callbacks might be nice. With larger engines I suspect that a lot of my time is wasted cycling through loops that check if things should be on screen or not. Often they are not, which is a great saving for the GPU, but checking the entire list of renderables every frame is not ideal for the CPU. I think it's time to try managing separate lists of "visible" and "invisible" objects to reduce the length of loops.