Anton's Research Ramblings

16 July 2012.

Post-Processing and Computer Vision Algorithms on the GPU

Well, it looks like post-processing image techniques are a lot easier than I had envisioned. It's more or less the same set-up as the colour-picker demo that I did last week, but the intermediate buffer's texture gets re-used as an input to a simple full-screen quad. The quad samples the texture, and performs whatever image technique within the fragment shader. Things like changing the colour range are pretty straight-forward. I was interested in OpenCV-like computer vision techniques. The trick here is just to do several texture samples around the fragment's texture coordinates to create your sliding window/image kernel. e.g. for a 3x3 kernel:

GLSL Fragment Shader. Warning: Untested code fragment

So, probably not a lot of use for post-processing techniques out side of entertainment, but I wonder if it's worth-while to look at doing some of our computer vision tasks (currently done in OpenCV) on the GPU. I am thinking that this might be handy for the special case of analyse-visualise. The question there is - what is the best way to grab camera frames and feed them to GL? Do I want to stream them in one-at-a-time (slow), or do a bunch at a time? This raises some interesting questions (and possibilities) in a web set-up. Perhaps the camera can talk to a web server, and WebGL can fetch frames interactively from the server (and skip anything non-current). Must try.

Depth of Field

Okay, so I've figured out how to blur the area of a screen around a certain object's position on the screen. But this isn't true depth of field - a technique which actually uses depth information to focus on a particular object and blur the forward/rear elements. Annoyingly, the WebGL book that I got has a picture of a depth of field technique on the cover...and that was a main reason for buying the book...but it's just a picture. I judged a book by its cover... Anyway here are some enlightening sources of information;