Bilateral filter overview
A basic image processing task is that of image denoising. It is very common to use average filters to reduce the amount of noise in images. For example, one can calculate the average in the spatial neighborhood of each pixel in the image and replace its value with the average calculated. This is a simple tool. However, it often leads to blurring edges in the image as the calculated value for pixels that reside near a boundary accounts for values at both sides of the boundary.
In the next figure we see an image of the cameraman with noise added to it.
Upon applying a 15-by-15 averaging filter, the result appears blurred.
The bilateral filter applies Gaussian averaging where the averaging weights account for both the spatial as well as intensity distances between the center pixel and the other pixels. This way, at each pixel location, an adaptive averaging filter is calculated and the appropriate averaging neighborhood is defined.
The results obtained better preserve edges comparing to linear averaging methods that weight pixels according to their spatial distance from the center pixel only.
The result of application of the bilateral filter on the image better preserves the sharpness of the image.
One of the bilateral filter downsides is that it is more compute and I/O intensive compared to other linear filters. Optimizing and running this filter in real-time on a mobile device is hence quite challenging.
GPGPU Implementation on the mobile device
Developing an OpenCL application for the mobile platform is not that much different at its core from developing such an application for the PC. That is, once you have your environment set up correctly and you’ve made yourself familiar with the mobile eco-system. I’ll briefly explain how to set up your PC development environment so that you can start writing mobile applications using OpenCL. OpenCL for mobile platform is currently available only on Android, so I will focus only on that platform.
In order to have a working Android application, you will need to install the Android SDK and NDK (Native Development Kit, which allow the development of C/C++ code). Eclipse is a simple and intuitive tool to develop the Android wrapper and underlying native C++ code, which will be used to run the actual OpenCL code. You’ll also need the relevant auxiliary tools such as the JDK and Ant among others.
Once all tools are in place, one can create a simple android application and deploy it to the OpenCL enabled device. Next step would be to have the Java code call the native C++ code via JNI. You can find more in-depth tutorials, explaining how to set up your environment for Android development. You can also refer to SagivTech’s tutorial for that.
The steps so far had nothing to do with OpenCL for the mobile platform – any Android application would need the steps mentioned above. Next step is to implement the C++ host side of the algorithm. The host side is responsible of getting the image data to be processed, be it a static image or a frame captured by the camera, from the Java code and copy its data to the GPU memory. The next steps are to configure the needed buffers, call the bilateral kernel with all the needed arguments and have the results available for the CPU. The result can then be used by the Android application in any possible way: dump the output to a file, display the filtered image on the mobile device screen, send the filtered image to a remote server, etc.
Once all is up and running and the output of the GPU kernel/s are the correct and desired ones, the fun stuff begins – optimizing the host code and kernel, finding bottlenecks and improving the performance of the GPU code.
During the development of the OpenCL bilateral filter, we gained some insights that could ease the learning and development curve when developing mobile OpenCL applications. The most important insight was that it is usually best to start with a simple C++ implementation of the algorithm, debug and run it on the local PC. It is still much easier than writing and debugging it directly on the target device. Once the simple C++ implementation is bug free, it should be migrated to a simple working OpenCL implementation, still running on the PC.
Here is the base, reference, bilateral filter OpenCL kernel code we used in SagivTech. This code is the base for further optimizations on the different mobile platforms we have evaluated in SagivTech.
It is then a good time to move the code, both host and kernels, to the mobile environment and make sure it runs and yields correct results, algorithmic wise. Hopefully most of the bugs have already been resolved during the previous stages and no special issues will be revealed during the process of moving the code to the mobile environment.
Only after the code runs correctly on the target mobile device, is it recommended to start optimizing the code. There are a lot of optimizations that might be applicable for the PC implementation, but not applicable for the mobile environment; hence it is beneficial to do that directly on the target device. Such optimizations can include the use of images instead of global memory so that the cache can be used, loop unrolling, making sure optimization flags are set correctly for the current platform, etc.
This project is partially funded by the European Union under thw 7th Research Framework, programme FET-Open SME, Grant agreement no. 309169
You understand that when using the Site you may be exposed to content from a variety of sources, and that SagivTech is not responsible for the accuracy, usefulness, safety or intellectual property rights of, or relating to, such content and that such content does not express SagivTech’s opinion or endorsement of any subject matter and should not be relied upon as such. SagivTech and its affiliates accept no responsibility for any consequences whatsoever arising from use of such content. You acknowledge that any use of the content is at your own risk.