GPU Convolutions: OpenCV GPU and LibJacket – Part 2

This is a response to my earlier post comparing OpenCV’s gpu::convolve() and LibJacket’s jkt::conv2() convolution functions, at various image and kernel sizes.

That post generated a lot of traffic, most notably from the OpenCV developer community. Taking note of this, it seems that the folks at Willow Garage have re-vamped their GPU convolutions and posted their own set of benchmarks using their updated routines.

While the benchmarks I ran highlighted some performance issues in OpenCV – which the maintainers have now fixed, their benchmarks exposed a weak spot in LibJacket’s convolutions – which AccelerEyes have now addressed.

Now, I bring yet another set of benchmarks (along with updated code) to show the current state of mutual improvements for 2D image convolutions in both libraries.

.
.

Test system (same as before):

Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz
OpenCV svn: r6902
LibJacket 1.1 (build 767c147)
GPU0 GeForce GTX 295, 896 MB, Compute 1.3 (single,double)
GPU0 Tesla C2075, 5376 MB, Compute 2.0 (single,double)

.
.

The bottom part of each figure shows LibJacket speedup over OpenCV, where

speedup = (time_opencv / time_jacket)

Indeed, both libraries have improved since last time, and are sure to only get faster!

2 thoughts on “GPU Convolutions: OpenCV GPU and LibJacket – Part 2”

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.