This is a response to my earlier post comparing OpenCV’s gpu::convolve() and LibJacket’s jkt::conv2() convolution functions, at various image and kernel sizes.
That post generated a lot of traffic, most notably from the OpenCV developer community. Taking note of this, it seems that the folks at Willow Garage have re-vamped their GPU convolutions and posted their own set of benchmarks using their updated routines.
While the benchmarks I ran highlighted some performance issues in OpenCV – which the maintainers have now fixed, their benchmarks exposed a weak spot in LibJacket’s convolutions – which AccelerEyes have now addressed.
Now, I bring yet another set of benchmarks (along with updated code) to show the current state of mutual improvements for 2D image convolutions in both libraries.
Test system (same as before):
Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz
OpenCV svn: r6902
LibJacket 1.1 (build 767c147)
GPU0 GeForce GTX 295, 896 MB, Compute 1.3 (single,double)
GPU0 Tesla C2075, 5376 MB, Compute 2.0 (single,double)
The bottom part of each figure shows LibJacket speedup over OpenCV, where
speedup = (time_opencv / time_jacket)
Indeed, both libraries have improved since last time, and are sure to only get faster!
Eh… Again using of not buffered version of cv::gpu::convolve in “versus.cpp”. :((