I followed the same procedure, and built OpenCV along with the extra modules and when examining the output of cmake it states : OpenCL : Yes (1.2). However when examining the performance I compiled the following code with g++
Code:
#include <opencv2/features2d.hpp>
#include <opencv2/imgcodecs.hpp>
#include <opencv2/opencv.hpp>
#include <opencv2/core/ocl.hpp>
#include <vector>
#include <iostream>
#include <stdio.h>
using namespace std;
using namespace cv;
void print_ocl_device_name() {
vector<ocl::PlatformInfo> platforms;
ocl::getPlatfomsInfo(platforms);
for (size_t i = 0; i < platforms.size(); i++) {
const cv::ocl::PlatformInfo* platform = &platforms[i];
cout << "Platform Name: " << platform->name().c_str() << "\n";
for (int j = 0; j < platform->deviceNumber(); j++) {
cv::ocl::Device current_device;
platform->getDevice(current_device, j);
int deviceType = current_device.type();
cout << "Device " << j << ": " << current_device.name() << "\n";
}
}
ocl::Device default_device = ocl::Device::getDefault();
cout << "Used device: " << default_device.name() << "\n";
}
int main(void) {
print_ocl_device_name();
RNG r(239);
Mat desc1(5100, 64, CV_32F);
Mat desc2(5046, 64, CV_32F);
r.fill(desc1, RNG::UNIFORM, Scalar::all(0.0), Scalar::all(1.0));
r.fill(desc2, RNG::UNIFORM, Scalar::all(0.0), Scalar::all(1.0));
for (int i = 0; i < 10; i++) {
BFMatcher matcher(NORM_L2);
UMat udesc1, udesc2;
desc1.copyTo(udesc1);
desc2.copyTo(udesc2);
vector< vector<DMatch> > nn_matches;
matcher.knnMatch(udesc1, udesc2, nn_matches, 2);
printf("%d\n", i);
}
return 0;
}
And the timed the code with using a clock_t type variable, initializing the variable (clock_t start = clock())) before the for loop and examining the result of double result = clock() - start/(double)CLOCKS_PER_SEC after the termination of the loop. It returned similar times (5%) when either using UMat or Mat type variables.
I also monitored the CPU usage in both cases and for both the CPU usage was above 90% for the execution of the program, which makes me think the GPU was not involved in doing any processing.
This is further backed up by the fact that I compiled the same code for my Ubuntu desktop, and for CPU (Mat) only, all cores were maxed out to 90% for the duration of the program execution. However when using UMat and taking advantage of my NVIDIA card, only one core was involved in the program execution and the rest idled (assuming one core handled the memory transfer to the OpenCL device).
I think a driver needs to be installed for the MALI GPU (as I did in Ubuntu on my desktop for the NVIDIA card).
Which OS are you running on the tinker board? I was using the latest Armbian.
EDIT : So after getting no results from clinfo and stumbling upon this [url=https://brian.digitalmaddox.com/blog/?p=484] and seeing no OpenCL directory in /etc I assumed the driver was not part of Armbian. As such I moved to TinkerOS and after fixing the incorrectly named directory as mentioned in the post above, clinfo returned all the information. However, when trying to implement even a simple function call from using OpenCV's Tapi, this error [url=https://stackoverflow.com/questions/50289324/clenqueuendrangekernel-failed-with-error-out-of-resources-5] occurs which I attempted to fix but with no luck.