gpgpu - OpenCL clEnqueueCopyImageToBuffer with stride -


I have an OpenCL buffer that has 2D diagonal width of this image is larger than its width. I need to create an OpenCL image from this buffer The problem is that the function clEnqueueCopyImageToBuffer is not wavy in the form of an input parameter, only a copy from the OpenCL buffer OpenCL image (with a larger width than the width) or Make it faster? One way to solve this problem is to write your own kernel, but maybe there is a very clear solution?

Unfortunately, there is no way in the OpenCL specification that allows you to create images directly from buffer When the image of the buffer data does not equal the width, the most efficient solution will probably be to write your own kernel to do this.

The easiest way to do this is to write your own kernel, clEnqueueCopyBufferToImage at a time. If your image is too large, it may be that the display of this technique will be equal to the hand written kernel, but you have to try to see it.


I did not include the clEnqueueCopyBufferRect approach in my original reply because my first trend was that the additional copy would kill the performance, however, the comments above given me about it Started thinking more, and I was interested enough to see how performance was actually in the display.

performance result

As I suspect, the fastest The way to do this was to implement the kernel directly, though the copy of the data on line-by-line was significantly slower than my expectations. Copying buffer into intermediate buffer with clEnqueueCopyBufferRect is a really good compromise of execution and simplicity, although there is still some time in slow execution with kernel implementation.

The source code can be found for this small experiment. I was copying the image of 1020x1020 with 1024, and the time is average over 8 oz.

Comments

Popular posts from this blog

Java - Error: no suitable method found for add(int, java.lang.String) -

java - JPA TypedQuery: Parameter value element did not match expected type -

c++ - static template member variable has internal linkage but is not defined -