2D数组

在分配一个尺寸为widthheight的float型2D数组时,有这样一行代码:
cudaMallocPitch((void**)&devPtr,&pitch,width
sizeof(float),height);
其中的"devPtr“是数组的首地址,请问其中的“pitch”是做什么用的?

cudaError_t cudaMallocPitch ( void ** devPtr,
size_t * pitch,
size_t width,
size_t height
)
Allocates at least widthInBytes * height bytes of linear memory on the device and returns in *devPtr a pointer to the allocated memory. The function may pad the allocation to ensure that corresponding pointers in any given row will continue to meet the alignment requirements for coalescing as the address is updated from row to row. The pitch returned in *pitch by cudaMallocPitch() is the width in bytes of the allocation. The intended usage of pitch is as a separate parameter of the allocation, used to compute addresses within the 2D array. Given the row and column of an array element of type T, the address is computed as:

T* pElement = (T*)((char*)BaseAddress + Row * pitch) + Column;

For allocations of 2D arrays, it is recommended that programmers consider performing pitch allocations using cudaMallocPitch(). Due to pitch alignment restrictions in the hardware, this is especially true if the application will be performing 2D memory copies between different regions of device memory (whether linear memory or CUDA arrays).

Parameters:
devPtr - Pointer to allocated pitched device memory
pitch - Pitch for allocation
width - Requested pitched allocation width
height - Requested pitched allocation height

Returns:
cudaSuccess, cudaErrorMemoryAllocation
Note:
Note that this function may also return error codes from previous, asynchronous launches.

cudaMallocPitch函数分配的内存中,数组的每一行的第一个元素的开始地址都保证是对齐的。因为每行有多少个数据是不确定的,widthofxsizeof(元素)不一定是32的倍数(看编程手册的第5章性能指南的5.1.2.1全局存储器这节,其中提到了存取器事务(一次访问宽度,如不对齐的话估计访问宽度就浪费了)大小为32字节(1.2能力及以上),64字节,128字节。)。故此,为保证数组的每一行的第一个元素的开始地址 对齐,cudaMallocPitch在分配内存时,每行会多分配一些字节,以保证widthofxsizeof(元素)+多分配的字节是32的倍数(对齐)。这样,ywidthofxsizeof(元素)+xsizeof(元素)来计算a[y][x]的地址就不正确了。 而应该是y[widthofxsizeof(元素)+多分配的字节]+xsizeof(元素)。 而函数中返回的pitch的值就是widthofx*sizeof(元素)+多分配的字节。

[ 本帖最后由 siheng303 于 2010-8-10 13:32 编辑 ]