Nsight Debug:
CUDA Memory Checker detected 5925 threads caused an access violation:
Launch Parameters
CUcontext = 060d2ef8
CUstream = 06b99990
CUmodule = 07cddf90
CUfunction = 07d26ea8
FunctionName = _Z6KernelPhS_iii
GridId = 1
gridDim = {28519,1,1}
blockDim = {1024,1,1}
sharedSize = 256
Parameters:
dpImage = 0x413e0000 0 ’
Memory Checker detected 5925 access violations.
error = access violation on store (global memory)
gridid = 1
blockIdx = {0,0,0}
threadIdx = {1,0,0}
address = 0x42fc2a30
accessSize = 1
CUDA grid launch failed: CUcontext: 101527288 CUmodule: 130932624 Function: _Z6KernelPhS_iii
设备为GT770
显卡的内存为2G;
线程分配情况:
dim3 dimBlock(1024,1,1);
dim3 dimGrid((dwEffWidthbiHeight+1023)/1024,1);
dwEffWidth = ((((ds.biBitCount * biWidth) + 31) / 32) * 4);
BYTE pbBits = (BYTE*)hImage + (DWORD)hImage + ds.biClrUsed * sizeof(RGBQUAD);
pImage = new BYTE[dwEffWidth * biHeight];
是在做这样的一步:图像pImage上的像素与相应的模板进行卷积,在输出的图片pbBits相对应的像素点上改变相应的像素值。
相关的停止语句是:
effwdt = ((((d_biBitCount * d_bmpWidth ) + 31) / 32) * 4);
不知是怎么个情况,越界情况也检查了觉得没有问题,还请各位指点指点。。。。