请教各位大佬们,我想用python做三维图像卷积,图像三维矩阵大小为1024x1024x1024,卷积核的大小为17x17x177,但完全没有头绪应该怎么做,大佬们可以指条明路么?
1024/117好像不能获得整数,不太能理解,我这里有一个img为3维度,kernel至少三维的核函数,可以相互交流一下
__global__ void conv(const float *img, const float *kernel, float *result, const float *bias,
const int height,const int width, const int kernel_height,
const int kernel_width , const int ochannel , const int ichannel)
{
long blockId = blockIdx.y * gridDim.x + blockIdx.x;
long id = blockId * blockDim.x + threadIdx.x;
int row = id /(ochannel*width);
int col = id % (ochannel*width ) /ochannel;
int cur_channel = id % (ochannel*width) % ochannel;
if (id >= float(width) * height * ochannel)
{
return;
}
result[id]=0;
for(int channel_rbg = 0 ; channel_rbg<ichannel;++channel_rbg)
{
for (int i = 0; i < kernel_height; ++i)
{
for (int j = 0; j < kernel_width; ++j)
{
float img_value = 0;
int cur_row = row - int(kernel_height / 2) + i;
int cur_col = col - int(kernel_width / 2 ) + j;
if (cur_row < 0 || cur_col < 0 || cur_row >= height || cur_col >= width)
{
}
else
{
img_value = img[cur_row * ichannel * width + cur_col*ichannel + channel_rbg];
}
result[id]+=img_value*kernel[i*ichannel*ochannel*kernel_width + j*ichannel*ochannel +channel_rbg *ochannel + cur_channel];
}
}
}
result[id]+=bias[cur_channel];
if(result[id]<0)
result[id] = 0;
return;
}
有没有头绪,我和博主有一样的困境最近,能不能交流一下。