python做图像三维卷积

请教各位大佬们,我想用python做三维图像卷积,图像三维矩阵大小为1024x1024x1024,卷积核的大小为17x17x177,但完全没有头绪应该怎么做,大佬们可以指条明路么?

1024/117好像不能获得整数,不太能理解,我这里有一个img为3维度,kernel至少三维的核函数,可以相互交流一下

__global__ void conv(const float *img, const float *kernel, float *result, const float *bias,
   const int height,const int width, const int kernel_height,
   const int kernel_width , const int ochannel , const int ichannel)
{
   long blockId = blockIdx.y * gridDim.x + blockIdx.x;  
   long id = blockId * blockDim.x + threadIdx.x;  

   int row = id /(ochannel*width);   
   int col = id % (ochannel*width ) /ochannel;
   int cur_channel = id % (ochannel*width) % ochannel;

   if (id >= float(width) * height * ochannel)
   {
   return;
   }
   result[id]=0;
   for(int channel_rbg = 0 ; channel_rbg<ichannel;++channel_rbg)
   {
   for (int i = 0; i < kernel_height; ++i)
   {
   for (int j = 0; j < kernel_width; ++j)
   {
   float img_value = 0;
   int cur_row = row - int(kernel_height / 2) + i;
   int cur_col = col - int(kernel_width / 2 ) + j;
   if (cur_row < 0 || cur_col < 0 || cur_row >= height || cur_col >= width)
   {
   }
   else
   {
   img_value = img[cur_row * ichannel * width + cur_col*ichannel + channel_rbg];

   }
   result[id]+=img_value*kernel[i*ichannel*ochannel*kernel_width + j*ichannel*ochannel +channel_rbg *ochannel + cur_channel];
   }
   }
   }

   result[id]+=bias[cur_channel];
   if(result[id]<0)
   result[id] = 0;
   return;
}

有没有头绪,我和博主有一样的困境最近,能不能交流一下。