斑竹您好:
采用如下方式求和,代码如下
__global__ void Mean_680(float *p,float *d_mean,int Col_num,int Row_num)
{
int y_id=blockIdx.y*blockDim.y+threadIdx.y;
int x_id=blockIdx.x*blockDim.x+threadIdx.x;
//简单低效
//if (x_id==0&&y_id<75)
//{
// for (int i=1;i<56644;i++)
// {
// p[y_id*56644]+=p[i+y_id*56644];
// }
//}
//规约
int a=Col_num/2;
int b=Col_num%2;
int c;
while (a!=0)
{
if (x_id<a&&y_id<Row_num)
{
p[x_id+y_id*Col_num]=p[x_id+y_id*Col_num]+p[x_id+a+b+y_id*Col_num];
}
c=a+b;
a=c/2;
b=c%2;
__syncthreads();
}
}
其中:threaddim(512,1);blockdim(111,75),col_num=56644,row_num=75,p中为75行56644列的的float数据,代码实现列方向上的和,最终结果为p的第一列为所有列之和,采用代码中注释代码是正确的,现采用规约的方式,结果不正确,辛苦斑竹帮忙看一下,谢谢