初学CUDA,想请教一下:
/** a node in a k-d tree */
#pragma pack(16)
struct kd_node
{
int index;
int index_left;
int index_right;
int ki; /< partition key index */
double kv; /< partition key value */
int leaf; /< 1 if node is a leaf, 0 otherwise /
struct feature features; /< features at this node */
int n; /< number of features /
struct kd_node kd_left; /< left child /
struct kd_node kd_right; /**< right child /
};
#pragma pack()
struct feature
{
int d; /descriptor length/
double descr[FEATURE_MAX_D]; /descriptor/
void feature_data; /user-definable data/
};
然后定义两个结构体指针:
struct kd_node* d_expl;
cudaMalloc((void**)&d_expl,n2*sizeof(struct kd_node));
struct feature* d_tree_feat;
cudaMalloc((void**)&d_tree_feat,n2*sizeof(struct feature));
在global函数中这样写:
int tid=threadIdx.x+blockIdx.x*blockDim.x;
d_tree_feat[tid]=d_expl[tid].features[0];
貌似不对,我想是不是因为没有给d_expl中的feature分配显存造成的,要是这样的话,这个应该怎样分配显存啊???
有没有谁能给解答一下,急。。。这个问题纠缠了一个一星期了,没时间浪费了。先谢谢了。