看SDK中的C++ Integration的问题啊。。。

system · 2010 年8 月 23 日 11:21

首先就是以下代码是啥意思啊？？

    g_data[tid] = ((((data <<  0) >> 24) - 10) << 24)
   | ((((data <<  8) >> 24) - 10) << 16)
   | ((((data << 16) >> 24) - 10) <<  8)
   | ((((data << 24) >> 24) - 10) <<  0)

其实这个例子的说明是
This example demonstrates how to integrate CUDA into an existing C++ application, i.e. the CUDA entry point on host side is only a function which is called from C++ code and only the file containing this function is compiled with nvcc. It also demonstrates that vector types can be used from cpp.
目录下游四个文件：main,cpp,cppIntegration_gold.cpp,cppIntegration_kernel.cu,cppIntegration.cu
根据这段英文描述，是这样的，C++是前面两个文件的，只是在main里面有一个CUDA的入口，程序调用runtest，就进入了后面两个文件。----------------是这个意思不？？？
还有疑问就是里面有一下片段

    if( cutCheckCmdLineFlag(argc, (const char**)argv, "device") )
   cutilDeviceInit(argc, (char**)argv);
   else
   cudaSetDevice( cutGetMaxGflopsDeviceId() );

这里用到的argc和argv是命令行参数吧？我记得我没有写任何东西啊？？而且从main函数调用runtest进入CUDA范畴也加入了const int argc, const char** argv两个参量。。。。----------------这是干什么的啊？？

最后一个不明白的是居然最后运行结果显示HELLO WORLD。。。。。。这个是怎么来的啊？难不成就是第一个问题的一堆数字变换后得到的？？

system · 2010 年8 月 23 日 11:34

我擦，我查了一下啊

虽然第一个问题，那一堆堆的左移右移，不明白他是为什么。但是我看了ASIC 表，
这是输入
int len = 16;
// the data has some zero padding at the end so that the size is a multiple of
// four, this simplifies the processing as each thread can process four
// elements (which is necessary to avoid bank conflicts) but no branching is
// necessary to avoid out of bounds reads
char str = { 82, 111, 118, 118, 121, 42, 97, 121, 124, 118, 110, 56,
10, 10, 10, 10};

前面十二个还是对的上的。。多余的4个就是为了对齐吧？？？

system · 2010 年8 月 24 日 19:54

g_data[tid] = ((((data << 0) >> 24) - 10) << 24)
| ((((data << 8) >> 24) - 10) << 16)
| ((((data << 16) >> 24) - 10) << 8)
| ((((data << 24) >> 24) - 10) << 0)

if typeof(data) == int, then:

union{
int data;
char c[4];
}tmp = data;
c[0]-=10;
c[1]-=10;
c[2]-=10;
c[3]-=10;
g_data[tid] = tmp.data;

system · 2010 年8 月 25 日 07:33

[

你后面写的是头文件里面的定义？