bank conflict 的问题

system · 2010 年4 月 30 日 03:48

问下bank conflict 的问题
我用的geforce9500 应该是有16个banks
对于下面的例子
shared float shared[32];
float data = shared[BaseIndex + s * tid];
若BaseIndex = 0
s = 1；
访问应该是
tid shared[BaseIndex + s * tid]
0 shared[0]
1 shared[1]
.
.
.
15 shared[15]
0～15属于不同的bank
若s = 3的话
是怎么样的呢
tid shared[BaseIndex + s * tid]
0 shared[0]
1 shared[3]
.
.
.
5 shared[15]
6 shared[16]
.
.
10 shared[30]
11 shared[33]//大于31了，不会出错吗
.
.
.
15 shared[45]
如果将上面float类型改为
struct type { float x, y, z; };
tid shared[BaseIndex + s * tid] bank
0 shared[0] 0，1，2
1 shared[3] 3，4，5
.
.
.
5 shared[15] 15，0，1
6 shared[16] 2，3，4
.
.
10 shared[30]
11 shared[33]
.
.
.
15 shared[45]
这里不是已经重复访问了吗
为什么在NVIDIA_CUDA_ProgrammingGuide.pdf G3.3.4
Three separate reads without bank conflicts if type is defined as

struct type { float x, y, z; };
since each member is accessed with an odd stride of three 32-bit words;
英文版和中文版的都已经看过了，还是没看懂
还请各位不吝赐教