在MATLAB中使用多GPU做并行计算实际上是将多块GPU分别赋予个计算机中的每个CPU核心(线程),即为MATLAB中worker,而MATLAB中涉及多线程并行计算往往跟pool有关。一台计算机的CPU核心或者GPU都是有限的,但是可以利用MATLAB提供的MDCS服务奖多台计算并联起来,然后在建立并行池,然后将GPU赋予给每个workers,最后使用MATLAB提供的多线程并行计算函数,如parfoor,spmd等等即可实现多GPU并行计算。如何建立分布式集群,请参看:http://developer.nvidia-china.co … 7994&extra=page%3D1,如果你使用过MATLAB的CPU并行应该知道matlabpool,当前,在matlab中如果调用多GPU那么需要开启多个pool,一个pool对应一个GPU,也就是一个CPU worker对应一块GPU,如
matlabpool 2
spmd
gpuDevice
end
spmd
if labindex ==1
gpuDevice(2);
end
end
spmd
gpuDevice
end
如果计算机中存在两块GPU,则得到如下结果:
Lab 1:
ans =
CUDADevice with properties:
Name: 'Quadro FX 370'
Index: 2
ComputeCapability: '1.1'
SupportsDouble: 0
DriverVersion: 5.5000
ToolkitVersion: 5
MaxThreadsPerBlock: 512
MaxShmemPerBlock: 16384
MaxThreadBlockSize: [512 512 64]
MaxGridSize: [65535 65535 1]
SIMDWidth: 32
TotalMemory: 268435456
FreeMemory: NaN
MultiprocessorCount: 2
ClockRateKHz: 720000
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
CanMapHostMemory: 1
DeviceSupported: 0
DeviceSelected: 1
Lab 2:
ans =
CUDADevice with properties:
Name: 'Tesla K20c'
Index: 1
ComputeCapability: '3.5'
SupportsDouble: 1
DriverVersion: 5.5000
ToolkitVersion: 5
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
SIMDWidth: 32
TotalMemory: 5.0330e+09
FreeMemory: 4.9166e+09
MultiprocessorCount: 13
ClockRateKHz: 705500
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 0
CanMapHostMemory: 1
DeviceSupported: 1
DeviceSelected: 1