parallel.gpu.CUDADevice handle
Package: parallel.gpu
Properties:
Name: 'Quadro 600'
Index: 1
ComputeCapability: '2.1'
SupportsDouble: 1
DriverVersion: 4
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [65535 65535]
SIMDWidth: 32
TotalMemory: 1.0417e+009
FreeMemory: 949575680
MultiprocessorCount: 2
ClockRateKHz: 1280000
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
CanMapHostMemory: 1
DeviceSupported: 1
DeviceSelected: 1
基本情况: 现有如上配置的nvidia显卡,欲用于大规模(2000万阶)稀疏矩阵(为结构化矩阵和非结构化两种)与向量的相乘运算(如CG方法求解过程中)。
问题:理论上,上述nvidia显卡比普通的3.4G HZ的4核CPU最多能快多少倍? 最好能给出理论分析过程。