OpenCL Bandwidth Test This is a simple test program to measure the memcopy bandwidth of the GPU. It currently is capable of measuring device to device copy bandwidth, host to device and host to device copy bandwidth for pageable and page-locked memory, memory mapped and direct access.
メモリへの転送は普通のOpenCLのプログラムだが、時間を測るほうはどのようになっているんだろう? shared/src/shrUtils.cpp に記述してある。以下はWindowsのスイッチが入っているケースだ。
// Helper function to return precision delta time for 3 counters since last call based upon host high performance counter // ********************************************************************* double shrDeltaT(int iCounterID = 0) { // local var for computation of microseconds since last call double DeltaT; #ifdef _WIN32 // Windows version of precision host timer // Variables that need to retain state between calls static LARGE_INTEGER liOldCount[3] = { {0, 0}, {0, 0}, {0, 0} }; // locals for new count, new freq and new time delta LARGE_INTEGER liNewCount, liFreq; if (QueryPerformanceFrequency(&liFreq)) { // Get new counter reading QueryPerformanceCounter(&liNewCount); if (iCounterID >= 0 && iCounterID <= 2) { // Calculate time difference for timer 0. (zero when called the first time) DeltaT = liOldCount[iCounterID].LowPart ? (((double)liNewCount.QuadPart - (double)liOldCount[iCounterID].QuadPart) / (double)liFreq.QuadPart) : 0.0; // Reset old count to new liOldCount[iCounterID] = liNewCount; } else { // Requested counter ID out of range DeltaT = -9999.0; } // Returns time difference in seconds sunce the last call return DeltaT; } else { // No high resolution performance counter return -9999.0; }
DeltaTの計算をしているが、QueryPerformanceFrequency, QueryPerformanceCounter により情報を取得している。これにより、メモリ転送にかかった時間を測り、バンド幅を計算している訳だ。