FPGA開発日記

カテゴリ別記事インデックス https://msyksphinz.github.io/github_pages , English Version https://fpgadevdiary.hatenadiary.com/

OpenCLのCode Exampleを試す (BandwidthTest)

developer.nvidia.com

OpenCL Bandwidth Test 
This is a simple test program to measure the memcopy bandwidth of the GPU. It currently is capable of measuring device to device copy bandwidth, host to device and host to device copy bandwidth for pageable and page-locked memory, memory mapped and direct access.

メモリへの転送は普通のOpenCLのプログラムだが、時間を測るほうはどのようになっているんだろう? shared/src/shrUtils.cpp に記述してある。以下はWindowsのスイッチが入っているケースだ。

// Helper function to return precision delta time for 3 counters since last call based upon host high performance counter
// *********************************************************************
double shrDeltaT(int iCounterID = 0)
{
    // local var for computation of microseconds since last call
    double DeltaT;

    #ifdef _WIN32 // Windows version of precision host timer

        // Variables that need to retain state between calls
        static LARGE_INTEGER liOldCount[3] = { {0, 0}, {0, 0}, {0, 0} };

        // locals for new count, new freq and new time delta
            LARGE_INTEGER liNewCount, liFreq;
            if (QueryPerformanceFrequency(&liFreq))
            {
                    // Get new counter reading
                    QueryPerformanceCounter(&liNewCount);

                    if (iCounterID >= 0 && iCounterID <= 2)
                    {
                            // Calculate time difference for timer 0.  (zero when called the first time)
                            DeltaT = liOldCount[iCounterID].LowPart ? (((double)liNewCount.QuadPart - (double)liOldCount[iCounterID].QuadPart) / (double)liFreq.QuadPart) : 0.0;
                            // Reset old count to new
                            liOldCount[iCounterID] = liNewCount;
                        }
                        else
                        {
                        // Requested counter ID out of range
                        DeltaT = -9999.0;
                        }

                    // Returns time difference in seconds sunce the last call
                    return DeltaT;
            }
            else
            {
                    // No high resolution performance counter
                    return -9999.0;
            }

DeltaTの計算をしているが、QueryPerformanceFrequency, QueryPerformanceCounter により情報を取得している。これにより、メモリ転送にかかった時間を測り、バンド幅を計算している訳だ。