Home>

I want to speed up my program using OpenACC, but compiling with g ++ -fopenacc ○○ .c (cpp) does not speed up at all.

In addition, the results did not change at all in the sample program for calculating the pi by the Leibniz method described below.

According to the book, it seems to be so fast, so I'm wondering what's wrong.

Is it ineffective without using the PGI compiler?
Also, I am running it on ubuntu, but when I look at the task manager of windows during execution, the GPU performance item is not responding.
I tried to increase the amount of calculation because the number was small, but there was no reaction and Leibniz (n, 0) and Leibniz (n, 1) remained the same, and the processing time just became longer.
And how do you specify which GPU to run when you have multiple GPUs?

Even if you look it up, you can only find the ones that have been removed, and it is completely a stalemate, so please give me some advice.

Problems and calculation results that are occurring
C: n = 1000000, elapsed time = 0.0468750000 [sec], pi = 3.14159165358977432447
OpenACC: n = 1000000, elapsed time = 0.0468750000 [sec], pi = 3.14159165358977432447
     C: n = 10000000, elapsed time = 0.3593750000 [sec], pi = 3.14159255358979150330
OpenACC: n = 10000000, elapsed time = 0.3593750000 [sec], pi = 3.14159255358979150330
     C: n = 100000000, elapsed time = 3.5000000000 [sec], pi = 3.14159264358932599492
OpenACC: n = 100000000, elapsed time = 3.5156250000 [sec], pi = 3.14159264358932599492
     C: n = 1000000000, elapsed time = 34.9375000000 [sec], pi = 3.14159265258805042720
OpenACC: n = 1000000000, elapsed time = 35.4843750000 [sec], p
tesla v100
 C: n = 1000000, elapsed time = 0.0654819980 [sec], pi = 3.14159165358977432447
OpenACC: n = 1000000, elapsed time = 0.0576860011 [sec], pi = 3.14159165358977432447
     C: n = 10000000, elapsed time = 0.5476359725 [sec], pi = 3.14159255358979150330
OpenACC: n = 10000000, elapsed time = 0.5475519896 [sec], pi = 3.14159255358979150330
     C: n = 100000000, elapsed time = 5.3251781464 [sec], pi = 3.14159264358932599492
OpenACC: n = 100000000, elapsed time = 5.3305191994 [sec], pi = 3.14159264358932599492
     C: n = 1000000000, elapsed time = 52.7688751221 [sec], pi = 3.14159265258805042720
OpenACC: n = 1000000000, elapsed time = 52.7695045471 [sec], pi = 3.14159265258805042720
Corresponding source code
// //
// Leibniz pi, for C, OpenACC
// //
// (c) Copyright Spacesoft corp., 2018 All rights reserved.
// Hiro KITAYAMA
// //
#include<stdio.h>
#include<math.h>
#include<time.h>
// ------------------------------------------------ ---------------- ----------------
void Leibniz (const int n, const int acc)
{
    clock_t start = clock ();
    double pi = 0.0f;
    #pragma acc kernels if (acc)
    for (int i = 0;i<n;i ++)
    {
        pi + = (double) (pow (-1, i)/(double) (2 * i + 1));
    }
    pi * = 4.0f;
    clock_t stop = clock ();
    fprintf (stdout, "n =% 11d,", n);
    fprintf (stdout, "lapse time =% .10f [sec], pi =% .20f \ n",
        (float) (stop --start)/CLOCKS_PER_SEC, pi);
}
// ------------------------------------------------ ---------------- ----------------
int main ()
{
    for (int n = 1000000;n<= 1000000000;n * = 10)
    {
        fprintf (stdout, "C:");
        Leibniz (n, 0);
        fprintf (stdout, "OpenACC:");
        Leibniz (n, 1);
    }
    return 0;
}
Supplementary information (FW/tool version, etc.)

g ++ is up to date today.