i have simple code prepared testing. important piece of code:
#pragma omp parallel sections { #pragma omp section { (int j=0;j<100000;j++) (int i=0;i<1000;i++) a1[i]=1; } #pragma omp section { (int j=0;j<100000;j++) (int i=0;i<1000;i++) a2[i]=1; } }
i compiled program mingw compiler , results expected. going use computer linux only, compiled code on linux (using same machine). used gcc 4.7.2 , intel 12.1.0 compilers. efficiency of program decreased. slower sequential program (omp_set_num_threads(1)
)
i have tried private arrays in threads, effect similar.
can suggest explanation?
i don't understand mean achieve code difference in efficiency due compiler making use of not knowing how handle code has sections-within-sections.
first off, try different compiler. experience gcc-4.8.0 works better openmp maybe try start off.
secondly, use optimisation flags! if measuring performance fair use either -o1 -o2 or -o3. latter give best performance takes short-cuts mathematical functions make floating point operations slightly less accurate.
g++ -fopenmp name.cpp -o3
you can read more on compiler flags on this page if interests you.
as end note, don't know how experienced openmp, when dealing loops in openmp use following:
#pragma omp parallel for(int i=0; i<n; ++i) dosomething();
additionally, if using nested loops, can use collapse
directive inform compiler turn nested loops single 1 (which can lead better performance)
#pragma omp parallel collapse(2) private(i, j) for(int i=0; i<n; ++i) for(int j=0; j<n; ++j) dosomething();
there things should aware of when using collapse can read here. prefer manually converting them single loop experience proves more efficient.
Comments
Post a Comment