Multiwfn official website: //www.umsyar.com/multiwfn. Multiwfn forum in Chinese: http://bbs.keinsci.com/wfn
You are not logged in.
Dear Tian Lu,
I'm working with new kernels of derivatives. I have implemented these kernels to an old version of Multiwfn via modification of orbderv routine. Unfortunately, in the novel versions of Multiwfn, several orbderv (standard, for PBC, for Promolecular density) routines are. Below, the patch, which merged orbderv, orbderv_PBC, and orbderv_pmol into one orbderv routine, is. I tested it with the integration of the electron density of the Xe atom and it works, but I did not test other branches.
Internally, I used pointers to provide necessary arrays of primtype and coefficients without multiplying code for each unique variable.
The patch:
0001-Merge-orbderv-to-one-routine.patch.txt
How my function.f90 looks (note, (1) I applied dos2unix to function.f90, (2) define.f90 was also changed):
function.f90.txt
Best regards,
Igor
Offline
Dear Igor,
I will review and test your subroutine these days. If no detectable performance degradation is observed, the patch will be merged into official release and let you know here.
Best,
Tian
Offline
Dear Igor,
A notable performance degradation is observed using the new code. For example, calculating electron density with high quality grid for a system containing 704 atoms takes 31s and 42s before and after applying the patch (tested under Windows). Because efficiency is important for Multiwfn, therefore, I am sorry to inform you that the patch cannot be merged into official release.
Best,
Tian
Offline
Dear Tian,
Thank you for letting me know!
I will try to profile my patch for getting a similar performance.
Best regards,
Igor
Offline
Dear Tian,
It seems to me that I have found a problem...
There is a new patch, in which I did not notice performance degradation with Intel Compiler.
0001-Merge-orbderv-routines.patch.txt
Command line for testing:
time echo "1000 10 10 6 32 3,0,0 2 2 32 0,3,0 2 2 32 0,0,3 2 3 0 -1 100 4 1" | tr " " "\n" | OMP_STACKSIZE=200M ./Multiwfn_noGUI H2O.wfn
I set 10 cores (mine PC has Intel(R) Core(TM) i9-10940X CPU @ 3.30GHz), multiplied H2O molecule(H2O.wfn.txt), calculated at HF/def2TZVP level of theory by X and Y axes twice, and by Z axe --- thrice. Then, I integrate electron density. I used 75.434 grid for integration; Number of unique GTFs: 2412.
Timings:
old:
real 0m20.536s user 3m24.962s sys 0m0.301s
new:
real 0m20.570s user 3m25.301s sys 0m0.280s
I suppose, the difference in timings comings from a different load of the computer between runs.
I also checked my patch with GCC. Unfortunately, the compilation failed. Below, the patch, that fixes this issue, is. Hope, the patch is correct.
Best,
Igor
Offline
Dear Igor,
Unfortunately, the performance degradation is still evident. I tested the code in both Windows and Linux, the percentage performance degradations are similar. This is my test file:
https://mega.nz/file/qB1BlRjJ#5XiIW8wA5 … 4wdY4EKRjg
After loading it, input
5
1
3
Using old code, the time cost on Win10 64bit, Intel 10870H (8 physical cores), is 33s, while the cost increases to 43s if using new code.
Best,
Tian
Offline
Dear Tian,
Thank you for providing this file!
I got close timing to yours on my hardware.
Unfortunately, I did not find any way to speed up my patch. Moreover, I got terrific timings for GCC builds with my patch: about 70-80 seconds. However, I found a way to speed up calculations with `!$omp collapse(2)`. I will provide a patch on another topic
Best regards,
Igor
Offline