Evaluating SPECchem96 (GAMESS)
Contents
-
Program Description
-
Program Statistics
-
Memory Usage
-
Explicit Parallelism
-
Load Balancing
-
Execution Profiles
Evaluating SPECchem96 (GAMESS)
-
Program Description
-
The General Atomic and Molecular Electronic Structure System (GAMESS), is a publicly avaiable appliation for evaluating ab initio quantum chemistry on various single and multi-processor archtectures. The package models molecules and reactions at the quantum level.
The code is portable; one version is used for scalar, vector, parallel, or for 32- or 64-bit systems. The SPEC-HPG version is distributed only for benchmarking and is not supported for research by SPEC or Iowa State University. For a full research version of GAMESS program, contact:
Dr. Mike Schmidt Department of Chemistry, Iowa State University, Ames, Iowa 50011 tel: 515-294-9796; FAX: (515)294-0105 email: mike@si.fi.ameslab.gov
An on-line version of the user's guide is available.Additional information is available at the GAMESS home page.
-
Program Statistics
-
Source files: 109389 lines, 3957287 bytes ( 21% comments, 79% code ) Total subprograms: 865 Subroutines: 813 Functions: 51 Program: 1 Block Data: 0
-
Memory Usage
-
Memory is allocated dynamically, from a fixed-length pool. Wave functions
(integral blocks) are written to disk to avoid recomputation. These files
can be quite large -- over 2 gigabytes for large datasets. The wave
functions can also be recomputed directly for systems with poor I/O
capability or insufficient disk capacity.
-
Explicit Parallelism
-
The program has been explicitly paralleled for the distributed-memory
MIMD model. Parallelism is expressed using the TCGMSG message-passing
library. For the benchmark, calls to TCGMSG are filtered to PVM3 through
a library developed at IBM.
Each processor executes the same program, entering the loops over sets of electron shells for which integrals must be computed. Each processor skips most integral blocks, taking only those that it determines to have been assigned to it. The calculation is completed by adding together the partial matrices evaluated by each node from its partial integral list.
The explicitly parallel program is said to scale well up to 200 processors.
-
Load Balancing
-
The input file specifies one of two load balancing schemes:
-
Loop-level balancing: each process takes regular turns, evaluating
every nth block of integrals (for n processes),
skipping over the others.
-
Dynamic load balancing: an integral block is assigned to each
processor as it finishes its previous task.
SUBROUTINE TWOEI C C ME = this processor's ID number C NPROC = number of processors C C initialize parallel work C IPCOUNT = ME - 1 NEXT = -1 MINE = -1 C C begin the four loops over the electron shell sets C DO 920 II = . . DO 900 JJ = IF (dynamic load balancing) THEN MINE = MINE + 1 IF (MINE.GT.NEXT) NEXT = NXTVAL() IF (NEXT.NE.MINE) GO TO 900 END IF . . DO 880 KK = . . DO 860 LL = IF (loop-level balancing) THEN IPCOUNT = IPCOUNT + 1 IF (MOD(IPCOUNT,NPROC).NE.0) GO TO 860 END IF . . C C Generate integral block C Write to disk or place directly in matrix C . . 860 CONTINUE 880 CONTINUE 900 CONTINUE 920 CONTINUE END
(*) Michael W. Schmidt, et. al, "General Atomic and Molecular Electronic Structure System," Journal of Computational Chemistry, Vol. 14, No. 11, 1347-1363 (1993)
-
Loop-level balancing: each process takes regular turns, evaluating
every nth block of integrals (for n processes),
skipping over the others.
-
Execution Profiles
-
The code was profiled extensively on an SGI Challenge (running IRIX 5.3).
Serial execution profiles
Direct SCF (Self-Consistent Field) calculationsNo direct SCF calculations (integrals stored to disk)
Gregg M. Skinner (skinner@csrd.uiuc.edu) Bill Pottenger (potteng@csrd.uiuc.edu)
Fri Jan 19 13:50:01 PST 1996