Project

General

Profile

problem with gcm_to_cclm test

Added by Iya Belova over 1 year ago

Dear colleagues,

We try to perform test in directory ../step_by_step/gcm_to_cclm/ but for some reasons run_int2lm script fails at a certain moment. All files from directory ../cclm-sp_2.4/step_by_step/gcm_to_cclm/log_int2lm are attached. No file appears in directory ../step_by_step/gcm_to_cclm/data/int2lm_output. Could you, please, suggest any advice on tracking/solving this problem?

Kind regards,
Iya Belova

P.S. maybe this problem is related to the fact that in file int2lm.exe.out-14621 we see the following line “Binary name ....: tstint2lm” but the real binary name is int2lm.exe but we didn’t find does this name (tstint2lm) come from.


Replies (10)

RE: problem with gcm_to_cclm test - Added by Burkhardt Rockel over 1 year ago

I assume that you run the original run_int2lm script without any modifications? In that case it might be a problem with your computer system, i.e. compiler and options, mpi version etc. If you provide information on
  • the name of the computing system you use
  • the name of the compiler and its version
  • the name of the mpi and its version
  • attach the Fopts file from /home/dokukin/work/cosmo/cclm-sp_2.4/src/int2lm

maybe someone from the CLM-Community using a similar configuration can help.

The different binary name is not the reason, because this is an information which has to be set by the user (see the following snippet from the subroutine info_int2lm.f90) and does not has any effect on the model run.

!   Currently it is not possible with FORTRAN95 to get the information
!   of the full path of binary name like the $0 in C. Additionally
!   we cannot determine on which host(s) the binary is running and the
!   domain of the data spread through the nodes.
!   Therefore this information has to be defined manually. On using info_readnl()
!   this information may be defined within the segment /info_defaults/ which
!   has to reside within the named namelist of your choice. Missing information
!   will be ignored silently.
!   Currently following information may be defined within /info_defaults/:
!   INFO_Options ..: List of print options
!   INFO_BinaryName: Name (best: full path) of the binary
!   INFO_RunMachine: The machine (OS) where the program is running
!   INFO_Nodes ....: Description of the nodes the binary is running
!   INFO_Domain ...: The domain the binary is calculating

RE: problem with gcm_to_cclm test - Added by Iya Belova over 1 year ago

Thank you for the answer.

We changed only lines which are used to call int2lm.exe file and number of CPUs in run_in2lm script. Anyway I’ll attach it together with the Fopts file used to compile int2lm.

Here is the information about our system
  • system: CentOS 5.2
  • compiler: ifort 10.1
  • mpi: mvapich2 1.0.3
Fopts Fopts 1.22 KB
run_int2lm run_int2lm 3.78 KB

RE: problem with gcm_to_cclm test - Added by Burkhardt Rockel about 1 year ago

I just run the script with nprocx =1, nprocy = 1, as you did. No problems. Therefore I assume this is a problem of your computing system. I have no experience with CentOs and mvapich2. Hopefully another member of the CLM-Community has and can help you.

RE: problem with gcm_to_cclm test - Added by Iya Belova about 1 year ago

Thank you for your help.
Could you, please, attach resulting log files? This could help us to track our problem.

RE: problem with gcm_to_cclm test - Added by Burkhardt Rockel about 1 year ago

There is a maintenance of the computing system at DKRZ today and tomorrow. I will send you the log files after that.

RE: problem with gcm_to_cclm test - Added by Burkhardt Rockel about 1 year ago

Here are the log and OUTPUT files of the successful job.

RE: problem with gcm_to_cclm test - Added by Iya Belova about 1 year ago

Thank you again.
When I compared your files with ours I found that the following part is quite different: (this is part of our file)
“Info about KIND-parameters: iintegers / MPI_INT = 4
1275069467
int_ga / MPI_INT = 4
1275069467”
In your file instead of 1275069467 stays 7. I’ve received another int2lm outputs and there is also 7 on that place. It seems that this could cause some problems.
I’m not sure but it seems that variable MPI_INT is broken for some reasons.
I’ve tried to add something like “export MPI_INT=7” to run_int2lm script but unfortunately it didn’t work.

RE: problem with gcm_to_cclm test - Added by Ulrich Schättler about 1 year ago

Hi,
you cannot modify the MPI_INT value. This is a value given by the MPI Library used and it can be different for different computers. You should find more information on this value in the documentation of your MPI library used (it is the MPI_INTEGER value). In the documentation you can see whether the value 1275069467 is correct or not.
Furthermore you could check what “Exit code -5” means on your system. Are you running interactively or per batch? Maybe you have to increase the stack size for your run (with “ulimit -s unlimited”)

RE: problem with gcm_to_cclm test - Added by Iya Belova about 1 year ago

Thank you for help.
We found that problem was in trying to run program in 1 core mode.
It seems that either programs with mpi can’t be runned on 1 core or this is the feature of our computing system.
Anyway, when we write in script
“npx=4
npy=2
mpirun -np 8”
everything works correctly.

RE: problem with gcm_to_cclm test - Added by Cemre Yürük 9 months ago

Dear all,

We have similar problem with Iya Belova. We are trying to run COSMO-CLM (cclm-sp_2.4) at 0.11 resolution using ERAinterim dataset for period between August 1st, 2007 and December 31st, 2009. Although the run time is 29 months, we obtain 'int2lm finished’ message after 2-3 months run period. We had experience using cclm-sp_1.5 on same workstation before and we did not encounter with this kind of problem. Do you have any suggestion about this problem?

Fopts, int2lm_test.log and run_int2lm_eraint_test files are in the attachment.

Best regards,

Cemre Yürük

    (1-10/10)