Knowledge Base

Article ID: 122 - Last Modified:

When I run Desmond, Jaguar, or QSite jobs on a cluster with an Infiniband network, the jobs fail with the following in the log file. How do I fix this?
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
This will severely limit memory registrations.

The jobs are failing because of limits that are set in your shell. Have a look at the OpenMPI FAQs #14 and #15 listed here:

http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages

Of particular importance is the fact that shell limits can be set in many places (limits.conf, PAM, even the resource manager daemons e.g. SGE, LSF, etc.). Simply logging into a computer and testing the limits via ulimit -a (or equivalent) may yield inaccurate results.

To truly test the shell limits in the environment of a cluster job, submit a simple batch script to the queue that prints out ulimit -a.

You can increase the limit for RLIMIT_MEMLOCK by adding the following command in the shell startup files:

bash: ulimit -l unlimited
csh/tcsh: limit memorylocked unlimited

Related Articles:

#1481: My parallel Jaguar calculation failed with an "out of memory" error. What is the problem?

Back to Search Results

Was this information helpful?

What can we do to improve this information?


To ask a question or get help, please submit a support ticket or email us at help@schrodinger.com.
Knowledge Base Search

Type the words or phrases on which you would like to search, or click here to view a list of all
Knowledge Base articles