Do note that firewall on CentOS 7 system is enabled by default.
Step 1: To check the status of CentOS 7 FirewallD
# systemctl status firewalld.service
● firewalld.service - firewalld - dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
Active: inactive (dead)
The above shows that the firewalld is disabled.
Step 2: To stop the FirewallD
# systemctl stop firewalld.service
Step 3: To completely disable the firewalld service
# systemctl disable firewalld.service
If you are installing something and you see this “Error error while loading shared libraries: libXm.so.4”. It is quite easy to solve. Just do the following
# yum install motif, motif-devel
I was installing ABAQUS 2017 on CentOS 7 when I encountered an error. lsb_release is the print distribution specific information. Strangely, this issue is found on CentOS 7 distribution.
[root@node-h001 1]# ./StartGUI.sh
Current operating system: "Linux"
./StartGUI.sh: .: .: line 3: lsb_release: not found
Unknown linux release ""
# yum install redhat-lsb-core
[root@node-h001 1]# lsb_release
LSB Version: :core-4.1-amd64:core-4.1-noarch
Before restarting the NSD Nodes or Quorum Manager Nodes or other critical nodes, do check the following first to ensure the file system is in the right order before restarting.
1. Make sure all three quorum nodes are active.
# mmgetstate -N quorumnodes
*If any machine is not active, do *not* proceed
2. Make sure file system is mounted on machines
# mmlsmount gpfs0
If the file system is not mounted somewhere, we should try to resolve it first.
There was a good and varied topics being discussed at the Spectrum Scale
If your cluster has symptoms of overload and GPFS kept reporting “overloaded” in GPFS logs like the ones below, you might get long waiters and sometimes deadlocks.
Wed Apr 11 15:53:44.232 2018: [I] Sending 'overloaded' status to the entire cluster
Wed Apr 11 15:55:24.488 2018: [I] Sending 'overloaded' status to the entire cluster
Wed Apr 11 15:57:04.743 2018: [I] Sending 'overloaded' status to the entire cluster
Wed Apr 11 15:58:44.998 2018: [I] Sending 'overloaded' status to the entire cluster
Wed Apr 11 16:00:25.253 2018: [I] Sending 'overloaded' status to the entire cluster
Wed Apr 11 16:28:45.601 2018: [I] Sending 'overloaded' status to the entire cluster
Wed Apr 11 16:33:56.817 2018: [N] sdrServ: Received deadlock notification from
Increase scatterBuffersize to a Number that match IB Fabric
One of the first tuning will be to tune the scatterBufferSize. According to the wiki, FDR10 can be tuned to 131072 and FDR14 can be tuned to 262144
The default value of 32768 may perform OK. If the CPU utilization on the NSD IO servers is observed to be high and client IO performance is lower than expected, increasing the value of scatterBufferSize on the clients may improve performance.
# mmchconfig scatterBufferSize=131072
There are other parameters which can be tuned. But the scatterBufferSize worked immediately for me.
# mmchconfig verbsRdmaSend=no -N nsd1,nsd2
Verify settings has taken place
# mmfsadm dump config | grep verbsRdmasPerNode
Increase verbsRdmasPerNode to 514 for NSD Nodes
# mmchonfig verbsRdmasPerNode=514 -N nsd1,nsd2
- Best Practices RDMA Tuning
If you encounter this issue during an application run and your scheduler used is Platform LSF. There is a simple solution.
explicit_dp: Rank 0:13: MPI_Init_thread: didn't find active interface/port
explicit_dp: Rank 0:13: MPI_Init_thread: Can't initialize RDMA device
explicit_dp: Rank 0:13: MPI_Init_thread: Internal Error: Cannot initialize RDMA protocol
MPI Application rank 13 exited before MPI_Init() with status 1
mpirun: Broken pipe
In this case the amount of locked memory was set to unlimited in /etc/security/limits.conf, but this was not sufficient.
The MPI jobs were started under LSF, but the lsf daemons were started with very small memory locked limits.
Set the amount of locked memory to unlimited in /etc/init.d/lsf by adding the ‘ulimit -l unlimited’ command.
### END INIT INFO
ulimit -l unlimited
- HP HPC Linux Value Pack 3.1 – Platform MPI job failed