Resolving unreach or unavail nodes in OpenLava-3.0


After configuring OpenLava-3.0 using the tar ball and following the instruction according to the OpenLava – Getting Started Guide

After fixing OpenLava with LM is Down Error Messages for OpenLava-3.0, you may errors

HOST_NAME          STATUS       JL/U    MAX  NJOBS    RUN  SSUSP  USUSP    RSV
spms-limeb-c00     unreach              -     16      0      0      0      0      0
spms-limeb-h00     ok              -     16      0      0      0      0      0

Suggestions:

  1. Check your permission where openlava-3.0 reside. Make sure the HeadNode and ComputeNode has the user and group openlava and openlava have permission on the folder
    drwxr-xr-x. 10 openlava openlava 4096 Jun 26 00:32 openlava-3.0
  2. Install pdsh. See Installing pdsh to issue commands to a group of nodes in parallel in CentOS on all the compute nodes and use pdcp to copy /etc/passwd /etc/shadow /etc/group to all the nodes
    # pdcp -a /etc/passwd /etc
    # pdcp -a /etc/shadow /etc
    # pdcp -a /etc/group /etc
  3. Make sure your /etc/hosts reflect the short hostname of the cluster both in the HeadNode and ComputeNode. Refrain from putting 2 hostnames per line.
  4. Check your firewalls settings. Make sure the ports 6322:6325 are opened.
  5. Ensure your NTP are synchronized across the clients and HeadNode with the designated NTP Server. If the NTP
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s