Sun Grid Engine 6.2 Update 2 introduced the support for Windows Operating systems to run as worker nodes. Sun or Oracle Grid Engine as it’s being relabeled now is a distributed resource manager primarily used in HPC environment, but there’s more widespread use now with all the new features introduced as part of Update 5.
Here I’m going to detail a quick how-to of getting Grid Engine installed and running on Windows hosts. This is more applicable for Windows XP and Windows Server 2003, some of additional prerequisites required on the Windows hosts are now standard in Windows Server 2008 and Windows 7.
- Follow the detailed instructions http://wikis.sun.com/display/GridEngine/Microsoft+Services+for+UNIX for an in-depth understanding.
- You must disable Data Execution Prevention (DEP). DEP is not compatible with some parts of SFU and might cause segmentation faults. See http://support.microsoft.com/kb/875352 for more information about DEP. To disable DEP, see http://wikis.sun.com/display/GridEngine/Disabling+DEP. Note this may not be the case for Windows 2008 Server and newer releases.
Installation of SFU
- Download Windows SFU 3.5 http://technet.microsoft.com/en-us/interopmigration/bb380242.aspx
- Create a copy of the passwd and group file from your master node with only entries required for User mapping on Windows. This will be used in one of the installation screens.
- Refer to http://wikis.sun.com/display/GridEngine/Microsoft+Services+for+UNIX#MicrosoftServicesforUNIX-ServicesforUNIXInstallation for detailed screen shots.
- Choose Custom install and choose Remote Services and leave the rest as is
- Enter the correct path of your passwd and group files for User mapping.
- Complete the installation and restart.
- Some notes to take, Interix is the subsystem for SFU. So with SFU installed, both the Win32 and Interix subsystem will run parallel to each other.
- Also note that any windows application can be invoked from with the Interix shell.
Additional but not required
- Install the bootstrap installer from Interop. Follow the directions here. http://www.interopsystems.com/tools/pkg_install.htm
- If your system doesn’t have access to the Internet download the packages from ftp://ftp.interopsystems.com/pkgs/3.5/ to your local machine
- For adding new packages, set PKG_PATH to the packages directory on your local machine.
- Mostly it can be used for adding Bash, because SFU comes with default ksh and csh.
- The binaries are installed in /usr/local/bin
Post Installation of SFU / GE installation prep work
- The instructions here can be followed http://wikis.sun.com/display/GridEngine/Microsoft+Services+for+UNIX#MicrosoftServicesforUNIX-PostSFUInstallationTasks , but the instructions provided below will also be useful.
- Open the Control Panel -> Administrative Tools -> Services and check that Telnet and Remote Shell Service is disabled. We will need to run telnet and RSH from Interix.
- Uncomment the lines containing telnet and shell from /etc/inetd.conf through one of the Interix shells.
- Restart inetd from /etc/init.d/inet start/stop
- Check the Windows Firewall, if its off there’s nothing to do, if its off and group enabled or if enabled do the following:
- Add Exceptions for TCP port 23 and 514 for telnet and remote shell access required for GE.
- Also add an exception for the Grid Engine Execution daemon port, in this case 6444.
- Do a nmap to check that ports 23 and 514 are accessible.
- User name mapping is an important step. This instructions may vary if there’s a domain controller.
- Map the user names from the passwd file we used earlier and Windows users
- Refer http://wikis.sun.com/display/GridEngine/User+Management+for+Sun+Grid+Engine+on+Windows+Hosts for a detailed understanding of User management on Windows for Grid Engine.
- Start -> All Programs -> Windows Services for UNIX -> Service for UNIX Administration -> User Name Mapping
- Depending on your configuration you may have choose NIS or Password and Group Files. Either of those options are fine.
- Click on the Maps tab, check Simple Maps and choose the appropriate Windows domain name.
- Under Advanced Maps click on Show User Maps. Click on List Windows Users and List UNIX Users
- Now just map the Windows user to the appropriate UNIX user name.
- Make sure the following mapping is present (Windows User) Administrator -> (UNIX User) root
- Map any other required users.
- Check the users home directories and map them accordingly and create a profile for users if needed.
- Control Panel -> Administrative Tools -> Computer Management -> Users -> Properties -> Profile
Pre Grid Engine Installation
Following on Windows Host
- Create a directory for Grid Engine installs, typically /opt/gridengine
- Export the following environment variables, add it to /etc/profile.lcl for system wide settings.
export SGE_ROOT=/opt/gridengine export SGE_QMASTER_PORT=6445 export SGE_EXECD_PORT=6444
- If you are going to be running multiple instances of the execution daemon, then don’t set these environment wide, rather set it for each instance, as each instance needs to use a different port.
- Add the Master node and the current node in /etc/hosts
Following on Master Node
- Run sgepasswd on the Master node and set the passwords for the Windows users. This is used by G! E for shell logins.
- When you do "sgepasswd user", it sets the password for user on the default domain.
- Use "sgepasswd -D domain user" to set the password for a user of a specific domain.
- Make sure the Windows host is added as an administrative host.
- Run "qconf -sh" to check if the Windows host is already an administrative node
- Run "qconf -ah hostname" to add the Windows host.
- Type "qconf -mconf" and set the execd_params to enable_windomacc=true.
- Add the Windows Admin name
- qconf -am Administrator
- Generate certificates for the Windows users and also the certificates and keys used for encryption needs to be copied over to the Windows machine
- This page provides detailed info – http://wikis.sun.com/display/GridEngine/Installing+Security+Features#InstallingSecurityFeatures-CSP
- The advanced security can also be used for other hosts, but is a must to run jobs on Windows.
- Generate certificates for a new user
- $SGE_ROOT/util/sgeCA/sge_ca -user <win_user_name>
- Follow Step 6 in the above link
- Use this as guidance. http://wiki.gridengine.info/wiki/index.php/Install_and_configure_Grid_Engine_in_heterogenic_environment_on_Linux_and_Windows_with_MPICH2#! NFS_on_Windows
- Create a soft link from X:\ or any mount you choose. It available in /dev/fs/X.
Grid Engine Installation
- Copy and extract the Grid Engine binaries, common files and the Windows specific binaries to $SGE_ROOT. Optionally a NFS mount could be setup.
- Copy the folder containing in CELL_NAME in most cases default to $SGE_ROOT
- If you get an error with cannot resolve host name, add the windows hostname and IP in the /etc/hosts file.
- Follow the installation instructions at http://wikis.sun.com/display/GridEngine/Printable+Installing+Sun+Grid+Engine+Guide#PrintableInstallingSunGridEngineGuide-HowtoInstallExecutionHosts for installing the Execution daemon.