Linux
From Solaris to Linux
When the UVa astronomy department hired me, the main project they wanted me to design and implement was a way to transition their legacy Sun Solaris network over to cheaper Linux based PCs. Many of the astronomy related programs had already been ported from one platform and architecture to the other, and so the ability to change platforms was ready.
History
Originally, the design goals for introducing Linux PCs into our department was to create a 2nd parallel network completely separate from the Solaris machines. At the time, it was generally believed that Solaris was more stable and secure, and there was concern amongst the faculty that Linux would compromise those features. Initially a network based Linux home directory server was set up in tandem with the existing Solaris servers. Later, after about 33% of the users had transitioned to Linux, it was decided that the two networks should be re-merged together into one transparent platform that hosted both architectures of clients.
Servers
This migration of desktops spun off multiple sub-projects, such as migrating our single Solaris server (which was burdened with mail, web and NIS duties plus general processing) to separate, independent Linux servers.
- Our mail services were migrated over to a Postfix based system with the latest spam fighting techniques applied.
- Our web services were moved to a newer, higher capacity apache system.
- The Linux home directory server was designed as a redundant fail-safe cluster, since it is the most critical part of our network. This server was based on the heartbeat project that is included in recent versions of Fedora Linux.
- Finally, the last (yet very critical!) service was developing and deploying a backup service for our data.
Ultimately, the NFS home directory "cluster" was downgraded to a single host system (heartbeat proved to be unreliable during fail-over situations).
The NIS user password table was eventually migrated to an OpenLDAP database and expanded to include user data for other services, such as Samba (Windows logins), VPN, and user addressbooks. This massive LDAP project also allows for centralized Mac logins, additional website features and other exciting technologies.
The Linux servers have proven to be very stable - several of the servers have an uptime in excess of a year.
Server Backup system
I developed a custom backup solution for our most important data based around the rsync program and NFS. The backup program provides automatic daily rotation of the backups. With CentOS 6, I hope to be able to migrate the NFS side of this system over to a data deduplication filesystem, to reduce storage overhead greatly.
Workstations
Our Linux workstation network is currently based on the CentOS project, which is an open source "rebuild" of Redhat's Enterprise products. Originally I used Fedora (Core 2, Core 4, Core 5, Fedora 7 and finally Fedora 9) before surrendering to the longer-term stability of CentOS.
As part of the workstation project I've developed a rapid deployment system to get an up-to-date, heaviliy modified operating system customized for our environment that can be installed on any commodity PC hardware. Total install time is on the order of about 20 minutes per machine (multiple machines can be deployed simultaneously if desired) and requires very little interaction during the installation (mostly network settings for new machines and disk partitioning questions). This is all based on the kickstart scripting system, and uses some php scripts on our webserver that queries a mysql database for known machine information.
This department routinely has machines with significant stability characteristics - several linux workstations have an uptime in excess of a year.
Workstation Backups
I also modified the backup scripts that I use for the servers to work in a desktop workstation environment. These scripts run daily and provide critical data backup of all user data to secondary storage areas.
HPC Clusters
Our first two HPC Clusters, named Hyades and Pleione, were based on the Rocks Linux cluster distribution and off-the-shelf Dell server hardware. They were fairly small systems (48 cpu cores and 16 cores, respectively) and were optimized for different types of code (parallel and single-threaded). The Hyades system, where most of the MPI-based codes were running was upgraded to InfiniBand networking which greatly increased the performance of our codes.
The latest-generation HPC cluster is Hyades-NG, which went into service in September 2010 and replaced the Hyades and Pleione clusters. The new cluster is based around diskless blade technology, and provides 384 cpu cores and 1TB of memory; all of which is availble exclusively to UVa Astronomy Department researchers. The entire Hyades-ng cluster can communicate through a dedicated InfiniBand network and has significantly increased storage abilities due to the relative cheapness of iSCSI storage arrays. This cluster uses the Lustre filesystem for high performance disk I/O throughput between nodes.
Integration
Ultimately, the users of the Astronomy department computer network need to be able to work on multiple platforms and move data quickly and easily between all the systems and programs they need. A combination of technologies, such as Samba (Windows-to-Unix file sharing), CUPS (centralized printing), LDAP and RADIUS (user account information sharing), VPN (secured networking from remote locations), IMAP (email) and NFS provide a reliable, convenient way to move almost any type of data between systems easily.
With the explosion of popularity of Mac OS X, I've also modified the computer environment to accomodate and allow users to simply plug in and start using their mac equipment just like any windows or linux system.
Last modified November 17, 2010.