Selecting Server Hardware

Executive summary:
The server you select can have a big impact on the reliability of your business. Here are some recommendations.

The Server’s Job
The network server stores the business data. This can include files from office programs such as Word and Excel, and databases that provide shared data such as accounting software.  Most servers also provide network services such as security and connecting computers to the internet.
The idea is to put all your data in one basket, and make it very reliable and safe.
By estimating the cost to the business of a server failure, you can choose how reliable (and expensive) a server you need.

Recommendations

  1. Buy a real server from a reliable vendor. Server hardware is built differently than workstation hardware. Don’t try to save by buying home built servers or workstations with server software installed. Reliable vendors are IBM/Lenovo, Intel, Cisco, HP, Dell, and a few others with more specialized hardware for specific needs.
  2. Consider redundant power supplies or keep a spare power supply on hand. The 2 parts that most often fail are the power supply and the hard drives. Redundant power means you can keep working even if one of the 2 power supplies fail.
  3. Get server grade hard drives. Workstation hard drives use a SATA hardware interface. These can do only one thing at a time. Server hard drives use an SAS interface, which is designed to optimize access to data for multiple users. They come in 10,000 RPM (fast), 15,000 RPM (faster) and Solid State (very fast). Faster is more expensive. Allow room for growth.
  4. Get Mirrored hard drives. “Mirrored” (sometimes called RAID 1) means that there are 2 identical hard drives with identical copies of your data. If one fails, your server keeps working, and warns you to replace the other.
  5. Don’t get RAID 5 – This is an older standard that has a single extra drive to provide redundancy. Your software vendor may recommend this. Here’s the supporting documentation to give them on why this is bad:

RAID 5 is deprecated and should never be used in new arrays
From <https://www.storagecraft.com/blog/raid-performance/>


Once one drive fails the load on the remaining drives increases dramatically and suddenly.  The chances of a second drive failing shortly after the first is abnormally high during a parity drive failure event.

This is the primary risk of a RAID 5 array and why they are listed as never recommended.
from <https://community.spiceworks.com/topic/290226-raid-5-array-not-rebuilding>

…recommendation for Raid 5 for all business critical data on any drive type and Raid 50 on 7200 SATA/NLS drives 1TB and larger are no longer best practice. As drives get bigger and bigger, RAID sets take longer to rebuild and the performance and size benefit you get from these RAID levels don’t outweigh the higher risk to reliability.

…. But even today a 7 drive RAID 5 with 1 TB disks has a 50% chance of a rebuild failure. RAID 5 is reaching the end of its useful life.

As well-informed commenter Liam Newcombe notes:

The key point that seems to be missed in many of the comments is that when a disk fails in a RAID 5 array and it has to rebuild there is a significant chance of a non-recoverable read error during the rebuild (BER / UER). As there is no longer any redundancy the RAID array cannot rebuild, this is not dependent on whether you are running Windows or Linux, hardware or software RAID 5, it is simple mathematics. An honest RAID controller will log this and generally abort, allowing you to restore undamaged data from backup onto a fresh array.

From <http://www.zdnet.com/article/why-raid-5-stops-working-in-2009/>

While an array is in a degraded state it typically has less or no redundancy, performance is reduced sharply, and drives are under stress as they provide the data for the rebuild. Mirroring systems aren’t affected as badly, since only a single disk is required to rebuild the mirrored drive…  As drive sizes increased, RAID-5 rebuild times followed, taking hours or even days to reconstruct the data from a lost drive. During this time disks were run full-tilt as the RAID controller recreated the missing drive… This led into an issue [with] similar failure rates between drives in the same manufacturing lot.
From <http://blog.servercentral.com/the-levels-of-raid>

  1. Invest in a server disk controller. Server hard drives work best with a dedicated hardware disk controller that is intelligent – it helps manage multiple simultaneous requests for data from the network users. A hardware “RAID” controller with at least 1GB cache memory can dramatically improve performance.
  2. Use a battery backup (UPS – un-interruptible power supply) with shutdown software on the server. When the power goes out, there is a chance that the server is in the middle of a critical operation, such as saving your data. A battery backup is not enough. You also need it to signal the server to gracefully shut down before it loses power completely.
  3. Have adequate backups. This is a topic for an entire article.
  4. Get the right hardware warranty. Server parts are only available from the vendor. Most provide a warranty on hardware for up to 5 years. Either purchase a spare identical server, or make sure your hardware is under warranty while it is critical to your business.

And, of course, we can help you with selecting an implementing the right server for your business. Because we charge by the hour as a consultant, we don’t have a bias based on hardware profit margins.

-Tim Torian