The role of system administration involves three different responsibilities.
1. Be ready to Prevent the issues which are Expected to Arise
2. Be ready to Resolve the issues which are already Expected but couldn’t prevent from occurring.
3. Be Ready to Resolve the issues which are unexpected in nature.
We all are very well trained to fulfill the first two responsibilities in most efficient manner. But, very few of us are actually ready to deal with the third responsibility, and the reason is not the technical capability but it is because of ” lack of visibility on how our servers, that we are managing, are connected with other IT infrastructure components ( like networks / storage / power supply ) “. In this post I am going shed some light on this third responsibility of the system administrator.
The most unexpected and major outage for any IT infrastructure could happen due to the complete power supply failure in a Data center with some technical / human error. When the power supply fails, it is not only the servers that are going down but also the other major components, like network devices and Storage devices, will also go down. You can look at the below figure to understand how the components in a data center are interconnected to each other. Full Story