Latest from Network Reliability/Testing & Assurance/Cybersecurity/Safety

Disaster 0322 1920x920

It’s All About Your Reaction

March 1, 2022
Learn how renovation activities, and other unintentional accidents, can cause significant damage to your network equipment. Then, decide how best to prepare. […]

Create an Effective Disaster Prevention and Recovery Flow Before Disaster Strikes

Disasters often arrive unannounced, but that doesn’t mean we can’t plan for them. Consider the tornados that occurred in December of 2021. At the time of this writing, network service providers are facing damage to buildings and outside plant facilities. While aerial plant and tower mounted equipment are the most affected, flooding of underground installations is also a concern. High winds damage roofs, allowing for water ingress and flooding damaging both equipment and services. Having regional response teams is critical to service restoration.

An unplanned interruption of network services can range from performance degradation to a complete network outage. Natural events, human error, cyberattacks, design flaws, or a combination of these factors can all cause an unplanned interruption. To effectively respond to a disaster, planning is needed to mitigate the effects on health and safety, facility risk exposure, service disruptions, economic loss, potential liability, corporate reputation, public relations and to reduce the possibility of future regulatory requirements.

Disasters can occur when communications service providers are faced with constraints to their budgets and resources. While providers need to maximize the reliability of their network infrastructure, they are also often faced with converged traditional communications and IT-based networks with differing installation procedures and environments. Thus, major outages related to physical infrastructure continue to occur, resulting in lost revenue from service outages and direct damage costs. These outages can be caused by service losses from power outages and backup power failures; facility fires, A/C switch gear fires and hydrogen gas concerns; and water leaks and floods.

The key elements to disaster prevention and recovery:

  1. First and foremost, the disaster prevention and recovery plan should be tested to ensure that it can be effectively executed.
  2. In the event of a disaster, evacuate staff if needed.
  3. Determine usable and unusable equipment.
  4. Maintain critical systems and infrastructure needed to keep service available before anything else.
  5. Determine how to maintain service until final repairs can be made.
  6. For large-scale disasters, find replacement equipment and staff available as quickly as possible.
  7. Once final repairs are made, fully restore service.
  8. Prepare necessary insurance claims. 
  9. Discuss lessons learned and areas for improvement.

Review building and network equipment in the following areas:

  • Geographic and local building risks and site security
  • Power and back-up power availability
  • Fire protection (detection, suppression, safety)
  • Equipment selection and installation quality
  • HVAC systems and environment
  • Cabling, bonding, and grounding
  • Facility operations: Operating procedures, safety hazards, building conditions
  • Disaster recovery planning and procedures

Disaster Recovery Plan

Figure 1.

Because this plan becomes the reference for all managers and employees during a critical time, the disaster recovery plan should define the process of resuming normal business operations and repairing or salvaging critical equipment. (See Figure 1.) It’s also critical to include representatives from all critical organizations across the company.

  • It must provide for major and minor disasters, and prepare for natural disasters, such as tornados and general flooding; and both accidental and purposeful man-made disasters, including cyberattacks.
  • The plan must provide for initial and ongoing employee training as well as the skills needed during the recovery process.
  • The plan must not only spell out which functions are vital, but also in which order they are restored.
  • And remember: no 2 disasters are the same.

Your disaster recovery plan should include information related to:

  • Minimizing interruptions to the normal operations.
  • Notifying those affected. 
  • Limiting the extent of disruption and damage.
  • Minimizing the economic impact of the interruption.
  • Establishing alternative means of operation in advance.
  • Training personnel with emergency procedures.
  • Planning for contingencies.
  • Providing for smooth and rapid restoration of service.

A good reference to review is NIST Special Publication 800-34 Contingency Planning Guide for Federal Information Systems.1

Proactive Response

Let’s focus on the actions needed to respond to specific disasters and to determine the level of damage. In response to a disaster, we must also determine which equipment can remain in use temporarily, which equipment needs to be deactivated or decommissioned and replaced, and which equipment can be left in service.

The most common disaster scenarios encountered include:

  • Water Damage: Flooding or overhead ingress from pipe breaks and roof failures
  • Fire and Smoke: Direct thermal damage, corrosive smoke, or corrosive fire suppression
  • Contamination from indoor- or outdoor-generated particulate or chemical releases
  • Critical Equipment Failure: Generator, air conditioning, ATS, switch gear, etc.

A. Evaluation of Water Damage

Direct contact with water will rapidly cause electrolytic corrosive damage to electronics. (See Figures 2a and 2b.) However, electrical systems such as switch gear and distribution panels often can be restored. Thermal damage from electrical arcing and short-circuits may also occur, causing additional damage to electronics.

Figures 2a and 2b. Equipment Backplane and Connector Damage

If the electronics are not energized at the time of wetting, returning to service is much easier through cleaning and testing. In addition to detailed visual inspections of equipment, sample collection for chemical and elemental analysis is critical to determining the type of corrosion and impact of contaminants in the water and to differentiating types of metals impacted for superficial surface oxidation to corrosion of underlying metals. This is needed to determine replace, restore, and cleaning recommendations and ensure long-term reliability. Additional testing considerations may be found in NEMA and NETA guidelines.2

Recommended analytical testing routinely performed includes:

  • Ion Chromatography Analysis is used to quantify anionic and cationic contamination levels in micrograms per square. These can be compared to industry standards for electronics to aid in restoration or replacement decisions.
  • Environmental Scanning Electron Microscope with Energy Dispersive Spectroscopy (ESEM/EDS) provides analysis for the elemental composition of corrosion Elemental Analysis.
  • pH or Halogen Paper spot tests determine corrosivity of residues when hydrated.
  • Thermography (Infrared Camera) for use on water-impacted electrical panels.
  • Insulation Resistance or Dielectric Breakdown testing for electrical cabling and panels.

B. Evaluation of Fire and Smoke Damage on Electrical Cards

Figure 3. Equipment Fire Damage

While direct thermal damage and combustion in a fire are devastating, the impacts from the corrosive smoke and fire gasses generated typically are far-reaching and may cause extensive damage to electronics well away from the fire event or throughout an entire building if not quickly remediated. The combustion or thermal decomposition of common halogenated plastics, cable insulation, and polymers used throughout communications and IT facilities, (See Figure 3), release corrosive halogenated chemicals that may attack metallic surfaces.

It is critical to perform chemical analysis throughout impacted areas and surrounding areas to map out the deposited concentrations of these anionic contaminants. Once the contamination levels are known, the decisions for equipment replacement, restoration, basic cleaning, or no cleaning necessary, can be developed and refined throughout different areas within a facility.

The chemical analysis also determines irreparable corrosion or other damages or chemicals of concern. Many of the same analytical tests described for water residues should be used, including:

  • Ion Chromatography Analysis is used to quantify anionic and cationic contamination levels in micrograms per square. These can be compared to industry standards for electronics to aid in restoration or replacement decisions.
  • Environmental Scanning Electron Microscope with Energy Dispersive Spectroscopy (ESEM/EDS) provides analysis for the elemental composition of corrosion Elemental Analysis.
  • pH or Halogen Paper spot tests determine corrosivity of residues when hydrated.

C. Evaluation of Contamination

Figure 4. Field Sample Chemical Analysis

In addition to water and fire events, other contamination events include renovation activities. Construction or demolition debris, zinc or tin whisker contamination, hygroscopic (moisture-absorbing) pollutant ingress, chemical spills, virus or bacteria decontamination byproducts, refrigerant leaks, diesel spill, and electrolyte leaks, are all possible contaminants affecting critical facility equipment. (See Figure 4.) In addition to a detailed visual evaluation and review of chemical data sheet information, several of the same analytical tests should be employed:

  • Optical Microscopy of affected components characterizes the contaminants.
  • Ion Chromatography Analysis is used to quantify anionic and cationic contamination levels in micrograms per square. These can be compared to industry standards for electronics to aid in restoration or replacement decisions.
  • Environmental Scanning Electron Microscope with Energy Dispersive Spectroscopy (ESEM/EDS) provides analysis for the elemental composition of corrosion elemental analysis.
  • pH or Halogen Paper spot tests determine corrosivity of residues when hydrated.

Now you can see WHY it’s important to have a plan and preparation for all phases, from being ready to react to a disaster, actions to analyze the damage, and steps to minimize service interruptions and recovery efforts.

While you can’t plan for disasters, a well-planned and rehearsed Disaster Recover Flow plan (as illustrated in Figure 1), can help your facilities, team members and customers, recover more effectively.

References and Notes
1. https://www.nist.gov/privacy-framework/nist-sp-800-34
2. http://www.nema.com; https://www.netaworld.org/home

For more information, please email [email protected] or visit https://www.ericsson.com/en.
Follow Ericsson on Twitter @ericsson.

About the Author

Ernie Gallo | Director at Network Infrastructure Solutions, a division of Ericsson, Inc.,

Ernie Gallo is Director at Network Infrastructure Solutions, a division of Ericsson, Inc., and an IEEE C2 National Electrical Safety Code - Committee Member, Subcommittees SC2, SC4, SC5, and SC8. He has more than 40 years of experience in outside plant products, requirements, deployment, and electrical safety codes and standards such as the NEC and NESC. For more information, please email [email protected] or visit https://www.ericsson.com/en. Follow Ericsson on Twitter @ericsson.

For more information, please visit https://standards.ieee.org/products-programs/nesc/ to learn more information about the NESC and NESC Products. Follow the IEEE SA on Twitter @IEEESA. We welcome you to contribute to our efforts by joining the NESC.