By: Chuck Wilson, Content Proposal Manager – Identity Solutions, USA Region
Most, if not all, Driver’s License/ID card contracts include provisions for Disaster Recovery (DR) by specifying backup production facilities to ensure a timely response to demand in the face of interruptions at the primary production facilities. Experience has shown most secondary DR facilities that are only operated in the event of a disaster or for test purposes usually cannot adequately respond to an actual disaster.
First, every disaster is different and they rarely comply with scenarios developed to design and test the system. In a genuine disaster situation, the best response occurs after carefully considering the circumstances – something for which there is little time and often fewer resources.
Second, initiating production in the DR facility assumes that all IT equipment, data tunnels and physical materials are fully functional and ready to go “at the flip of a switch”. This isn’t usually the case – processes and card constructions have simply become too complex to start successfully on a moment’s notice.
Finally, initiating production at the DR facility assumes that that facility’s personnel are familiar with the day-to-day nuances of production – the minor, undocumented procedures that make daily production flow smoothly. Without actually running the program every day, the DR facility doesn’t develop these processes. The truth is, bona fide disasters are often accompanied by a second – hopefully smaller – disaster when the DR facilities are activated.
Disaster Recovery Best Practices for Credential Issuance
A best practice for disaster planning and business continuity is a technique, method, process, or activity that is effective at restoring the operation of an enterprise after a disaster or enterprise interruption event occurs.
With proper processes, checks, and testing, a desired outcome can be delivered with fewer problems and unforeseen complications. Best practices can be defined as the most efficient (least amount of effort) and effective (best results) way of accomplishing a task, based on repeatable procedures that have proven themselves over time.
• Create records of all systems needed and update them periodically – The computer / printer systems used and their configurations must be documented, as well as software and firmware updates. This should include all service packs, fixes, and QFEs (quick fix engineering) that have been applied.
• Validate that all technology is properly installed and configured correctly from the start – a technology solution that is properly implemented in terms of its hardware, firmware, and software will dramatically reduce problems and downtime in the future. Proper initial configuration can also save time and reduce issues with upgrades, hot patches, and other changes.
• Monitor the processes and people to evaluate what is critical – Ensure the facilities, people and infrastructure are in place to meet current needs and contingencies. An assessment that examines and analyzes the environment’s capabilities and requirements on a periodic basis can provide valuable information to help improve efficiency.
• Train staff on how to execute the DR/Business Continuity (BC) procedures – People are the front line when it comes to operational support. A staff that has not been properly trained in the use of the DR/BC processes and procedures when an event occurs will be a hindrance. Everyone must have the knowledge and skills to provide the right support. This not only helps to reduce downtime, but it also delivers better performance and a faster ROI through better and wiser use of IT assets. Further, the operations staff and IT staff must practice how to recover from any outages or disasters. This includes restoration of all procedures prior to the system failure.
• Record and disseminate a clear definition for declaring when a disaster or business interruption occurs that will set the DR/BC process into motion – There needs to be a clear process for allocating resources based on their criticality and availability. This will define the “rules of the road” for who does what and when while minimizing the factors that can negatively impact enterprise operations. As part of this process, there must be a clear point of division that separates troubleshooting an issue from issue recovery.
• Integrate the DR/BC processes with change management – Changes are inevitable in any sizable environment. It is difficult to keep up with the flood of new IT applications, emerging technologies, and new tools. That is why it is essential to design, implement, and continuously improve change and configuration management processes.
• Focus on addressing issues before they impact the enterprise – At the current speed of business, after-the-fact fixes do not make the grade. Trouble must be anticipated and headed off before it happens. It is important to identify risks across people, processes, and technologies so that appropriate countermeasures can be implemented. An inappropriate level of support must be planned for, including proactive features such as critical patch analysis and change management support.
• When possible, maintain a replicated, fully-staffed, hot site of the critical system – The best way to create this is to maintain duplicate machines in two separate card personalization facilities, enabling both to perform the same work, but retaining the capacity for each site to fully absorb the work of the other in the event of a catastrophic occurrence. This significantly tightens the time frame from recognition of a downtime event to full DR implementation. The same outcome occurs if two sites with 100 % capacity are processing data at 50% capacity each and have failover capability.
• Annually update and test the DR/BC Plan – DR/BC testing should be conducted prior to putting the production system in operation, and repeated at least once per year. The test should include all recovery procedures.
Valid’s credential issuance system was designed from the ground up to be robust and disaster proof. We maintain dual hot-site facilities, fully staffed with the same equipment, and our two facility staffs share the experience of producing the same credentials for each customer. While we routinely split the production demand between two sites, we maintain the capacity and resources at each site to process 100% of the total credential volume. Valid is committed to providing our clients the best Disaster Recovery posture in the industry! To learn more, please write email@example.com, or click here to receive a solution consultation.