Posts

Network Operation Center (NOC) Best Practices – Part 3: Processes

This is the third part of our 3-part blog series discussing Network Operation Centers best practices. The first post was dedicated to NOC tools. The second provided some useful tips regarding NOC knowledge and skills. In this last part, we’ll address processes.

What are the operational, structured processes that you should implement for effective and repeatable results? Here are our top ones.

Escalation

A table of escalation will ensure that all team members are clear on the proper protocol and channels for escalating issues. This table should also include all areas and skills covered by the NOC and the people who are trained to cover those areas.

For example, see the table below defining the escalation procedures for DB related problems.

Time Frame Escalate To Method
0+15mins DB on call SMS
0+30mins DB on call Phone
0+60mins DB Group Leader Phone
0+90mins UNIX & DB Project Manager SMS
0+120mins UNIX & DB Director SMS

A critical problem that was not solved within 30 minutes is escalated up the management ladder, until a response and/or ownership is taken. At every step of the process, it is recommended to involve all personnel up to the current level. So when an SMS is sent to the project manager, it is also sent the DB on call and Group Leader.

Prioritization

The process of prioritizing incidents is different in each NOC, and therefore should be clearly defined. Incidents should never be handled on a first come, first served basis. Instead, the shift manager should prioritize incidents and cases based on the importance and impact on the business. Issues that have a greater impact on the business should obviously be handled first.

Understanding the prioritization of incidents in terms of their business impact should be part of the NOC training. The entire team should be familiar with the NOC “Top 10” projects, and have an understanding of what signifies a critical incident. It could be the temperature rising in the data center, a major network cable breaking or a service going down.

Obviously, common sense is very useful. Clearly the shift leader should be able to determine that an incident that jeopardizes the entire data center has a higher priority than a request to verify why an individual server is down.

Incident handling

The process of handling incidents applies both to NOC operators and shift managers. Both roles should be familiar with the specific process of handling incidents with the greatest impact on users.

Incident handling process should cover issues such as:

  • Full technical solution, if available.
  • Escalation of issue to appropriate personnel.
  • Notification of other users who may be directly or indirectly affected by issues.
  • ‘Quick solution’ procedures or temporary workarounds for more complex problems that may take longer to completely resolve.
  • Incident reporting. An incident report, completed once the incident is resolved, helps improves the service when the next incident occur, or may also prevent the recurrence of the same incident.

Employing the proper tools, skills and processes in your NOC will allow you to run more efficient network operations and ensure smooth day to day operations as well as meeting the demands from the IT department.

  A 360° of Network Operation Center in action using IT Process Automation tool.

What other processes are you familiar with and would recommend to other NOC managers?

Download  Download eBook: 10 Network Operation Best Practices

 

Network Operation Center (NOC) Best Practices – Part 2: Knowledge & skills

 

This is the second of our 3 parts blog series discussing Network Operation Centers (NOC’s) best practices. The first post was dedicated to NOC tools. This part is dedicated to knowledge and skills. By ‘knowledge and skills’ we do not mean the obvious technical knowledge, network ‘know-how’  your team members must hold in order to run day-to-day operations, but rather –

 

How you can ensure your team’s skills are used to their best potential, and how to keep those skills up to date over time.


Cyber Security Best Practices: Assembling the Right Team

Clearly define roles

Definition of roles may vary between data centers and will depend on team size, the IT environment and tasks. Still, there should be a clear distinction between the roles and responsibilities of operators vs. shift supervisors in the NOC.

Why does it matter?

Mainly matters because of Decisions making. Without clearly defined roles and responsibilities, a disagreement between operators may lead to late decisions and actions, or to no decisions taken at all. This may affect customers, critical business services, and urgent requests during off hours.

It should be clearly defined, therefore, that a shift manager makes the final decisions.

Tasks division

Another potential problem caused by a lack of role definition is the division of tasks between operators and the shift leader.

A shift manager should be responsible for: prioritizing tasks, assigning work to operators based on their skills, verifying that tickets are opened properly and that relevant personnel are notified when required, escalating problems, communicating with management during important NOC events, sending notifications to the entire organization, preparing reports, and making critical decisions that impact many services, such as shutting down the data center in case of an emergency.

Operators, on the other hand, are responsible for handling the technical aspect of incidents – either independently or by escalating to another team member with the required skills. Operators are also responsible for following up and keep tickets up to date.

While it might sound as if operators lack independence and responsibility, this is not the case. When faced with technical challenges, operators’ input and skills are probably the most critical for resolution and smooth NOC operation. Operators provide additional insights into problems, and can provide creative solutions when the standard procedures fail to work.

Invest in orientation program for new NOC employees

How often have you started a new job, without receiving any orientation, mentoring or guided training?

Failing to provide proper training to new NOC operators always has consequences. A new NOC operator may not know where to find a procedure or how to execute it; be confused about who should handle a task – the NOC, service desk, or higher level of support; or in a more severe case, take a decision that causes equipment damage or results in downtime of critical business services.

Therefore, an extensive training program should be put in place for new NOC employees. This is definitely a challenge, considering the lack of resources, particularly in small NOCs. Ideally, such a program would consist of one week of classroom training followed by three weeks of hands-on training under the supervision of a designated trainer.

A new employee should only be trained by an experienced member of the NOC, preferably a shift leader. The trainer should be released from all duties during the entire period of the training – in order to ensure that the training does not gradually fade between all the urgent shift tasks.

The training program should be updated on an ongoing basis, and should include topics such as required users and permissions, technical knowledge, known problems, troubleshooting, teams and important contacts.

 

Communication and collaboration

Within your Organization

Establishing a solid communication flow between NOC members and other IT teams has many advantages. It propels professional growth, provides opportunities for advancement in the organization, and makes it easier to approach other teams when requiring assistance.  But most importantly – it allows NOC personnel to see the larger picture. NOC members that are aware of projects, services and customers’ needs, simply provide better service.

A designated member of the NOC should attend weekly change management meetings. That person should communicate any issues or upcoming activities, such as planned downtime, to the rest of the team.

Define NOC members as focal points for important IT areas, such as NT, UNIX, Network, or a specific project is another good practice. These members should attend the meetings of the relevant teams, deliver new information and knowledge to the rest of the NOC, and handle specific professional challenges.

Within NOC Team

Another important form of communication is within the NOC team itself. There are clear advantages to having a strong connection and collaboration between NOC team members. Members are more willing to help each other, information is shared more easily, and the general atmosphere encourages collaboration when addressing problems, as opposed to an individualized approach.

Team communication is a challenge when the NOC team is geographically spread out or located in different countries. Because cultural and language differences can cause confusion and misunderstandings, spending efforts on building team communication and collaboration are even more critical.

Which additional skills and knowledge would you recommend for a NOC?

Download  Download eBook: 10 NOC Best Practices