How Slackbots and Ayehu Add Automation to BMC Helix Remedyforce

Author: Guy Nadivi

BMC Helix Remedyforce is a version of the BMC Remedy platform popular among organizations already using Salesforce.com, making it easy to deploy rapidly for IT organizations who value being nimble.

Since that’s a growing segment of the market, and given the surge of interest in chatbots, BMC and Ayehu have partnered to showcase how to add Slackbots and automation to the Helix Remedyforce platform.

BMC Helix Remedyforce provides a robust IT Service Management platform for running an IT organization and supporting the business. It takes a modern customer-focused perspective, and adds in very intuitive self-service capabilities that empowers non-IT staff to request services and solve problems on their own. BMC Helix Remedyforce is comprised of numerous modules, including:

  • Self-Service
  • Service Catalog
  • Knowledge Management
  • Service Level Management
  • Dashboards
  • Reporting and Analytics
  • Incident and Problem Management
  • Configuration Management
  • Asset Management
  • Agentless Discovery
  • Client Management
  • Multi-Cloud Data Center Discovery
  • Change and Release Management
  • Mobile Apps for IT and Business
  • Collaboration via Chatter and Chat
  • IT Best Practices and Smart Practices

Together, all this functionality allows BMC Helix Remedyforce to offer a unique value proposition of a short time to value, with light effort, yet still yielding a powerful delivery.

If your organization uses a cloud computing platform like BMC Helix Remedyforce, then being very lean and very responsive is most likely a priority. But there’s a way to take that leanness and responsiveness one level higher to help your organization become a self-driving enterprise through the addition of Slackbots and automation from Ayehu.

At Ayehu, we often talk about the self-driving enterprise, which is our guiding vision that influences every aspect of our automation platform.

What is a self-driving enterprise and how do we define it? Very simply – becoming a self-driving enterprise means becoming less reliant on people, and leveraging intelligent automation to handle more of the robotic kinds of tasks humans really shouldn’t be doing anyways.

Ayehu’s platform comes with numerous features an enterprise needs to become self-driving:

  • SaaS-Ready Multi-Tenancy
  • Agentless architecture
  • Codeless interface
  • And overall it’s very easy to use

It also has two features which really extend automation’s ability to help enterprises become self-driving, and thus less reliant on people:

  • AI and Machine Learning
  • Slackbots, which are an extension of AI and Machine Learning that provide end users with an almost human-like channel as an alternative to calling the help desk everytime they have an incident or a request.

Slackbots of course, are part of the overall chatbot market, which is big and getting bigger. Lest anyone think chatbots are a fad, according to Business Insider, in 2019 the market was worth a bit more than $2 ½ Billion. In 2024 they’re forecasting it will approach $10 Billion!

That’s a compound annual growth rate of over 29% a year. Very impressive growth!

I think we can safely say that chatbots are here to stay.

Gartner published a report about the chatbot market (“Market Guide for Conversational Platforms: July 30, 2019 – ID G00367775), which calculated that “31% of enterprise CIOs have already deployed conversational platforms.”

That number “represents a 48% year-over-year growth in interest.”

This is a strong leading indicator that the market is ready, if not eager, for conversational AI in the form of things like Slackbots.

One big reason enterprises are so eager for conversational AI and Slackbots is the impact they’re having on one of IT’s biggest KPI’s – Cost Per Ticket.

There’s a general industry figure published by Jeff Rumburg of MetricNet, an IT research and advisory practice, that a service desk’s average cost per L1 ticket is $20.

However, if you turn any given service request into a self-help or self-service function with chatbots, you can drive that cost down by 80% to just $4 per L1 ticket. 80%!

If you’re a CIO, CTO, or any senior IT Executive, and someone tells you that there’s a way to reduce your single biggest expenditure on IT Support by 80%, without reducing service effectiveness (in fact, possibly speeding it up), you’re probably going to want to hear more.

Enterprises are looking at chatbots as a way to divert calls or tickets or work away from the Service Desk, meaning people, and re-routing that load to chatbots, meaning software.

BTW – It’s not just because of bottom line costs and reducing calls and/or ticket volume to the service desk.

There are other value propositions for enterprise IT executives deploying chatbots:

  • Slashing MTTR by accelerating resolutions of incidents and requests
  • Liberating staff from doing tedious work so they’re freed up for more important tasks
  • And last but not least, raising customer satisfaction ratings, an increasingly critical KPI for IT

Today, there’s another big reason to start using chatbots – the Coronavirus COVID-19.

The Coronavirus pandemic is creating a new reality for everyone, and that’s led to widespread adoption of numerous precautions:

  • Washing one’s hands more frequently
  • Not shaking each other’s hands
  • Wearing protective facemasks

Perhaps the most relevant precaution being adopted, from an IT perspective, is the sudden surge in employees and contractors working remotely.

Numerous governments and health officials are imploring organizations to let their employees work from home, wherever possible, as a way of minimizing community transmission of the Coronavirus.

This has created a new reality for those workers, because now that they’re working from home, they can’t just walk over to the help desk cubicle to make a casual request. They might not be able to do it by phone either because the help desk staff is also working from home, and they’re pretty busy right now at most organizations just keeping the lights on.

Wherever remote workers may be though, they can always submit their service requests through a chatbot, and they can do it from both a web or mobile interface 24×7.

The great news about that is that there’s really no training required for someone to start using a chatbot or Slackbot, especially if it’s on their smartphone, an interface they’re already familiar with.

Slackbots can play an increasingly important role in a self-driving enterprise, allowing users to converse with the bot naturally (so to speak), and in their own language. The bot can understand the request, or if not, request clarification. Once it has the information it needs, the bot simply goes out and executes the request. It’s just that straightforward.

In addition to BMC Helix Remedyforce, there are many other systems you can quickly plug into Ayehu, which then acts as an integration hub across just about every platform in your environment.  This allows your users to initiate automated tasks via chatbot for every system you integrate with Ayehu. Best of all, almost every system Ayehu connects to can be seamlessly integrated without writing a single line of code.

If your organization aspires to be a self-driving enterprise, Ayehu automation + BMC Helix Remedyforce + Slack chatbots can provide a powerful combination which add value to such IT functions as:

  • Incident resolution
  • Alert-driven notification
  • Cross-IT change management
  • Service request management
  • Configuration management and infrastructure provisioning

If you’re interested in test driving Ayehu NG v1.6 with all its cool new features, download your very own free 30-day trial.

https://info.ayehu.com/download-free-30-day-trial-ng

Slash MTTR with Intelligent Automation for AIOps

Author: Guy Nadivi

There seems to be confusion in the marketplace about the term “AIOps” as far as what it means exactly, but there’s much less confusion about what it can do – Improve IT’s customer satisfaction scores by reducing noise, lowering call volume to the service desk, and slashing MTTR.

These are the types of benefits every IT organization is demanding, and the good news is they’re attainable right now.

Ayehu has partnered with Edge Technologies to show you a vision of what that looks like, and give you a glimpse at the promise that AIOps can bring to your IT organization.

Many of you have worked for years in the IT Operations and Systems Management space. Some of you may recall that in the mid-‘90s, Enterprise Systems Management and Business Service Management (or BSM for short) emerged as new disciplines that would bring together distributed systems and mainframes into a single pane of glass to solve problems. As you may know, Gartner killed off the BSM category in 2016 because vendors failed to deliver on these promised benefits.

In many large enterprises, the picture today still remains the same. Does this scene look familiar to you?

The CIO is still asking “why has customer experience dropped for our core service?.  The IT Ops Manager is unsure what the cause might be as everything looks good thanks to fantastic ”monitoring”.  And the SRE can’t make sense of any of the screens because he/she is suffering from information overload and isn’t sure where to look. No wonder MTTR is high!

Even with today’s AIOps vendors, and a market where new ones seem to be entering the space every week, the promise of universal views into your operations remains elusive.  Nevertheless, it’s still a highly sought-after goal.

So the question is, what is preventing progress towards that goal?

Today, we still continue seeing knowledge and visibility silos across the enterprise from business units, support, operations, and engineering functions all the way through to 3rd-party service providers.

This is one of the main challenges to overcome if AIOps is going to succeed. Internal politics, tool proliferation, and un-integrated workflows continue contributing to the slow adoption of AIOps.

Sound familiar?

The promise of a “single pane of glass” never materialized leaving teams to use point products with limited integration and different data formats. The result?  A huge and costly inventory of tools to manage and operate leading to more frustration.

It’s widely accepted across the industry that most monitoring dashboards today fail to provide required operational views that business needs.

AIOps aims to fully automate IT Operations workflows, but the reality today is that enterprises still struggle with tool sprawl resulting in the “swivel chair” effect. Your triage and remediation workflows are still very much reactive in nature, but the goal is to prevent incidents from happening in the first place as much as possible, right?

Also, in our experience over the years, the tools used today are more than likely to be replaced at some point, so the best approach is to have a vendor agnostic data visualization and integration solution for your dashboarding needs. The tools supplying the dashboard data feeds will come and go.  Replacing them is a simple configuration change in Edge.

In order to break the knowledge and visibility silo challenges and create intelligent operations dashboards for increased AIOps adoption, think of the process in three parts:

Part 1:   Integrate all required data sources ranging from customer experience and your enterprise IT domains to give business and service health views by role. For example, executive, manager, and analyst views.

Part 2:   Integrate your existing event management, monitoring, and IT service management tools at the data and web layers to maximize your existing tool investments, skills, and standard operating procedures to become more proactive than ever before.

Part 3:   Integrate your process automation tools (such as Ayehu) to create convenient and frictionless workflows that can be executed in either attended or unattended mode.

Now that we better understand the problems and obstacles in the way of making progress, let’s walk through the process of creating ideal intelligent operations dashboards for your AIOps initiatives by uniquely combining your data and tools into role-based views of your business and services.

When we think about digital transformation and the outcomes businesses are looking for, one of the goals CIOs have longed to achieve is ensuring that business and enterprise IT are completely aligned. This has been a goal for as long as most of us can remember!

To reflect that in our intelligent operations dashboards, let’s start from the top-level (see graphic below),which is a set of first-level business, customer, and end-user experience (EUE) dashboards that appeal to all levels of the organization.

The second level is a triage dashboard, designed to allow teams to quickly identify whether the server, network or application layer is the source of an outage or service health issue.

The third level is a dependency-mapping dashboard that links application, network, and server infrastructure together in topology views to understand the business impact.

The fourth level is individualized dashboards specifically designed for teams and dedicated roles — application, infrastructure, and network monitoring dashboards.This level of dashboard is where SMEs can directly access your existing best-in-class tools using Edge’s unique web UI proxying capability.

The fifth level gives you access to your raw data including logs, events, packet traces, and call stack traces for example —so that detailed analysis can be performed in context to the issue being investigated.

By combining your data sources and tools into universal views using a single platform like Edge, you can provide appropriate dashboards to your executives, management, and SMEs that provide them access to the content they need and tasks they need to perform to be successful in their daily jobs.

By combining business and related service health metrics along with the power of integration with your data and tools, you can rapidly identify root cause, fix the problem for good, and slash your key performance indicators such as MTTR. Many Edge customers report having happier customers, greater alignment between business and IT, eradication of visibility silos, and overall better decision making and outcomes from their deployment.

Not least of all, their most valuable assets (people) are more successful in meeting their goals and performing their job tasks.

Now let’s talk a bit about automation.

Digital Transformation is a buzzword you hear a lot about these days.  It doesn’t have one standard definition but can basically be understood to mean the collection of technology, process, and even cultural disruptions an organization adopts to maximize its competitiveness in the 4th industrial revolution.

Those technology disruptions can include things like cloud computing, artificial intelligence, chatbots, and of course automation.

The process disruptions include things like Agile or Six Sigma, and a cultural disruption might be something like repositioning the organization’s focus to be better aligned with the customer journey.

For IT departments, digital transformation ultimately boils down to optimizing and accelerating delivery of computing services, regardless of whether the customer is external or internal.

When it comes to incident monitoring, one thing an IT department can do as part of its digital transformation, is to consolidate the visualization of all their various monitoring tools into a single pane of glass, as Edge Technologies enables.  A unified dashboard providing a 360° view of operations, can also provide an extraordinary opportunity to not only centralize incident monitoring but also to automate incident remediation.  That represents a big step forward in the digital transformation of data centers, and a perfect example of how 1 plus 1 can sometimes equal 3.

A recent paper published by Gartner (ID G00390283 – October 9, 2019) advised its readers that an ideal performance monitoring dashboard framework must aim to “Provide for the rapid triage and remediation of performance issues…”.

No argument there. Ayehu and Edge Technologies agree that combining automation with performance monitoring is central to an ideal dashboard framework.  But perhaps the most important word to emphasize in Gartner’s recommendation is “rapid”.

Unfortunately, “rapid” is not an adjective that the vast majority of service desks can use to describe their MTTR today.

MetricNet, the IT consulting firm that publishes benchmarks, performance metrics, and scorecards for a variety of IT-related activities, claims that the average incident MTTR is 8.40 business hours.  If you’re an end user in an organization who just submitted a ticket to the help desk, you do NOT want to hear that it will take an average of 8.40 business hours to remediate your issue.  On the contrary, you want to know that your IT department is doing everything it can to expedite a resolution for your incident, before it starts hampering your personal productivity.

When it comes to MTTR, your mileage may vary of course, depending on your IT organization’s ticket backlog, user population density, and complexity of tickets handled.

Regardless though, one universal factor that’s slowing down almost all IT organizations is the ever-increasing user demand for IT services, which often leads to growing system complexity in your environment to accommodate that growth, and ultimately results in ever increasing pressure on your staff to keep up. 

However, people don’t scale very well.  Even the very best data center workers can only do so much.  At some point, and that point is pretty much right now, automation has got to do more and more of the repetitive, tedious, laborious tasks all this growth in demand for services and increased system complexity is creating.

That’s why consolidating visualization of all your monitoring tools into a single pane of glass and incorporating automated incident remediation into that dashboard, can give your IT department the critical boost it needs to overcome the lack of human scalability.

If you’re interested in test driving Ayehu NG v1.6 with all its cool new features, download your very own free 30-day trial version from the link below:

https://info.ayehu.com/download-free-30-day-trial-ng

Introducing Ayehu NG v1.6 – New Advanced Features

Author: Guy Nadivi

If you’re an existing user of Ayehu NG, or even if you’re just thinking about trying us on for size, you probably know that one of the core strengths of our solution is how easily and quickly you can plug Ayehu into various ITSM platforms, cyber security tools, operating systems, messaging and notification solutions, and increasingly chatbots and AI services.  Almost all of these integrations can be activated seamlessly without writing a single line of code. 

And the purpose of providing you with all these pre-built integrations and connectors that make up our ever-expanding ecosystem, is to simplify your ability to orchestrate automation across any platform in your environment.  All from a single pane of glass!

We add new integrations and their accompanying activities to Ayehu on an on-going basis, but sometimes, that hasn’t been quick enough for some of our customers and prospects. 

In our last release of NG, v1.5, we introduced you to its new Activity Designer.

This new functionality allowed you to build your own activities from scratch, in Python, C#, or .NET.  Many of our customers use the Activity Designer to create activities for actions against external systems that we don’t currently connect to. For example, you can use it to connect to Dropbox, Google, or any other third-party system that has accessible APIs.

SDK

In this new version 1.6 of NG, we’ve added a software development kit.  This new SDK means that now, in addition to being able to build custom activities, you can build entire custom integrations!  So if you’d like to integrate Ayehu with a platform we don’t currently have an integration with, you can do it yourself.  This might be especially helpful if you’ve got a homegrown application that’s the only one of its kind, and you want to automate certain tasks for it.  You can do that with the new SDK, and orchestrate the workflows from right inside Ayehu NG, just like you do for your other platforms.

NG-to-NG Migration Tool

In the past, migrating an NG workflow from a pre-production environment like DEV or TEST was a bit challenging.

In release 1.6 though it becomes a breeze with our new NG-to-NG Migration Tool.  This tool makes moving workflows from a DEV or TEST environment much simpler because it brings over almost all the entities associated with that workflow into your PRODUCTION environment.

BTW – this comprehensive migration can be done on a single workflow or an entire folder of workflows.

Slack Bot

Many of you have been taking advantage of Ayehu’s integration with Slack to build intelligent bots which provide your end users with powerful self-service remediation capabilities, that have eased the strain on your service desks.

In version 1.6, we’ve greatly simplified the process of configuring a Slack bot to just one click.  On top of that, we’ve also activated an “Add to Slack” button on our Slack Integration page so you can easily register your bot with Slack.

Configurable Password Policy

Ayehu NG v1.6 now has enhanced security for configuring default password policy. This means that passwords for all new accounts will be much stronger.  You can now set your own default password strength and parameters based on your organization’s security needs.

So if your password standard requires 12 characters, two special characters, and a number, you can now set this as default and it will be enforced across all local accounts in your environment. If you prefer synching accounts from your Active Directory, then NG will default instead to the password policies you’ve established in AD.

Updated Installer

Another improvement NG v1.6 introduces is an updated installer, which simplifies installation while also providing greater visibility into the process.

In the image seen here, you can review all the components to be installed on a component selection tree.  The most popular components are selected by default, but you can easily toggle the ones you don’t want. We’ve also added a pre-requisite screen check to ensure that the installation will complete successfully, and to let you know if any minimum installation requirements are lacking.

Refreshed Login Page

This new feature is more about aesthetics than anything else, but we’ve refreshed the login page with a bit more of a dynamic look and feel to it.

image001

BMC Helix Remedyforce Integration

Finally this newest release of NG includes an integration for BMC’s Helix Remedyforce.  It’s basically a duplicate of the existing BMC Remedyforce integration capabilities but on BMC’s Helix platform. This new integration allows you to Create, Update, and Get records from Helix Remedyforce, as well as execute SOQL queries.

If you’re interested in test driving NG v1.6 with all its cool new features, download your very own free 30-day trial version here.

To watch a replay of the live webinar and see these new features in action, click the image below.

Automation-Driven Employee Onboarding – From Days to Seconds

Author: Guy Nadivi

It’s really true. Automation-driven employee onboarding can reduce the time that process takes from days to seconds. Actually, in some cases, we’ve heard that at some organizations, onboarding is measured in weeks, not just days.

I was at a VMworld conference in San Francisco a few years ago, and struck up a conversation with someone visiting our booth who said that prior to automating their employee onboarding process, it took them 2 weeks to get new hires fully integrated into their company!

When you calculate the cost of the people involved in onboarding that new hire, as well as the productivity time that new hire lost by not being fully onboarded faster, you quickly realize how expensive this process can be. The phrase “time is money” really applies to the onboarding process, which is why automating it frequently turns out to be such a big win for organizations.

According to the Society for Human Resources Management, more commonly referred to as SHRM, the average cost of onboarding an employee is $4,129. Of course, this is an average figure taken from a broad cross-section of companies and industries. Your mileage may vary. Maybe it’s lower. Or maybe it’s higher. MUCH higher.

That $4,129 figure is taken from a 2016 report published by SHRM, and was the most recent number we could find about the cost of onboarding from an independent, reputable source. One thing you can be almost 100% certain of though is that now, towards the end of 2019, that number has likely gone up even higher. How high is hard to say, but there’s no reason to believe that the cost of onboarding employees has gotten any cheaper the last 3 years.

If you’d like to calculate what the cost of onboarding an employee is for your organization, I encourage you to Google “employee onboarding calculator“, and you’ll find numerous websites that can help you determine what your cost is.

There’s probably a pretty good chance that no two organizations have identical onboarding processes. Everyone’s is just a little bit different. Or perhaps A LOT different.

Despite that, there are some aspects to onboarding that are probably universal across most, if not all organizations.

  • First off, onboarding is typically laborious. Many different things need to be done in order to successfully onboard a new hire, and often times these things are primarily manual processes.
  • Onboarding often touches many different things across multiple departments – HR, IT, Facilities Management, etc. The more things your onboarding process touches, the more things that can go wrong, especially if they’re done manually where the potential for human error is always lurking in the background. That’s just a basic application of Murphy’s Law.
  • The process of onboarding can be very time consuming, particularly when some or even all of it is manual. This is true not just for the new hire being onboarded, but also for the people involved in the new hire’s onboarding.

And what are these laborious, error-prone, time-consuming processes I’m referring to? Here’s a list of some, but by no means all, of the tasks involved in many typical onboardings:

  1. Order new employee’s IT equipment
  2. Create new employee’s Active Directory account and  email
  3. Add new employee to calendar and  mailing lists
  4. Create new employee’s logins for primary systems they’ll use
  5. Provision new employee’s virtual machine(s)
  6. Email new employee all new hire documents from HR, including I-9 and W-4 tax forms
  7. Order new employee’s business cards
  8. Submit request for new employee’s office key and/or ID badge
  9. Create new employee’s IP phone (using API calls to Cisco Unified Comm. Manager, for example)

Guess what? In general, all of these tasks can be automated, saving a lot of work, a lot of time, and a lot of mistakes.

Doing so has some surprising benefits for an organization.

According to SHRM, 69% of employees who experienced great onboarding were still with the company after 3 years!

Now think back to the average cost of an onboarding, and ask yourself – how long would a new hire need to stay at my organization in order for us to break even on them? Hopefully it’s become a little clearer why the better the onboarding experience for your new hires, the more positive financial impact it can have on your organization.

That’s particularly true when it comes to onboarding executives and other high-pay employees.

According to a Harvard Business Review that polled global executives worldwide about their onboarding experience, nearly one third (32%) said their onboarding experience was POOR.

Why does that matter?

It matters because the cost to an organization of replacing a failed executive as a percentage of his or her salary is 213%, according to the Center for American Progress.

If your organization is hiring a highly-paid employee, you really want to make sure that it’s not only the right person, but someone who’s going to stick around long enough to justify the investment in hiring them. Otherwise, it might be much more expensive to lose them.

Now specifically with regards to automating your employee onboarding, here are probably the top 3 benefits:

  1. You’re going to save time, and when it comes to onboarding, the savings often will not be measured in hours but in days or weeks! BTW – I recently learned from Lee Coulter, the Chair of the IEEE Working Group on Standards in Intelligent Process Automation, that a better way to describe time saved by automation is “Hours Returned to the Business”. I think that’s very applicable to automating the onboarding process.
  2. You’re going to save money, that’s not only a result of saving time, but a product of longer retention for new hires.
  3. Finally, you’re going to reduce errors. The more things you do manually, the more opportunities you have for errors. Plain and simple.

Collectively, all these benefits contribute to a KPI that’s become increasingly important to how your department is graded, and in many cases compensated, at least bonus-wise.

I’m talking of course about customer satisfaction ratings. At a lot of organizations, this has become the central metric, or at the very least an extremely important one, for rating performance.

Automating the onboarding process will undoubtedly improve your customer satisfaction ratings.

Here’s something else to consider.

For many organizations out there these days, a good chunk of their new hires are younger employees. I’m talking about the millennial generation, of course, but also early members of the next demographic wave, often referred to as Gen Z.

You probably know that these generations have been raised on Facebook and mobile apps, and generally prefer interfacing with technology rather than people. We know that empirically because we see it for ourselves just about anywhere we look.

A 23-year old young lady in my own family told me that she and her friends consider books to be weird. Books apparently are weird because why would you read something written on processed dead trees when you could just use a smartphone and pull it down from the cloud?

So automating the onboarding process for this demographic is likely to be a bigger factor than you might realize in retention of younger employees. If their first impression as new hires at your organization is that you’re having them do things manually, you’re likely to persuade them that your organization may not be the kind of digitally transformed, progressive environment they want to be working at for 8+ hours a day.

Finally, it’s also worth mentioning that if everything discussed up till now applies to onboarding, then it also applies to offboarding, which is basically the same process in reverse.

This is especially true in the case of layoffs when you have a large volume of people who all basically have to be offboarded simultaneously. That’s a recipe for chaos when done manually, but a routine garden-variety process when it’s automated.

Automating the offboarding process also ensures you’ll have properly documented audit logs for compliance and internal review.

Both the onboarding and offboarding processes tend to be low-hanging fruit at most organizations, when it comes to automating procedures that yield big benefits quickly.

How AI Can Reduce Service Desk Ticket Costs from $20 to $4 [Webinar Recap]

Author: Guy Nadivi

It’s the End of the IT Service Desk as We Know it (and We Feel Fine)

If you’ve been paying attention the last few years, you know Digital Transformation is a concept that’s sweeping through many organizations, and fundamentally changing how they operate and deliver value to customers.

There’s some very cool, but still somewhat emerging technologies underpinning this disruption, and you’re no doubt familiar with them. Things such as:

  • Data Science
  • Machine Learning
  • Artificial Intelligence

But in the last couple years, the emerging technology that seems to have garnered the most mindshare faster than any of them is chatbots. That’s right! Chatbots are the coolest kids on the digital transformation block, because they assimilate many of the benefits from data science, machine learning, and artificial intelligence into a form that can be used today, and deliver value to your organization and customers right now. As a result, chatbots have emerged as perhaps the most familiar digital transformation experience for end users.

BTW – There isn’t any consensus yet on a single definition of “Digital Transformation”. One thing just about everyone can agree upon though is that shifting more of the laborious, repetitive tasks that people shouldn’t be doing in the first place over to chatbots is a good idea. This becomes especially true when you look at some numbers.

A the 2017 HDI show, Jeff Rumburg, Co-Founder and Managing Partner of MetricNet, an IT research and  advisory practice, delivered a presentation on the results of his research into the costs of different service desk access and communication channels. He discovered some amazing disparities.

Jeff found that incidents requiring Vendor Support cost on average a whopping $599 per incident.

If you needed to get IT Support involved (that’s level 3 support), the average cost was $104 per incident.

Desktop Support (level 2) was cheaper, but still relatively expensive at $69 per incident.

Incidents going through the Service Desk, your level 1 support tier, cost $20 per incident. Since level 1 tickets comprise by far the highest volume at most service desks, that’s a logical place to start applying chatbots.

If you can push out incident resolution for level 1 tickets to your end users, enabling them to initiate and remediate their own incidents with chatbots, the cost of support drops down to a very economical $4 per incident. Yeah, wow!

At this point, some more skeptical people in IT might be asking – are chatbots a passing fad or are they here to stay? Let’s look at the objective data on that, and see what direction the numbers point to.

Earlier this year, Salesforce.com released a major report entitled the “State of Service”. Nearly a quarter of their respondents (23%) said they currently use AI chatbots and nearly another third (31%) said they plan to use them within 18 months.

That represents a projected growth rate of 136% in the use of AI chatbots over the next year and a half. By any definition, that’s a viral trajectory.

Spiceworks published a report not long called “AI Chatbots and Intelligent Assistants in the Workplace”.

One question their survey asked was about utilization of intelligent assistants and chatbots by department. Guess which department uses chatbots more than any other? That’s right – IT.

Another question in that Spiceworks survey specifically asked IT professionals if they agree or strongly agree with a number of different statements. The statement IT professionals overwhelmingly agreed with more than any other was that AI will automate mundane tasks and enable more time to focus on strategic IT initiatives.

Those IT professionals Spiceworks surveyed were right. One of the biggest benefits of chatbots is that they automate many of the robotic, laborious tasks that humans shouldn’t be doing anyway. That frees up those IT professionals to work on more strategic and far more valuable IT initiatives. Which in turn makes those professionals more valuable to their organizations.

Why is offloading that tedious work from IT staff so important? Because Gartner has shown that the biggest budget item for IT Service Desks is personnel. Between 2012 and 2016, the average percentage of a service desk’s budget allocated to labor ranged from 84% – 88%. With digital transformations driving up the demand for IT support, there’s simply no way an organization can hire their way out of this situation, even if they wanted to.

The reality is that quality service desk personnel simply cost too much, and no matter how good those personnel are, they can only keep up with so much volume. At some point the laws of physics reassert themselves, reminding everyone that people simply don’t scale very well. Chatbots though, have infinite scalability.

That limited human capacity to scale, combined with the increased volume of requests for service desk support, is degrading end user experiences.

A 2016 Harvard Business Review Webinar titled “How to Fix Customer Service” revealed that:

  • 81% of consumers say it takes too long to reach a support agent.
  • 43% of customers try to self-serve before calling a contact center.

What that tells you is that waiting for human support has gotten so insufferable, end users are increasingly willing to remediate their own issues. All they need is for IT to enable a channel for them to do that.

What kinds of requests are keeping IT service desks so busy?

Well if you’ve attended any of our previous webinars you might’ve heard us cite a well-quoted statistic from Gartner that as much as 40% of an IT service desk’s call volume is nothing but password resets. 40%!

Another big drain on your service desk? Requests for ticket status updates. Those can comprise as much as 10% of a service desk’s call volume, and we’re citing ourselves (Ayehu) as the source on that.

How do we know? Well, Ayehu knows because our clients tell us which workflows have the biggest impact on reducing call volume to their service desks.

Therefore, if you can use a chatbot to automate just these two processes – password resets and ticket status updates – you could cut call volume to your service desk in half! That’s huge, and it will go a long way towards reducing your service desk ticket costs dramatically.

New call-to-action

Ayehu’s New Advanced Features in NG v1.5 [Webinar Recap]

Author: Guy Nadivi

In response to growing user requests to add more flexibility to the Ayehu NG automation platform, Ayehu has released NG v1.5. This release will significantly expand the scope of what you can automate in your environment, all from a single pane of glass, and we think that makes it a real game changer in the IT orchestration and automation market.

If you’re an existing user of Ayehu NG, or even if you’re just thinking about trying us on for size, you probably know that one of the core strengths of our solution is how easy and quickly you can plug Ayehu into various ITSM platforms, cyber security tools, operating systems, messaging and notification tools, and increasingly chatbots and AI services. Almost all of these integrations can be activated seamlessly without writing a single line of code.

And the purpose of providing you with all these pre-built integrations and connectors that make up our ever-expanding ecosystem, is to simplify your ability to orchestrate automation across any platform in your environment. All from a single pane of glass!

So, here’s what’s really exciting about this new version of NG. We’ve added a “Do It Yourself” capability to allow you to build your own platform-specific activities without the need for Ayehu to do it for you.

From the feedback we’ve received, that’s really going to appeal to those of you who aren’t afraid to roll up your sleeves, do a little coding, and craft your own specific intelligent IT automation activities.

In fact, when you see how easy we’ve made it to build your own activities, we think some of you non-coders might even be tempted to take a crack at it yourself and perhaps fulfill some aspirations on your personal automation wish list.

Without further ado then, let’s dive into what’s new in our latest release of NG, v1.5:

  • Activity Designer – This is the big one. It’s a new feature designed to give users the option to build their own activities, which marks the first time they’re not relying on us to build an activity. You already know we provide an Out-Of-The-Box library of more than 500 no-code, pre-built activities. With the Activity Designer though, customers can now independently develop or modify existing activities in Python, C# or .Net to extract further value through customization that meets specific needs.
  • GitHub Community Repository – Ayehu now has a new community on GitHub that contains more than 100 of Ayehu’s workflow templates, as well as source code for built-in activities. Customers can use this in conjunction with the Activity Designer to create custom activities based on existing pre-built workflows. The GitHub Community Repository also provides free access to other peer-developed workflow templates and activities which have already been created and contributed to the community. 
  • Ayehu Academy Advanced Courses – We now have two new Ayehu Automation Academy courses – Activity Designer Essentials and Advanced Activity Designer. Together, these courses help train and certify developers in creating new activities using the Activity Designer. The Academy has already certified nearly 1,000 IT automation engineers since its inception earlier this year.

Let’s talk a bit more about the Activity Designer.

Typically, when building a workflow you simply drag and drop activities onto a canvas, and position them in the order you want them to execute. There’s no coding, scripting, or programming of any kind required. All you have to do is configure any particular activity by entering some parameters into a popup window, as shown in the image below:

With the new Activity Designer, you can build your own activities from scratch, in Python, C#, or .NET. We believe this will typically be for a system we haven’t integrated yet, perhaps some home-grown in-house application. But it can also be used to create new custom activities for an existing integration, like ServiceNow or SolarWinds. This is a big deal because now organizations will be able to take previously unintegrated systems and incorporate them into enterprise-wide orchestration and automation via Ayehu’s single pane of glass. The Activity Designer interface is shown in the image below:

Ayehu’s GitHub Community Repository marks an expansion of our presence on GitHub’s open-source community, and can be seen at this link: – https://github.com/Ayehu

At the repository, you’ll find:

  • 100+ Ayehu workflow templates
  • Source code for built-in activities

There are many benefits to our users from this new repository, including:

  • Shorter time to value through reuse of existing, pre-built workflows
  • Shorter time to value thru customization of open source activities
  • Free access to peer-developed workflow templates and activities

Here’s an example. If we want to see what kinds of workflow templates are already available for Cisco devices, we can just click on the Cisco category, and drill down to all the workflow templates you can access that are Cisco-specific, as seen in the image below:

These new features are also accompanied by new advanced courses created for the Ayehu Academy, which can be found on our website ayehu.com under the Customers menu.

The two new Ayehu Automation Academy courses are:

  • Activity Designer Essentials
  • Advanced Activity Designer

Together, these courses help train and certify developers in creating new activities using the Activity Designer. 

Ayehu recommends getting certified because your new knowledge will enhance 2 areas of interest:

  • Your organization’s automation capabilities
  • Your own personal professional standing.

Furthermore, as this market continues to grow, we anticipate new income opportunities will be created for Certified Activity Designers. The Academy has already certified about 1,000 IT automation engineers despite only opening earlier this year. That’s a reflection of the growing interest in automation, and if you’re one of those IT automation engineers, you’ve positioned yourself very nicely for the growth curve ahead.

New call-to-action

How to Predict and Remediate IT Incidents Before They Affect Business Outcomes [Webinar Recap]

Author: Guy Nadivi

The ability to proactively predict  and remediate IT incidents BEFORE they occur, rather than react to them after they’ve already happened, is one of the key value propositions of a new IT operations category called AIOps, which stands for Artificial Intelligence for IT Operations.

Leveraging the AI part of AIOps to mitigate problems before they become problems is a game changer for IT. So we’ve partnered with Loom Systems, who like ourselves are a Gartner Cool Vendor in their category, to demonstrate how two best-of-breed providers can integrate their respective platforms to create an enterprise-grade AIOps solution. In doing so, we believe the result is an early glimpse at the self-healing data center of tomorrow, and we think you’ll be intrigued to experience how you can peek over the horizon to see  and automatically remediate incidents before they impact end-users.

Let’s start with the obvious question many of you might have on your mind – what is AIOps? It is after all, a term that kind of snuck up on all of us.

The term AIOps, like a lot of buzzwords in our industry, was originated by Gartner. In this case, a Sr. Director Analyst named Colin Fletcher coined it in 2016, and its earliest published appearance (as best I can tell) was in early 2017.

Interestingly though, Colin told me he originally meant the term to refer to Algorithmic IT Operations.

Since then it’s evolved to refer to Artificial Intelligence for IT Operations.

Now we all know how it is in IT marketing. New buzzwords are used to refresh a category and create excitement. So is AIOps basically just a recycling of the term “IT monitoring”? Are IT monitoring and AIOps basically the same? Twins, so to speak, but with different names?

Here’s the definition for IT Monitoring, courtesy of an internet publication many of you are probably aware of called TechTarget:

  “IT monitoring is the process to gather metrics about the operations of an IT environment’s hardware and software to ensure everything functions as expected to support applications and services.   Basic monitoring is performed through device operation checks, while more advanced monitoring gives granular views on operational statuses, including average response times, number of application instances, error and request rates, CPU usage and application availability.”    

The operative words there are “gather metrics” – “through device operation checks”.

This reflects one of the primary characteristics of IT Monitoring – namely that it’s passive in nature.

And here’s Colin Fletcher’s original definition for AIOps:

“AIOps platforms utilize big data, modern machine learning and other advanced analytics technologies to directly and indirectly enhance IT operations (monitoring, automation and service desk) functions with proactive, personal and dynamic insight. AIOps platforms enable the concurrent use of multiple data sources, data collection methods, analytical (real-time and deep) technologies, and presentation technologies.”

Unlike IT Monitoring, AIOps is proactive and far more sophisticated. So AIOps is a LOT MORE than just IT Monitoring.

At this point you may be asking yourself, “OK, but how can this benefit me?”

As we all know, in today’s Digital Era, most businesses are digital or undergoing a digital transformation, which means that IT systems are replacing many traditional physical business processes, and that in turn means more work for IT Operations.

In fact, IT Operations engineers have become responsible for the customers’ digital experience. When your organization’s systems are misbehaving, underperforming, or worse not working at all, your customers’ satisfaction is affected, which often leads to customer churn.

It’s that simple.

End users often use applications or websites and love how simple and intuitive they can be. In IT though, we all know that building something to look nice and simple, can actually be quite difficult. That’s because there are usually many technologies under the hood that need to work together seamlessly in order for these digital experiences to run smoothly.

As if that wasn’t enough, let’s add some more complexity:

With Cloud Computing on the one hand, and Microservices architectures on the other, things become even more complex, for the following reasons:

  1. Cloud computing means abstraction – that can lead to struggles understanding what the impact of a performance issue on a host will do to other components of your applications.
  2. These environments change dynamically, making it harder to stay on top of everything.
  3. Microservices often require disparate data sources, each generating its own logs and metrics, making tracing and correlation an inherent part of root cause analysis (RCA).

So, the increased complexity of digital businesses architectures, coupled with the explosion of different data types, and the elevated expectations consumers have these days for seamless end user experiences, makes the life of IT Operations teams quite challenging.

Enter AIOps.

AIOps is a set of tools that enable achievement of optimum availability and performance by leveraging machine learning technologies against massive data stores with wide variance. The big idea here is to use machines to deal with machines.

Here are some examples of the challenges customers often look to address by implementing AIOps:

  • Outage prevention – organizations in the process of cloud migration or architecture change, often look for modern technologies like AIOps to help them prevent outages before the business is affected. This is a marked difference from 2 years ago when the market was just focused on noise reduction. Artificial intelligence and machine learning have raised expectations of how much more is possible.
  • Capturing different data feeds – this means it’s not just about alerts anymore. There’s a huge need to consolidate logs, metrics, and events together, and to make sense out of them as a whole.
  • Consolidation of tools – this one is mainly about the workflow of the users. They’d like AIOps to make their daily lives easier and consolidate everything into one system.

A monitoring architecture for modern enterprises that can do all of the above would be a real-life example of a self-healing architecture.

Everything starts with observability. Many enterprises use one or more infrastructure monitoring tools. Application Performance Management (APM) monitors do a great job in monitoring performance, but are very limited for the application stack and log management, rendering them a bit unhelpful for triage and forensic investigations.

These monitoring tools are usually focused on specific data feeds or IT layers, and they emit alerts when things go wrong. However, these can lead to confusing alert storms.

This is another reason why organizations are beginning to leverage AIOps to work for them and make sense out of it all. Think of AIOps as a robot that turns monotonous data into information you cannot ignore. In our case, turning logs into predictions or early stage detection of an outage.

Now that you know something is about to break, can you prevent it from happening? That’s exactly the idea of self-healing. When working with an intelligent automation platform like Ayehu, you can build simple (or complex) remediation workflows, that can take the alert from Loom Systems and automatically remediate the incident BEFORE it becomes something more calamitous.

In your monitoring architecture, you want the Automation tool to seamlessly interact with both the AIOps solution and your ITSM platform, to open a ticket and update it as you’re taking remedial action.

When configured properly, this architecture can resolve issues before they affect the business, while also documenting what happened for future reference.

Gartner concurs with this approach.

In a paper published earlier this year (ID G00384249 – April 24, 2019), they wrote that:

  “AI technologies play an important role in I andO, providing benefits such as reduced mean time to response (MTTR), faster root cause analysis (RCA) and increased I andO productivity. AI technologies enable I andO teams to minimize low-value repetitive tasks and engage in higher-productivity/value-oriented actions.”    

No ambiguity there.

A little further down in the same paper, Gartner gave the following recommended actions, representing their most current advice to infrastructure and operations leaders regarding AIOps and automation:

  Embark on a journey toward driving intelligent automation. This involves managing and driving AI capabilities that are embedded by infrastructure vendors, in addition to reusing artificial intelligence for operations (AIOps) capabilities to drive end-to-end (from digital product to infrastructure) automation.”    

With AIOps + Automation, it’s possible to predict and prevent network outages or other major disruptions by proactively detecting the conditions leading up to them and automatically remediating them BEFORE disaster strikes. Given how costly a service interruption can be to an enterprise, avoiding issues before they happen will be a critical function in the self-healing data center of tomorrow.

New call-to-action

How to Create an Outstanding Experience for Your Cherwell ITSM Users [Webinar Recap]

on-demand-webinar-cherwell-ayehu-present-how-to-create-an-outstanding-experience-for-your-itsm-users

Author: Guy Nadivi and Ayla Anderson, Technology Alliances Manager, Cherwell

The discipline of ITSM has undergone significant evolution since its earliest incarnations. Today with the drive towards automation, increasing use of artificial intelligence, and the push for digital transformation, ITSM occupies an increasingly high-profile position for many organizations. This is especially true as many enterprises are seeking competitive advantages in their customer experience and service quality offerings.

With that in mind, we’ve partnered with Cherwell, an increasingly common ITSM platform choice for many of our customers, to demonstrate how to create an outstanding experience for Cherwell ITSM users.

According to Gartner, it isn’t too surprising why Cherwell’s popularity is on the rise. In a recent report they wrote that “Cherwell continues to enjoy mind share among Gartner clients looking at intermediate ITSM tools. Cherwell was the second most frequently shortlisted vendor by Gartner clients in 2018.” BTW – Cherwell held that same distinction in 2017 as well. So Gartner is seeing the same increase in demand for Cherwell that we’ve been seeing.

Gartner has also identified Cherwell as a “Challenger” in its most recent Magic Quadrant for ITSM tools which was just published in August of 2019. So there seems to be a lot of momentum building in the ITSM market for Cherwell.

Since Ayehu is very customer-driven, we give priority to developing new features and new integrations based on what our customers are asking for the most. As a result, we’ve added some Cherwell-specific functionality lately, and we think that many Cherwell customers will be intrigued to see how much more they can do with the platform, once it’s integrated with Ayehu.

For those not familiar with their ITSM solution, Cherwell transforms the way businesses deliver service. Its technology provides a centralized system through which all services can be managed and monitored. This gives unprecedented visibility to all processes, helping teams measure and manage services more effectively and efficiently.

Along with nearly 100 technology alliance partners (like Ayehu), Cherwell aims to help customers modernize their IT service management. Today, Modern Service Management is foundational to transforming the experience of employees (ITSM users).

Before explaining why though, let’s define Modern Service Management (MSM).

MSM is the evolution from legacy ITSM practices with minimal impacts on the business and its employees, to a philosophy in practice that leverages self-service, automation, visibility and agility to generate business outcomes and improve employee experiences.

The visual below from Forrester Research shows that in the past, automation associated with digital initiatives focused on cost reduction. More recently however, the focus has been around customer experience (CX), as more companies take a customer-centric approach.

Forrester Research

By 2020 the focus will shift to accelerating transformation with both Employee Experience (EX) and CX automation initiatives — because employee experience has a direct correlation to employee happiness and efficiency, which in turn impacts customer experience. As businesses continue moving up the ladder of ITSM maturity, speed and efficiency won’t just be critical for customer facing apps, but will also be expected across the entire organization.

So what’s stopping businesses from transforming their service experiences? In general, they lack a centralized way to architect and automate end-to-end processes across multiple services, systems, and teams. The four primary barriers to achieving this transformation include:

  1. Disparate Systems

Individual services and departments within a business often have their own systems and tools. This not only impacts employee experience, but it impedes businesses from monitoring the performance of services, and cross-functional processes, due to lack of centralized visibility across all systems.

  1. Fragmented Data

Since many services run on legacy databases, integrating data sets across services can be difficult and time consuming.

  1. Manual Service Steps

Most businesses struggle to integrate data, systems, and processes, leaving many teams stuck in an endless cycle of using antiquated systems to get their work done. Whether they’re responding to service requests, onboarding a new employee, or managing the maintenance logs of a fleet of vehicles — this creates inefficiencies and challenges in keeping up with service requests.

  1. Resource Intensiveness Required to Transform Digital Operations

Often times architecting new services, and evolving existing processes, requires teams of developers to write code. This is both time consuming and expensive.

This is where Cherwell integrated with Ayehu automation can help businesses.

If you’re currently a Cherwell customer or have it as one of your shortlisted vendors, then you may already be asking yourself whether you should add automation to Cherwell. And if you do add automation, what kind of boost would it give to your investment in Cherwell?

To determine that, it helps to look at some costs associated with helpdesk operations.

Based on Ayehu’s research conducting standard helpdesk data assessments for organizations, we’ve discovered that the 5 largest categories of incidents represent as much as 98% of their total tickets!

When those incident numbers get sliced and diced to see how many get handled by Tier 1 vs Tier 2 support, they often reveal a surprise.

As much as 70% of Tier 1 incidents get escalated to Tier 2!

That means that if you can somehow focus your automation efforts on just the 5 largest categories of incidents while they’re still in tier 1, automation is going to provide a very big payoff, not just at your service desk but in your customer satisfaction metrics as well.

Now, let’s take a closer look at what kind of a return we’re looking at from automation.

If we go real conservative by estimating that it costs $20 to remediate a ticket, then multiply that by the number of tickets your helpdesk handles, it’s likely going to add up to some serious money your organization is spending on manually resolving these incidents. (BTW – $20 per ticket is a rough number calculated by Jeff Rumburg of MetricNet for 2017)

Now I’m going to shock some of you. If you automate incident resolution, your cost per ticket drops down to $4, and that’s also playing it conservative.  Applying automation to incident resolution has a dramatic effect on your costs, so if you’re looking for a high-impact way to bring savings to your organization’s ITSM costs, automation is a pretty good way to go.

In case you’re wondering what kinds of specific incidents you would likely be automating with Ayehu, here are some of the most common processes we see:

  • Application/Service/Process/Server Restarts
  • Monitoring Application Log Files, looking for specific keywords, and taking some action based on what’s found
  • Low disk space remediation (always a popular thing to automate)
  • Running SQL Queries, perhaps at 3am then compiling the results into a report which gets emailed to appropriate personnel
  • Onboarding and offboarding employees (another popular one)

And there are many, many more tasks the service desk will want to automate for itself. Now how about the kinds of processes we can push out to end users to remediate in a self-help paradigm?

  • Password Resets or Account Unlocks are an obvious one
  • How about letting users provision their own VM’s whether it’s VMware, AWS, Azure, or Hyper-V?
  • And how about letting them modify or resize a VM’s memory or disk space without any help from the service desk?

When you think about it, there’s really no limit to the kinds of things you can automate, once you’ve integrated Ayehu with Cherwell.

Cherwell & Ayehu Present: How to Create an Outstanding Experience for Your ITSM Users

How to Leverage Intelligent Automation to Better Manage Alert Storms [Webinar Recap]

Author: Guy Nadivi

As most of you already know, there’s a digital transformation underway at many enterprise organizations, and it’s revolutionizing how they do business. That transformation though is also leading to increasingly more complex and sophisticated infrastructure environments. The more complicated these environments get, the more frequently performance monitoring alerts get generated. Sometimes these alerts can come in so fast and furious, and in such high volume, that they can lead to alert storms, which overwhelm staff and lead to unnecessary downtime.

Since the environments these alerts are being generated from can be so intricate, this presents a multi-dimensional problem that requires more than just a single-point solution. Ayehu has partnered with LogicMonitor to demonstrate how end-to-end intelligent automation can help organizations better manage alert storms from incident all the way to remediation.

The need for that sort of best-of-breed solution is being driven by some consistent trends across IT reflecting a shift in how IT teams are running their environments, and how costly it becomes when there is an outage. Gartner estimates that:

Further exacerbating the situation is the complexity of multi-vendor point solutions, distributed workloads across on-premise data centers, off-premise facilities, and the public cloud, and relentless end-user demands for high availability, secure, “always-on” services.

From a monitoring standpoint, enterprise organizations need a solution that can monitor any infrastructure that uses any vendor on any cloud with any method required, e.g. SNMP, WMI, JDBC, JMX, SD-WAN, etc. In short, if there’s a metric behind an IP address, IT needs to keep an eye on it, and if IT wants to set a threshold for that metric, then alerts need to be enabled for it.

The monitoring solution must also provide an intuitive analytical view of the metrics generated from these alerts to anyone needing visibility into infrastructure performance. This is critical for proactive IT management in order to prevent “degraded states” where services go beyond the point of outage prevention.

This is where automating remediation of the underlying incident that generated the alert becomes vital.

The average MTTR (Mean Time To Resolution) for remediating incidents is 8.40 business hours, according to MetricNet, a provider of benchmarks, performance metrics, scorecards and business data to Information Technology and Call Center Professionals.

When dealing with mission critical applications that are relied upon by huge user communities, MTTRs of that duration are simply unacceptable.

But it gets worse.

What happens when the complexities of today’s hybrid infrastructures lead to an overwhelming number of alerts, many of them flooding in close together?

You know exactly what happens.

You get something known as an alert storm. And when alert storms occur, MTTRs degrade even further because they overwhelm people in the data center who are already working at a furious pace just to keep the lights on.

If data center personnel are overwhelmed by alert storms, it’s going to affect their ability to do other things.

That inability to do other things due to alert storms is very important, especially if customer satisfaction is one of your IT department’s major KPI’s, as it is for many IT departments these days.

Take a look at the results of a survey Gartner conducted less than a year ago, asking respondents what they considered the most important characteristic of an excellent internal IT department.

If an IT department performed dependably and accurately, 40% of respondents considered them to be excellent.

If an IT department offered prompt help and service, 25% of respondents considered them to be excellent.

So if your IT department can deliver on those 2 characteristics, about 2/3 of your users will be very happy with you.

But here’s the rub. When your IT department is flooded with alert storms generated by incidents that have to be remediated manually, then that’s taking you away from providing your users with dependability and accuracy in a prompt manner. However, if you can provide that level of service regardless of alert storms, then nearly 2/3 of your users will consider you to be an excellent IT department.

One proven way to achieve that level of excellence is by automating manual incident remediation processes, which in some cases can reduce MTTRs from hours down to seconds.

Here’s how that would work. It involves using the Ayehu platform as an integration hub in your environment. Ayehu would then connect to every system that needs to be interacted with when remediating an incident.

So for example, if your environment has a monitoring system like LogicMonitor, that’s where an incident will be detected first. And LogicMonitor, now integrated with Ayehu, will generate an alert which Ayehu will instantaneously intercept.

Ayehu will then parse that alert to determine what the underlying incident is, and launch an automated workflow to remediate that specific underlying incident.

As a first step in our workflow we’re going to automatically create a ticket in ServiceNow, BMC Remedy, JIRA, or any ITSM platform you prefer. Here again is where automation really shines over taking the manual approach, because letting the workflow handle the documentation will ensure that it gets done in a timely manner, in fact in real-time. Automation also ensures that documentation gets done thoroughly. Service Desk staff often don’t have the time or the patience to document every aspect of a resolution properly because they’re under such a heavy workload.

The next step, and actually this can be at any step within that workflow, is pausing its execution to notify and seek human approval for continuation. Just to illustrate why you might do this, let’s say that a workflow got triggered because LogicMonitor generated an alert that a server dropped below 10% free disk space. The workflow could then go and delete a bunch of temp files to free up space, it could compress a bunch of log files and move them somewhere else, and do all sorts of other things to free up space, but before it does any of that, the workflow can be configured to require human approval for any of those steps.

The human can either grant or deny approval so the workflow can continue on, and that decision can be delivered by laptop, smartphone, email, Instant Messenger, or even via a regular telephone. However, note that this notification/approval phase is entirely optional. You can also choose to put the workflow on autopilot and proceed without any human intervention. It’s all up to you, and either option is easy to implement.

Then the workflow can begin remediating the incident which triggered the alert.

As the remediation is taking place, Ayehu can update the service desk ticket in real-time by documenting every step of the incident remediation process.

Once the incident remediation is completed, Ayehu can automatically close the ticket.

And finally, it can go back into LogicMonitor and automatically dismiss the alert that triggered this entire process. This is how you can leverage intelligent automation to better manage alert storms, as well as simultaneously eliminating the potential for human error that can lead to outages in your environment.

Gartner concurs with this approach.

In a recently refreshed paper they published (ID G00336149 – April 11, 2019) one of their Vice-Presidents wrote that “The intricacy of access layer network decisions and the aggravation of end-user downtime are more than IT organizations can handle. Infrastructure and operations leaders must implement automation and artificial intelligence solutions to reduce mundane tasks and lost productivity.”

No ambiguity there.

Ayehu

IT Incidents: From Alert to Remediation in 15 seconds [Webinar Recap]

Author: Guy Nadivi

Remediating IT incidents in just seconds after receiving an alert isn’t just a good performance goal to strive for. Rapid remediation might also be critical to reducing and even mitigating downtime. That’s important, because the cost of downtime to an enterprise can be scary. Even scarier though is what can happen to people’s jobs if they’re found to be responsible for failing to prevent the incidents that resulted in those downtimes.

So let’s talk a bit about how automation can help you avoid situations that imperil your organization, and possibly your career.

Mean Time to Resolution (MTTR) is a foundational KPI for just about every organization. If someone asked you “On average, how long does it take your organization to remediate IT Incidents after an alert?” what would your answer be from the choices below?

  • Less than 5 minutes
  • 5 – 15 minutes
  • As much as an hour
  • More than an hour

In an informal poll during a webinar, here’s how our audience responded:

More than half said that, on average, it takes them more than an hour to remediate IT incidents after an alert. That’s in line with research by MetricNet, a provider of benchmarks, performance metrics, scorecards and business data to Information Technology and Call Center Professionals.

Their global benchmarking database shows that the average incident MTTR is 8.40 business hours, but ranges widely, from a high of 33.67 hours to a low of 0.67 hours (shown below in the little tabular inset to the right). This wide variation is driven by several factors including ticket backlog, user population density, and the complexity of tickets handled.

Your mileage may vary, but obviously, it’s taking most organizations far longer than 15 seconds to remediate their incidents.

If that incident needing remediation involves a server outage, then the longer it takes to bring the server back up, the more it’s going to cost the organization.

Statista recently calculated the cost of enterprise server downtime, and what they found makes the phrase “time is money” seem like an understatement. According to Statista’s research, 60% of organizations worldwide reported that the average cost PER HOUR of enterprise server downtime was anywhere from $301,000 to $2 million!

With server downtime being so expensive, Gartner has some interesting data points to share on that issue (ID G00377088 – April 9, 2019).

First off, they report receiving over 650 client inquires between 2017 and 2019 on this topic, and we’re still not done with 2019. So clearly this is a topic that’s top-of-mind with C-suite executives.

Secondly, they state that through 2021, just 2 years from now, 65% of Infrastructure and Operations leaders will underinvest in their availability and recovery needs because they use estimated cost-of-downtime metrics.

As it turns out, Ayehu can help you get a more accurate estimate of your downtime costs so they’re not underestimated.

In our eBook titled “How to Measure IT Process Automation ROI”, there’s a specific formula for calculating the cost of downtime. The eBook is free to download on our website, and also includes access to all of our ROI formulas, which are fairly straightforward to calculate.

Let’s look at another data point about outages, this one from the Uptime Institute’s 2019 Annual Data Center Survey Results. They report that “Outages continue to cause significant problems for operators. Just over a third (34%) of all respondents had an outage or severe IT service degradation in the past year, while half (50%) had an outage or severe IT service degradation in the past three years.”

So if you were thinking painful outages only happen at your organization, think again. They’re happening everywhere. And as the research from Statista emphasized, when outages hit, it’s usually very expensive.

The Uptime Institute has an even more alarming statistic they’ve published.

They’ve found that more than 70% of all data center outages are caused by human error and not by a fault in the infrastructure design!

Let’s pause for a moment to ponder that. In 70% of cases, all it took to bring today’s most powerful high-tech to its knees was a person making an honest mistake.

That’s actually not too surprising though, is it? All of us have mistyped a keyboard stroke here or made an erroneous mouse click there. How many times has it happened that someone absent-mindedly pressed “Reply All” to an email meant for one person, then realized with horror that their message just went out to the entire organization?

So mistakes happen to everyone, and that includes data center operators. And unfortunately, when they make a mistake that leads to an outage, the consequences can be catastrophic.

One well-known example of an honest human mistake that led to a spectacular outage occurred back in late February of 2017. Someone on Amazon’s S3 team input a command incorrectly that led to the entire Amazon Simple Storage Service being taken down, which impacted 150,000 organizations and led to many millions of dollars in losses.

If infrastructure design usually isn’t the issue, and 70% of the time outages are a direct result of human error, then logic suggests that the key would be to eliminate the potential for human error. And just to emphasize the nuance of this point, we’re NOT advocating eliminating humans, but eliminating the potential for human error while keeping humans very much involved. How do we do that?

Well, you won’t be too surprised to learn we do it through automation.

Let’s start by taking a look at the typical infrastructure and operations troubleshooting process.

This process should look pretty familiar to you.

In general, many organizations (including large ones) do most of these phases manually. The problem with that is that it makes every phase of this process vulnerable to human error.

There’s a better way, however. It involves automating much of this process, which can reduce the time it takes to remediate an IT incident down to seconds. And automation isn’t just faster, it also eliminates the potential for human error, which should radically reduce the likelihood that your environment will experience an outage due to human error.

Here’s how that would work. It involves using the Ayehu platform as an integration hub in your environment. Ayehu would then connect to every system that needs to be interacted with when remediating an incident.

For example, if your environment has a monitoring system like SolarWinds, Big Panda, or Microsoft System Center, that’s where an incident will be detected first. The monitoring system (now integrated with Ayehu) will generate an alert which Ayehu will instantaneously intercept. (BTW – if there’s a monitoring system or any kind of platform in your environment that we don’t have an off-the-shelf integration for, it’s usually still pretty easy to connect to it via a REST API call.)

Ayehu will then parse that alert to determine what the underlying incident is, and launch an automated workflow to remediate it.

As a first step in our workflow we’re going to automatically create a ticket in ServiceNow, BMC Remedy, JIRA, or any ITSM platform you prefer. Here again is where automation really shines over taking the manual approach, because letting the workflow handle the documentation will ensure that it gets done in a timely manner (in fact, in real-time) and that it gets done thoroughly. This brings relief to service desk staff who often don’t have the time or the patience to document every aspect of a resolution properly because they’re under such a heavy workload.

The next step, and actually this can be at any step within that workflow, is pausing its execution to notify and seek human approval for continuation. To illustrate why you might do this, let’s say that a workflow got triggered because SolarWinds generated an alert that a server dropped below 10% free disk space. The workflow could then go and delete a bunch of temp files, it could compress a bunch of log files and move them somewhere else, and do all sorts of other things to free up space. Before it does any of that though, the workflow can be configured to require human approval for any of those steps.

The human can either grant or deny approval so the workflow can continue on, and that decision can be delivered via laptop, smartphone, email, instant messenger, or even regular telephone. However, please note that this notification/approval phase is entirely optional. You can also choose to put the workflow on autopilot and proceed without any human intervention. It’s all up to you, and either option is easy to implement.

Then the workflow can begin remediating the incident which triggered the alert.

As the remediation is taking place, Ayehu can update the service desk ticket in real-time by documenting every step of the incident remediation process.

Once the incident remediation is completed, Ayehu can automatically close the ticket.

Finally, Ayehu can go back into the monitoring system and automatically dismiss the alert that triggered the entire process.

This, by the way, illustrates why we think of Ayehu as a virtual operator which we sometimes refer to as “Level 0 Tech Support”. A lot of incidents can be resolved automatically by Ayehu without any human intervention, and thus without the need for attention from a Level 1 technician.

This then is how you can go from alert to remediation in 15 seconds, while simultaneously eliminating the potential for human error that can lead to outages in your environment.

Gartner concurs with this approach.

In a recently refreshed paper they published (ID G00336149 – April 11, 2019) one of their Vice-Presidents wrote that “The intricacy of access layer network decisions and the aggravation of end-user downtime are more than IT organizations can handle. Infrastructure and operations leaders must implement automation and artificial intelligence solutions to reduce mundane tasks and lost productivity.”

No ambiguity there.

Gartner’s advice is a good opportunity for me to segue into one last topic – artificial intelligence.

The Ayehu platform has AI built-in, and it’s one of the reasons you’ll be able to not only quickly remediate your IT incidents, but also quickly build the workflows that will do that remediation.

Ayehu is partnered with SRI International (SRI), formerly known as the Stanford Research Institute. In case you’re not familiar with them, SRI does high-level research for government agencies, commercial organizations, and private foundations. They also license their technologies, form strategic partnerships (like the one they have with us) and creates spin-off companies. They’ve received more than 4,000 patents and patent applications worldwide to date. SRI is our design partner, and they’ve designed the algorithms and other elements of our AI/ML functionality. What they’ve done so far is pretty cool, but what we’re working on going forward is what’s really exciting.

One of the ways Ayehu implements AI is through VSAs, which is shorthand for “Virtual Support Agents”.

VSA’s differ from chatbots in that they’re not only conversational, but more importantly they’re also actionable. That makes them the next logical step or evolution up from a chatbot. However, in order for a VSA to execute actionable tasks and be functionally useful, it has to be plugged in to an enterprise grade automation platform that can carry out a user’s request intelligently.

We deliver a lot of our VSA functionality through Slack, and we also have integrations with Alexa and IBM Watson. We’re also incorporating an MS-Teams interface, and looking into others as well.

How is this relevant to remediating incidents?

Well, if a service desk can offload a larger portion of its tickets to VSA’s, and provide its users with more of a self-service modality, then that frees up the service desk staff to automate more of the kinds of data center tasks that are tedious, repetitive, and prone to human error. And as I’ve previously stated, eliminating the potential for human error is key to reducing the likelihood of outages.

Speaking of tickets, another informal webinar poll we conducted asked:

On average, how many support tickets per month does your IT organization deal with?

  • Less than 100
  • 101 – 250
  • 251 – 1,000
  • More than 1,000

Here’s how our audience responded:

Nearly 90% receive 251 or more tickets per month. Over half get more than 1,000!

For comparison, the Zendesk Benchmark reports that among their customers, the average is 777 tickets per month.

Given the volume of tickets received per month, the current average duration it takes to remediate an incident, and most importantly the onerous cost of downtime, automation can go a long way towards helping service desks maximize their efficiency by being a force multiplier for existing staff.

Q:          What types of notifications can the VSA send at the time of incident?

A:           Notifications can be delivered either as text or speech.

Q:          How does the Ayehu tool differ from other leading RPA tools available on the market?

A:           RPA tools are typically doing screen automation with an agent. Ayehu’s automation is an agentless platform that primarily interfaces with backend APIs.

Q:          Do we have to do API programming or other scripting as a part of implementation?

A:           No. Ayehu’s out-of-the-box integrations typically only require a few configuration parameters.

Q:          Do we have an option to create custom activities? If so, which programing language should be used?

A:           In our roadmap, we will be offering the ability to create custom activity content out-of-the-box.

Q:          Do out-of-the-box workflows work on all types of operating systems?

A:           Yes. You just define the type of operating system within the workflow.

Q:          How does Ayehu connect and authenticate with various endpoint devices (e.g. Windows, UNIX, network devices, etc.)? Is it password-less, connection through a password vault, etc?

A:           That depends on what type of authentication is required internally by the organization. Ayehu integrated with the CyberArk password vault can be leveraged when privileged account credentials are involved. Any type of user credential information that is manually input into a workflow or device is encrypted within Ayehu’s database. Also, certificates on SSH commands, Windows authentication, and localized authentication are all accessible out-of-the-box. Please contact us for questions about security scenarios specific to your environment.

Q:          What are all the possible modes that VSAs can interact with End Users?

A:           Text, Text-to-Speech, and Buttons.

Q:          Can we create role-based access for Ayehu?

A:           Yes. That’s a standard function which can also be controlled by and synchronized with Active Directory groups out-of-the-box.

Q:          Apart from incident tickets, does Aheyu operate on request tickets (e.g. on-demand access management, software requests from end-users, etc.)?

A:           Yes. The integration packs we offer for ServiceNow, JIRA, BMC Remedy, etc. all provide this capability for both standard and custom forms.

Q:          Does Ayehu provide APIs for an integration that’s not available out of the box?

A:           Yes. There are two options. You can either forward an event to Ayehu using our webservice which is based on a RESTful API, or from within the workflow you can send messages outbound that are either scheduled or event-driven. This allows you to do things such as make a database call, set an SNMP trap, handling SYSLOG messages, etc.

Q:          Does Ayehu provide any learning portal for developers to learn how to use the tool?

A:           Yes. The Ayehu Automation Academy is an online Learning Management System we just launched recently. It includes exams that provide you an opportunity to bolster your professional credentials by earning a certification. If you’re looking to advance your organization’s move to an automated future, as well as your career prospects, be sure to check out the Academy.

Q:          Does Ayehu identify issues like a monitoring tool does?

A:           Ayehu is not a monitoring tool like Solarwinds, Big Panda, etc. Once Ayehu receives an alert from one of those monitoring systems, it can trigger a workflow that remediates the underlying incident which generated that alert.

Q:          We have 7 different monitoring systems in our environment. Can Ayehu accept alerts from all of them simultaneously?

A:           Yes. Ayehu’s integrations are independent of one another, and it can also accept alerts from webservices. We have numerous deployments where thousands of alerts are received from a variety of sources and Ayehu can scale up to handle them all.

Q:          What does the AI in Ayehu do?

A:           There are different areas where AI is used. From use in understanding intent through chatbots to workflow design recommendations, and also suggesting workflows to remediate events through the Ayehu Brain service. Please contact an account executive to learn more.

New call-to-action