Kishan Dutt

Coined by Gartner, Artificial intelligence for IT operations (AIOps) refers to multi-layered technological systems that automate and improve IT operations through advanced analytics and machine learning (ML). AIOps uses Cognitive Operations, Algorithmic IT Operations, and IT Operations Analytics (ITOA) to proactively detect anomalies, identify patterns and deviations in real-time for root cause analysis, and provide valuable predictive insights to respond intelligently. 

In short, AIOps empowers organizations to streamline IT operations, minimize disruptions, and optimize performance for a resilient and intelligent IT landscape. 

AIOps in banking applications: 

With the evolution of banking systems and applications, the number of interactions has increased manifold. From less than one interaction (physical bank visit) a few decades ago, interactions now include hundreds of interactions per day in the form of web, mobile, real-time transactions, and chatbots. Moreover, this number is expected to grow even further with the rise of smart devices and IoT.  

As transactions increase, so does the load on the infrastructure, leading to a higher risk of failure. The existing support staff might struggle to keep up with the surge in problems and transactions. Additionally, financial viability poses a challenge, as increased transactions do not necessarily lead to increased user fees. Therefore, adopting smart and AI Operations has become crucial for financial institutions’ IT setups. AIOps can enable real-time detection and correction of IT with minimal human efforts, ensuring smooth operations and enhanced customer experience. 

Upcoming challenges in IT/network operations: 

By 2030, it is expected that there will be 50 billion connected devices worldwide. According to Strategy Analytics, these could range from smartphones and tablets to smart cars or even physical wearables such as watches or glasses. The proliferation of connected devices will encompass various use cases of IoT and spatial computing. For example, home refrigerators may use eCommerce sites to order groceries and bank payment systems to settle bills.  

This exponential growth in connected devices will introduce new challenges for IT and network operations, necessitating innovative approaches like AIOps to handle the complexity and ensure efficient functioning of the interconnected systems. With this evolving landscape, we are seeing the following challenges: 

  • Data overload: The sheer volume and variety of data generated by applications and devices will be overwhelming for IT systems. This will make it difficult to identify and troubleshoot problems. 
  • Security risks: Connected devices can be a major security risk. If they are not properly secured, they can be hacked and used to launch attacks on IT systems. 
  • Complexity: The increasing complexity, diversity, and dynamism of IT environments, especially with the adoption of cloud, microservices, containers, serverless, and edge computing. 
  • Compliance challenges: Ensuring organizations and their users comply with a variety of regulations, such as those related to data privacy and security. This can be a challenge, as the regulatory landscape is constantly changing. 
  • Costs: Managing the above challenges using current methods will mean a proportional increase in workforce, leading to unsustainable costs. As future users will not pay as much per device for new use cases as current ones. 
How do we overcome these challenges? 
  • Reduce noise and alert fatigue by filtering out irrelevant or duplicate events and identifying root causes and patterns. 
  • Enhance observability and visibility across the entire IT infrastructure, applications, and services. 
  • Improve performance, availability, and reliability of IT systems by detecting and resolving issues quickly and accurately.
  • Enable proactive and predictive IT operations by anticipating and preventing potential problems before they impact users or business outcomes. 
  • Streamline and automate workflows and processes across IT domains and teams, reducing costs, manual efforts, and human errors. 
Conceptual diagram of AIOps

Conceptual diagram of AIOps 

How can AIOps solutions help: 
  • Data aggregation: Collect data from various sources, such as monitoring tools, event logs, and configuration files. 
  • Real-time processing: Process data in real time, which allows them to identify and respond to incidents quickly. 
  • Rules and patterns: Use rules and patterns to identify anomalies and potential problems. 
  • Domain algorithms: Apply domain algorithms to learn from historical data and predict future events. 
  • AI and ML capability: Employ artificial intelligence and machine learning to automate tasks and improve decision-making. 
Challenges in implementation of AIOps include: 
  • Data quality: The quality of the data used by AIOps solutions is critical to their success. If the data is inaccurate or incomplete, the solutions will not be able to make accurate predictions or identify problems. 
  • Complexity: AIOps solutions can be complex to implement and manage. Organizations need to have the right skills and resources in place to successfully deploy and use these solutions. 
  • Cost: AIOps solutions can be expensive to purchase and implement. Organizations need to carefully consider the benefits of AIOps before deciding to invest in these solutions. 
The benefits of AIOps include: 
  • Reduced MTTD and MTTR: Help organizations identify and resolve incidents more quickly, which can lead to significant reductions in downtime. 
  • Improved visibility: Provide organizations with a single view of their IT infrastructure to better understand the relationships between different systems and components. 
  • Enhanced decision-making: Make better decisions about IT operations, such as which incidents to prioritize and which resources to allocate. 
  • Increased efficiency: Automate tasks, which can free up IT staff to focus on more strategic activities. 
The future of AIOps: 

Despite challenges, the future of AIOps is bright. As AI and machine learning technologies continue to mature, AIOps solutions will become more powerful and sophisticated. They will be able to identify real problems and provide meaningful insights from large volumes of structured and unstructured data, false positives, and false negatives. AIOps will be able to comprehend multiple dashboards, ticketing systems, and incident-response tools to reduce and respond to incidents that IT teams must handle each week. With unsupervised machine learning algorithms, IT teams will be able to learn patterns from unlabeled data and overcome challenges in the current AIOps implementations. 

At QualityKiosk, we possess extensive domain knowledge in BFSI, ITOps, and data science, making us well-positioned to support our clients in adopting, deploying, and utilizing the best platforms, processes, and use cases to derive the utmost benefits from their AIOps journey. 

About the Author
- VP and Head of Cloud Engineering, QualityKiosk Technologies

Kishan Dutt has over 2 decades of experience in Telecom, working with global leaders (Ericsson) and MTN, Airtel, Orange Group, Telefonica, DT Group (Including Sprint/T-Mobile), Vodafone, Indosat, Telenor Asia (DTAC, Digi, TML), and NBN. Kishan has international stints across Europe, Africa, and APAC regions, leading multi-cultural teams consisting of 2500+ professionals. 

Prior to QualityKiosk, Kishan was heading GNOC operate center in India, leading service delivery for 30+ international telecom operators. He has rich experience in Network Operations, Planning and Design, Optimization, and support.