AI OPS FOR SECURITY


Another small write up on my understanding of AIOPs + security

Enterprise Security is an integral part of "Operations" department. starting with Provisioning resources, granting access, authority to make changes, and revoking all these and when needed in fact an important aspect of security. In the recent years, a new branch of IT Operations has emerged and it it is termed by some as "Security IT operations". From my perspective, this new branch of Security Operations exploits the data generated during the "regular" operation and Enterprise business activity to make intelligent decisions.

Security IT Ops, if based on AIOPs, could do magic. AIOPs means Artificial Intelligence for IT Operations. This means that as data is generated in various areas in the Enterprise (user login, request for data, writing to DB, IP traffic / Network usage, regional loads, Resource availability Vs Consumption, System Usage and delays etc..), data is used to learn the "norm" during learning stage and then predict future usage or growth and also report on anomalies.

AIOPs will need data from every data point in the enterprise. Data gathering, normalization and analysis are important to identify correlation between different variables. As i mentioned earlier, for security identifying how each enterprise is dependent on another is very important. For example, a guy who stole credentials might login from a different IP address or location (network data), access DB that is not authorized to him (user access data + Database metrics). or making several API requests to a server to flood the server to bring it down (System, Network data).




There are 5 stages in implementing the Artificial Intelligence based IT operations (including security).
1) Data Collection
2) Normalization + cleaning
3) Data Analytics (Charting, Graphing, Metrics and importantly statistical analysis)
4) Machine Learning (Supervised and Unsupervised)
5) Implementing AI OPs.


One might ask why we need "AI" in security while we have super talented personnel watching over the Data Analytics created by existing tools. Also, some might say, " i have automation tools to revoke or grant access or provision on the fly as there is demand." So why do i need "intelligence" at all. Good Question.

What if your enterprise grows from 10 thousand users to 10 million and your VM size grows 100 times? What if your demand for business has peaks and valley's during the day? Also, can you predict an system outage or a security anomaly in a very large scale environment.

The answer is just "impossible" without AIOPs. Security should not only depend on long term data analysis. A combination of long-term and real time analytics is the way going forward. For which Machine learning is great as it uses history data for supervised and unsupervised learning. The Supervised learning has a dependent variable (outcome) based on independent variables (inputs). Classification is another forma of supervised learning. In unsupervised, data is thrown on the wall to find different "clusters".

Practical implementation of AI OPs for security :
1) Anomaly detection : Example: It is based on identifying the norm in user behavior and then identifying if a user activity is "suspicious" . Machine learning models need to re-learn once an anomaly is hit, just to make sure the anomaly becomes a norm of the future.

2) Root Cause analysis: what if the analytics showed that the read access to your confidential data base has increased by 50% but by a authorized user? In that situation there is an anomaly that machine learning can predict. But a root cause analysis needs to be performed to see data base access for really legit or if there is a process that went rouge. Or if it is someone else who stole the credentials and trying to steal the data.

Not very thing in the world can be predicted as the Algorithms have limitations as they are dependent of the quality and variety of data. But if there was a breach in the system and it was identified a bit late, performing root cause analysis is next immediate logical thing to do. But using intelligence and automating the root cause analysis using AIOPs would help in identifying the cause of problem and remedy it ASAP.

3) Prediction:  as mentioned earlier, the machine learning will help with prediction of the future performance and identify any possible issues. The Operation manager can take necessary actions to perform deeper analysis and take action if necessary. For example, using prediction model, the operation manager identifies that Server would be running 85% in next 1 hour. He/She may choose to automate the process to perform load balance. What if they predict that the RAM will be 100% used in the next 30 mins, manual intervention might be needed to see if there is a memory leak or orphan storage left by some process. or something is in a loop.

4) Actionable alerting: When the AI program, identifies a user is trying to access a DB that he is not authorized, should the operator manually interfere or a set of policies can be automatically trigger that will enforce second or third level of security? Yes, this will impossible to handle manually if your companies has millions of users. Also what good is it, if you cannot take action of the alerts and alarms. Or what is its too late that some one responded. AIOPs will allow you to stop damage by taking action in several ways like limiting access or changing the resources available or permanently blocking a user.

Analytics also plays a vital role in "root cause Analysis". If the data comes in real time, Data analysis can identify the cause of issue and also help in remedying without hurting the end users.  Analytics will help in long term planning of the enterprise. Business decisions can be made on Long term data analysis. For example, you see that your business is growing only 4% y/y while the targeted was 8%. Now that is something Data Analytics can help you with and identify where you lack and there is scope of improvement.

References: Wikipedia, IEEE.org, Gatner.com., lots of other books and white papers.





Comments

Popular posts from this blog