APIs and Machine Learning- What You Need to Know


By Ross Moore

The Importance of APIs and Securing Them

From the consumer side, there’s nothing easier than opening some apps and accessing tons of information from them. On the application side, it’s a completely different view. Microservices, network connections, legal agreements, and development resources – just to name a few – are rushing along at a frightening pace to keep up with customer demands.

Because APIs are used almost universally, securing them – including their uptime – is of the utmost importance and immensely difficult. 

Tracking Activity and Taking Action

Some police departments have been and are looking for ways to predict crime well in advance, but numerous foundational issues are at hand that prevent accurate predictions that far ahead.

But the stakes are high, with sensitive data at risk and companies need to be able to predict cyberattacks well in advance. The best way that can be achieved is by using the latest technology to make the best inferences of the current data activity and act accordingly. These functions of accurate prediction require using Machine Learning (ML).

A primary ML strength is the ability to scale its data processing. Machine learning algorithms can be used for threat detection by analyzing network traffic to identify suspicious behavior, such as abnormal amounts of data transfers or repeated requests from a single IP.

Context is king in machine learning and using it to keep apps secure.

The role of machine learning in enhancing API security

The contextual use of ML algorithms includes behavior-pattern analysis. Context involves collating information such as how each API is used when called and if there is a history to the entity using the API. There’s no way to 100% determine every detail, but the more information available, the better informed the security will be in detecting API abuse.

Why the big push for automation and machine learning in API Security?

Older technology and approaches are slower because they rely on rules. Manually changing rules to match the threats is too slow. Automation with machine learning is required to make changes rapidly and dynamically as needed.

Legacy security tools such as WAFs (at least single- or limited-function WAFs) are based on signature and stateless detection, determining from observation whether the activity is anomalous or suspicious. ‘Intelligent CIO’ says, “attackers don’t handcraft malware; they modify existing malware just enough to throw off signature-based defenses. Malware signatures work by creating hashes of known bad files, so the smallest modification prevents a match.”

Modern protection, especially with ML, uses stateful detection to identify suspicious activity because the attacks are strategically abusing APIs, moving beyond simplistic approaches such as brute force or old-style SQLi. There needs to be defense-in-depth, defense-at-scale, and intelligent defense.

Benefits of using machine learning in API security

Several benefits of using Machine Learning in securing and improving APIs include:

Improved Threat Detection: Machine learning algorithms can examine enormous volumes of data to find trends and anomalies that can point to a security breach, giving enterprises a more thorough understanding of their security posture.

Adaptability: Machine learning algorithms can learn from previous occurrences and change to counter new threats, giving firms a proactive security strategy that changes as threats do.

Enhanced Efficiency: Machine learning may assist security teams in responding to threats more quickly and effectively by automating the analysis and prioritizing of security alerts, which cuts down on the time it takes to identify and address security issues.

Enhanced User Authentication and Authorization: Risk-based authentication techniques that monitor user behavior to estimate the possibility of fraudulent activities can be implemented, improving the security of user authentication and authorization.

Improved Data Privacy: ML reduces the danger of data breaches by identifying and blocking security threats, ensuring the privacy and security of sensitive data and preserving the reputation of enterprises.

It sounds too good to be true

Returning to the earlier section about the need for good data, ML can only work well when: 

1) there’s an abundance 

2) of quality data 

3) organized 

4) from all relevant sources

5) that can be analyzed in near real-time

The improvement of detection, prevention, and reaction to security risks is the role of machine learning in strengthening API security, but only if the above qualities are present. Attention to detail is required. 

Large-scale data analysis using ML algorithms can find trends and abnormalities that might point to a security breach. And ML is a helpful tool for boosting API security since it can learn from previous mistakes and adjust to new dangers. But only if the data is high quality.

Best Practices for Using Machine Learning in API Security

Preparation and Planning: Before implementing machine learning in API security, understand the security requirements and assess the data that will be used to train the algorithms. 

Quality and Accuracy of Data: Ensure that data is clean, accurate, and relevant to the organization’s security needs.

Continuous Monitoring and Maintenance: Regularly review the performance of the algorithms, update them with new data, and adjust their configurations as needed.

Integration with Existing Security Systems: Seamless interoperability is vital when implementing machine learning in API security.

Algorithm Bias and Discrimination: Machine learning algorithms are only as unbiased as the data used to train them: the less bias, the more accurate the results.

Data privacy and security concerns

Data Privacy and Security: Using machine learning in API security requires collecting and processing large amounts of data, which can raise privacy concerns. Ensure that sensitive data is protected and that privacy regulations, such as GDPR, are followed. Working toward ISO 27701 is a good goal.

Data Management and Access Control: Implement secure data storage and access controls to prevent unauthorized access.

Security of the Model and its Deployment: The security of the ML model and its deployment environment is crucial. Threat actors may try to compromise the model and its deployment environment to access sensitive data or manipulate the results.

Data Ownership: The ownership and responsibility of data used for training machine learning algorithms is a concern. Organizations must clearly define and understand these factors.

Remember Why You’re Doing This

Determining the right API protection platform isn’t easy; in reality, it shouldn’t be. API security is too easy to get wrong, and protecting the data that traverses the wires is extremely important. And that importance makes it worth doing what it takes to get it right.


Ross Moore is the Cyber Security Support Analyst with Passageways. He was Co-lead on SOC 2 Type 1 implementation and Lead on SOC 2 Type 2 implementation, facilitated the company’s BCP/DR TTX, and is a HIPAA Security Officer. Over the course of his 20 year IT career, Ross has served in a variety of operations and infosec roles for companies in the manufacturing, healthcare, real estate, business insurance, and technology sectors. He holds (ISC)2’s SSCP and CompTIA’s Security + certifications, a B.S. in Cyber Security and Information Assurance from WGU, and a B.A. in Bible/Counseling from Johnson University. He is also a regular writer at Bora.


Follow Brilliance Security Magazine on Twitter and LinkedIn to ensure you receive alerts for the most up-to-date security and cybersecurity news and information.