How Machine Learning Can Boost Cloud Security

Machine learning could improve cloud security

Companies are struggling to keep their networks secure, particularly when they involve cloud computing, but an emerging technology—machine learning—could provide some help to eliminate unauthorized communication in your cloud.

One of the biggest challenges of managing network security is understanding the environment: What assets are present? How are they communicating? What are their dependencies? Have new entities been added? How do they alter communication patterns? How does that affect risk? And the list goes on. Identifying and mapping network topology is hard enough when the network is stable and local, as is the case for on-premises data centers. But cloud exponentially expands the problem of asset tracking. And by all measures, cloud usage is commonplace and growing among companies of all sizes.

Wolters Kluwer Buyer’s Guide

Elasticity and scalability are two top reasons organizations love the cloud. That same dynamism, however, ratchets up the amount and complexity of work for infrastructure and security teams. In the era of interconnected everything, the reach of an organization’s network is almost boundless.

Between mobile devices, remote workers, IoT, partners, suppliers, contractors, shadow IT, rapid application deployment, and more, gaining an up-to-date inventory of everything present and communicating on the network is hard enough. Maintaining accurate records when network instances spin up and down or change location is simply not tenable using manual methods or even periodic scans. Automation is the key. However, automated scanning on its own is best suited for inventory. While visibility is always the initial step in managing what you have, there’s so much more to gaining control over your networks and ensuring the security of all software and services communicating there.

The application of machine learning can help your company go from a basic inventory to a thorough understanding of your networks. While automation allows you to collect more data than any human-driven method, it’s machine learning which processes that wealth of data and turns it into actionable insights the company can use to make both strategic and tactical decisions about the security of the network.


This is the third article in a series on cybersecurity. See also, “How to Keep Your Cyber Risk Under Control,” and “Data Breach Prevention as a Competitive Differentiator.”


The What, How, and When of Network Communications
You need a clear and current view of what assets or entities are on your networks, how they’re communicating, and when those communications happen. Merely identifying the what isn’t enough. Let’s say the operations team finds malware in the environment but sees that it isn’t communicating or trying to communicate with any other asset. This is a much different scenario than if the malware had accessed various applications, services, or devices—and it’s the very reason it’s important to learn and understand communication patterns.

Machine learning provides the power to understand patterns and baselines—it’s the why behind the what. This information is important because it adds context; the number of threats that could potentially present a problem to an organization is almost endless. Companies need a way to narrow their scope of concern to that which is most likely given their unique business. Machine learning assesses a company’s individual environment and presents a set of data that can be applied to secure not only what assets are present, but also prevent unauthorized assets from communicating and prevent them from making unauthorized connections (which could be an indication that the network has been hijacked).

Who Is Communicating on Your Network?
In a networking context, who often refers to the people accessing network resources, but people need software and services to navigate the network. Therefore, it only makes sense for you to be able to positively identify any requestor. When we’re talking about people as the who, many different attributes constitute an identity. Software and services also have an identity, and machine learning is best poised to build that identity using data sourced from the kernel.

Machine learning not only gathers relevant data about the attributes of communicating assets, but can determine the veracity of subtle changes to software or services such as software updates. In the same way that a person doesn’t get a new identity every time they wear a new piece of clothing, software, services, and cloud networks change constantly but their core identity doesn’t change. Security controls, therefore, must be adaptive without adding risk. Machine learning can analyze a combination of software-specific data (such as fuzzy hash and PE header values, for example) and environmental data (such as network namespace, IP address, and port) to build identities for each asset.

Using the identity built from machine learning, security controls can then definitively determine whether an asset is authorized to communicate on the network or not. Anything that fails to match an approved identity is prevented from communicating, and as a result, the network becomes better protected against threats.

Where Network Communication Is Happening
Broadly speaking, where refers to the network in question. That said, in a cloud network change could occur at any given moment. As such, you need to be able to decipher necessary and approved communication pathways from those that are unnecessary and therefore increasing risk. We’ve already explained above how machine learning can assess an environment to understand the who, what, how, and when of network communications, but machine learning is also an extremely important factor in securing the pathways software and services use to communicate over your network.

Most organizations’ networks are severely overexposed, meaning that attackers are able to communicate over unmonitored paths or hide in approved software (…which hasn’t been ID’d via machine learning). Given today’s technology, network overexposure is entirely avoidable. Machine learning can be used to baseline normal activity and behavior and quantify environmental risk. Learning the ins and outs of how your organization’s networks function based on a statistical analysis versus basic aggregated data (meaning not processed with machine learning) provides a more reliable way to control what assets have access to your networks and which ones can communicate across them.

Impacts of Machine Learning
If the above explains why and how the application of machine learning can improve your network security and prevent unauthorized communication in your cloud, it’s also important to note the benefits of machine learning. Actionable insights based on actual environmental data that’s specific to your business and security goals can:

  1. Allow for faster, better-informed decisions
  2. Reduce cybersecurity risk
  3. Provide measurable results

Machine learning may seem like magic in some cases, but (when done correctly) it’s really advanced math, using complex algorithms to determine outcomes. Of course, not all machine learning is created equal. Therefore, it’s important to understand how your vendor is applying machine learning to understand if it’s true machine learning or if it’s just lots and lots of data.

That said, machine learning can have transformative results on the efficacy of your security program and enable you to keep malicious communications off your network and out of your cloud.


Katherine Teitler is a cybersecurity speaker and writer, based in Medford, Mass.

Leave a Reply

Your email address will not be published. Required fields are marked *