What Is Runtime Security | Palo Alto Networks
Runtime security refers to the continuous, end-to-end monitoring and validation of all activity within containers, hosts, and serverless functions. It works by leveraging application control and allowlisting to establish a baseline of normal behavior for each host, container, serverless function, and other objects within a cloud-native environment. Through real-time observation of file systems, processes, and network activity, runtime security tools detect suspicious or anomalous activity and alert teams as needed.
Real-time security monitoring, of course, isn’t new. For almost two decades, security information and event management (SIEM) platforms have been monitoring application environments for anomalies. So what’s different about cloud-native runtime security?
The difference centers on the environment.
Runtime Security for Modern Applications
By automating security for fast-moving, dynamic applications like those that run in containers, runtime security addresses the unique security and compliance needs of cloud-native environments.
Cloud-native runtime security operates in environments moving so fast that baselines, in the traditional sense, don’t exist. When clusters change as nodes come offline or containers spin up and down (or load balancers redirect traffic between instances. etc.), conventional data sources like logs and network traffic are incapable of detecting anomalies.
Runtime security in the cloud-native environment works on a deeper level, establishing a dynamic baseline by interpreting how behavioral trends vary over time. From here, runtime security tools can detect changes in internal container processes, file system activity, and so on, that deviate from the norm — even within environments that rapidly scale.
Put another way, runtime defense is the set of features that provide predictive and threat-based active protection for rapidly changing environments.
- Predictive protection includes capabilities like determining when a container creates an unexpected network socket or runs a process not included in the origin image.
- Threat-based protection includes capabilities like detecting when malware is added to a container or when a container connects to a botnet.
Models and Rules: Understanding Runtime Security
Runtime security focuses on safeguarding containers during their execution — when they’re active, operational containers and most vulnerable to malicious activity. Traditional security tools weren’t designed to monitor running containers.
- The cloud-native runtime environment is unique.
- Runtime security is the only way to secure cloud-native applications at scale.
Using AI and machine learning, runtime security automates the process of modeling healthy activity.
Modeling refers to the process of creating a representation of normal, safe behavior for applications and services running in a cloud-native environment. This representation, or model, serves as a baseline to identify and detect deviations or anomalies that might indicate security threats.
By continuously monitoring and comparing the runtime activities of applications and services against the established model, teams can identify and respond to unauthorized actions, privilege escalations, and other potential incidents.
A runtime security solution like Prisma Cloud implements individual sensors for file system, network, and process activity, each with a unique set of rules and alerting. The unified runtime defense architecture simplifies the administrator experience and provides detail about what the solution learns from each image. Within this framework, runtime defense consists of two main object types — models and rules.
Container Models
Models are generated from the autonomous learning of a container runtime security solution and represent the allowed activities for a given container image across all runtime sensors. They offer administrators an overview of what the system has learned about their images. An Apache image model, for example, would specify the processes that should run within the container and the exposed network sockets.
Models are built from static analysis, like hashing process maps based on Dockerfile ENTRYPOINT scripts, and dynamic behavioral analysis, like observing actual process activity during early container runtime. Models can be in active, archived, or learning mode.
Modeling Capabilities
Some containers, like Jenkins containers, are difficult to model due to their dynamic nature. A container runtime security solution can automatically detect known containers and enhance the model with capabilities, tuning runtime behaviors for specific apps and configurations without changing the learned model.
Learning Mode
Learning mode is when the container runtime security solution performs static or dynamic analysis. Images stay in learning mode for one hour, followed by a 24-hour "dry run" period to ensure model completeness. If behavioral changes are observed during the dry run, the model returns to learning mode for an additional 24 hours. During learning mode, only threat-based runtime events are logged.
Active Mode
Active mode is when the container runtime security solution enforces the model and looks for anomalies that violate it. Active mode begins after the learning mode's 1-hour period. The system monitors for variances against the model, such as unexpected processes.
Archived Mode
Archived mode occurs when a container no longer actively runs a model. Models persist in archived mode for 24 hours before removal. Archived mode serves as a recycle bin for models, ensuring that frequently starting and stopping images don't need to re-enter learning mode.
Rules
Rules control how a container runtime security solution uses autonomously generated models to protect an environment. They allow or block activities by sensor and are evaluated together with models to create a resultant policy:
model + allowed activity from rule(s) - blocked activity from rule(s) = resultant policy
For example, if a model allows the httpd process and you want to ensure the bar process is allowed while the foo process is blocked, you can create a rule for all httpd images, add bar to the allowed process list, and add foo to the blocked process list.
Via models and rules, a runtime protection solution automatically learns how applications behave under different conditions. Users can then distinguish normal shifts in application behavior from those that reflect a security problem.
Components of Container Runtime Security
Identifying new vulnerabilities in running containers relies on knowing what normal looks like — even in dynamic environments. With dozens of microservices to manage and hundreds of containers, serverless functions, and VMs hosting them, teams don’t have time to manually collect behavioral data and configure behavior models. Organizations must leverage enhanced runtime protection capable of identifying and investigating suspicious activities potentially indicating zero-day attacks.
Control Application Behavior
In addition to modeling safe behavior, runtime defenses should automatically define and enforce allowed and disallowed actions for each container, serverless function, or objects in the environment. This includes determining which other containers a given container can communicate with and the type of communication allowed, as well as specifying which data storage volumes can access it. Enforcing these rules is essential for limiting the impact of a potential security breach.
Send Meaningful Alerts
Runtime security tools need to automate defenses and alert your team when manual intervention is required. To achieve this, they should monitor and send alerts for suspicious changes in processes, network connections, or file system read/writes within cloud-native infrastructure. They must also be able to decide whether to send an alert based on dynamic alert rules. Static alerting rules are insufficient for addressing the evolving nature of cloud-native threats, given that activity appearing threatening at one moment may prove benign at another.
Integrate with Other Security Solutions
Runtime security represents only one layer of defense that should exist within your organization’s cloud-native security tech stack. Particularly when working with highly distributed, containerized microservices, you’ll want your runtime protection to integrate with security solutions addressing the additional layers of your ecosystem.
- Host Operating System (OS)
- Cloud Network
- Container Orchestration
- Container Registry
- Container Security
Automated data security protections, access control, auditing tools, container image scanners, and so on, are equally important. Your runtime security solution must be able to integrate with other security tools to provide full depth and context for incidents, as well as an understanding of how a threat at one layer of your tech stack (like the runtime environment) impacts another (like data at rest).
Detect Incidents in Real Time
Although runtime security is capable of mitigating the impact of a breach after it occurs, your runtime solution will ideally allow you to find and remediate threats in real time, before they have an opportunity to escalate.
Limit the Blast Radius, Prevent the Breach
By delivering control over file systems, processes, and network activity for each container and serverless function, your runtime security solution should mitigate damage that could result if a security breach occurred within the environment. It should automatically model application-safe behavior and enforce rules that prevent dangerous activity on the container or host, ultimately preventing situations such as a compromised container executing processes that spread to other containers or the host.
Enable Incident Response
Incident response hinges on the data collected by your runtime security solution. By capturing and storing audit data for cloud-native applications, it provides teams with the information needed to understand what went wrong in the wake of an incident, even if the cloud-native environment no longer exists in its earlier form when the investigation occurs.
Best Practices for Optimal Runtime Security
Runtime security best practices serve to safeguard applications and infrastructure from runtime threats. By implementing proactive measures, organizations can minimize vulnerabilities, detect malicious activities, and limit the impact of security breaches.
End-to-End Runtime Coverage
Monitoring only part of your environment or focusing on only key services or infrastructure isn’t enough to detect all security threats. For optimum results, apply runtime security to all layers of your environment and use it to protect both development and production workloads.
Unique Resource Treatment
Because every host, container instance, and serverless function in your cloud-native environment has a unique configuration and behavior you should model each object separately. Don’t assume all containers will behave the same — not even those based on a common container image. Operating from a sweeping assumption will lead to a sampled approach that limits visibility into security incidents.
System Call Monitoring and Filtering Techniques
At the core of container runtime security is the monitoring and filtering of system calls made by processes within containers. System calls act as an interface between applications and the operating system kernel, allowing applications to request resources or services. By monitoring and controlling these calls, organizations can detect and prevent unauthorized actions, privilege escalations, and other malicious activities.
Falco is an open-source runtime security tool that monitors system calls and network activity, detecting and alerting on suspicious behavior. Also open source, Seccomp filters and restricts system calls, providing granular control over the actions of processes in containers.
Comprehensive Vulnerability and Malware Scanning
Regularly scanning containers for known vulnerabilities and malware during runtime is essential in identifying and addressing security risks. Continuous scanning ensures that organizations can detect newly discovered vulnerabilities and take appropriate action to secure their container environments.
Employ a runtime scanning solution that can detect unknown vulnerabilities and malicious code execution. Additionally, consider integrating threat intelligence feeds to stay updated on the latest threats and vulnerabilities affecting container environments.
Advanced Network Segmentation and Traffic Monitoring
Incorporate advanced network segmentation and traffic monitoring techniques by utilizing tools like Cilium or Calico to enforce network policies and enable microsegmentation. Leverage service mesh technologies, such as Istio or Linkerd, to encrypt container-to-container communication and implement fine-grained access controls. Use network monitoring and analysis tools to capture and analyze container traffic, facilitating the detection of anomalies and potential security threats.
Compliance and Auditing
Implement and maintain compliance for Docker, Kubernetes, and Linux CIS Benchmarks, as well as external compliance regulations and custom requirements. Remember to consider that, by default, Kubernetes APIs offer various easy privilege escalation routes. In a multitenant cluster, using certain features can introduce instability, so proceed cautiously when deploying them.
Policy Engines
Policy engine management solutions like Kyverno and OpenPolicyAgent (OPA), or a CSPM like Prisma Cloud, help ensure that containers adhere to policies aligned with standards like PCI DSS, HIPAA, GDPR, ISO 27001:2013, and NIST. Custom policies can also be created to enforce organizational standards.
Policy use cases detect a myriad of activities, including account hijacking attempts, backdoor activity, network data exfiltration, unusual protocol, and DDoS activity. Once a threat is detected, an alert is generated, notifying administrators of the issue so that they can respond quickly. Many policies map to the MITRE ATT&CK Enterprise IaaS Matrix, providing a comprehensive roadmap for securing your cloud assets.
Audit Checks
Implement a regular auditing process that scans all layers of your Kubernetes cluster and configurations to ensure they align with industry standards and best practices. Audits won’t necessarily detect threats in real time, but they will help you stay ahead of security problems or misconfigurations you may be overlooking that could give attackers an entry point to your cluster or applications.
Monitoring and Logging
Implementing monitoring and logging solutions for container activities enables organizations to detect and respond to security incidents in real-time, mitigating potential threats and facilitating incident response. Tools like Grafana, Jaeger, Prisma Cloud, and Prometheus provide visibility into container performance and health, enabling proactive management. Key metrics include cluster state, node status, pod availability, memory, disk, and CPU utilization. Monitoring helps identify configuration issues and ensures that containers meet business needs.
Monitoring Level | Metrics | Description |
---|---|---|
Cluster |
Cluster Nodes |
Measure how many nodes are available, which helps determine the cloud resources required to run the cluster. |
Cluster Pods |
Measure how many pods are running to help determine if you have sufficient nodes available to handle your overall workload in the event of a node failure. |
|
Resource Utilization |
Measure the computing resources utilized by your nodes, including memory, CPU, bandwidth, and disk utilization. |
|
Pod |
Container Metrics |
Monitor network utilization, CPU, and memory usage. These metrics, held up to DevOps-prescribed maximum values, determine if pods are running as designed. |
Application Metrics |
These metrics are application-specific and based on business use cases, for example, the number of concurrent users accessing the application, number of entries published or purged, user experience, etc. |
|
Kubernetes Scaling and Availability Metrics |
By monitoring the orchestration tool and how it handles a specific pod, you can see the number of pod instances at a given moment (compared to the expected number). These metrics will provide health checks of pods and applications, network data and on-progress deployments. |
Table 6: Strategic runtime metrics
With metrics, teams can understand whether microservices or individual container-based applications are running as expected and meeting desired business needs through scale-out or scale-in automation and analytics based on expected traffic.
Reviewing metrics also proves beneficial when considering horizontal scale-out approaches for container-based applications, microservices, and security-based products like Palo Alto Networks CN-Series firewalls. Having an effective monitoring strategy in place ensures higher uptime for services with minimal degradation and performance issues.
Additionally, understanding resource consumption, service configurations, and usage helps reduce operational and development costs. This insight can assist in daily operations efforts and gauging CI/CD pipeline health.
When selecting a monitoring and logging solution, keep in mind the metrics you’d like to observe. Many tools have the capacity to address a range of reporting for a multitude of applications and integrations.
Monitoring Kubernetes Clusters and Nodes |
Cluster resource usage |
Is the cluster infrastructure underutilized? |
Is the cluster infrastructure over capacity? |
||
Project and team chargeback |
||
Node availability and health |
Do we have enough nodes available to replicate the applications? |
|
Will we run out of resources? |
||
Monitoring Kubernetes Deployments and Pods |
Missing and failed pods |
Are all the necessary pods running for each of the applications or microservices? |
How many pods are dead or crashing? |
||
Running vs. desired instances |
How many instances for each microservice is actually ready? |
|
What is the expected number of microservices meant to be ready? |
||
Pod resource usage against requests and limits |
Is the pod's resource usage within the configured CPU and memory requests and limits? |
|
What is the expected number of microservices meant to be ready? |
||
Monitoring Kubernetes Applications |
Application availability |
Is the application responding? |
Application health and performance |
How many requests are we seeing? |
|
What is the responsiveness or latency for this application? |
||
Do we have any errors? |
Table 7: Additional metrics to consider, depending on use cases aligning with your organizational needs.
Incident Response and Forensics
In the event of a security incident, container runtime security tools can provide valuable data for investigation and remediation. This includes logs, system calls, and other forensic evidence that can help to identify the source of the attack and prevent future occurrences.
Container Escape Prevention
Container escape is a significant threat during runtime. It occurs when an attacker breaches a container's isolation, accessing the host system. Preventing this requires minimizing container privileges and avoiding critical mount points. Following best practices like CIS benchmarks for Docker and Kubernetes is essential.
Adopt a Defense-in-Depth Strategy
Employing a defense-in-depth approach to container security by implementing multiple layers of protection, including runtime security, image scanning, network segmentation, and host security helps organizations build a resilient security posture.
For instance, in addition to container network security via containerized next-generation firewalls, container runtime protection can serve as another layer of security to block malware. Runtime protection can also incorporate web application and API security to prevent HTTP-based Layer 7 attacks, such as the OWASP Top 10, denial of service (DoS), or bots.
Adopt a Holistic Approach
Container security must be addressed as part of a holistic enterprise cloud security strategy. While it’s tempting to add yet another security tool to the arsenal, addressing container and cloud security separately tends to leave organizations blind to risks that an otherwise integrated strategy would address. Mature organizations see containers as an essential component of their cloud infrastructure and address them with a centralized platform approach, typically leveraging a CNAPP.
If your security team is reactively focused on securing your applications during runtime, take a step back and consider the entire development and deployment process. While it's crucial to ensure the end state (runtime) is secure, concentrating solely on runtime security may cause you to overlook vulnerabilities or early-stage security issues that will likely repeat with a narrow approach.
By working backward, you can evaluate and address security concerns throughout the entire development lifecycle, from design and coding to testing and deployment. The holistic strategy will help you identify and fix issues before they become problems in the runtime environment, reducing the chances of repeating the same security issues.
At-a Glance Runtime Security Checklist
- Continuously scan applications for real-time threat detection to prevent or arrest attacks.
- Utilize a container runtime security solution to scan containers for known vulnerabilities and provide remediation recommendations.
- Monitor container behavior for abnormal activity.
- Run containers with low-privilege users, following the principle of least privilege.
- Study data from logs, system calls, and other forensics to identify the source of an attack and prevent future occurrences.
- Integrate container security solutions into CI/CD pipelines.
- Stay up to date with the latest threats and vulnerabilities via continuous monitoring and integration of threat intelligence feeds.