While speed and effectiveness are crucial in cybersecurity, operational resilience isn't just an option—it's essential in today's dynamic digital environment. Security outages can result in significant financial losses, reputational damage and regulatory repercussions.
Understanding the dual need to respond rapidly and effectively to new threats while maintaining uptime, we have built optimized update mechanisms to align with the specific requirements of our services. This multifaceted, five-stage approach ensures robust attack prevention, minimizes operational risks, and provides customers with the flexibility and control over how and when to implement updates.
Our Commitment to Operational Resilience
At Palo Alto Networks, we’re deeply committed to operational resilience, which is foundational to how we innovate and secure our customers. Our approach with Prisma SASE combines advanced infrastructure resiliency and rigorous testing, with continuous monitoring and remediation to achieve near-perfect uptime; our service level agreement (SLA) reflects this commitment, which delivers an industry-leading 99.999% uptime service availability.
1. Infrastructure Resiliency
Prisma SASE resilience is built on a robust architecture composed of three primary components:
- Dataplane: A dedicated dataplane per customer ensures that user-to-application data traffic is performant and secure for each specific organization. This isolation is crucial—any malicious act, intentional or otherwise, in one customer's environment doesn’t impact other customers. In contrast, shared dataplane models provided by other vendors have the potential to share IPs among multiple customers, which could result in applications deny-listing for all customers if one customer suffers a breach.
- Control Plane: The control plane orchestrates the dataplane and other services, enabling seamless management and operational control.
- Services: This includes a suite of services, such as Strata Cloud Manager (SCM), Autonomous Digital Experience Management (ADEM), Strata™ Logging Service (SLS), Cloud Identity Engine (CIE), and the Palo Alto Networks advanced cloud-delivered security services such as Advanced DNS Security, Advanced Threat Protection, Advanced URL Filtering (AURL), Advanced WildFireⓇ (AWF), DLP, SaaS Security and more.
We’ve designed each of the above components with built-in redundancy, leveraging zonal, regional, and multicloud strategies to ensure uninterrupted access and functionality. This robust redundancy safeguards your operations against potential disruptions, ensuring continuous access to applications. It also guarantees seamless policy management and security monitoring, which allows adjustments to the configurations and capabilities of your security services within the Strata Cloud Manager (SCM).
2. Controlled Upgrades Management
At Palo Alto Networks we employ a meticulously controlled, phased-release process for software and content updates. Our three-stage approach, described below, helps identify and address any potential issues long before they can affect our customer’s operations.
First, our controlled upgrade process begins with a robust quality assurance phase conducted internally in our Palo Alto Networks testing environments. We subject each upgrade to continuous regression and automation validation, leveraging solution test beds that simulate a wide range of customer deployments and verticals at scale. This extensive quality assurance allows us to validate the stability and performance of upgrades under diverse conditions, reflecting real-world scenarios our customers might encounter.
We then move into the second validation phase, where only a small group of internal users get access to the upgraded environment for careful evaluation. This stage starts with select groups of our sales engineering teams, then expands to include a pool of senior leaders and executives, and finally rolls out to all of our 16,000+ employees around the world. By rigorously validating upgrades internally across different groups, we can detect and remediate issues early, preventing disruptions from ever reaching our broader customer base.
Once our internal validation is complete, we proceed with a phased rollout to our customers, employing a canary upgrade strategy to manage risk and ensure stability.
For dataplane upgrades, this means initially upgrading a single region for each customer, allowing for controlled observation and rapid issue resolution. Customers also have the flexibility to select which location they prefer to upgrade first, offering them greater control over the process.
Similarly, we methodically phase control plane upgrades across multiple production environments, while service updates are rolled out in regional or tenant-based batches to maintain consistency and reliability. Palo Alto Networks closely monitors rollouts with the ability to roll back any updates based on deviations from behavioral baselines within each environment.
3. Monitoring and Proactive Remediation
Proactive management is central to our strategy. We continuously monitor the health and performance of our data and control planes, as well as our hosted services.
For example, if the number of users increases unexpectedly (i.e. a company all-hands meeting), we can autoscale up to meet the increased demand. Similarly, if we detect congestion on a link within our fabric, we can automatically transition to an optimized performant link to dynamically improve performance.
We have a close partnership with our cloud providers where we receive real-time service status updates. If we detect any disruptions, we work directly with those partners via our site reliability engineering (SRE) team, enabling swift action to mitigate potential service impact.
Our commitment to always-on monitoring and remediation underpins our operational service resilience, allowing us to prevent issues before they impact our customers and to respond rapidly if they do.
4. Deployment Guidelines for Resilience
To further enhance resilience, Prisma SASE offers deployment guides that optimize the reliability and performance of our services. A key aspect of these guidelines includes the implementation of multicloud redundancy for service connections, which ensures consistent and secure connectivity of our service to customer-owned resources and applications.
By deploying service connections with redundancy across multiple availability zones, regions and cloud providers, we provide an extra layer of protection against localized failures, minimizing the risk of service disruptions. We also recommend deploying connections in active/active or active/passive configurations depending on specific use cases and requirements.
This setup allows for dynamic failover capabilities, ensuring that your business operations remain unaffected in the event of an outage or degradation in service quality. These deployment strategies, combined with our robust architecture and proactive management, deliver a resilient and highly available SASE solution that supports your critical business needs.
5. The Industry’s Only Resilient SASE with Additional Endpoint-based Secure Connectivity
Recognizing that endpoint disruptions can severely impact business continuity, Palo Alto Networks offers a unique advantage with our extended secure connection capabilities.
A standout feature of our SASE solution lies within Prisma Access Browser and its ability to extend security and resilience to unmanaged devices, providing consistent secure connectivity for BYOD and contractor use—including SaaS applications and private apps—and to the web across any device and location.
Applying this capability to employee BYOD use cases ensures that even if a secure connection from a managed device is disrupted, employees can still securely access critical business applications such as Salesforce and Workday from any device, including personal ones.
Conclusion
Operational resilience is about ensuring that your business can continue to operate smoothly in the face of any challenge. At Palo Alto Networks, we’re dedicated to providing the highest levels of reliability, availability and serviceability across our Prisma SASE solution.
Through our layered approach to operational resilience—including infrastructure resiliency, controlled upgrades management, monitoring and proactive remediation, deployment guidelines for resilience, and additional endpoint-based secure connectivity—we deliver peace of mind and a competitive edge, empowering your organization to thrive in an unpredictable world.
Learn more about the operational resilience of your Palo Alto Networks SASE solution. Reach out to your Palo Alto Networks account representative to get started.