Cloud computing underpins the next wave of digital transformation for organizations. Although most organizations have a preferred cloud provider, they have started to deploy applications and workloads across multiple clouds for the following reasons:
- Avoiding vendor lock in
- Best of breed feature sets from different providers
- Competitive pricing
- Business continuity and disaster recovery
- Application performance and operation compliance across different geographies
- Mergers and acquisitions
Multi-cloud adoption comes with its own set of operational challenges though. With a widely ranging and disparate data sets originating from remote users, IoT devices, campuses, intra cloud and intercloud end points, managing and monitoring traffic from all these sources become very complex very soon. In this blog, we will briefly describe why network visibility is critical to day-2 operations and how Alkira addresses some of the challenges.
Multi-Cloud Visibility and Monitoring
To deliver the best customer experience, IT teams have to proactively detect and mitigate hot spots and blind spots in their multi cloud footprint. This is easier said than done; it is not trivial to munge vast data points available from various visibility tools offered by the cloud providers in different formats in order to determine infrastructure failures, application performance issues or security exposures, just to name a few. It is truly the network’s needle in the haystack.
With a robust microservice design and Infrastructure as Code deployments, cloud native applications have become very flexible, agile and complex at the same time. It is expected that 70% to 80% of the cloud traffic will be East-West, monitoring it today oftentimes requires installing third party agents in every workload. The agents come in different flavors for different platforms (AWS/Azure/GCP) and for different operating systems, which exacerbates the infrastructure complexity even further.
A robust cloud monitoring platform must provide a unified and holistic picture of the entire multi-cloud and hybrid cloud environments using a single pane of glass view and management.
The Alkira portal is an enterprise grade solution that shines light on your multi-cloud platform with the following key network insights:
- Network connectivity status
- Application and network usage
- BGP Health and Route visualization
- Cloud inventory management
- Policy Driven Metrics
Let’s dive into each of them and their associated benefits.
Network Connectivity Status
To effectively troubleshoot in a multi-cloud environment for packet drops and latency issues, the status of network endpoints must be known at all times. The endpoint status can be obtained in one of two ways:
- Periodically collecting device stats, error metrics and logs from every node in the network and storing all this information in a database. Then process the data for any network anomalies at regular intervals and take corrective actions as needed.
- Actively monitor every network endpoint with synthesized active probes and update the state based on the probe’s outcome. Unlike the previous method which requires a push or a pull at preset time intervals, the network state updates happen in real time with this approach.
At Alkira, we took the second option, once probes detect any network endpoint inconsistencies, alerts are generated and published. These alerts can be consumed using REST APIs and remedial actions can be taken at the application level to avoid any downtime.
Application and Network Usage
The Alkira solution provides network and application level stats across all cloud providers in all regions. This data provides critical context to understanding performance issues across the entire network. Deep packet inspection is done on all flows, granular metrics (bandwidth, latency, drops) are collected and displayed for all endpoints thereby illuminating congestion hotspots in the cloud fabric. The intuitive view helps decipher whether degraded performance is unique to the application, user or location, so corrective measures can be taken accordingly.
Shadow IT – resources that employees routinely use without the cognizance of enterprise IT teams; these unauthorized resources pose serious security threats to organizations if sensitive data gets unwittingly exposed to “bad actors”. For IT and cloud admins, it is vital that this dark data (unclassified and unknown) be brought under the IT control to make sure it complies with organizational connectivity and security policies. With Alkira, customers obtain an extensive view of all applications in their cloud environments, if a data source or an application is not known, appropriate security and compliance policies can be swiftly put in place to minimize risks and disruptions.
BGP Health and Route Visualization
As enterprises migrate towards a hybrid cloud environment, assigning and managing IP addresses to the various segments becomes an integral part of the network architecture. Having a concise, cohesive and a clear view of all the existing IP routes in the network goes a long way in managing a robust network. The route visualization dashboard in Alkira’s portal gives a summary of received and advertised routes at the global, regional and the individual connector levels.
The solution also constantly monitors the health of BGP sessions on all nodes; if sessions go down alerts are raised immediately and the relevant routes no longer show in the route visualization dashboard.
Route conflicts get auto detected; a misconfigured network can lead to route loops and traffic blackholing wreaking havoc on the network. If the same routes are learnt from two different endpoints with the same preference, a high priority alert is generated and the routes get invalidated. But if the same scenario plays out with different preferences, the best path is chosen based on the preference.
Cloud Inventory Management
The Alkira solution is consumed as-a-service, this implies there are no agents or gateways running in the customer VPCs/VNets. At the same time we extended the holistic visibility into the customer accounts as well; with some simple read permissions the Alkira solution offers deep insights into the following elements of customers’ environments:
- Deployed cloud resources
- Connectivity between various resources
- Security gaps for the various resources
- Resource utilization and limits
For example, here is the snapshot of what the Alkira portal relays:
- Conflicting CIDRs across various cloud providers
- Virtual networks that access internet without traversing the Alkira cloud backbone
- Elastic IPs created but never assigned
- VPN connections down for an extended period of time
- VGW with no VPN connections
- IGW or VGW not associated with any VPC
- Security Groups with allow_all access (all IPs and all ports)
Apart from identifying security gaps, the portal minimizes cloud wastage by tagging resources that add no value to the customer network.
Metrics for policy enforcement
To implement and enforce effective policies, it is critical to compare and contrast the traffic profile of each application with every other application in the cloud fabric. For example, do streaming media applications take considerably more bandwidth than critical SaaS and enterprise applications? With Alkira’s top application chart, this information is readily available and policies can be created so important traffic profiles are never starved. Same applies for end points as well, the top talkers chart identifies data sources that send the most volume of traffic, policies can be applied to meter those endpoints if they are not deemed essential.
If robust network connectivity and security form the foundation of a stable multi-cloud network, intuitive visibility is an essential and necessary scaffold. Alkira’s clear, comprehensive and cohesive visibility dashboards coupled with a rich set of APIs, provide a fertile ground for innovation, where customers are no longer blindfolded in the cloud deployments.
Fast track your network into the cloud era by taking a 30 minute challenge