Blog

Filter by:

Blog Subscription Form

Can we talk about Service Mesh and multi-cloud?

Charles Stucki
By Charles Stucki | 01.31.19 | No Comments

Digital industry leaders have built themselves awesome software platforms. So, applications development and deployment i.e. DevOps is a massively strategic competency.

Agile teams code microservices using their preferred languages and tools and enable them with APIs. DevOps continuously integrates microservices and releases applications using automation, VMs or Linux containers, orchestration systems like Kubernetes and multiple clouds for IaaS.

Service mesh is a tool to manage the many interactions of microservices within an application. It does this over the top of the underlying packet moving network infrastructure.

In part 1 and 2 of my recent Service Mesh posts, I described 6 great leaps forward compared to SDN, which offers applications ways to configure the underlying network components.

In this post, I highlight one major Service Mesh attribute that may be a key limiter rather than a leap: A service mesh creates a flat mesh of point to point connections.

A flat mesh is like Southwest Airlines…no hubs.  It’s effective and easy to use provided you don’t need to go very far. Here, going far means crossing domains.

Question 1:  Why would anyone deploy an application across multiple domains?

Multiple domains include multiple:

  • Kubernetes clusters
  • VPCs in Amazon
  • Public and private clouds
  • Trust domains within an application 

At KubeCon, my former Cisco colleague Matt Caufield shared a thoughtful progression on the need for multiple clusters for the same application:

  • From 1 cluster to 2: Redundancy.
  • 2 to 10: Global user experience and data sovereignty.
  • 10 to 100: Highly distributed such as IoT Edge computing (see Matt’s use cases.)
  • 100 to 1000+: Scale-out of digital product, minimized latency, fine-grain privacy. 

The same applies to using multiple coherent units of cloud IaaS capacity, called VPCs at AWS and GCP or VNets at Azure.

Under what conditions would you deploy a single application to multiple public and private clouds?  Examples include:

  • Enterprise: Web services are at AWS, ML continuous learning algorithms are at GCP, but customer data base remains in private cloud.
  • SaaS vendor: We run on AWS but landed a big contract that requires keeping the customer’s authentication processes and data stores on Azure.
  • IOT solution vendor: The core application and ML modeling are at Azure, but our sensor data processing and inferences run on Edge data centers from Equinix and Telcos.

Finally, why would you separate trust domains within the same application? Many companies have policies such as: externally reachable components never establish direct connections with key internal customer and financial data.

Bottom line:In cases like these, the microservices that make up the application must cross domain boundaries to communicate.

Question 2: What specific types of problems does a flat mesh face with multiple domains?

Security– You can secure a service mesh by allowing only connections initiated by the application and authorized by the orchestrator. But at least two categories of exposure remain:

  • Developer error. For example, a service is supposed to read but inadvertently is also enabled write, update, and delete its data sources.  Isolating the data source with role-based privileges as well as non-flat flow patterns such as publish and subscribe create checks and balances on developer errors.
  • Compromised workload. In a flat mesh, for example, any workload can look up the IP address of other workloads. A bad actor who gets IP addresses from one for others can exploit those other resources directly. A non-flat network, as above, can obscure and isolate especially sensitive resources.

Efficiency– A flat mesh assumes, as in a LAN that the cost of communications is zero.  In some important cases this is not a good assumption, including:

  • Long distance. Imagine 25 microservices in Europe all want to subscribe to the same telemetry stream from one in Seattle?  Identical data crosses a high-cost link 25 times. In a non-flat network, a single stream of this telemetry goes to a network service in Europe where all users subscribe to it, incurring the transatlantic transit cost once.
  • Gateway traversal.Cross-domain connections traverse firewalls, address translation, vpn and another gateways. Each requires processing, latency and potentially financial charges. Finding the most efficient path given business conditions and ensuring traffic passes through the correct services such as deep packet inspection go beyond flat.

Bottom line:To put it simply, while a flat network is a lot easier to understand for the development organization, in these and many other use cases, the application’s production network simply should not be flat.

I welcome a healthy discussion on that conclusion. Please be assured that I am not advocating a retreat to SDN.  And so, it raises another question:  How do you build a non-flat service mesh without losing its great leap attributes?  My colleagues and I at Bayware will explore this next.