We Replaced Our Service Mesh with Postgres Row Level Security
Andika's AI AssistantPenulis
We Replaced Our Service Mesh with Postgres Row Level Security
The conventional wisdom in microservices architecture is clear: you need a service mesh. It's pitched as the definitive solution for security, observability, and traffic management in a distributed world. We bought into that promise, deploying a complex mesh to handle service-to-service communication. But after months of wrestling with YAML configurations, performance overhead, and operational headaches, we realized our core problem wasn't about which services could talk to each other, but what data a user could access. That's when we made a radical decision: we replaced our service mesh with Postgres Row Level Security, and we've never looked back.
This isn't just a story about swapping one technology for another; it's about fundamentally rethinking where authorization logic should live. By moving security to the data layer, we simplified our architecture, improved performance, and created a more robust authorization model.
The Service Mesh Promise vs. The Painful Reality
A service mesh like Istio or Linkerd offers a tantalizing suite of features. Automatic mTLS encryption, sophisticated traffic routing, and detailed observability are powerful tools. We initially adopted one to enforce a zero-trust network, ensuring that only authenticated and authorized services could communicate.
However, we quickly ran into the operational friction that comes with this power.
The Operational Overhead of Sidecars
The magic of a service mesh is enabled by sidecar proxies—separate processes that run alongside each of your application containers. These proxies intercept all network traffic, applying your configured policies. While effective, this model comes with significant costs:
Each sidecar consumes CPU and memory, increasing the resource footprint of every single service. Across dozens of microservices, this added up to a substantial increase in our cloud bill.
Resource Consumption:
Increased Latency: Every network request now takes an extra hop through two proxies (one on the client side, one on the server side). While individually small, this added latency impacts the overall performance of user-facing requests.
Configuration Complexity: Managing the custom resource definitions (CRDs) for service mesh policies is notoriously complex. A simple change could require updating multiple YAML files, and debugging communication issues became a frustrating exercise in tracing traffic through the mesh layer.
When Service-to-Service Authorization Isn't Enough
The most critical realization was that our service mesh was solving the wrong problem. It could expertly answer, "Can the OrderService talk to the InventoryService?" But it had no way of answering the much more important business question: "Can this specific user see the inventory details for this specific product?"
This meant we were still writing authorization logic inside our application code. The service mesh provided a secure perimeter between services, but the fine-grained data access control—the part that actually mattered for our multi-tenant SaaS application—was still a manual, distributed effort. We had two layers of security that were expensive, complex, and partially redundant.
Shifting Authorization to the Data Layer with Postgres RLS
Instead of adding another layer of complexity, we decided to push authorization down to the one place that holds the ultimate truth: the database. Our weapon of choice was a powerful but often-overlooked feature in PostgreSQL: Row Level Security (RLS).
Row Level Security is a database feature that allows administrators to define policies that control which rows a user can access or modify in a table. When RLS is enabled on a table, every query is automatically filtered by its security policies. The logic is applied directly by the database engine, making it unimpeachable.
How Postgres Row Level Security Works
At its core, an RLS policy is simply a SQL expression that returns a boolean. If the expression evaluates to true for a given row, the operation is allowed. If it's false, the row is invisible to the query, as if it doesn't exist.
Here’s a simple example for a documents table where users should only see their own documents:
-- First, enable Row Level Security on the tableALTERTABLE documents ENABLEROWLEVEL SECURITY;-- Create a policy that checks the user's IDCREATE POLICY user_can_see_own_documents ON documents
FORSELECTUSING(user_id = current_setting('app.user_id')::uuid);
In this policy, current_setting('app.user_id') is a special function that retrieves a session variable. Our application is responsible for setting this variable when it establishes a database connection, effectively telling Postgres who the current user is.
Our Implementation: A Practical Case Study
Our transition from a service mesh to a Postgres security model was surprisingly straightforward. The key was establishing a secure way to pass user context from our application services to the database.
The Role of JWTs and Database Sessions
Our authentication flow now works like this:
A user logs in and receives a JSON Web Token (JWT). This token contains crucial claims like user_id and tenant_id.
When the user makes an API request, the JWT is included in the Authorization header.
Our API gateway validates the JWT and passes it to the relevant microservice.
Before executing any database query, the microservice sets the user context in its database connection pool using the SET command. For example: SET app.tenant_id = 'some-tenant-uuid';.
All subsequent queries on that connection are now automatically filtered by the RLS policies, which use current_setting() to access these variables.
This simple pattern ensures that no matter what the application code tries to query, the database itself enforces that the user can only ever access data belonging to their tenant.
-- A more advanced policy for a multi-tenant invoices tableALTERTABLE invoices ENABLEROWLEVEL SECURITY;CREATE POLICY tenant_isolation_policy ON invoices
FORALL-- Applies to SELECT, INSERT, UPDATE, DELETEUSING(tenant_id = current_setting('app.tenant_id')::uuid);
With this policy in place, it is impossible for a bug in the application code to accidentally leak data from one tenant to another. This is a far stronger security guarantee than what our service mesh ever provided.
The Tangible Benefits of This Architectural Shift
By swapping our service mesh for RLS, we unlocked several immediate and impactful benefits:
Drastically Simplified Architecture: We deleted hundreds of lines of YAML and decommissioned the entire service mesh control plane. Our architecture now has fewer moving parts, making it easier to understand, debug, and operate.
Reduced Latency and Cost: By removing the sidecar proxies, we saw a measurable improvement in API response times—shaving off 10-15ms from our p99 latency. We also reduced our cluster-wide CPU and memory consumption by over 20%, leading to direct cost savings.
Centralized and Unimpeachable Authorization: Our data access policies now live right next to the data they protect, defined in declarative SQL. This single source of truth for authorization is far more reliable than scattered checks across multiple microservices.
Increased Developer Velocity: Developers no longer need to be service mesh experts. They can focus on writing business logic, knowing that the database provides a robust security safety net.
Is This Approach Right for You?
Replacing a service mesh with Postgres RLS is not a universal solution. A service mesh still provides immense value for use cases like advanced traffic shaping (canary deployments, A/B testing), enforcing mTLS in a heterogeneous environment (with non-Postgres data stores), or gaining deep, language-agnostic observability.
However, this approach is incredibly powerful if:
Your architecture is heavily centered around PostgreSQL.
Your primary security concern is fine-grained, multi-tenant data authorization.
You value architectural simplicity and operational efficiency over the full feature set of a service mesh.
Conclusion: Look to the Data Layer for Answers
Our journey taught us a valuable lesson: always question architectural defaults. While the industry was championing service meshes as the future, our biggest security challenges were best solved by a mature, powerful feature that was already in our database. By replacing our service mesh with Postgres Row Level Security, we didn't just solve our authorization problem; we built a simpler, faster, and more secure system.
If you're feeling the operational pain of a complex microservices architecture, perhaps the answer isn't another layer of abstraction. Maybe it's time to look closer at the most foundational layer of your stack: the data itself.
Have you used RLS to simplify your authorization model? Share your experiences and questions in the comments below!
Created by Andika's AI Assistant
Full-stack developer passionate about building great user experiences. Writing about web development, React, and everything in between.