AI/ML DevelopmentVector DatabaseRAGpgvectorAI ArchitectureData Engineering

Stop Buying Vector Databases: The Case for the Unified Data Layer

Rahul

AI/ML Delivery Head, GYSP.tech

1 November 20248 min read

What you'll take away

The Hidden Cost of the Dedicated Vector DB
What Vectors Actually Are — And What You Actually Need
The Alternatives That Cover 80% of Cases
When a Dedicated Vector Database Is Actually the Right Answer
Validated Outcomes

The default RAG architecture decision for 2024 and 2025 has been: pick a vector database, embed your documents, store the vectors, retrieve by cosine similarity. Pinecone, Weaviate, Qdrant, Chroma, Milvus — the vector database ecosystem exploded to meet demand that outpaced critical evaluation of whether a dedicated vector store was actually the right tool for each use case.

For many production deployments, it is not. A dedicated vector database solves a specific problem — high-throughput, low-latency approximate nearest-neighbour search at billion-vector scale. Most enterprise RAG applications operate at thousands to millions of vectors, not billions, with query patterns that do not require the specialised indexing algorithms that dedicated vector databases are optimised for.

The Hidden Cost of the Dedicated Vector DB

Every dedicated vector database you add is a new operational dependency: a new infrastructure component to maintain, a new failure mode to handle, a new cost line to manage, and — critically — a new data synchronisation problem. Your product content lives in PostgreSQL. Your vector embeddings live in Pinecone. When product content changes, you need a sync pipeline that keeps the two stores consistent. Sync pipelines fail, lag, and create subtle consistency bugs that are maddening to debug.

The synchronisation problem compounds with scale. In a monolithic RAG architecture where all knowledge base content flows into a single vector index, a document update requires delete-and-re-embed in the vector store, coordinated with the corresponding update in the source system. Without exactly-once semantics and careful transaction design, you will serve retrieval results that point to stale or deleted content.

What Vectors Actually Are — And What You Actually Need

A vector embedding is a dense numerical representation of semantic meaning — a list of floats that encodes the content of a text chunk into a form that allows similarity search. The vector itself is not the data you serve; it is an index structure that helps you find the right data to serve. The distinction matters because it clarifies what storage system the vector actually belongs in.

If your documents live in a relational database, the semantic index for those documents belongs alongside them — ideally in the same system, using the same transaction guarantees, with native support for combined semantic + structured filtering. This is exactly what pgvector provides for PostgreSQL.

The Alternatives That Cover 80% of Cases

pgvector (PostgreSQL extension) — Adds vector column types and similarity search operators (cosine, dot product, L2) directly to PostgreSQL. ACID transactions, no separate sync pipeline, full SQL filtering combined with semantic search, and operational familiarity. Handles tens of millions of vectors efficiently. Appropriate for the vast majority of enterprise RAG use cases.
Elasticsearch / OpenSearch with vector fields — If you already run Elasticsearch for full-text search, adding a dense_vector field and kNN search capability adds semantic retrieval to your existing infrastructure without a new dependency. Particularly powerful for hybrid retrieval (BM25 keyword + semantic vector search in a single query).
Snowflake Cortex / BigQuery vector search — For RAG applications whose source knowledge lives primarily in a data warehouse, vector search built into the warehouse eliminates the synchronisation problem entirely. Snowflake Cortex and BigQuery vector search are maturing rapidly and suitable for analytics-adjacent RAG applications.
Redis with vector search — For latency-critical applications where sub-millisecond similarity search matters, Redis's vector search module provides in-memory performance. Appropriate when the vector index fits in memory and query speed is the dominant constraint.

When a Dedicated Vector Database Is Actually the Right Answer

Dedicated vector databases earn their operational overhead in specific scenarios: billion-vector scale with strict latency requirements (the indexing algorithms in Pinecone and Weaviate genuinely outperform pgvector at this scale), multi-modal embeddings where you need to index and search across text, image, and audio vectors simultaneously, and cases where your primary workload is pure semantic search with no relational filtering — making a purpose-built tool more economical than a general-purpose database with vector capabilities.

Is your AI ready for production?

48-hour turnaround. No obligation.

Request AI Architecture Review

Validated Outcomes

Shopify's engineering team published a case study documenting their migration away from a dedicated vector database back to PostgreSQL with pgvector for their product recommendation and semantic search use cases. The reason: at Shopify's actual query volume, pgvector delivered equivalent query latency to their dedicated vector store, while eliminating a separate operational system, its associated infrastructure costs, and its synchronisation complexity. The post explicitly noted that the dedicated vector database had been adopted based on benchmark performance at billion-vector scale — a scale Shopify's use case did not require.

GYSP's RAG architecture engagements begin with a data layer audit that maps actual vector counts, query frequency, and latency requirements against the options available. In practice, more than 70% of enterprise RAG use cases fall within the performance envelope that pgvector or Elasticsearch's native vector search handles without a dedicated vector database. Clients who right-size their vector storage architecture from the start avoid the operational overhead that comes from maintaining a purpose-built vector database that their use case never actually required.

The Unified Data Layer Architecture

The principle underlying all the alternatives above is the same: keep your semantic index in the same system as your source data wherever possible. A unified data layer — where vector embeddings live alongside the records they represent — eliminates synchronisation complexity, reduces operational surface area, simplifies debugging, and often reduces cost. The penalty is slightly lower ceiling performance at extreme scale, which most production enterprise applications never approach.

Before you add a vector database to your architecture, ask one question: does my use case genuinely require billion-vector scale or sub-millisecond ANN search? If the answer is no, pgvector or your existing search stack will serve you better with significantly lower operational complexity.

GYSP's AI/ML Development practice designs RAG architectures that minimise operational complexity while meeting performance requirements — selecting vector storage strategies based on actual data scale and query patterns, not technology trends.

“The best architecture for a RAG application is usually the one with the fewest moving parts that meets the performance requirements. Adding a dedicated vector database to a workload that pgvector handles fine is adding complexity without adding value.”
— Rahul, AI/ML Delivery Head — GYSP.tech

ShareLinkedIn Twitter / X

Ready to act on this?

Is your AI ready for production?

Get a free AI architecture review — we assess your current design, identify failure points, and outline a production-ready path.

92%

Faster information retrieval

70%

Reduction in support queries

99.5%

Extraction accuracy

Request AI Architecture Review

48-hour turnaround · No obligation · Senior engineers only

Get new AI/ML Development insights in your inbox

Practical, no-fluff articles for engineers and technology leaders. New pieces delivered as they're published.

No spam. Unsubscribe any time.

Stop Buying Vector Databases: The Case for the Unified Data Layer

The Hidden Cost of the Dedicated Vector DB

What Vectors Actually Are — And What You Actually Need

The Alternatives That Cover 80% of Cases

When a Dedicated Vector Database Is Actually the Right Answer

Validated Outcomes

The Unified Data Layer Architecture

Is your AI ready for production?

Get new AI/ML Development insights in your inbox

More from the Blog

Your Data Warehouse Is Not Ready for AI. Your Data Team Probably Knows It.

The "It Works On My Machine" AI Crisis: Why 90% of Models Die in Production

Your PDFs Are Ruining Your AI: The Case for Layout-Aware Ingestion