The elephant in the room: how to keep your vector database in sync with your data

As the vector database space has matured, consolidated, and evolved, the developer community is begining to hear echos of the past. Like the great Mark Twain once said, “History doesn’t repeat itself, but it often rhymes.” The rise of vector databases and their relationship with general-purpose databases like Postgres is starting to rhyme with the NoSQL wave of a decade ago. Developers are starting to ask themselves, “Do I really need a dedicated vector store, or can I just use pgvector and call it a day?” Between the VC-funded hype and constant framework churn it’s genuinely hard to get a straight answer about when you actually need a dedicated vector store — and harder still to find practical guidance on what to do once you’ve decided you do.

This article is an attempt to cut through that. It first starts with a qualitative analysis of what the developer community actually thinks by analyzing over 100 community threads from Reddit and Hackernews, then dicusses when pgvector is the right default, acknowledges where dedicated vector stores genuinely win, and finally presents three progressively sophisticated architectures for keeping Postgres and Qdrant in sync — with working code for each.

Vector databases unfiltered-edition

Before we get into sync architectures, it’s worth grounding the conversation in what developers are actually experiencing. I reviewed 110 community threads from Hacker News and Reddit discussing pgvector vs. dedicated vector stores, and the patterns were remarkably consistent. I was left with four key takehome messages:

The dominant decision heuristic is “start with pgvector, graduate if needed”
The #1 reason people migrate toward pgvector is data sync pain
The community has converged on a rough ~10M vector threshold for pgvector viability
The real technical weaknesses of pgvector are metadata filtering and hybrid search, not raw search speed

I detail each in turn below, with representative quotes from the community threads.

The dominant decision heuristic: “Start with pgvector, graduate if needed”

Across developer backgrounds and experience levels, the most common advice is a simple heuristic that goes like this: start with pgvector, and only move to a dedicated vector store when you have a specific, demonstrable reason to.

“My decision tree looks like this: Use pgvector until I have a very specific reason not to.”

“Default to pgvector, avoid premature optimization.”

“The lesson for me isn’t ‘don’t use Pinecone’, but more like ‘did you already max out Postgres?’”

Authors of quotes like these are often people already running postgres for transactional data, who added pgvector as a feature and found it sufficient for their needs. Moreover, the data-scale is often in thousands, not millions. Postgres is revered for its versatility and reliability, and the fact that pgvector is a native extension means it inherits all of that. The barrier to entry is low, and the operational overhead is minimal. This makes it a natural first choice for developers who want to add vector search capabilities without introducing new infrastructure.

The #1 reason people migrate toward pgvector: data sync pain

Of all the operational burdens cited in the corpus, data synchronization is the most consistently painful. Developers who have run a separate vector database alongside their relational database frequently report that keeping two systems in sync was a major headache.

“It was a PITA keeping data synced between pgsql <> qdrant.”

“One of the biggest nightmares with Pinecone was keeping the data in sync… With pg_vector integrated directly into my main database, this synchronization problem has completely disappeared.”

“In my research, Qdrant was also the top contender… but the need to sync two DBs put me off.”

This is essential the core of the problem. More systems means more complexity, more maintenance, more cost, and more surface area for failure. When your vectors live in the same database as your relational data, there’s no sync gap to manage, no reconciliation to run, no failure modes to reason about across two systems.

The ~10M vector threshold

Pgvector, however, is not without limitations. The community has converged on a rough empirical limit for the number of vectors you can store with it. Specifically, pgvector is comfortably viable up to ~10 million vectors. Beyond that, the advice shifts toward dedicated stores or advanced Postgres extensions like pgvectorscale or VectorChord (also monetized, so at that point you’re really just choosing your flavor of vector database).

“PGVector is a good choice when your dataset is smaller than 10M.”

“I think anything under 10M records [can] use PGVector.”

This isn’t a hard wall. There is a VectorChord case study pushing PostgreSQL to 3 billion vectors — but it’s a useful first approximation.

More shortcomings with pgvector: metadata filtering and hybrid search

When dedicated vector stores are defended on technical grounds, the arguments almost always center on two specific capabilities: efficient metadata pre-filtering and native hybrid search (dense + sparse/BM25). Raw search speed is rarely the bottleneck at typical production scales. Indeed, it’s the filter-then-search problem that most often pushes developers toward dedicated stores.

“I think the most relevant weakness for pgvector is the lack of ‘proper’ prefiltering on metadata while leveraging the vector index.”

“Pinecone allows hybrid search, merging dense and sparse vector embeddings that Postgres can’t do AFAIK.”

The hybrid search gap is worth unpacking because it’s often conflated with Postgres’s built-in full-text search — and they’re not the same thing.

Postgres has had excellent full-text search via tsvector/tsquery for years. You can rank results with ts_rank, build GIN indexes on document vectors, and even combine a full-text query with a vector similarity query using UNION or a weighted score. But this is lexical search — it operates on exact term matches, stemming, and stop words. BM25, on the other hand, is a probabilistic model that considers term frequency, inverse document frequency, and document length to rank results. It’s more sophisticated than simple full-text search and often yields better relevance for keyword-based queries. Dedicated stores like Qdrant support this natively.

So, when should you use pgvector?

pgvector is a reasonable choice, but only when a fairly specific set of conditions all hold simultaneously. Namely, you should consider pgvector if the following six conditions are all true:

1. Your vector dataset is small — under 1M vectors. Above this threshold you’ll start hitting index-build times, memory pressure, and recall degradation under load. The community’s empirical ceiling is around 10M, but the comfortable range is much lower.

2. You don’t need accurate metadata filtering. If every search is against the full collection, then post-filtering wont limit you. However, when you need to scope searches to a user, a tenant, a category, or any selective predicate, then pgvector can generate a lot of unnecessary search overhead.

3. Your embeddings are tightly coupled to relational data. If the primary value of your vectors is as an attribute of a row (e.g. a product description embedding stored alongside the product), then colocation genuinely helps. If vectors are the star of the show and your relational data is just metadata, then the operational benefits of colocation are less compelling.

4. You don’t need hybrid search. Your queries are natural language questions against prose content, where dense retrieval alone performs well. Keyword-sensitive workloads (product catalogs, code search, technical documentation) will leave relevance on the table.

5. Postgres is already doing the heavy lifting for your business logic. You have existing transactions, schemas, and ACID guarantees that matter for your application. Adding Qdrant means splitting that concern. If Postgres is already central, the operational argument for colocation holds.

6. Your team is small enough that embedding search logic in SQL is manageable. Typically regarded as an anti-pattern, storing business logic (like search) in the database can be a pragmatic choice for small teams that want to avoid maintaining separate application code for search. If your team is comfortable with SQL and the search logic is relatively simple, pgvector’s colocation can reduce cognitive overhead.

If you check all six boxes, pgvector is a pragmatic starting point. If you’re missing even two or three — especially the filtering or scale ones — you’re going to hit friction sooner than you expect.

Where Dedicated Vector Stores Genuinely Win

To rephrase the above in positive “wins” for dedicated vector stores, there are four key ways in which a dedicated vector store like Qdrant is prefferable to pgvector:

Efficient metadata filtering. Dedicated vector stores like Qdrant are designed to handle selective predicates efficiently. They can leverage indexes on metadata fields to narrow down the search space before performing vector similarity calculations, which is crucial for large datasets.

Native hybrid search. If your relevance depends on a combination of dense similarity and keyword matching, dedicated stores that support BM25 or similar algorithms natively will outperform any Postgres-based workaround.

Scalability beyond 10M vectors. Dedicated vector stores are optimized for large-scale vector data, with features like sharding, distributed indexing, and optimized memory management that allow them to handle tens or hundreds of millions of vectors with low latency.

Decoupled architecture. By separating your vector search from your relational database, you can scale, optimize, and evolve each independently. This is especially beneficial for applications where vector search is a critical path that requires specialized performance tuning.

The Sync Problem (and Why We’re Here)

Given the above analysis, the real “elephant in the room” still remains: keeping your vector database in sync with your relational data. There are many ways to solve this problem, each with its own trade-offs in terms of complexity, consistency guarantees, and operational overhead. The right choice depends on your specific requirements around latency, durability, and tolerance for inconsistency.

In this article, we present three tiers of sync architecture, each progressively more robust and complex. What follows is a walkthrough of all three approaches, using a consistent running example so you can compare them directly.

The Running Example: Fashion Product Catalog

Imagine you’re building a fashion ecommerce site. Postgres is your system of record — it holds your product catalog, handles inventory writes, and backs your order management logic. Qdrant sits alongside it as a dedicated search layer, powering the site’s search bar with hybrid dense + BM25 retrieval. A FastAPI service acts as the backend: it receives product write requests from the rest of your system, keeps both databases in sync, and exposes search endpoints that the frontend calls directly. The only thing that changes across our three tiers is how the sync between Postgres and Qdrant happens.

All three implementations use the same fashion product catalog backed by the H&M Personalized Fashion Recommendations dataset (~106K products, with a 1K-product subsample for the demo).

The schema is deliberately simple:

CREATE TABLE products (
    id              SERIAL PRIMARY KEY,
    article_id      VARCHAR(20) UNIQUE NOT NULL,
    name            VARCHAR(255) NOT NULL,
    description     TEXT,
    product_type    VARCHAR(100),
    product_group   VARCHAR(100),
    color           VARCHAR(50),
    department      VARCHAR(100),
    index_name      VARCHAR(100),
    image_url       VARCHAR(500),
    price           DECIMAL(10,2),
    created_at      TIMESTAMPTZ DEFAULT NOW(),
    updated_at      TIMESTAMPTZ DEFAULT NOW()
);

The Qdrant collection mirrors this with both dense and sparse vectors — using Qdrant Cloud Inference to generate embeddings server-side, which removes the need for a separate embedding pipeline:

client.create_collection(
    collection_name="products",
    vectors_config={
        "dense": models.VectorParams(size=384, distance=models.Distance.COSINE),
    },
    sparse_vectors_config={
        "bm25": models.SparseVectorParams(modifier=models.Modifier.IDF),
    },
)

Each point’s ID is a deterministic UUID derived from the article_id, so both systems share a natural key without maintaining a mapping table. The FastAPI app exposes the same CRUD endpoints across all three tiers:

Method	Endpoint	Description
`POST`	`/products`	Create a product
`PUT`	`/products/{id}`	Full update
`PATCH`	`/products/{id}`	Partial update
`DELETE`	`/products/{id}`	Delete
`GET`	`/search`	Hybrid search (dense + BM25 + RRF)
`GET`	`/search/semantic`	Dense-only
`GET`	`/search/keyword`	BM25-only
`POST`	`/sync/reconcile`	Compare Postgres vs. Qdrant, fix drift

The only thing that changes between tiers is the sync mechanism. The search endpoints are identical — because Qdrant is always the read store, all three tiers produce identical search results once data is synced.

Tier 1: Application-Level Dual-Write

“Just do it in your app code”

Architecture

Every CRUD endpoint writes to Postgres first, then to Qdrant, in the same request handler. If the Qdrant write fails, the error is logged but the request succeeds — Postgres is the source of truth, and a reconciliation job can fix drift later.

The Code

The route handler is exactly what you’d expect: one write after the other, with error handling around the Qdrant call:

@router.post("/products", response_model=ProductResponse, status_code=201)
async def create_product(product: ProductCreate):
    # 1. Write to Postgres first — it is the source of truth
    row = await insert_product(product.model_dump())

    # 2. Write to Qdrant — non-blocking on failure; reconcile catches drift
    try:
        await upsert_product(row)
    except Exception as exc:
        logger.error("Qdrant upsert failed for %s: %s", product.article_id, exc)

    return row

The same pattern applies to every mutating operation: Postgres first, Qdrant second, exceptions caught and logged but not re-raised.

Failure Modes

Failure	Consequence	Mitigation
Qdrant is down	Postgres write succeeds; Qdrant write silently skipped	Logged; `reconcile` fixes drift
Qdrant is slow	Request latency spikes (blocks on Qdrant call)	Client has configurable timeout
Postgres fails after Qdrant write	Orphaned point in Qdrant	Write Postgres first; `reconcile --fix` cleans up
Network partition	Partial writes	Reconciliation script

What This Approach Gets Right

The strongest argument for dual-write isn’t correctness — it’s cognitive simplicity. A new engineer can read this code and immediately understand the entire sync story. There are no background workers, no queues, no separate processes. The request handler is the sync mechanism.

This simplicity has real value for prototypes, internal tools, and early-stage products where iteration speed matters more than operational rigor.

Where It Falls Apart

The write path is coupled to Qdrant availability. If Qdrant has a hiccup — even a brief one — you’re generating drift that has to be cleaned up later. There’s no guarantee that every write will reach Qdrant; you’re relying on reconciliation to eventually make things right.

More subtly: the request latency includes the Qdrant round-trip. For write-heavy workloads, this becomes a bottleneck.

When to Use This

Prototypes and MVPs
Internal tools where occasional inconsistency is tolerable
Low write throughput (< 10K products, < a few hundred writes/day)
Teams that want to ship fast and revisit operational concerns later

Tier 2: Transactional Outbox Pattern

“Never lose a sync event”

Architecture

In this architecture, instead of writing to Qdrant directly from the request handler, we write an event into a sync_outbox table in the same Postgres transaction as the product write. Then, a background worker picks up these events and syncs them to Qdrant asynchronously.

The primary benefit of this architecture is that the outbox event exists if and only if the product write succeeded. There’s no window between the two — they commit atomically.

The Outbox Table

CREATE TABLE sync_outbox (
    id           BIGSERIAL PRIMARY KEY,
    entity_id    VARCHAR(20) NOT NULL,   -- article_id
    operation    VARCHAR(10) NOT NULL,   -- 'upsert' | 'delete'
    payload      JSONB,                  -- product snapshot at write time
    status       VARCHAR(20) DEFAULT 'pending',
    attempts     INT DEFAULT 0,
    max_attempts INT DEFAULT 5,
    last_error   TEXT,
    created_at   TIMESTAMPTZ DEFAULT NOW(),
    processed_at TIMESTAMPTZ
);

Note the payload column: it stores the full product data at write time. The worker doesn’t re-query the products table — by the time it processes an event, the product may have been updated again. The payload is the snapshot that was valid when the event was created.

The Write Path

Every CRUD endpoint opens a single Postgres transaction that writes both the product and the outbox event:

@router.post("/products", response_model=ProductResponse, status_code=201)
async def create_product(product: ProductCreate):
    pool = await get_pool()
    async with pool.acquire() as conn:
        async with conn.transaction():
            row = await conn.fetchrow(INSERT_SQL, *product_values())
            await enqueue_upsert(conn, product.article_id, dict(row))
    return dict(row)

enqueue_upsert inserts a row into sync_outbox on the same connection, within the same transaction. If anything fails, both writes roll back together.

The request returns as soon as the Postgres transaction commits. Qdrant isn’t touched during request handling.

The Worker

A background async task processes outbox events in batches. The critical concurrency primitive here is FOR UPDATE SKIP LOCKED — it lets multiple workers claim events safely without stepping on each other, and without the overhead of a distributed lock:

async def process_batch() -> int:
    pool = await get_pool()
    async with pool.acquire() as conn:
        rows = await conn.fetch(
            """
            UPDATE sync_outbox
            SET status = 'processing', attempts = attempts + 1
            WHERE id IN (
                SELECT id FROM sync_outbox
                WHERE status IN ('pending', 'failed')
                    AND attempts < max_attempts
                ORDER BY created_at ASC
                LIMIT $1
                FOR UPDATE SKIP LOCKED
            )
            RETURNING *
            """,
            BATCH_SIZE,
        )

    for row in rows:
        await _process_event(dict(row))

    return len(rows)

Events that fail get their attempts counter incremented and their status set back to 'pending' (or 'failed' if they’ve exceeded max_attempts). Qdrant upserts are naturally idempotent by point ID, so duplicate processing is harmless.

Two Worker Modes

The implementation ships with two delivery strategies, selectable via environment variable:

Polling (default): the worker wakes up every N seconds, checks for pending events, and processes them. Simple, robust, and doesn’t require a persistent database connection.

LISTEN/NOTIFY: a Postgres trigger fires NOTIFY sync_outbox_insert when a new row is inserted. The worker wakes up immediately, processes the batch, and goes back to waiting. This gives near-real-time sync without busy-polling.

async def run_listen_worker() -> None:
    conn = await asyncpg.connect(dsn)
    await conn.execute("LISTEN sync_outbox_insert")

    while True:
        try:
            await asyncio.wait_for(conn.wait_for_notify(), timeout=30.0)
        except asyncio.TimeoutError:
            pass  # fall back to sweep anyway

        await process_batch()

The 30-second sweep fallback is important: if a notification is missed (e.g., the worker was down briefly), the polling fallback ensures nothing stays stuck indefinitely.

Failure Modes

Failure	Consequence	Mitigation
Qdrant is down	Events queue in `sync_outbox`	Worker retries with backoff on recovery
Worker crashes	Events remain `pending`	Worker picks up on restart; `SKIP LOCKED` prevents double-processing
Duplicate processing	Same event processed twice	Qdrant upserts are idempotent by point ID
Outbox table grows large	Storage/performance impact	Prune `completed` events older than N days

What This Approach Gets Right

The write path is completely decoupled from Qdrant. If Qdrant is down for an hour, writes succeed normally and events queue up. When Qdrant comes back, the worker drains the queue. The application never blocks on Qdrant.

The delivery semantics are clear and enforceable: at-least-once, durable in Postgres, retried until success or max_attempts. Failed events are visible in the database and inspectable with SQL.

You also get a natural /sync/status endpoint:

@router.get("/sync/status")
async def sync_status():
    return await get_sync_status()
# → {"pending": 0, "failed": 0, "avg_lag_seconds": 1.2}

The Trade-off

You have eventual consistency. There’s a window — typically milliseconds to seconds — between when a product is written to Postgres and when it appears in Qdrant search results. For most applications this is perfectly acceptable. For applications where writes must be immediately searchable (real-time bidding, live inventory with hard availability constraints), this requires a different approach.

You also have a new table to manage: the outbox table grows over time and needs periodic cleanup of completed events.

When to Use This

Most production applications with moderate write throughput
When Qdrant availability shouldn’t impact your write path
When you can tolerate seconds of sync lag
The recommended default for the majority of production Postgres + Qdrant deployments

Tier 3: Change Data Capture with Debezium + Redpanda

“Let Postgres tell Qdrant what changed”

Architecture

CDC is architecturally different from the previous two approaches in a fundamental way: the application code has no awareness of Qdrant. The FastAPI routes are pure Postgres CRUD — they don’t import the Qdrant client, they don’t write to an outbox. Sync is handled entirely in the infrastructure layer.

How It Works

Postgres’s Write-Ahead Log (WAL) is a sequential log of every change to the database — it exists for crash recovery and replication. With wal_level = logical, external consumers can read this log in a structured format.

Debezium connects to Postgres via a logical replication slot and captures every INSERT, UPDATE, and DELETE on the products table as a JSON event. These events are published to a Redpanda topic. A Python consumer service reads from that topic and calls Qdrant.

Postgres Setup

-- Enable in postgresql.conf (or docker-compose command args):
-- wal_level = logical

-- Create a publication for the products table
CREATE PUBLICATION products_publication FOR TABLE products;

Debezium Connector Config

{
    "name": "products-connector",
    "config": {
        "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
        "database.hostname": "postgres",
        "database.port": "5432",
        "database.user": "debezium",
        "database.dbname": "fashiondb",
        "database.server.name": "pgserver",
        "table.include.list": "public.products",
        "plugin.name": "pgoutput",
        "publication.name": "products_publication",
        "slot.name": "qdrant_sync",
        "topic.prefix": "pgserver",
        "transforms": "unwrap",
        "transforms.unwrap.type": "io.debezium.transforms.ExtractNewRecordState",
        "transforms.unwrap.delete.handling.mode": "rewrite"
    }
}

The ExtractNewRecordState transform is important: it unwraps Debezium’s envelope format into a flat payload, and adds an __op field (c = create, u = update, d = delete) so the consumer doesn’t have to parse the raw Debezium schema.

The Consumer Service

from confluent_kafka import Consumer

def consume_and_sync():
    consumer = Consumer({
        "bootstrap.servers": "redpanda:9092",
        "group.id": "qdrant-sync",
        "auto.offset.reset": "earliest",
        "enable.auto.commit": "false",  # manual commit after successful processing
    })
    consumer.subscribe(["pgserver.public.products"])

    while True:
        msg = consumer.poll(1.0)
        if msg is None:
            continue

        event = json.loads(msg.value())
        op = event.get("__op")  # 'c', 'u', or 'd'

        if op in ("c", "u"):
            upsert_to_qdrant(event)
        elif op == "d":
            delete_from_qdrant(event["article_id"])

        consumer.commit()  # commit only after successful processing

Manual offset commits (committing only after successfully processing each message) give you at-least-once semantics: if the consumer crashes mid-message, it reprocesses from the last committed offset on restart.

Redpanda vs. Kafka

The demo uses Redpanda as the event bus. Redpanda is Kafka-wire-protocol-compatible but ships as a single binary with no ZooKeeper dependency — dramatically simpler to run in a demo environment. The consumer code is identical whether you point it at Redpanda or a full Apache Kafka cluster; just change KAFKA_BOOTSTRAP_SERVERS.

The Killer Feature: Replaying History

CDC’s most underappreciated advantage is replayability. If Qdrant needs to be rebuilt from scratch — new index configuration, migration to a new cluster, disaster recovery — you can do it by replaying the Redpanda topic from the beginning. Every change that ever happened to the products table is recorded in the event stream.

With dual-write (Tier 1) or the outbox (Tier 2), a full rebuild requires running a bulk export from Postgres. With CDC, the event stream is the rebuild mechanism.

Failure Modes

Failure	Consequence	Mitigation
Qdrant is down	Consumer pauses; Redpanda retains events	Consumer retries; resumes from offset on recovery
Redpanda is down	Debezium buffers; WAL grows	Monitor WAL size; Redpanda HA with replication in production
Debezium crashes	WAL retains changes since last checkpoint	Debezium resumes from replication slot on restart
Schema change in Postgres	Connector may need restart	Monitor connector status; test schema migrations in staging

The WAL Disk Bloat Problem

One operational hazard specific to CDC: Postgres holds WAL segments until the replication slot consumer acknowledges them. If your consumer is down for a long time, Postgres can accumulate significant disk usage. Monitor replication slot lag:

SELECT
    slot_name,
    pg_size_pretty(
        pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)
    ) AS lag
FROM pg_replication_slots;

Set up alerting if this exceeds a few GB. In extreme cases (consumer permanently down), you may need to drop the replication slot — which means re-seeding Qdrant from Postgres rather than from the event stream.

Failure Modes

Failure	Consequence	Mitigation
Qdrant is down	Consumer pauses; Redpanda retains events	Consumer retries; resumes from committed offset on recovery
Redpanda is down	Debezium buffers; WAL grows	Monitor replication slot lag; Redpanda HA in production
Debezium crashes	WAL retains changes since last checkpoint	Debezium resumes from slot on restart
Schema change	Connector may need restart	Test migrations in staging; monitor connector status

When to Use This

High write throughput systems where the outbox table would become a bottleneck
When multiple downstream consumers need to react to changes (not just Qdrant)
When application code must be fully decoupled from sync concerns
Teams with existing Redpanda/Kafka infrastructure
When replay-from-scratch capability is a requirement

Tier 3 is genuinely powerful, but it comes with real operational costs: Redpanda and Debezium to deploy, configure, monitor, and upgrade. If you’re not already running streaming infrastructure, think hard before adding it for a single sync use case. Tier 2 handles most production scenarios with far less complexity.

Choosing Your Tier

Here’s the decision framework distilled:

Do you have < 10K products and low write volume?
└── Yes → Tier 1 (dual-write) is fine to start

Does Qdrant downtime need to be invisible to your write path?
└── Yes → Go to Tier 2

Do you need every sync event guaranteed even across restarts?
└── Yes → Go to Tier 2

Do you already run Kafka/Redpanda infrastructure?
└── Yes → Tier 3 is a natural fit

Do multiple services (not just Qdrant) need to react to product changes?
└── Yes → Tier 3

Do you need to be able to rebuild Qdrant from event history?
└── Yes → Tier 3

Otherwise → Tier 2

And the comparison matrix:

Dimension	Tier 1: Dual-Write	Tier 2: Outbox	Tier 3: CDC
Sync complexity	Low	Medium	High
Extra infrastructure	None	Outbox table + worker	Redpanda + Debezium + consumer
Consistency model	Best-effort	At-least-once, eventual	At-least-once, eventual
Write latency impact	Adds Qdrant round-trip	None (async)	None (async)
Qdrant downtime impact	Generates drift	Events queue in Postgres	Events queue in Redpanda
Captures direct SQL	No	No	Yes (all WAL changes)
Replay capability	No	Limited (outbox retention)	Yes (Redpanda retention)
Operational overhead	Minimal	Low–Medium	High
Best for	Prototypes, internal tools	Most production apps	High-throughput, multi-consumer

The important thing: these tiers aren’t permanent decisions. Start with Tier 1. When you hit its limits — Qdrant outages generating too much drift, write latency becoming noticeable — move to Tier 2. Only when Tier 2 becomes a bottleneck or you need the replay capability should you invest in Tier 3.

The Universal Safety Net: Reconciliation

Every sync architecture drifts eventually. Qdrant has a brief outage. A deployment goes wrong. A bug produces a partial write. Hardware fails at the wrong moment. No sync mechanism eliminates this entirely — they just reduce the frequency and severity.

The reconciliation script is what catches the residue. It runs the obvious set comparison:

async def reconcile(fix: bool = False) -> ReconcileResult:
    pg_ids = set(await get_all_article_ids_from_postgres())
    qdrant_ids = set(await get_all_point_ids_from_qdrant())

    missing_in_qdrant = pg_ids - qdrant_ids    # need to sync
    orphaned_in_qdrant = qdrant_ids - pg_ids   # need to delete

    if fix:
        for article_id in missing_in_qdrant:
            product = await get_product(article_id)
            await upsert_product(product)

        if orphaned_in_qdrant:
            await qdrant_client.delete(
                collection_name="products",
                points_selector=orphaned_in_qdrant,
            )

    return ReconcileResult(
        postgres_count=len(pg_ids),
        qdrant_count=len(qdrant_ids),
        missing_in_qdrant=len(missing_in_qdrant),
        orphaned_in_qdrant=len(orphaned_in_qdrant),
        in_sync=len(missing_in_qdrant) == 0 and len(orphaned_in_qdrant) == 0,
    )

This is intentionally simple. The only inputs are “what’s in Postgres” and “what’s in Qdrant” — there’s no event log to replay, no ordering to worry about. Postgres is the source of truth; Qdrant is a derived read store. When they diverge, Postgres wins.

Run this on a schedule — nightly is usually sufficient — and on-demand via the /sync/reconcile endpoint. The ?fix=true query parameter triggers auto-repair. The report-only mode (default) is useful for monitoring: if you’re seeing consistent divergence, that’s a signal that your sync mechanism needs attention.

A more sophisticated implementation would also compare updated_at timestamps for records that exist in both systems — catching cases where the record exists in Qdrant but has stale data. The set comparison catches the most important class of drift (missing or orphaned records) and is cheap to run even at large scale.

Conclusion

The community consensus is right: start with pgvector. It handles the vast majority of production workloads, eliminates the sync problem entirely, and doesn’t require you to operate additional infrastructure. For most applications, the first time you’ll seriously consider a dedicated vector store is when you’ve outgrown pgvector’s metadata filtering or need native hybrid search — and by then, you’ll have enough data about your actual requirements to make a principled decision.

When you do end up running Postgres alongside Qdrant, the sync problem is real but manageable. The three architectures presented here represent a progression of complexity and capability:

Tier 1 is right for prototypes and early-stage products. It’s the kind of code you can read and understand in five minutes.
Tier 2 is the recommended production default. The outbox pattern is battle-tested, gives you durable delivery semantics, and decouples your write path from Qdrant availability.
Tier 3 is for teams with high write throughput, existing streaming infrastructure, or a hard requirement for replay capability.

All three share the same reconciliation safety net. Postgres is always the source of truth. Qdrant is always a derived read store, always rebuildable from Postgres.

The full working code for all three tiers is in the companion repository, along with seed scripts, search endpoints, and reconciliation tooling. Each tier’s README has quick-start instructions you can follow in under ten minutes.

Pick the tier that matches your current requirements, not your imagined future scale. You can always graduate.

The elephant in the room: how to keep your vector database in sync with your data

Vector databases unfiltered-edition

The dominant decision heuristic: “Start with pgvector, graduate if needed”

The #1 reason people migrate toward pgvector: data sync pain

The ~10M vector threshold

More shortcomings with pgvector: metadata filtering and hybrid search

So, when should you use pgvector?

Where Dedicated Vector Stores Genuinely Win

The Sync Problem (and Why We’re Here)

The Running Example: Fashion Product Catalog

Tier 1: Application-Level Dual-Write

Architecture

The Code

Failure Modes

What This Approach Gets Right

Where It Falls Apart

When to Use This

Tier 2: Transactional Outbox Pattern

Architecture

The Outbox Table

The Write Path

The Worker

Two Worker Modes

Failure Modes

What This Approach Gets Right

The Trade-off

When to Use This

Tier 3: Change Data Capture with Debezium + Redpanda

Architecture

How It Works

Postgres Setup

Debezium Connector Config

The Consumer Service

Redpanda vs. Kafka

The Killer Feature: Replaying History

Failure Modes

The WAL Disk Bloat Problem

Failure Modes

When to Use This

Choosing Your Tier

The Universal Safety Net: Reconciliation

Conclusion

Was this page useful?