Processing 10,000 Product Images Without Melting Your Server: Building merchi.ai Chapter 5

    Processing 10,000 Product Images Without Melting Your Server: Building merchi.ai Chapter 5

    Merchi Team

    The Industrial Scale of Modern Merchandising

    In the previous chapters, we’ve discussed the “Brain” of merchi.ai, our model routing and prompt engineering. But even the most sophisticated AI is useless if the underlying infrastructure buckles under the weight of real-world data. In the retail world of 2026, data doesn’t arrive in neat, single-row packages. It arrives in tidal waves. We are talking about ZIP files containing 10,000 high-resolution images, CSVs with tens of thousands of rows, and raw web-scraped data from legacy manufacturer portals.

    Processing 10,000 images is a fundamentally different engineering challenge than processing one. If you attempt to handle this within a standard request-response cycle, your server will time out, your memory will spike, and your database will lock. As a team of two building merchi.ai, our primary architectural goal was to build a “factory line” that could ingest these massive datasets, break them into atomic tasks, and process them asynchronously without ever melting our infrastructure.

    To achieve this, we moved away from traditional monolithic processing and adopted a “Fan-Out” architecture. By leveraging Supabase Edge Functions for the initial handshake and Trigger.dev for the heavy lifting, we’ve built a system that can process 125 years of human labour in a single day. This chapter is a technical deep-dive into the ingestion pipeline that makes merchi.ai’s scale possible.

    The Ingestion Gateways: Meeting the Data Where It Lives

    E-commerce teams have diverse workflows. A small boutique might want to upload a single photo of a new arrival, while a global distributor needs to sync their entire seasonal catalogue. We designed multiple upload paths to serve these distinct needs, ensuring that whether the data is a single JPEG or a programmatic JSON stream, the entry point is seamless.

    MethodUse CaseTechnical Entry Point
    Single ImageQuick test or one-off product enrichment.Direct Supabase Storage Upload
    ZIP FileBulk uploads where filenames equal Product IDs.Edge Function (Extractor)
    Spreadsheet/CSVImport from existing PIM/ERP systems.Edge Function (Parser)
    Web ScrapingExtracting data directly from supplier websites.Trigger.dev Scraping Task
    API (JSON/CSV)Programmatic integration for enterprise partners.API Workspace Endpoint

    Each of these methods eventually converges into the same processing pipeline, but the initial handling is specialized. For example, when a user uploads a ZIP file, we don’t just dump it into storage. A Supabase Edge Function intercepts the file, validates the contents, and extracts the images. This prevents us from wasting compute resources on corrupted files or non-image assets. Once validated, the metadata is written to our runs table—the backbone of our processing visibility, and the “fan-out” begins.

    The Fan-Out: Orchestrating 10,000 Atomic Tasks

    The core of our reliability lies in how we hand off work from the “Edge” to the “Engine.” We use Trigger.dev as our primary task orchestrator. When a bulk upload is confirmed, the Edge Function creates a run record and triggers a parent task. This parent task’s only job is to “fan out”—it iterates through the 10,000 items and spawns 10,000 individual, atomic child tasks.

    This decoupling is vital. By breaking a massive job into tiny, independent units, we gain several advantages. First, we bypass serverless execution limits. Each individual product enrichment (vision extraction, copy generation, translation) happens in its own isolated environment. Second, it allows for massive concurrency. We can process hundreds of products simultaneously across our distributed worker network.

    Each atomic task follows a strict execution script:

    1. Context Retrieval: Fetch the latest model configuration and Writing Knowledge from the database.
    2. Prompt Assembly: Build the task-specific prompt using our template variable system.
    3. AI Execution: Call OpenRouter with the assembled prompt and product assets.
    4. Parsing & Validation: Ensure the AI output matches the required JSON schema.
    5. State Persistence: Store the results and update the individual task status.

    Handling the Messy Reality: Retries and Partial Failures

    In a factory processing a million products, something will always go wrong. An AI provider might hit a rate limit, an image URL might be temporarily broken, or a specific model might produce a malformed JSON response. In a traditional system, a single error in a batch of 10,000 might crash the entire process. In merchi.ai, we architected for “Partial Failure Resilience.”

    We configure our Trigger.dev tasks with a robust retry policy: maxAttempts: 3 with exponential backoff. If a request to OpenRouter fails due to a transient network error, the task simply sleeps and tries again a few minutes later. This handles 99% of “flaky” API issues without human intervention.

    For the remaining 1%, we use the runs table to provide real-time feedback to the user. Instead of a binary “Success” or “Failure,” our UI shows a progress bar with granular states: pending, processing, completed, and failed. If 9,995 products succeed and 5 fail, the user can see exactly which 5 failed and why (e.g., “Image too small” or “Invalid SKU format”). They can fix those specific items and re-run only the failures, rather than re-uploading the entire 10,000-image ZIP file.

    The Economics of Scale: Managing Tokens and Storage

    Every image processed in merchi.ai has a literal cost. In 2026, high-resolution vision models are powerful but expensive in terms of token usage. When a user uploads 10,000 images, we are effectively initiating a significant financial transaction with our AI providers. Our pipeline includes an “Economic Guardrail” layer that estimates the cost of a run before it begins.

    To keep costs manageable, we employ several technical optimisations:

    • Image Pre-processing: Before sending an image to a vision model, we resize and compress it at the Edge. Most models don’t need a 20MB RAW file to identify that a shirt is “navy blue.”
    • Caching Intelligence: If a user re-runs a task on the same image with the same prompt, we check our cache before making a new AI call.
    • Batching Metadata: While we process images individually, we batch the database writes to Supabase to reduce the number of active connections and prevent IOPS throttling.

    Our use of Supabase Storage is also highly structured. Images aren’t just thrown into a bucket; they are organised by tenant_id and run_id. This allows for easy cleanup. If a user deletes a run or if a temporary processing folder is no longer needed, our background workers can sweep the storage buckets to keep our footprint (and costs) lean.

    What We Learnt: The “Thundering Herd” Problem

    During our early scaling tests, we ran into the “Thundering Herd” problem. When we fanned out 10,000 tasks simultaneously, they all tried to query the database for the same Writing Knowledge configuration at the exact same millisecond. This created a massive spike in database connections that threatened to lock our PostgreSQL instance.

    The fix was two-fold. First, we implemented a small, randomised “jitter” in our fan-out logic, so tasks didn’t all start at the exact same microsecond. Second, we began passing the brand configuration as part of the initial task payload. By fetching the “static” brand rules once in the parent task and passing them down to the children, we reduced our database read volume by 99%.

    This is the reality of building in public as a small team: you don’t always get the architecture perfect on day one. You build, you break, and you refine. By treating our pipeline as a factory line rather than a simple script, we’ve created a system that is as resilient as it is fast.

    What’s Next?

    With a resilient pipeline in place, we can handle the volume. But how do we ensure the quality of that volume? In the next chapter, we’ll continue to upack infrastructure and security. We will explore Multi-tenant architecture, data isolation, and Row Level Security.

    Ready to see our pipeline in action with your own data? Book a Demo or Start Automating with merchi.ai for FREE.