How Smuves Exports 100,000 Records Without Running Out of Memory
Exporting 2,000 HubSpot records from Smuves used to work fine. Exporting 20,000 would crash the worker. Exporting 100,000 was not even worth attempting. The edge function would run out of memory halfway through, the request would die, and the user would get a vague error message with no file to download.
Today, the same export finishes reliably regardless of size. It could be 5,000 records or 500,000. The worker never holds more than a few hundred records in memory at any point. And when the platform tries to kill the function for running too long, the function simply hands off to a fresh copy of itself and keeps going.
This post explains how we rebuilt the export pipeline to survive both memory limits and execution time limits on Supabase Edge Functions, and why the final architecture ended up being simpler than the one it replaced.
The original approach and why it broke
The first version of the export worker was the obvious one. Fetch all records from Typesense, accumulate them in an array, convert the array to CSV or JSON, write the file to Supabase Storage, and send the user a download link.
For small exports this worked perfectly. The entire dataset fit in memory, the conversion was fast, and the file landed in storage before the edge function hit its time limit.
Then agencies started using Smuves with real portals. A single content type might have 70,000 blog posts or 100,000 website pages. The worker would start paginating through Typesense, pushing each page of results into a growing array. Somewhere around record 15,000 to 20,000, the Deno runtime would run out of heap memory and the process would terminate.
The fix seemed obvious at first. Just increase the memory. But Supabase Edge Functions run on Deno Deploy, and you do not get to pick your memory ceiling. You work within the limits of the platform. So we needed an approach that did not scale memory usage with dataset size.
Streaming one page at a time to storage shards
The core idea was to stop accumulating records entirely.
Instead of fetching all records and then writing one big file, the worker fetches one page of records from Typesense (250 at a time), converts that page to CSV or JSON, and writes it as a standalone shard to Supabase Storage. Then it moves on to the next page.
Each shard gets named by its cursor position. So you end up with files like shards/0.csv, shards/250.csv, shards/500.csv, and so on. The worker only ever holds one page of records in memory. After writing a shard, the array is discarded and the next page takes its place.
This dropped peak memory usage from O(total records) to O(250). A 200,000 record export uses the same amount of memory as a 200 record export. The worker simply writes more shards.
For CSV exports, each shard includes the header row. For JSON exports, each shard is a set of newline-delimited JSON objects. Both formats are designed so that merging shards later is straightforward.
The 45 second problem
Solving the memory problem revealed the next one. Supabase Edge Functions have an execution time limit. The exact number depends on your plan, but the practical ceiling we work within is around 45 to 50 seconds per invocation.
A large export might need to paginate through hundreds of Typesense pages and write hundreds of shards. That takes minutes, not seconds. The function would get killed mid-export, and the user would be left with a partial set of shards and no way to know the export was incomplete.
The solution is something we internally call self-recursion. Before the worker hits the time limit, it saves its progress (the current cursor position and job metadata) to the database, then invokes a fresh copy of itself using EdgeRuntime.waitUntil. The new invocation picks up where the old one left off, loads the cursor, and continues writing shards.
From the outside, it looks like one continuous export. From the inside, it is a chain of short-lived function invocations, each doing a small piece of work and handing off to the next.
The handoff works like this. At the start of each page write, the worker checks how much wall clock time has elapsed since the invocation started. If it is approaching 45 seconds, the worker stops paginating, updates the job record in the database with the current cursor, and fires off a new invocation via a fetch call wrapped in EdgeRuntime.waitUntil. The waitUntil call is important because it tells the runtime to keep the current invocation alive long enough to send the HTTP request, even though the main response has already been returned.
The new invocation starts fresh. It claims the same job from the database, reads the stored cursor, and picks up from the exact shard where the previous invocation stopped. Each invocation also resets the retry counter on the job after a successful batch, so that transient failures in one invocation do not permanently poison the job.
Merging shards on download
Once all shards are written and the job is marked complete, the user clicks a download link. That link hits an API route that does not serve a pre-built file. It builds the file on the fly by streaming shards together.
The download route knows the shard naming convention and the total cursor range from the job record. It does not call storage.list() to discover shards, because listing objects in cloud storage is slow and paginated. Instead, it calculates the shard keys directly from the cursor metadata.
For CSV exports, the route streams each shard but strips the header row from every shard except the first. For JSON exports, it concatenates the newline-delimited objects directly. The response uses Transfer-Encoding chunked, so the browser starts receiving data immediately without waiting for the entire file to be assembled in memory on the server.
This means the download route also has constant memory usage. It reads one shard, pipes it to the response stream, and moves to the next. A 500MB export does not require 500MB of server RAM.
Handling failures and user cancellation
Exports can fail for a lot of reasons. Typesense might be temporarily unavailable. A storage write might time out. The edge function itself might crash for reasons outside our control.
Every invocation wraps its work in a try/catch. If the current invocation fails, it marks the job as failed in the database, writes an idempotent notification so the user gets exactly one failure message (not one per retry), and exits. The next time the cron triggers, the job can be retried if the retry count has not exceeded the cap.
Users can also cancel an export in progress. The worker checks the job status at the start of each page. If the user has set the status to stopped, the worker exits cleanly without writing any more shards.
The notification system uses a dedupe pattern we call insertNotificationOnce. Before writing a terminal notification (export complete, export failed), the worker checks if one already exists for that job ID. If it does, the write is skipped. This prevents the same notification from appearing multiple times when a job is retried or when two invocations overlap briefly during a handoff.
What we learned
The biggest lesson was that constant memory is not a nice-to-have on edge functions. It is a hard requirement. If your memory usage scales with your input size, you will eventually hit the ceiling, and the ceiling is not something you control.
The self-recursion pattern felt unusual at first. Having a function invoke a copy of itself sounds like a recipe for infinite loops. But with proper cursor tracking, retry caps, and status checks, it turns out to be a very reliable way to do long-running work on platforms that are designed for short-lived invocations.
The other thing worth calling out is that the shard-then-merge approach made the system more debuggable. When an export fails, we can look at which shards exist in storage and immediately know how far it got. We can re-run just the missing portion. That kind of visibility was not possible when the export was one monolithic operation that either succeeded or failed with nothing in between.