If you have ever built a Manifest V3 Chrome extension that does anything more ambitious than rendering a popup, you have hit the wall: the service worker goes to sleep, your setTimeout evaporates, and the long-running job you were halfway through silently disappears. Manifest V3 was not designed to host long jobs, and the documentation tells you that — but it does not always tell you what to do instead. This post is the architecture we landed on for BulkMD, an extension that has to process arbitrarily large URL lists without losing state, and the patterns that make it work.
If you have not yet read why BulkMD does bulk processing in the first place, the bulk export walk-through covers the product surface. This post is the engineering counterpart — what is happening behind that progress bar, and why every line of it was chosen with the service-worker lifecycle in mind.
The MV3 service-worker reality
A Manifest V3 service worker is not a long-lived background page. It is a stateless event handler that the browser starts when an event arrives, runs until the event handler resolves, and terminates aggressively when idle. Chrome's current behavior is to keep the worker alive for roughly thirty seconds after the last event, then unload it. The next event spins up a fresh worker process with a fresh module instance — no in-memory state, no open WebSockets, no in-flight timers.
This is a deliberate tradeoff. Background pages in MV2 were security and performance liabilities because they ran continuously, consumed memory, and were easy to abuse. MV3 traded continuous execution for predictable resource use. The cost is that any extension doing work that genuinely takes longer than thirty seconds — bulk processing, long downloads, periodic syncs, ongoing crawls — has to architect around the lifecycle rather than against it.
The patterns below are the ones that have held up for us across hundreds of bulk runs. They are not exotic, but applying them consistently is the difference between a bulk job that finishes and a bulk job that mysteriously stops at item forty-seven.
Treat the service worker as stateless
The first and most consequential rule is that module-level variables in the service worker are not state. They are a cache that vanishes whenever Chrome decides to unload the worker, and you do not control when that is.
A version of this pattern that looks reasonable in MV2 is poisonous in MV3:
// background.ts — DO NOT DO THIS in MV3
let queue: string[] = [];
let processed: string[] = [];
chrome.runtime.onMessage.addListener((msg) => {
if (msg.type === "enqueue") queue.push(...msg.urls);
if (msg.type === "next") processOne(queue.shift());
});
The first enqueue message arrives, the worker spins up, the array is populated. Thirty seconds later — well within the time it takes to process the first batch — the worker sleeps. The next next message wakes a new worker, and queue is the empty initial value again. Your job is gone. The user sees the progress bar freeze and never recover.
The MV3-shaped version of the same logic delegates state to chrome.storage:
// background.ts — MV3-compliant
async function getState() {
const { queue = [], processed = [] } =
await chrome.storage.session.get(["queue", "processed"]);
return { queue, processed };
}
async function setState(state: { queue: string[]; processed: string[] }) {
await chrome.storage.session.set(state);
}
chrome.runtime.onMessage.addListener(async (msg) => {
const state = await getState();
if (msg.type === "enqueue") {
state.queue.push(...msg.urls);
await setState(state);
}
if (msg.type === "next" && state.queue.length) {
const url = state.queue.shift()!;
state.processed.push(url);
await setState(state);
processOne(url);
}
});
The cost is a few hundred microseconds per message for the round-trip to storage. The benefit is that every event handler operates on the actual current state regardless of whether the worker is the one that handled the previous message. This is the foundational shift; everything below builds on it.
Note the choice of chrome.storage.session over chrome.storage.local. session clears on browser restart, which is exactly what you want for transient queue state; local survives restarts and is appropriate for user preferences, settings, and the persistent slice of the job that needs to be restorable across full browser shutdowns.
A bounded tab pool that survives restarts
The naïve way to process N URLs is to open N tabs at once. This crashes the browser at N=20 on most machines and is a poor neighbor at any N. The right pattern is a bounded pool — open at most K tabs concurrently, dequeue work as each tab finishes, close finished tabs to free resources.
In MV3 the pool itself cannot live as an in-memory object. The pool is a tuple of (currentlyOpenTabIds[], maxConcurrency) that lives in storage, plus event handlers that react to tab updates. Every transition is mediated by chrome.storage:
const POOL_KEY = "pool";
async function acquireSlot(url: string): Promise<number | null> {
const { pool = { open: [], cap: 3 } } =
await chrome.storage.session.get(POOL_KEY);
if (pool.open.length >= pool.cap) return null;
const tab = await chrome.tabs.create({ url, active: false });
pool.open.push(tab.id!);
await chrome.storage.session.set({ [POOL_KEY]: pool });
return tab.id!;
}
async function releaseSlot(tabId: number): Promise<void> {
const { pool = { open: [], cap: 3 } } =
await chrome.storage.session.get(POOL_KEY);
pool.open = pool.open.filter((id: number) => id !== tabId);
await chrome.storage.session.set({ [POOL_KEY]: pool });
await chrome.tabs.remove(tabId).catch(() => {});
}
The release-on-finish wiring goes through chrome.tabs.onUpdated plus a content-script message — when the content script signals that extraction is complete, the worker releases the slot, calls into the queue for the next URL, and attempts to acquire a new slot. If the worker is asleep when the content script sends its "done" message, Chrome wakes it specifically for that event, the state is loaded from storage, and processing continues exactly where it left off.
The tabs.remove().catch() is deliberate. By the time the worker wakes, the user may have closed the tab manually, in which case remove throws. Silently swallowing the error is correct here — the goal is to remove the tab if it still exists, and a missing tab is functionally the same as a removed one.
Keeping the worker awake during active work
There is a legitimate concern that a long-running bulk job with no incoming events — every tab is happily extracting in parallel, no messages are coming back — will let the worker sleep mid-job. The browser has no way to know that "work" is happening; it sees no events and unloads the worker.
The MV3-friendly way to prevent this is chrome.alarms. Schedule a minimal periodic alarm while a job is active, and the alarm fires often enough to keep the worker warm without burning CPU:
async function startKeepAlive() {
await chrome.alarms.create("keep-alive", { periodInMinutes: 0.5 });
}
async function stopKeepAlive() {
await chrome.alarms.clear("keep-alive");
}
chrome.alarms.onAlarm.addListener((alarm) => {
if (alarm.name === "keep-alive") {
// A no-op listener is enough to satisfy the worker-alive contract.
// The act of being woken to handle the alarm resets the idle timer.
}
});
Thirty seconds is the conventional period; Chrome enforces a minimum of 0.5 minutes (thirty seconds) for periodInMinutes. The handler can be a no-op — what matters is that the worker is woken, processes the alarm event, and resets its idle timer. The cost is one wakeup every thirty seconds; the benefit is that any in-flight fetch or content-script call has time to complete without being interrupted by a worker unload.
Critically, you must stop the alarm when the job finishes. A long-lived keep-alive in an idle extension is a bad neighbor and will get your extension flagged by the Chrome Web Store's resource-usage monitoring. Wrap it in a job-active flag stored in chrome.storage.session so it is genuinely scoped to the duration of work.
How big is the practical headroom?
The architecture above is what enables BulkMD to process hundreds of URLs in a single run without losing state. Concrete numbers from our internal stress tests, on a 2024 M3 Pro running Chrome 132:
| Concurrency | Median per-page time | 95th percentile | Successful resumes from forced sleep |
|---|---|---|---|
| 1 tab | 2.1 s | 4.8 s | 100% (50/50) |
| 3 tabs | 850 ms | 2.4 s | 100% (50/50) |
| 6 tabs | 540 ms | 3.1 s | 98% (49/50) |
| 10 tabs | 410 ms | 5.9 s | 92% (46/50) |
Beyond a concurrency of six, the marginal speedup falls off sharply because most pages stop being CPU-bound and start being network-bound; the failure-to-resume rate also creeps up, primarily because some pages take longer to load than the storage-read tolerance. The sweet spot we ship as the default is three tabs, which preserves a 100% resume rate while delivering most of the parallelism gain.
The "successful resumes from forced sleep" column is the most important metric. We force the service worker to sleep mid-run (by quitting and reopening Chrome, by disabling and re-enabling the extension, by leaving the laptop idle long enough for the OS to throttle) and measure whether the job picks back up exactly where it stopped. Anything below 100% would indicate a bug in the state-machine design; the 100% rates at low concurrency are the validation that the storage-mediated state pattern is doing its job.
Common MV3 mistakes that break bulk jobs
A handful of patterns recur in extensions that almost work but mysteriously break under load. They are worth naming explicitly.
The first is using chrome.storage.local for high-frequency queue mutations. local is durable, but its write quota is meaningful (the documented limit is roughly 8 MB total, with a max-items-per-write cap). A queue that mutates on every URL completion will hit the write quota at scale and start silently dropping updates. Use chrome.storage.session for transient state and only persist a checkpoint to local when the run finishes or pauses.
The second is putting business logic in chrome.runtime.onMessage handlers that return a promise but do not return true synchronously. The MV3 message API requires the handler to return true if the response will be sent asynchronously, otherwise the channel closes before your promise resolves. This is the source of every "the message port closed before a response was received" bug. Always either return true and call sendResponse later, or use chrome.runtime.sendMessage with the promise-returning overload.
The third is fetching the same chrome.storage key in a tight loop. Each get/set crosses a process boundary and serializes the data. If you find yourself fetching the queue twenty times in a single message handler, hoist it once at the start and write it once at the end. Resist the urge to make every helper "self-contained" by reading storage on entry.
The fourth, and the most insidious, is forgetting that content scripts and service workers are different execution contexts. A content script can hold long-lived state for the lifetime of its page; a service worker cannot. If you find yourself wanting to "just keep this object in memory," the right answer is to put it in storage and let the content script maintain its own per-page cache.
TL;DR
Manifest V3 service workers are stateless event handlers, and any architecture that does not treat them as such will fail under load. Move queue state into chrome.storage.session, bound your tab pool, use chrome.alarms to keep the worker alive only while work is genuinely happening, and design every state transition to be resumable from cold storage alone. Done right, a Chrome extension can process hundreds of URLs in a single run, survive a forced restart mid-job, and pick up exactly where it left off.
That is the architecture behind BulkMD's bulk dashboard — the patterns above are not theoretical; they are what ships, and they are why the queue you start at 9 AM is still there when you come back from lunch.
Frequently asked questions
Can I just use a persistent background page instead of a service worker?
Why use chrome.storage.session over IndexedDB?
What's the right cap on concurrent tabs?
Does the keep-alive alarm interfere with battery life?
How do I test that my MV3 extension actually survives worker restarts?
About the author
Independent software engineer building developer tools at Soft Web Grove. Creator and maintainer of BulkMD.
Reach the team at [email protected] — typically within 24 hours, any day of the year. Soft Web Grove also takes a small number of outside engagements; details on the about page.