PrivaKit: WebGPU Zero‑Upload AI Workspace for OCR & Transcription

PrivaKit: a client-side Sensitive Data AI Workspace that runs OCR, transcription, and image models entirely in the browser

PrivaKit runs OCR, transcription, and image models entirely in-browser using WebGPU/WASM so documents, images, and audio never leave your device, Offline-first.

PrivaKit and the case for local AI processing

PrivaKit is presented as a Sensitive Data AI Workspace that executes inference—OCR, transcription, and image processing—directly inside the user’s browser, with the explicit aim of keeping images, documents, and audio files on the device at all times. The project positions itself for use by privacy‑sensitive professionals in HR, legal, and finance who need to extract text, remove backgrounds, or transcribe recordings without sending confidential material to cloud APIs. This article lays out what PrivaKit reports about its technical approach, how data moves (and crucially, does not move), the verification steps it recommends, and what those choices mean for developers, businesses, and privacy-minded users.

Client-side inference stack: ONNX Runtime Web and Transformers.js

PrivaKit’s inference layer is built on browser-capable machine learning runtimes. The team cites ONNX Runtime Web and Transformers.js as the core inference engines it uses to run models locally. Supported local models are described in two categories: vision models for background removal, upscaling, and OCR, and audio models for transcription—explicitly naming Whisper as an example for the latter. According to the material, model weights are distributed as .onnx files and downloaded to the client.

That stack is designed to bring “server-grade models” to the browser so that model execution can occur entirely on the user’s machine. The documentation emphasizes that the software downloads model weights to the device and that those weights are cached permanently after the first load, framing the flow as “we download the model to you; we never upload your files to the model.”

Hardware acceleration and fallback paths: WebGPU primary, WASM fallback

PrivaKit describes a two-tiered approach to hardware acceleration inside the browser. The primary path uses WebGPU to leverage the local GPU for parallel processing, which the project highlights as offering effectively zero network latency for inference work. When a compatible GPU or WebGPU implementation is not available, PrivaKit falls back to WebAssembly (WASM), running on the CPU while taking advantage of SIMD instructions where possible. This combination is presented as a way to deliver on-device performance across a range of machines while preserving the local-only processing guarantee.

Keeping the UI responsive: Web Workers for concurrency

To avoid blocking the main thread during compute-heavy tasks, PrivaKit offloads heavy inference workloads to background threads using the Web Workers API. The documentation calls out examples such as transcribing a 30-minute audio file as work that would run in a worker, allowing the user interface to remain responsive while inference proceeds in the background.

Data flow and the zero-upload claim

PrivaKit provides a clear data flow diagram that the project uses to explain how user files are handled from initial page load through inference and final output. The sequence in the diagram describes these stages:

Page assets (HTML, JavaScript, CSS) are served to the user from a Vercel CDN.
Model weights (*.onnx) are downloaded from Hugging Face or a CDN; this is described as a one-time download that is cached permanently after first load.
A separation line labelled as a firewall indicates that no user data crosses into the internet/cloud side once models and assets have been delivered.
User inputs—images, PDFs, and audio files—are kept as blobs in browser memory.
Those blobs are dispatched to a Web Worker, which invokes ONNX Runtime (and the local compute stack), sending compute to the local GPU when available.
The outputs—masks for background removal, OCR text, or audio transcripts—are rendered to the canvas or DOM and available for user download.

Taken together, the diagram and accompanying notes are the project’s technical statement that inference occurs locally and that user data never leaves the device during that process.

How to verify the local-only processing: recommended audits

PrivaKit provides two audit methods for technical users who want to verify the zero-upload claim themselves.

The Airplane Mode test: Load PrivaKit, wait for the model to download and the UI to report “Ready,” then disconnect network access (airplane mode or unplug Ethernet). After network is disabled, drop an audio recording or image into the tool; the expected result is that the AI will process the file completely offline. The documentation uses this as an accessible, practical test that distinguishes client-side inference from cloud-dependent wrappers.
A DevTools network audit: Open Chrome DevTools and observe the Network tab while uploading and processing a file. The documentation instructs auditors to look for the absence of POST requests that contain file blobs or form data, and to verify that no external API endpoints are called during inference. This is presented as a more technical check for auditors comfortable with browser tooling.

Both methods are framed as simple and verifiable ways to confirm that user files are not being transmitted during inference.

What network traffic the site does generate

PrivaKit documents the limited set of network requests users should expect:

HTML, JavaScript, and CSS from privakit.ai to render the UI.
Model files (.onnx) from huggingface.co or a CDN to provide the AI weights; these are downloaded to the user and stored locally after the first load. The project emphasizes that those downloads are one‑time and that the downloaded models are used locally—files are sent to the user, not the other way around.
Anonymous, aggregate analytics calls to a self-hosted Node instance for usage metrics. The documentation asserts that these analytics collect zero personally identifiable information.

Those three categories are presented as the only expected web-facing interactions; anything beyond them is characterized as a discrepancy the project asks users to report.

Privacy and analytics: no third-party trackers, Plausible self-hosted

PrivaKit states it avoids Google Analytics and third-party tracking scripts entirely. Instead, the project reports using an open-source Plausible Analytics instance that is self-hosted on an isolated lightweight node (specified as 2-core, 4GB) in Singapore. The privacy posture is described with the following claims: no cookies are set, IP addresses or personal identifiers are not logged, and the analytics are ad‑blocker friendly—if browser extensions block the self-hosted Plausible script, the AI tools themselves remain fully functional. The team invites users to open issues if they find discrepancies relative to these promises, reflecting a stated commitment to data sovereignty.

Who PrivaKit is aimed at and what it does for them

The material explicitly frames PrivaKit for professionals in HR, legal, and finance who handle sensitive documents—employment contracts, identity documents, or confidential meeting recordings—and who would prefer to avoid sending those materials to cloud APIs to perform basic AI tasks like text extraction or background removal. By keeping inference local, PrivaKit’s stated value proposition is privacy preservation for workflows that commonly involve regulated or sensitive data.

Implications for developers and enterprises

PrivaKit’s approach—shipping models to the browser and running inference locally with WebGPU/WASM and ONNX—illustrates a broader pattern in which web applications can move compute to the client to minimize data egress. The project highlights operational choices that enterprises and developers may consider when privacy is a primary requirement: using browser runtimes capable of local model execution, providing clear caching behavior for model weights, and offering straightforward verification steps so technical staff can confirm no data leaves the network perimeter during inference. The documentation also raises practical operational points: hosting static site assets (Vercel CDN) and model binaries (CDN/Hugging Face), and choosing self-hosted, privacy‑oriented analytics rather than third-party trackers.

Transparency, auditability, and user trust

PrivaKit’s documentation emphasizes auditability as part of its trust model: it gives explicit instructions for local verification and enumerates the limited network interactions users should observe. The project’s openness about where assets and model files come from—and its invitation to open issues if users spot inconsistencies—are positioned as mechanisms to increase accountability. The combination of self-hosted analytics with claims of no cookies and no PII collection is presented as a concrete privacy stance that complements local inference.

PrivaKit’s architecture also highlights trade-offs that organizations will need to weigh for themselves: the requirement to download model weights to each device, and the need for browser and hardware capabilities (WebGPU support or a performant WASM fallback) to achieve the desired local performance. The documentation’s concrete guidance on verifying the zero-upload behavior helps organizations confirm that those trade-offs actually translate to reduced data egress in practice.

PrivaKit’s materials repeatedly stress a single guiding principle: your files stay on your device during inference. That claim is supported in the documentation by a combination of the stack description (ONNX Runtime Web, Transformers.js), the hardware paths (WebGPU primary, WASM fallback), the concurrency model (Web Workers), the explicit data flow diagram, and the two verification techniques it recommends.

If you rely on cloud APIs for OCR, transcription, or image edits today, PrivaKit’s approach suggests an alternative pattern for privacy-sensitive workflows: move the model to the user, run inference locally, and make the verification steps transparent and repeatable.

Looking ahead, the documentation’s emphasis on local execution, explicit verification methods, and privacy-forward analytics suggests an evolving set of expectations for privacy-conscious tools: clear supply-chain signals for model binaries, explicit caching behavior, and auditable, minimal network footprints for UI assets and analytics. Those elements will likely shape how teams evaluate tools for regulated workflows and may influence developer tooling, browser runtime support, and procurement policies for environments that must avoid third‑party data egress.