Invoice Generator: Two‑Tier Regex and gpt-4o‑mini Parser with react-pdf in Next.js

Invoice Generator converts natural-language lines into polished PDF invoices in a week

Invoice Generator converts free-text into professional PDF invoices with a regex-first parser, gpt-4o-mini fallback, and @react-pdf/renderer for freelancers.

Building an Invoice Generator in a Week and Why It Matters

Invoice Generator is an experiment in rapid product iteration: a small-scope SaaS that turns plain English lines such as "Invoice Acme for 10 hrs at £100/hr" into a finished, downloadable PDF invoice. Built as Week 8 of the NanoCrafts curriculum, the project tested whether the stack and patterns established while building Resume AI Tailor could be re-used to ship a different, useful SaaS quickly. The result is a minimal, pragmatic tool that emphasizes frictionless input, predictable PDF output, and server-side rendering—all designed to validate the reusability of prior work and reveal the engineering trade-offs of shipping fast.

How the NLP Parser Extracts Invoice Data

At the heart of this invoice generator is an NLP parser designed to convert unconstrained user text into structured invoice fields—client name, hours, rate, and currency. Rather than relying solely on large-model interpretation, the system uses a two-tier approach: a deterministic, regex-first path that handles the most common inputs, and an OpenAI fallback for ambiguous or unusual phrasing.

This design aims to minimize latency, cost, and surprising behavior for routine inputs while preserving robustness when user text falls outside the expected patterns. The parser returns an explicit confidence value that the UI exposes as badges—"Auto-filled" when the regex path succeeds, and "AI-assisted" when the model is used—letting users know whether a suggestion can be trusted or should be checked.

The Regex Fast Path

The first tier is intentionally simple and local: regular expressions running in the browser or server that extract client name, hours, and rate from compact sentence patterns. Examples that the regex is intended to cover include lines like:

Invoice Acme for 10 hrs at £100/hr
Bill TechCorp 5 hours at $75 per hour
Charge Wonka Co 3hrs £120/hr
Send invoice to Globex for 8 hours, rate £50

When the regex finds all required fields, the parser also infers currency from symbols—£ → GBP, $ → USD, € → EUR—with GBP used as a fallback default. These deterministic matches avoid external API calls and provide instant, low-cost parsing for the vast majority of straightforward inputs.

The OpenAI Fallback and Structured Output

When input fails to match the regex patterns—for example, because of ambiguous wording, missing elements, or unusual structure—the system delegates parsing to a tightly scoped OpenAI model prompt using gpt-4o-mini. The model is asked to return structured JSON representing the invoice fields, which the application parses and maps into the same form UI used by the regex path.

Crucially, the fallback returns a lower confidence level that the UI surfaces with a yellow "AI-assisted" badge. Users always see the same editable form regardless of path, so they can correct or confirm fields before saving. That small confidence indicator reduces cognitive load by communicating how much verification the user should apply to the auto-fill results.

PDF Generation with @react-pdf/renderer

For PDF output the project re-used an integration previously developed for Resume AI Tailor: @react-pdf/renderer. The PDF layout is defined as React components, rendered server-side into a binary buffer that streams back to the browser as a downloadable file. The invoice PDF itself is a single-page document composed of five sections: a branded header (including the NanoCrafts name and invoice number), a bill-to block with client and due date, a line-items table with hours, rate, and amount columns, a totals block showing subtotal, VAT at 20%, and grand total, plus a payment terms footer.

Reusing @react-pdf/renderer substantially reduced implementation time because the author had already solved related server-client rendering conflicts in an earlier project. Rendering proceeds by producing a Node.js Buffer and converting it to a typed array before returning an HTTP Response with appropriate Content-Type and Content-Disposition headers so the browser downloads the PDF as an attachment.

Implementation Gotchas and Server-Side Details

Several practical integration details proved important during the build:

The correct import for the renderer is a named export: renderToBuffer must be imported directly from @react-pdf/renderer rather than accessed off a default export. Using the wrong import pattern leads to runtime errors.
renderToBuffer returns a Node.js Buffer, which is not directly assignable as a Response body in the Next.js App Router. The Buffer must be converted into a Uint8Array (or similar) before constructing the Response.
In Next.js, @react-pdf/renderer must be listed in next.config.ts under serverExternalPackages to prevent Next.js from attempting to bundle the library client-side. This server-side configuration carried over from the previous project and saved time.

The project exposes the PDF through a straightforward endpoint: GET /api/invoices/[id]/pdf. That route fetches the invoice record from the Neon database, validates ownership against the Clerk userId, renders the PDF server-side, and streams the bytes back. The UI links to this route with a plain anchor so downloads can happen without client-side JavaScript.

What Carried Over from Resume AI Tailor

A primary motivation behind the NanoCrafts curriculum is compounding knowledge. For Invoice Generator, multiple infrastructure pieces and patterns transferred directly:

Authentication and user management used the same Clerk v7 setup, with proxy.ts in place of middleware.ts to conform to a Next.js 16 convention. Protected routes, async auth calls, and route matcher utilities were reusable.
Database tooling and schema patterns used Drizzle ORM with Neon Postgres, including the same singleton DB pattern and drizzle.config.ts workflow.
The server-side PDF rendering pattern with @react-pdf/renderer had already been worked out in Week 4, meaning the author spent minutes instead of hours on similar problems.
Deployment and CI were pre-established with a Vercel pipeline: GitHub repo connected, environment variables set via the CLI, and predictable production deployments.

Together, these carryovers saved an estimated four hours on the week’s work, according to the developer’s notes—illustrating how earlier investment in stable patterns reduced friction for subsequent projects.

What Was Genuinely New

Despite the reused infrastructure, the build did include new engineering work. The two-tier NLP parser was the central novel component: Resume AI Tailor accepted structured file uploads, while Invoice Generator needed to interpret unconstrained human text. Designing a regex-first approach with an LLM fallback, and exposing a confidence level that directly maps to UI badges, was the most interesting and original engineering effort of the week.

Another new area was optimistic UI for invoice status changes. The invoice dashboard requires immediate feedback when marking an invoice as sent or paid; the interface updates local state instantly and rolls back changes on failure. Though a small pattern, it was necessary for a responsive, real-world dashboard.

Where Time Was Lost

Not all delays came from product work. The biggest time sink was an environmental tooling bug: Turbopack and Tailwind v4 combined with a Next.js 16 workspace root detection problem. When the new project lived within a parent directory that already contained package.json or package-lock.json, Turbopack incorrectly walked up the folder tree and resolved CSS imports from the wrong root. Diagnosing and fixing that issue cost about three hours.

From this experience the author recommends a simple pattern: start each week’s project fresh with create-next-app in a clean directory, then manually copy only the files you need from previous work. That small upfront time investment avoids the kind of environment issues that are easy to overlook and hard to debug.

Schema Trade-offs and Planned Fixes

To ship quickly, the initial database schema simplified invoice persistence: the database stores a single hours and rate column per invoice even though the UI supports multiple line items. In practice, real invoices commonly contain several line items—design work, development, meetings—so the author plans to add a separate line_items table with a foreign key to invoices in a future iteration.

Another deliberate shortcut was hardcoding VAT at 20% in the PDF totals block. While that simplifies the initial implementation and suits UK freelancers, it doesn’t work for U.S. users who don’t charge VAT or for other countries with varying VAT rates. The stated plan is to introduce a vatRate field on invoices with a sensible default, a change that the author says would have taken about thirty minutes to add at the outset.

These examples highlight a recurring trade-off: shipping fast sometimes means deliberately choosing imperfections that will need to be paid back later.

Developer Workflow and Deployment Notes

The project’s API routes, database access, authentication, and PDF rendering all ran in a Next.js 16 environment with the App Router. The author emphasized that the download link for each invoice is a standard anchor element pointing at the server route—no client-side JavaScript is required for downloading PDFs. That choice reduces complexity and improves reliability for basic user flows.

For deployment, a conventional Vercel pipeline was used: repository connected to Vercel, production environment variables configured via the CLI, and standard deploy flags. The prior project’s deployment patterns carried over without surprises.

Broader Implications for Developers and Businesses

Invoice Generator demonstrates a practical pattern for shipping small, focused SaaS quickly by combining deterministic parsing with an LLM safety net and reusing proven infrastructure. For developers, the two-tier parser is a useful template: handle common, structured inputs locally to save latency and costs, and fall back to a model only when necessary. Exposing a confidence indicator in the UI is a low-effort design choice that improves user trust and reduces friction in accepting auto-filled fields.

For small businesses and freelancers, the project highlights how even simple automation—typing a line and getting a downloadable invoice—can eliminate repetitive work and reduce cognitive overhead. The server-side PDF rendering approach using @react-pdf/renderer keeps document generation controllable and auditable on the backend, which aligns with business needs for reliable invoicing and storage.

From a product development standpoint, this build underscores the value of investing in reusable patterns—authentication, ORM configuration, and rendering pipelines—which can compound into significant time savings over multiple projects. At the same time, the project illustrates the cost of shortcuts in data modeling and localization: small initial compromises like a single line-item schema or hardcoded VAT create maintenance and product friction that must be addressed in subsequent versions.

What I Would Do Differently Next Time

Reflecting on the week’s work, the author identifies three concrete changes they would make on a repeat build:

Start each project from a fresh create-next-app scaffold rather than cloning an existing repo to avoid workspace root detection and tooling resolution bugs.
Persist multiple line items from day one by introducing a line_items table, which better matches real-world invoicing needs.
Make VAT configurable by adding a vatRate field with a default rather than hardcoding 20%, improving international suitability at minimal cost.

These adjustments are framed not as failures but as lessons: practices that save time in the short term can increase friction later, while upfront investments in correct modeling and clear boundaries pay dividends.

The invoice generator is available live at invoice-generator-six-roan-49.vercel.app and the source code is published at github.com/Azeez1314/invoice-generator. The project sits within an ongoing curriculum: Week 9 is already planned and progress updates are shared on social channels.

Looking forward, the patterns validated by this build point toward a practical roadmap: iterate on data modeling to support full line-item persistence, add configuration for tax and currency handling to expand international usefulness, and continue refining the regex-first plus LLM-fallback parsing strategy to balance cost, latency, and reliability. The project is a compact demonstration of how a small, well-scoped product can be used to validate engineering patterns and inform future product choices across invoicing, document generation, and other productivity automation scenarios.