~ $ whoami

Gustavo Paulo

senior backend · AI engineer · cloud

I design and run distributed systems with applied AI in production — from RAG with observability to multi-tenant platforms with queues, fragile integrations and self-managed deploys.

available for projects and opportunities get in touch

~ $

~ $ ls stack/

Stack & skills

Backend/

Node.js / TypeScript
Python / FastAPI
GraphQL (Apollo, Nexus)
REST

AI & LLMs/

LangChain / LangGraph
Hybrid RAG (dense + BM25 + reranker)
Tool-calling
Langfuse / LangSmith
OpenRouter / Vertex AI / OpenAI

Cloud & Infra/

Docker / Compose
GCP (BigQuery, Vertex, Cloud Storage)
Cloudflare (Pages, Tunnels)
Linux / systemd
Caddy / Nginx
PM2 / Gunicorn

Data/

PostgreSQL (multi-tenant schemas, partial indexes, pgvector)
BigQuery
MongoDB
Redis (BullMQ, Streams, arq)

Frontend/

React + TypeScript
Next.js
Vite
Tailwind + shadcn/ui

~ $ cat contact.md

Contact

If you want to talk about a role, a project or an idea, drop me a line.

· email: [email protected]
· github: @gpaulo00

CV available on request — ask me by email.

~ $ cat about.md

About

Senior backend engineer with 7 years of experience (since 2019). My base is Node/TypeScript and Python, but I'm comfortable crossing domains: database performance, async workers, LLM integration, reporting, and some networking and infrastructure when needed.

I've worked as tech lead on critical SaaS platforms with a team, and also as solo developer taking products from zero to production. That mix gave me judgment to decide when to decouple, when to scale, and when to reduce complexity before it grows.

My current focus is applied AI (RAG, LangGraph agents, tool-calling) and pragmatic cloud/DevOps — the combination that delivers the most value when done well.

~ $ ls projects/featured/

Featured projects

▸

Softwariza AI Triage

FastAPI + LangGraph backend that triages, enriches and answers support tickets for a fiscal ERP using RAG and multi-provider LLMs.

[expand]

Python 3.11FastAPILangGraph + LangChainLiteLLM (Gemini, OpenAI)Qdrant + BM25

Problem

Support teams for a fiscal-accounting ERP receive thousands of tickets with incomplete information and very domain-specific vocabulary. The system classifies them, asks for missing data, retrieves context from official guides and historical cases, and prepares a response or summary for the human advisor — reducing time-to-first-response and manual triage load.

My role

Sole backend developer (~325 of ~340 commits). I designed and built it end to end: API, LangGraph orchestration, RAG layer, anonymization, XLSX ingestion, review dashboard, observability, migrations and CI.

Technical decisions

LangGraph state graph instead of linear chains: separates collection, classification, retrieval and finalization nodes, with conditional routing (follow-up vs. human escalation).
Hybrid retrieval: dense in Qdrant + BM25 with per-application filter + Vertex AI reranker, with historical boost and adaptive threshold. Better recall for domains with very specific jargon.
Shadow mode + incremental v2 curated dataset to iterate on prompts and models, measuring p85 without touching production.
LiteLLM + Langfuse as a unified layer: switch between Gemini Flash, Flash Lite and OpenAI without touching graph nodes.
Three-layer anonymization before the LLM to avoid exposing sensitive fiscal data to external providers.

What it shows

Taking an applied-AI system from zero to production alone — architecture, non-trivial RAG, observability, shadow-mode evaluation and deployment — while keeping the code testable and the domain (fiscal data, sensitive) under control.

Stack

Python 3.11FastAPILangGraph + LangChainLiteLLM (Gemini, OpenAI)Qdrant + BM25Vertex AI rerankerPostgreSQL + SQLAlchemy/AlembicAPSchedulerLangfuseDocker

About the code

Commercial project. I can't publish code or client data; architecture and decisions are verifiable in interview.

▸

Kati-bot — multi-tenant SaaS for AI chatbots

Multi-tenant platform that lets businesses deploy AI chatbots on WhatsApp and Telegram, configurable without code.

[expand]

Python 3.12 (FastAPI, asyncio, uv)PostgreSQL + pgvectorRedis Streams + arqOpenRouter (LLMs)sentence-transformers

Problem

Small and mid-sized businesses lose sales by not responding quickly on WhatsApp and Telegram. The platform gives them an AI assistant that takes orders, reservations and stock queries, and escalates to a human when needed — configurable from a dashboard without touching code.

My role

Sole developer end to end (~132 commits, single author). Microservice architecture, multi-tenant Postgres model, layered prompt pipeline, tool-calling, nightly self-evaluation, two interchangeable WhatsApp providers, and a React configuration dashboard.

Technical decisions

Schema-per-tenant in Postgres (instead of tenant_id columns) for real data isolation and granular per-customer backups.
Redis Streams as the bus between Gateway and Core, instead of direct HTTP, to decouple channels (Telegram/WhatsApp) from the AI engine and enable retries and backpressure.
Layered prompt (IDENTITY → CAPABILITIES → INSTRUCTIONS → ADJUSTMENTS → KNOWLEDGE → HISTORY) with tools filtered by tenant capabilities in YAML, so the same core serves very different businesses without forks.
Nightly self-improvement loop: a worker evaluates conversations with an LLM and proposes prompt adjustments per tenant, reviewable from the dashboard.

What it shows

Taking a full SaaS product from zero to a sellable MVP alone — distributed architecture, real multi-tenancy, LLM integration and ops dashboard — with isolation, decoupling and configurability decisions designed to scale across many customers.

Stack

Python 3.12 (FastAPI, asyncio, uv)PostgreSQL + pgvectorRedis Streams + arqOpenRouter (LLMs)sentence-transformersReact + Vite + TypeScriptTailwind + shadcn/ui + TremorDocker ComposeCaddy

About the code

Commercial project. I can't publish code or client data; architecture and decisions are verifiable in interview.

▸

Backend Vertebra — accounting and billing automation SaaS

GraphQL backend that automates download, audit, digitization and payment of utility bills at enterprise scale.

[expand]

TypeScript + Node.jsApollo GraphQL (Nexus)Prisma + PostgreSQLBullMQ + RedisMongoDB (AI memory)

Problem

Companies with hundreds of utility accounts (electricity, gas, water, telecom) spend thousands of hours per month downloading invoices from portals, validating them, posting them to ledgers and paying them. The platform automates that cycle end to end, with auditing, alerts and bank reconciliation built in.

My role

Backend lead / principal maintainer. Of ~9,700 total commits, ~7,100 are mine (over 70%). I designed the async worker subsystems, reporting, AI pipelines for bill extraction/digitization, bank reconciliation, alerts, and most of the Prisma schema. I coordinate a team of 4–6 developers.

Technical decisions

BullMQ-based architecture with 79 specialized workers (download, audit, reconciliation, payments, alerts) to isolate failures per external integration and enable retries/backpressure without affecting the API.
Custom AI layer with LangGraph agents, RAG and MongoDB memory, with LangSmith observability, for structured field extraction from heterogeneous invoices (PDF/image) and email classification.
147 Excel reports generated as jobs on disk under a uniform pattern, letting business users request complex extractions without touching BI.
Postgres → BigQuery datastream for analytics without penalizing OLTP; partial indexes and custom SQL functions to speed up critical payment flows.

What it shows

Sustaining and evolving a long-lived critical system with many fragile integrations (external portals, banks, OCR/AI), prioritizing observability, decoupled queues and automation where manual work existed before. Comfortable crossing domains: DB performance, workers, applied AI, business reporting.

Stack

TypeScript + Node.jsApollo GraphQL (Nexus)Prisma + PostgreSQLBullMQ + RedisMongoDB (AI memory)BigQueryLangChain / LangGraph / LangSmithMJMLFirebase AdminGoogle Cloud StorageDocker + PM2

About the code

Commercial project. I can't publish code or client data; architecture and decisions are verifiable in interview.

~ $ ls projects/other/

Other projects

Guruve — conversational orchestrator on WhatsApp Business

Backend for conversational bots with queues, scheduling and reports, serving several enterprise customers from a single platform.

Sustaining a multi-tenant platform in production alone for years, with focus on operational reliability (queues, retries, scheduling, basic observability).

Node.js + ExpressBullMQ + RedisBigQueryWhatsApp Business Cloud API

Academic management platform (higher-education institute)

Full-stack web system to manage courses, enrollments and official PDF reports for a higher-education institute.

Taking a product alone from data model to automated deploy, making pragmatic packaging and infrastructure decisions (Nuitka + pull-deploy) tailored to an on-premise client without a DevOps team.

Python 3.12 + FastAPISQLAlchemy + AlembicPostgreSQL + JWTReact 19 + TypeScript + Vite

Vsync2 — incremental Postgres → BigQuery sync

Replacement of an Airbyte connector with a daily Python job using a watermark, no streaming and no extra cost.

Recognizing when a managed tool is overkill and replacing it with minimal, idempotent code that is trivial to operate — including the no-data-loss cutover plan, not just the happy path.

Python 3.11psycopg 3 (server-side cursor)google-cloud-bigquery (NDJSON load jobs)uv