Nuvlo — Expense Tracker
React · FastAPI · Kafka · PySpark · BigQuery · dbt · Terraform · GCP — 2026

Overview
Nuvlo is a production-deployed personal finance app that connects to real US bank accounts via the Teller API, automatically syncs credit card transactions, and sends payment reminders before due dates. The frontend is deployed to Firebase Hosting and the backend runs on Cloud Run, with all infrastructure provisioned through Terraform.
The project was built in 7 phases: starting with a basic CRUD API and evolving into a full data platform with Kafka event streaming, Apache Airflow orchestration, PySpark batch processing, and a BigQuery analytics warehouse with dbt transformations.

Problem
Managing multiple credit cards across different banks means logging into separate apps to check balances, track spending, and remember due dates. Most missed payments happen not because of insufficient funds — but because the due date slipped by unnoticed.
Nuvlo solves this by linking all cards in one place via the Teller API, syncing transactions automatically every day, categorising spending, and sending email reminders before each card's due date. It also exposes a weekly analytics pipeline that aggregates spending by category, monthly totals, and credit utilisation history — data that no consumer banking app surfaces directly.
Architecture
The system has three layers.
Application layer — The FastAPI backend handles all API requests: auth, card management, transaction CRUD, Teller webhook events, and email notifications. It connects to PostgreSQL on Cloud SQL via a Unix socket using the Cloud SQL connector. On every transaction write, it publishes an event to a Kafka topic. A background consumer thread inside the same FastAPI process subscribes to that topic and handles downstream processing.
Orchestration layer — Apache Airflow runs three DAGs: daily_teller_sync (calls the Teller API for every linked account and upserts new transactions into PostgreSQL), bill_due_checker (scans cards for upcoming due dates and sends reminder emails via Gmail SMTP), and weekly_spark_batch (triggers a PySpark job that reads all transactions from PostgreSQL, computes aggregations, and writes Parquet files to a GCS data lake).
Analytics layer — PySpark writes three aggregation tables to the GCS data lake: spending_by_category, monthly_totals, and utilization_history. A separate job loads these into BigQuery. dbt then runs staging models to clean and type-cast the raw tables, and mart models that produce the final analytics-ready views. BigQuery tables are time-partitioned for query efficiency.
Key Features
- Bank account linking — Cards connected via Teller API. Each card stores institution name, card name, last four digits, credit limit, current balance, and due date. Dashboard shows utilisation percentage per card from live balance and limit data.
- Automatic transaction sync — daily_teller_sync DAG runs every 24 hours, fetches transactions from Teller, and upserts into PostgreSQL using the Teller transaction ID as a deduplication key. Transactions include amount, description, merchant, category, and status.
- Payment reminders — bill_due_checker DAG checks each card's due date daily and sends an HTML email via Gmail SMTP when a due date is approaching, templated with card name, due date, and current balance.
- Two-factor authentication — Two independently toggleable methods: Authenticator App (TOTP via pyotp, QR code scanned with Google Authenticator) and Email OTP (6-digit code, 10-minute expiry). When both are enabled, user can switch between them on the 2FA screen.
- Account security flows — Forgot password, change email, and change password are all protected with OTP verification. OTP tokens stored in a dedicated table with a purpose field and hard expiry timestamp.
- Calendar view — Monthly calendar renders each card's due date as a labelled event on the correct day. Clicking a date shows which cards are due and their current balance.
- Spending analytics — Transactions page supports filtering by card, date range, category, and merchant. Weekly PySpark job aggregates into BigQuery marts for category-level and month-level breakdowns.

Technical Stack
- Frontend — React 18, Vite, Tailwind CSS (Firebase Hosting)
- Backend — FastAPI, Python 3.11, SQLAlchemy, Alembic (Cloud Run)
- Database — PostgreSQL 16 (Cloud SQL)
- Auth — JWT (python-jose), TOTP (pyotp), Email OTP
- Bank Data — Teller API
- Messaging — Apache Kafka (kafka-python)
- Orchestration — Apache Airflow
- Batch Processing — PySpark
- Data Warehouse — BigQuery
- Transformations — dbt
- Infrastructure — Terraform (GCP)
- CI/CD — GitHub Actions
- Monitoring — Prometheus, Grafana
Deployment
All GCP infrastructure is defined in Terraform and provisioned with a single terraform apply: Cloud SQL PostgreSQL instance, Cloud Run service with Cloud SQL connector, Artifact Registry for Docker images, GCS buckets for the data lake, BigQuery dataset with 4 time-partitioned tables, two service accounts with least-privilege IAM roles, and all required GCP APIs enabled programmatically.
The GitHub Actions pipeline runs on every push to main: authenticates to GCP, builds and pushes the backend Docker image to Artifact Registry, deploys to Cloud Run with all environment variables injected (DATABASE_URL via Cloud SQL Unix socket, SECRET_KEY, SMTP_PASSWORD, TELLER_APPLICATION_ID), then builds the React frontend with VITE_API_URL set from the Cloud Run deployment output and deploys to Firebase Hosting.
Live: nuvlo-expense-tracker.web.app · card-tracker-api-y3cegn3hoa-uc.a.run.app · /docs for Swagger UI
