This project ingests ONIX — the international standard for book metadata — from all six trade publishers Fully Booked sources from, parses each feed, merges it into a single Azure-resident catalog database, and serves it to every downstream system that needs it: the public website (direct TLS reads), in-store POS, the on-prem inventory management system, and any future ERP, analytics, or partner integration (via scheduled snapshots).
The catalog is the system of record. The website reads it over a managed TLS connection with Entra ID identity; other systems consume scheduled file snapshots — so the schema or compute tier can evolve without breaking external integrations.
Founded by Jaime Daez in 1997, Fully Booked opened its first Rockwell branch in 2003. What started as distributing a single architecture magazine grew into the Philippines' premier bookstore chain — selling 100,000+ titles, stationery, vinyl records, and collectibles.
Every book title is entered by hand — slow, error-prone, and unscalable across 100,000+ titles.
6 publishers (Ingram, PRH, HarperCollins, Hachette, S&S, Macmillan) send separate ONIX files with no unified intake.
The catalog depends on an aging Windows host + MS SQL Server on-prem — costly, brittle, and a bottleneck for the website.
fullybookedonline.com has grown rapidly since 2021. The catalog backend must match the pace of online demand.
6 ONIX sources
Ingram · PRH · HC
Hachette · S&S
Macmillan
Existing managed FTP server, per-publisher /incoming folders.
Publisher pushes ONIX 3.0 XML to the on-prem managed FTP server. Azure Storage Mover replicates each incoming/ folder to onix-raw hourly.
Blob arrival fires a BlobCreated Event Grid event → the Parser Function writes 5 NDJSON files (products, contributors, prices, subjects, media) to onix-processed.
A second Event Grid subscription triggers the Merge Function on each NDJSON. It bulk-loads via COPY into staging and runs an idempotent MERGE (ISBN-13 keyed) into the catalog DB.
fullybookedonline.com and other consumers read the catalog over TLS — directly from PostgreSQL (Entra ID managed identity) or from scheduled per-format snapshots in Blob.
Already in production at Fully Booked. Operated by client IT. Accepts ONIX pushes from all 6 publishers into per-publisher landing folders. Zero new cost from this project.
Azure-provided agent runs on a small Linux VM. Mounts each publisher's incoming/ folder via SMB (read-only) and replicates new files to the Azure raw archive hourly. Change-detection skips already-transferred files.
Versioned, encrypted archive of every original ONIX file. Lifecycle policy tiers data Hot → Cool → Archive and expires at 1 year for replay.
Reads each new ONIX 3.0 XML file and emits 5 normalized entity files: products, contributors, prices, subjects, media. Stateless, on-demand — no servers to keep running.
Bulk-loads each entity file into staging, then runs an idempotent MERGE into the catalog database. Per-field source-priority rules resolve conflicts between publishers. Tunable by config, no code deploy.
Cloud-managed Azure Database for PostgreSQL — Flexible Server (B1ms). Holds products, contributors, prices, subjects, media, plus an audit log. Encrypted at rest (Key Vault CMK), daily automated backups, point-in-time recovery, TLS-only connections.
| Catalog DB (PostgreSQL Flexible Server B1ms, 32 GB) | ~$23 |
| Azure Monitor, Logs, Alerts, Data Transfer | ~$3 |
| Blob Storage — raw + processed | ~$1 |
| Event Grid + Key Vault (keys + secrets) | ~$1 |
| Parser & Merge Functions (Flex Consumption) | <$1 |
| Storage Mover Uplink (~2–5 GB) | $0 |
| Total | ~$28 |
| Dimension | Azure Version | On-Prem Version |
|---|---|---|
| Project Cost | ₱1,655,808.00 | ₱1,552,320.00 |
| Project Cost (Year 1) MIN | ₱1,672,608.00 | ₱1,770,720.00 |
| Project Cost (Year 1) MAX | ₱1,673,952.00 | ₱2,487,520.00 |
| Recurring monthly cost | ~$28 / month (Azure SE Asia) | $0 |
| Annualised cloud spend | ~$336 / year | $0 |
| One-time infrastructure | ~$500 (Terraform / IAM setup) | ~$3,500–$12,800+ (VM + Windows Server + MS SQL + VMware licences; $0 if reusing existing kit) |
| Data sovereignty | Data resides in Azure Southeast Asia | All data on customer-owned hardware |
| Backup & monitoring | Azure managed (Flexible Server backup, Log Analytics, Alerts) | Customer's existing IT processes |
| Resilience to host outage | Azure absorbs publisher delivery; buffers catch up | Managed FTP server incoming/ buffers catch up |
| Operational complexity | Two stacks to debug (Azure + on-prem) | One host stack — Windows / Python / SQL |
| TCO in 3 years (MIN) | ₱568,944.00 | ₱590,240.00 |
| TCO in 3 years (MAX) | ₱570,288.00 | ₱756,373.33 |
| TCO in 5 years (MIN) | ₱331,193.60 | ₱354,144.00 |
| TCO in 5 years (MAX) | ₱334,790.40 | ₱510,720.00 |
incoming/ folders to the Azure raw blob archive. (2) Cloud pipeline — Event Grid triggers a Parser Function that converts ONIX XML to NDJSON in Blob Storage; a Merge Function then bulk-loads via COPY + MERGE. (3) Catalog DB — Azure Database for PostgreSQL Flexible Server (B1ms).southeastasia (Singapore) — closest to the Philippines.v1 single-zone (~$23/mo). Upgrade to Zone-redundant HA is a portal toggle → ~$47/mo with hot failover.
DB reachable only from merge Function + website egress. TLS enforced at require_secure_transport=on. Read/write credentials separated; Entra ID managed identity for the Function.
Files accumulate on FTP server. On recovery, next hourly job transfers everything pending. No data loss.
Event Grid retries with exponential backoff then dead-letters to a blob container; alert rule pages ops. No partial corruption reaches the merger.
Source-priority config on merger. Direct publisher overrides Ingram. Tunable by config update — no code deploy needed.
Documented playbook: temporarily resize DB one tier up (B2s), run backfill, resize back down to B1ms. ~5-minute managed operation.
Ingestion speaks ONIX 3.0 — the global publisher feed standard. Onboarding the next publisher is a config change, not new code or a new contract.
Plain SQL. Any BI tool, ERP, mobile backend, or AI/ML pipeline reads the catalog directly — no proprietary connector, no middleware tax.
Every ingestion publishes to Event Grid. Search re-index, Slack alerts, recommendation engines, POS sync — each is a new subscriber, not a rebuild.
Terraform end to end. Audit it, migrate regions, or hand it to another team in a day. No black box, no lock-in to us.
Entra ID + managed identities throughout. Future apps inherit the same SSO and authz model — no bolt-on auth later.
SAP catalog sync, mobile commerce, marketplace feeds, AI-driven merchandising — each is one more consumer of the same hub. No re-platform.
Confirm Azure subscription to use (existing or net-new), billing owner, and resource group naming convention.
Confirm DBA ownership of catalog database credentials and quarterly rotation approvals.
Confirm hypervisor capacity for the Storage Mover agent VM (small Linux; SMB to FTP server, outbound 443 to Azure).
Confirm: single-zone Flexible Server for v1 (~$23/mo) or Zone-redundant HA from day one (~$47/mo)?
Confirm managed FTP server can expose each publisher's incoming/ folder via SMB to a service account.
Confirm cutover plan: start fresh from ONIX backfills, one-shot migration, or dual-write window?
Confirm which publishers to onboard first beyond Ingram.
Confirm: public-endpoint + firewall allow-list + TLS acceptable for v1, or add Private Endpoint from day one (+$8/mo)?
Confirm the website's egress IP / network range for the PostgreSQL firewall allow-list.
Confirm v1 scope: international titles only, or include manual-entry fallback for local PH publishers?