Database Design skill

Database Design is an agent skill for AI coding assistants (Claude Code, OpenClaw, Cursor, Codex). Schema design, indexing, migrations, query optimization, and PostgreSQL patterns for production systems. Install with: npx skills-ws install database-design.

devv1.0.0Updated
copied ✓
openclawclaude-codecursorcodex
0 installsVirusTotal: cleanSource code

Database Design

Schema Design Patterns

Normalization Quick Reference

FormRuleWhen to break
1NFAtomic values, no repeating groupsJSONB arrays for tags/metadata
2NFNo partial dependenciesDenormalized read models
3NFNo transitive dependenciesCaching computed fields
BCNFEvery determinant is a candidate keyRarely broken

Denormalization Patterns

-- Materialized counter cache (avoid COUNT queries)
ALTER TABLE posts ADD COLUMN comments_count INT DEFAULT 0;

-- Trigger to maintain it
CREATE FUNCTION update_comments_count() RETURNS TRIGGER AS $$
BEGIN
  IF TG_OP = 'INSERT' THEN
    UPDATE posts SET comments_count = comments_count + 1 WHERE id = NEW.post_id;
  ELSIF TG_OP = 'DELETE' THEN
    UPDATE posts SET comments_count = comments_count - 1 WHERE id = OLD.post_id;
  END IF;
  RETURN NULL;
END; $$ LANGUAGE plpgsql;

Indexing Strategies

TypeUse caseExample
B-treeEquality, range, sorting (default)CREATE INDEX idx_users_email ON users(email)
GINJSONB, arrays, full-text searchCREATE INDEX idx_data ON items USING GIN(metadata)
GiSTGeometric, range types, proximityPostGIS spatial queries
BRINLarge sequential/time-series tablesCREATE INDEX idx_ts ON events USING BRIN(created_at)
CompositeMulti-column queriesCREATE INDEX idx_org_status ON tickets(org_id, status)
PartialSubset of rowsCREATE INDEX idx_active ON users(email) WHERE active = true

Composite index rule: Left-to-right prefix matching. Index on (a, b, c) serves queries on (a), (a, b), (a, b, c) — not (b, c).

Query Optimization

-- Always start here
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) SELECT ...;

Key indicators in query plans:

  • Seq Scan on large tables → missing index
  • Nested Loop with high row counts → consider Hash Join via better stats
  • Rows Removed by FilterActual Rows → index not selective enough
  • High Buffers: shared read → data not cached, check shared_buffers

N+1 Detection and Fixes

// BAD: N+1 with Prisma
const users = await prisma.user.findMany();
for (const u of users) {
  const posts = await prisma.post.findMany({ where: { authorId: u.id } }); // N queries
}

// GOOD: Eager load
const users = await prisma.user.findMany({ include: { posts: true } });

// GOOD: Drizzle with explicit join
const result = await db.select().from(users).leftJoin(posts, eq(users.id, posts.authorId));

Migration Workflow

Zero-Downtime Checklist

  1. Add nullable column (safe, no lock)
  2. Backfill data in batches (UPDATE ... WHERE id BETWEEN $1 AND $2)
  3. Add NOT NULL constraint using ALTER TABLE ... ADD CONSTRAINT ... NOT VALID then VALIDATE CONSTRAINT
  4. Deploy app code using new column
  5. Drop old column after confirmation period
# Migration file naming: YYYYMMDDHHMMSS_description.sql
20260101120000_add_users_role.up.sql
20260101120000_add_users_role.down.sql

Dangerous operations (take ACCESS EXCLUSIVE lock):

  • ALTER TABLE ... ADD COLUMN ... DEFAULT (PG < 11)
  • ALTER TABLE ... ALTER COLUMN TYPE
  • CREATE INDEX without CONCURRENTLY

Always use CREATE INDEX CONCURRENTLY in production.

PostgreSQL Power Features

-- JSONB: query nested data
SELECT * FROM events WHERE payload->>'type' = 'click' AND (payload->'meta'->>'duration')::int > 500;

-- CTE for readability
WITH active_users AS (
  SELECT id FROM users WHERE last_login > NOW() - INTERVAL '30 days'
)
SELECT p.* FROM posts p JOIN active_users u ON p.author_id = u.id;

-- Window function: running total
SELECT date, revenue, SUM(revenue) OVER (ORDER BY date ROWS UNBOUNDED PRECEDING) AS running_total
FROM daily_sales;

-- Table partitioning (range)
CREATE TABLE events (id BIGINT, created_at TIMESTAMPTZ, data JSONB)
  PARTITION BY RANGE (created_at);
CREATE TABLE events_2026_q1 PARTITION OF events
  FOR VALUES FROM ('2026-01-01') TO ('2026-04-01');

Connection Pooling

Use PgBouncer in transaction mode for serverless/high-connection environments:

# pgbouncer.ini
[databases]
mydb = host=127.0.0.1 dbname=mydb
[pgbouncer]
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 20

Rule of thumb: default_pool_size ≈ 2-3× CPU cores of your database server.

Backup Strategy

MethodRPOUse case
pg_dumpPoint-in-timeSmall DBs, dev restore
WAL archiving + pg_basebackupSecondsProduction PITR
Logical replicationNear-realtimeCross-version, selective
# Automated daily backup
pg_dump -Fc --no-owner mydb | zstd > "backup_$(date +%Y%m%d).dump.zst"
# Restore
zstd -d backup_20260101.dump.zst | pg_restore -d mydb --no-owner

References

See references/ for index tuning guides, migration templates, and ORM comparison matrices.