Building a Schema-Aware Database Seeder CLI: Going Beyond Random Data
Seeding a database is easy, but generating realistic and structured data is not. This CLI tool takes a different approach by using schema introspection and naming conventions to automatically produce meaningful, context-aware data for real-world development.

Seeding a database is easy.
Seeding it with useful, realistic, and structured data is not.
After dealing with repetitive seeders across SaaS and ERP projects, I built a CLI tool that does something slightly different:
It generates data based on column naming conventions and database types, not just randomness.
Repo: database-seeder-CLI
The Problem with Most Seeders
Typical approaches fall into two categories:
- Manual seeders → accurate but slow and repetitive
- Random generators → fast but unrealistic
The issue is not generating data.
It is generating data that:
- Matches your schema
- Feels realistic
- Works across multiple tables
- Scales with your database structure
The Approach: Schema-Driven Seeding
Instead of hardcoding values, the CLI inspects your database directly:
SHOW TABLES → DESCRIBE table → infer columns → generate data
From there, it applies two layers of logic:
- Column name heuristics (primary signal)
- SQL type fallbacks (secondary safety net)
Core Idea: Naming Conventions as Signals
The tool treats column names as intent.
if (field.includes('email')) return faker.internet.email();
if (field.includes('price')) return faker.finance.amount();
if (field.includes('slug')) return faker.helpers.slugify(...);
This simple pattern unlocks a lot:
- email → real emails
- avatar → working image URLs
- tech_stack → JSON arrays of technologies
- tags → structured arrays instead of strings
This makes seed data actually usable for UI, APIs, and testing.
Type-Based Fallbacks
When naming is not enough, the CLI falls back to SQL types:
if (/^(int|bigint)/.test(type)) return faker.number.int();
if (/^(decimal|float)/.test(type)) return faker.finance.amount();
if (type.startsWith('date')) return faker.date.past();
This ensures:
- No column is left unhandled
- Data stays consistent with schema constraints
- Inserts don’t break due to invalid formats
Rich Text Detection (Underrated Feature)
One interesting piece is rich-text handling.
const RICH_TEXT_TYPES = ['longtext', 'mediumtext'];
If detected, the CLI can generate Quill-compatible HTML:
- <h2> headings
- <p> paragraphs
- <strong> formatting
- <ul> lists
This is a big deal for:
- CMS systems
- Blog platforms
- SaaS dashboards
Because plain lorem text is not enough when your UI expects formatted content.
Interactive CLI Flow
Instead of hardcoding tables, the tool uses an interactive flow:
- Select tables (multi-select)
- Choose row count
- Detect rich-text fields and ask for HTML mode
const selectedTables = await checkbox(...)
const rowCount = await input(...)
const useHtml = await confirm(...)
This makes it flexible across projects without changing code.
Insert Strategy (Important Detail)
The seeding logic is not just naive bulk insert.
It:
- Skips auto-increment fields
- Filters out null values
- Handles row-level failures without stopping the process
await conn.execute(filteredSQL, filteredValues);
This is important for real-world usage where:
- Constraints exist
- Some rows might fail
- You don’t want the entire seed process to crash
Why This Actually Matters
This is not just a convenience tool.
It improves:
1. Developer Experience
Spin up realistic environments instantly.
2. Frontend Testing
UI components behave properly with real-looking data.
3. API Validation
Endpoints return meaningful payloads.
4. System Simulation
Closer to production-like scenarios without real data.
Trade-Offs
This approach is not perfect.
Naming Dependency
It assumes your schema follows good conventions.
Limited Domain Awareness
It does not understand business rules deeply.
Relationships Still Need Care
Foreign keys and relational consistency are not fully inferred.
Where This Fits Best
This tool is ideal for:
- SaaS development
- Admin dashboards
- Rapid prototyping
- Internal tools
- Developer onboarding
It is less suited for:
- Complex relational simulations
- Domain-heavy test scenarios
Engineering Takeaway
This project highlights something simple but important:
Good developer tools are not about complexity. They are about removing friction intelligently.
Instead of writing more code, this approach uses:
- Schema introspection
- Naming conventions
- Smart defaults
To automate a task developers do constantly.
Final Thought
Seeding is often treated as a minor task.
But in practice, it directly affects how fast you can build, test, and iterate.
A small improvement here compounds across every project.
And sometimes, that is where the most practical engineering wins happen.
Author
Jose Albert Arnedo
Full-Stack Engineer focused on ERP systems and SaaS platforms