A Scrum-inspired framework for human-AI collaboration,
optimized for GitHub Copilot CLI.
A lead agent manages the sprint cycle — planning, execution, retros. It dispatches specialized sub-agents for code, tests, security, docs, and reviews. You're the stakeholder — you set direction, not details.
Sprint Planning, Daily Huddles, Sprint Review, Retrospectives — all adapted for human-AI dynamics. Each ceremony has a purpose. Nothing exists because "that's how Scrum works."
Fork it, customize the placeholders, and you have a fully configured AI development process. Agents, skills, process docs, sprint templates — everything included.
Start guided to learn the framework. Graduate to autonomous when you trust the process.
You're at the keyboard. You trigger each ceremony, review AI decisions, approve scope. Interactive sessions — the AI works while you watch.
Start a sprint and walk away. The AI runs for hours — planning, coding, testing, reviewing. You come back to finished work and a push notification.
AI coding assistants are powerful. But without structure, they create as many problems as they solve.
Every session starts from zero. The AI doesn't know what you did yesterday, what decisions were made, or what failed last sprint. Context is lost every time.
The AI happily starts new features while old ones are half-done. No discipline, no focus, no "finish what you started." Your backlog grows while nothing ships.
The AI says "done" without running tests. It claims success without evidence. You merge confidently — and discover the bug in production.
Ad-hoc prompting produces ad-hoc results. Without a defined workflow, there's no way to improve. Every session is a fresh experiment.
| Ad-hoc Prompting | Spec-Driven Development | AI-Scrum | |
|---|---|---|---|
| Approach | Chat until it works | Write full spec upfront, AI implements | Iterative sprints with feedback loops |
| Planning | None | Heavy upfront — spec, tests, architecture | Just-in-time — ICE scoring, sprint-sized |
| Memory | None — context lost each session | Spec is the memory | Sprint logs, issue comments, velocity data |
| Quality | Hope for the best | Test-first from spec | Quality gates, DoD, CI before merge |
| Adapts to change | Instantly — no plan to break | Slowly — spec rewrite needed | Via backlog — changes are welcomed, not feared |
| Process improves | ❌ No | ❌ Spec improves, process doesn't | ✅ Every retro improves tools + workflow |
| Best when | Quick prototypes, one-offs | Well-understood problems, clear specs | Ongoing projects, evolving requirements |
Spec-driven works great for one-shot tasks with clear requirements. AI-Scrum is for projects that live and evolve — where you need sustained quality, not just initial delivery.
Every file in the framework maps to a Copilot CLI feature. No custom tooling needed.
.github/copilot-instructions.md.github/agents/*.agent.md@agent-name. Each agent has a role, specialized instructions, and domain expertise. 11 agents included..github/skills/*/SKILL.mdPlan mode built-inShift+Tab). Creates structured implementation plans before touching code. The framework integrates with it.Built-in agents built-inexplore, task, code-review, and general-purpose agents — Copilot CLI's native agents with full toolsets.SQL database built-inTwo roles. Clear boundaries. No ambiguity about who decides what.
Inspired by the Agile Manifesto (2001) — adapted for human-AI collaboration.
That is, while there is value in the items below, we value the items above more.
The best architecture emerges from small, tested diffs — not grand rewrites.
Quality gates are non-negotiable. Every feature gets tests. Every PR gets CI. Every claim gets verification.
The human brings judgment; the agent brings throughput. Play to each side's strengths.
Documentation is not overhead — it's memory. The agent has no memory between sessions.
Escalation is a feature, not a failure. MUST-escalate criteria protect the project from autonomous overreach.
Process improvements compound. Each retro makes the next sprint better. Sprint 10 is radically smoother than Sprint 1.
Evidence before assertions, always. Never say "done" without proof. Trust, but verify — on both sides.
The backlog is sacred. Ideas don't get lost — they get queued. If it's not an issue, it doesn't exist.
Velocity is descriptive, not prescriptive. Track it to understand capacity. Don't use it to pressure.
The agent is not a junior developer — it's a different kind of collaborator. Design the process for what it actually is.
Welcome scope changes — route them through the backlog, not mid-sprint pivots.
Simplicity — maximize the work not done. Every line of code is a liability; every automation is an asset.
Five ceremonies. Each has a slash command. Human-triggered or fully autonomous — your choice.
Stakeholder drops ideas. AI researches, decomposes into concrete issues with acceptance criteria.
/refine
AI triages backlog, scores issues (ICE), selects scope, assigns labels and milestones.
/sprint-planning
AI implements issues with quality gates. Huddle after each issue. Tests required for every feature.
/sprint-start
Demo deliverables, metrics, velocity tracking. Stakeholder accepts or requests changes.
/sprint-review
What went well, what didn't. Process improvements. The process that improves itself.
/sprint-retro
You need an active GitHub Copilot subscription (Pro, Pro+, Business, or Enterprise) and Copilot CLI installed.
# Install Copilot CLI
brew install copilot-cli # macOS / Linux
winget install GitHub.Copilot # Windows
# Enable experimental mode (required for Autopilot)
copilot --experimental
Autopilot mode lets the agent continue working until a task is complete. Activate it with Shift+Tab to cycle to Autopilot mode inside a session. This is especially important for the autonomous variant.
# Guided (interactive, you steer)
gh repo create my-project \
--template trsdn/copilot-scrum-guided \
--public --clone
# Autonomous (multi-hour, unattended)
gh repo create my-project \
--template trsdn/copilot-scrum-autonomous \
--public --clone
Search for {{PROJECT_NAME}}, {{PROJECT_DESCRIPTION}}, and {{OWNER}} across all files and replace with your values.
gh label create "status:planned" --color "0E8A16"
gh label create "status:in-progress" --color "FBCA04"
gh label create "status:validation" --color "1D76DB"
gh label create "priority:high" --color "B60205"
gh label create "priority:medium" --color "F9D0C4"
gh label create "priority:low" --color "C5DEF5"
gh label create "type:idea" --color "D4C5F9"
# Create an idea:
gh issue create --title "idea: Your feature" \
--label "type:idea" --body "Brief description"
# Refine ideas into concrete issues:
/refine
# Plan and run your first sprint:
/sprint-planning
The AI refines your ideas into concrete issues, then plans and executes sprints.
Five principles, ranked by priority. When they conflict, higher beats lower.
The human decides what and why. The AI decides how. Stakeholder-created issues must not be deprioritized, descoped, or closed without approval. ICE scoring is advisory — human intent overrides it.
Complete what you start. New ideas go to the backlog, not into the current sprint. Never abandon work-in-progress to chase shiny objects.
Every change is tested, reviewed, and verified before merge. No exceptions. CI must be green. Evidence must exist.
One feature per PR. ~150 lines ideal, 300 max. Config changes over code changes. Each PR is independently shippable.
Every retro evaluates the process itself — agents, skills, workflows. Friction gets automated. Failures get root-caused.
Clear rules for when the AI decides alone and when it must ask.
Autonomous execution needs guardrails. Individual commits can be valid while the project silently drifts off course.
Only planned issues may be executed. Discovered work goes to backlog — never into the current sprint. If >2 unplanned issues are created, the AI must escalate.
After each issue: Was this planned? Are files related to sprint scope? Is the goal still achievable? If any check fails — stop and escalate.
At sprint review: holistic git diff --stat across all changes. Flag files that don't relate to any sprint issue. Report planned vs unplanned work.
Every issue must meet ALL applicable criteria before closing. No shortcuts.
gh run listGitHub Issues are the only task system. No external trackers, no internal todo lists.
status:plannedstatus:in-progressstatus:validationScore = Impact × Confidence ÷ Effort (each 1-3)
Milestones replace board columns. Full history preserved.
gh issue edit 42 --milestone "Sprint 5"
gh issue list --milestone "Sprint 5"
gh issue list --milestone "Sprint 5" --state closed
The AI has no memory between sessions. These artifacts are how continuity survives.
Huddle decisions, learnings, blockers — created at sprint start, updated after each issue.
docs/sprints/sprint-N-log.md
Sprint-over-sprint performance tracking. Created at retro, used at planning.
docs/sprints/velocity.md
Traceable audit trail per issue. Huddle results posted as comments.
GitHub Issue #N
Architectural Decision Records. Immutable unless stakeholder approves change.
docs/architecture/ADR.md
Start guided, graduate to autonomous. Same process, same constitution — different level of human presence.
Interactive. Step by step.
You trigger each ceremony, review results, approve scope. Sessions last as long as you're engaged.
Runs for hours. Unsupervised.
Start a sprint and walk away. The AI agent team runs for hours — you come back to finished work and a notification on your phone.