Managing Monorepos at Scale: Lessons from Meta, AWS, and a 15-Repo Migration

Early in my career, I thought polyrepos were the way. Microservices, each in their own repo, independently deployable — it felt clean and modern. Then I joined Meta and worked in their massive monorepo, and it changed how I think about code organization entirely. Anything I needed to change, I could change it right there. Everything was discoverable. One commit could do everything I needed it to.

Then I moved to AWS, back to polyrepo land, and the contrast was painful. Now I’m building my own products with monorepos, and I migrated SID Technologies from 15 separate repositories to one over a weekend.

This post is about what I learned from each approach, the actual migration process with real metrics, and why the monorepo vs polyrepo debate is really an organizational question, not an engineering one.

Two Extremes: Meta vs AWS

Meta’s Monorepo

At Meta, virtually everything lived in one Mercurial repository: Facebook, Instagram, WhatsApp web, internal tools, mobile apps, backend services. Thousands of engineers committing thousands of times per day.

What worked:

  • Cross-cutting changes were routine, not projects
  • One engineer could touch iOS, Android, backend, and web in a single diff
  • Refactoring was safe — your IDE could find all usages across the entire codebase
  • No version management for internal code
  • CI caught integration issues before merge

What was hard:

  • Onboarding required downloading gigabytes of code
  • Search was slow without custom tooling
  • Build times required distributed caching (Buck/Bazel)
  • Merge conflicts were frequent in hot files

And then there was the dead code problem. Our team had this one legacy project — super old, not maintained anymore. We’d get the occasional errors on it and have to try to fix it. I eventually figured out we could deprecate it, and I created a commit deleting tens of thousands of lines of code. But there was this one function in one folder — some sort of hex translation function — and because this was ancient code in the main monorepo, it had over 200 references across dozens of different teams’ folders. The obvious move was to delete everything around it and not touch that function.

That’s the monorepo tradeoff in one story: you can find everything, change everything, delete everything — but you also carry the responsibility of not breaking what other teams depend on. That said, dead code isn’t a monorepo problem — it’s a codebase problem. In a polyrepo company, you get entire repos that drift away and nobody even knows they’re still running. At least in a monorepo you can see it.

AWS’s Polyrepo

At AWS, nearly every service lives in its own repository. Thousands of repositories, each with its own CI/CD, deployment pipeline, and ownership.

What works:

  • Clear ownership boundaries (one team, one repo)
  • Independent deployment cadences
  • Technology diversity (each team chooses their stack)
  • Strong security isolation between services
  • Small repositories are fast to clone and search

What’s painful:

  • Discoverability is terrible. Half the time I’m using internal code search to find out how someone else did something and copy their code. I know people have rewritten CDK components all over the place because there’s no standardization. The same infrastructure pattern gets implemented dozens of times by dozens of teams.
  • Cross-cutting changes require coordination across dozens of repos
  • Shared libraries mean version management hell
  • Duplicate code accumulates across services
  • Integration testing requires complex test environments

Neither approach is wrong. They fit their organizational constraints. Meta optimizes for velocity and coordination. AWS optimizes for autonomy and isolation. But here’s the thing I’ve come to believe: polyrepo is fundamentally an organizational choice, not an engineering one. It’s the same thing as setting CODEOWNERS and clear team boundaries in a monorepo — but now you’re adding dependency injection on top.

The Migration: 15 Repos to 1

At SID, we started with the “best practice” of one repo per service. By the time we had 15 services, the overhead was crushing us.

The Breaking Point

I got tired of building cross-dependencies and injecting versions and libraries into repos. I really dislike GitHub submodules, and the constant issue of being pinned on a version because someone didn’t update was eating my time. The immediate trigger was adding organization-level permissions — it required coordinated changes across 9 repositories. Two weeks of work. One week coding. One week dependency management.

The pain points were clear:

Database models everywhere. Our User model was defined in 8 different repositories. When we added a field to the users table, we had to update 8 different type definitions, hope they stayed in sync, and debug runtime errors when they didn’t.

Security fixes were nightmares. Found a security bug in authentication middleware (its own repo): fix and release [email protected], open PRs in 14 consuming repos, wait for 14 code reviews, merge and deploy 14 services, hope no one is still on the old version. For a security fix that should have been 10 minutes.

I just called it. Time to migrate.

Saturday Morning: Structure Planning (3 hours)

sid-monorepo/
├── services/           # Each old repo becomes a directory
│   ├── authentication/
│   ├── billing/
│   └── ...             # 15 services
├── pkg/                # Shared Go packages (consolidated)
├── packages/           # Shared TypeScript (consolidated)
├── db/                 # Database models (single source of truth)
└── tools/              # Deployment scripts, CI tools

Key decisions: services stay independently deployable, shared code moves to pkg/ and packages/ with no more versioning, database models get a single source of truth in db/.

Saturday Afternoon: The Migration (8 hours)

For each service, I cloned the old repo and moved contents into the monorepo structure:

git clone [email protected]:sid/authentication-service.git temp
mkdir -p services/authentication
mv temp/* services/authentication/
git add services/authentication
git commit -m "Migrate authentication service"

Did this for all 15 services. Took about 4 hours with breaks.

Then came the hard part — consolidating shared code. I had 8 copies of the User model (slightly different), 5 copies of auth middleware (different versions), 3 copies of the Stripe client (different features). For each, I compared all versions, picked the most complete one, added missing features from the others, and moved it to db/models/ or pkg/.

// Before: 8 different definitions across 8 repos
// After: One definition
// db/models/user.go
type User struct {
    ID              uuid.UUID
    Email           string
    OrganizationID  uuid.UUID  // This field was missing in 3 services
    CreatedAt       time.Time
}

Sunday Morning: CI/CD (6 hours)

Before: 15 separate GitHub Actions workflows, 3,200 lines of CI config total.

After: one workflow with change detection:

name: CI
on: [push, pull_request]

jobs:
  detect-changes:
    runs-on: ubuntu-latest
    outputs:
      services: ${{ steps.filter.outputs.services }}
    steps:
      - uses: actions/checkout@v3
      - uses: dorny/paths-filter@v2
        id: filter
        with:
          filters: |
            authentication:
              - 'services/authentication/**'
              - 'pkg/authentication/**'
              - 'db/**'
            billing:
              - 'services/billing/**'
              - 'pkg/stripe/**'
              - 'db/**'

  test:
    needs: detect-changes
    if: needs.detect-changes.outputs.services != '[]'
    strategy:
      matrix:
        service: ${{ fromJson(needs.detect-changes.outputs.services) }}
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Test ${{ matrix.service }}
        run: |
          cd services/${{ matrix.service }}
          go test ./...

Change services/billing? Test only billing. Change db/models? Test everything.

Sunday Afternoon: Deployment and the Origin of Pilum

This is where I hit a new problem. I had services that were Cloud Run containers, Cloud Run Jobs, Cloud Functions, and a frontend — all in the same repo. I didn’t have a way to deploy all of these different service types with one tool. I didn’t want 5 different infrastructure tools.

At Plaid — also a monorepo — they had a ton of services and had built internal tooling around deployment that made it easy. I knew I wanted my services segmented and isolated, not a monolith, but I needed one deployment tool that understood different service types. So I started building Pilum, an open-source multi-cloud deployment orchestrator. That’s its origin story — born from this exact migration.

Sunday Evening: The Bugs (3 hours)

Deployed to staging. Found issues:

  • Circular dependency: Service A imported Service B, Service B imported Service A. Hidden by versioning in polyrepo. Visible immediately in monorepo.
  • Import paths: Missed some imports during migration.
  • Database migrations: Had to reconcile conflicting migrations from different services.

Fixed each, re-deployed, verified. Total time: roughly 24 hours over a weekend. Could have been faster with better planning. Could have preserved git history if I’d been more careful — I skipped it for speed.

The Results

We tracked metrics for 3 months before and 3 months after:

MetricPolyrepo (15 repos)MonorepoChange
Time to deploy all services45 min (serial)8 min (parallel)-82%
PRs for cross-cutting change8-15 PRs1 PR-90%
Time to complete feature spanning 3 services10-14 days2-3 days-75%
Developer onboarding time2 days4 hours-75%
Lines of CI/CD config3,200 lines400 lines-87%
Time spent on dependency management4-6 hours/week0 hours/week-100%
Bugs from version skew2-3 per month0-100%

The Real Polyrepo Pain Points

I don’t think polyrepo “fails” — I think it introduces specific friction that most startups don’t need to pay for.

Discoverability

Where does the code live? In a monorepo, you grep and find it. In polyrepo, you’re searching across dozens of repositories, hoping the naming conventions are consistent, hoping the README is up to date. At AWS, I spend real time just finding how someone else solved a problem. In Meta’s monorepo, I could find it in seconds.

Duplication Without Standardization

Without a shared codebase, teams reinvent the same solutions independently. I’ve seen the same CDK patterns rewritten across dozens of teams at AWS. At SID, we had 8 copies of the User model. In a monorepo, there’s one definition and everyone imports it.

Version Pinning

When service-A depends on [email protected] and service-B depends on [email protected], and service-C needs both — you have a diamond dependency problem. In a monorepo, this can’t happen. Everyone is always on the same version.

The Organizational Argument

Here’s the thing: polyrepo gives you the same thing as CODEOWNERS and clear team boundaries in a monorepo. The isolation is real, but it’s organizational isolation, not engineering isolation. You’re paying for that isolation with dependency injection, version management, and coordination overhead. For a 5-person startup, that cost is pure waste. For Amazon with genuinely independent business units — AWS, Retail, Prime Video, Alexa — the isolation maps to real organizational boundaries and the cost is justified.

When Each Approach Wins

Monorepo wins when:

  • Team size is 1-50 engineers and everyone needs to coordinate
  • Services share data models, auth, and infrastructure
  • You need to iterate and refactor quickly
  • You want one set of tooling, one CI config, one set of standards

Polyrepo wins when:

  • Genuinely independent business units with different customers and roadmaps
  • Different security/compliance requirements (HIPAA service vs marketing site)
  • Open source components that need separate release cadences
  • 200+ engineers in truly autonomous teams

The pattern across the industry: Google, Meta, Microsoft, Stripe, and Uber run monorepos. Amazon, Netflix, and Spotify run polyrepos. Both models work. The difference is organizational structure, not engineering capability.

Common Objections

”Monorepos don’t scale!”

Google has 25,000+ engineers in a monorepo with billions of lines of code. They needed custom tooling (Bazel, CitC, Critique) — but the approach scales. Most startups will never hit those limits.

”A bug in shared code takes down everything!”

In a monorepo, CI catches it before merge. In polyrepo, you catch it weeks later when services finally upgrade to the new version. The blast radius is actually smaller in a monorepo because you find and fix problems atomically.

And honestly — most production incidents aren’t from application code bugs. They’re from configuration changes, database issues, and external dependency outages. The blast radius argument applies to maybe 10% of real incidents.

”Microservices need multiple repos!”

Microservices are an architectural pattern. Repository structure is orthogonal. SID has 19 services in one repo. They deploy independently, scale independently, and have clear ownership. The monorepo is a development choice, not a deployment choice.

The Evolution Path

0-10 engineers: Monolith or 2-3 services in a monorepo. Don’t overthink it.

10-50 engineers: Natural service boundaries emerge. Split along team lines. The monorepo keeps coordination costs low.

50-200 engineers: Domain-driven design matters. Services map to business domains. Strong CODEOWNERS conventions.

200+ engineers: You might consider polyrepo for genuinely independent business units. But Google has 25,000+ engineers in a monorepo, so don’t assume you’ve hit scale limits.

The mistake is treating the 200+ architecture as the starting point. Premature microservices are premature optimization.

Practical Migration Advice

If you’re considering this:

  1. Start small. Pick 3-5 related services. Prove the value. Then expand.
  2. Plan the structure. Decide where shared packages, database models, and tools live. Document it in CONTRIBUTING.md.
  3. Invest in CI. Change detection is critical — your CI must detect which services changed and test only those (plus dependents). Budget time for this.
  4. Use CODEOWNERS. Even in a small team, explicit ownership prevents the “everyone and no one owns this” problem.
  5. Measure before and after. Track deploy time, PRs for cross-cutting changes, dependency management hours. If you can’t measure improvement, you can’t justify the migration.

The Monorepo Structure That Works

Here’s what SID looks like today — 19 services, growing:

├── services/           # Independent microservices (19 total)
│   ├── authentication/ # User auth, OAuth, tokens
│   ├── billing/        # Stripe integration, subscriptions
│   ├── calendar/       # Calendar management
│   ├── kanban/         # Task boards
│   ├── notifications/  # Push, email, SMS
│   ├── organization/   # Team and org management
│   ├── permissions/    # RBAC, access control
│   └── ...
├── packages/           # Shared TypeScript packages
│   ├── api/            # Generated API clients
│   ├── configs/        # Shared ESLint, TS configs
│   ├── ui/             # Component library
│   └── utils/          # Common utilities
├── apps/               # Client applications
│   ├── web/            # Next.js web app
│   ├── desktop/        # Electron app
│   └── mobile/         # React Native
├── pkg/                # Shared Go packages (30+)
│   ├── authentication/ # Auth utilities
│   ├── middleware/      # HTTP middleware
│   ├── stripe/         # Billing integration
│   └── ...
└── db/                 # Database schemas, migrations

Atomic changes. Shared code without versioning. Consistent tooling. Easy refactoring. One commit can touch the API, the web app, and a backend service — single PR, single code review.

Conclusion

The monorepo vs polyrepo debate isn’t about technology — it’s about how your organization communicates. Conway’s Law still holds: your system architecture will mirror your communication structure.

For most startups with 1-50 engineers and high coordination needs, monorepo wins. You’ll ship faster, refactor safely, and spend zero time on dependency management. For large organizations with genuinely independent business units, polyrepo maps to real organizational boundaries.

I went from polyrepo fan to monorepo convert over the course of my career — Meta showed me what was possible, AWS reminded me what the alternative felt like, and the SID migration proved it was worth the weekend.

Your first step: count how many PRs last month touched multiple repositories or required coordinated releases. If it’s more than a handful, you’re paying the coordination tax daily. The monorepo conversation is worth having.


Further reading: