Post

Git sparse-checkout

Git sparse-checkout

Introduction

When working with large repositories, you often don’t need the entire codebase locally. Maybe you’re only interested in a specific subdirectory, or you’re working on a monorepo where different teams manage different folders. Git’s sparse-checkout feature allows you to selectively check out only the files and directories you need, saving disk space and reducing clone times.


What is Sparse-Checkout?

Git sparse-checkout is a feature that allows you to have a working directory that contains only a subset of the files from your repository. Instead of checking out all files, you can specify patterns to include or exclude specific directories and files.

Benefits:

  • Reduced disk usage: Only download and store the files you need
  • Faster operations: Git commands work faster on smaller working directories
  • Focused development: Avoid distractions from unrelated code
  • Better for large monorepos: Work with specific components without the entire repository

Basic Sparse-Checkout Setup

Here’s how to clone a repository and check out only specific subdirectories:

Method 1: Clone and Configure

Step 1: Clone the repository (without checking out files)

1
2
git clone --no-checkout https://github.com/user/large-repo.git
cd large-repo

Step 2: Enable sparse-checkout

1
git config core.sparseCheckout true

Step 3: Define what you want to include

1
2
3
4
# Create sparse-checkout file with patterns
echo "path/to/subfolder/*" > .git/info/sparse-checkout
echo "another/important/directory/*" >> .git/info/sparse-checkout
echo "important-file.txt" >> .git/info/sparse-checkout

Step 4: Check out the files

1
git checkout main  # or whatever branch you want

Method 2: Using Modern Git (v2.25+)

Step 1: Clone with sparse-checkout enabled

1
2
git clone --filter=blob:none --sparse https://github.com/user/large-repo.git
cd large-repo

Step 2: Set sparse-checkout patterns

1
2
git sparse-checkout init --cone
git sparse-checkout set path/to/subfolder another/important/directory

Practical Examples

Example 1: Android AOSP - Only Framework

1
2
3
4
5
# Clone Android source but only get the framework directory
git clone --filter=blob:none --sparse https://android.googlesource.com/platform/frameworks/base
cd base
git sparse-checkout init --cone
git sparse-checkout set core services

Example 2: Linux Kernel - Specific Architecture

1
2
3
4
5
# Clone Linux kernel but only ARM architecture files
git clone --filter=blob:none --sparse https://github.com/torvalds/linux.git
cd linux
git sparse-checkout init --cone
git sparse-checkout set arch/arm drivers/gpio include/linux

Sparse-Checkout Patterns

Cone vs Non-Cone Mode

Cone Mode (Recommended - Git 2.25+):

1
2
git sparse-checkout init --cone
git sparse-checkout set src docs tools
  • Simpler syntax
  • Better performance
  • Only works with directory paths

Non-Cone Mode (Legacy):

1
2
3
4
# Manually edit .git/info/sparse-checkout
echo "src/*" > .git/info/sparse-checkout
echo "docs/*.md" >> .git/info/sparse-checkout
echo "!src/temp/*" >> .git/info/sparse-checkout

Partial Clone Filters:

  • --filter=blob:none: Skip all blobs (file contents)
  • --filter=blob:limit=1k: Skip blobs larger than 1KB
  • --filter=tree:0: Skip all trees (directory listings)

Real-World Scenarios

Scenario 1: Documentation Writer

You only need documentation files from a large project:

1
2
3
4
git clone --filter=blob:none --sparse https://github.com/kubernetes/kubernetes.git
cd kubernetes
git sparse-checkout init --cone
git sparse-checkout set docs api/openapi-spec


Performance Comparison

Here’s what you can expect with sparse-checkout on a large repository:

OperationFull CloneSparse-CheckoutSavings
Clone time10 minutes2 minutes80%
Disk usage2.5 GB500 MB80%
Git status3 seconds0.5 seconds83%

Conclusion

Git sparse-checkout is a powerful feature for working with large repositories efficiently. By selectively checking out only the files and directories you need, you can:

  • Save disk space and bandwidth
  • Focus on relevant code without distractions
  • Work efficiently with monorepos and large projects

Key flag:

  • Use --filter=blob:none --sparse for maximum efficiency
  • Prefer cone mode for simpler patterns and better performance
  • Combine with partial clone for very large repositories

This post is licensed under CC BY 4.0 by the author.