Sparse Hyper-Connections Documentation
Sparse Selective Hyper-Connections (SHC) is a practical efficiency framework for multi-stream residual architectures that achieves substantial computational and memory improvements while maintaining equivalent accuracy.
Key Features
Bounded spectral norm ρ ≤ 1 by construction via closed-form Cayley transform, ensuring stable training at any depth.
Replace iterative Sinkhorn normalization with closed-form orthogonal matrix generation via the Cayley transform.
Factorized KV cache compression reduces memory from 4× to ~1.2× baseline through learned low-rank projections.
Optional SSM distillation enables linear-time generation without KV cache, trading ~1% accuracy for 4.4× memory reduction.
Quick Installation
pip install sparse-hyper-connections
Quick Start
from shc.models import SHCTransformer, get_config
# Create model with predefined configuration
config = get_config('500m') # Options: '500m', '1b', '3b', '7b'
model = SHCTransformer(config)
# Forward pass
import torch
input_ids = torch.randint(0, 32000, (2, 512))
logits = model(input_ids)
# Generate text
output = model.generate(
input_ids[:, :10], # prompt
max_new_tokens=100,
temperature=0.7,
)
Documentation Contents
Getting Started
API Reference
Development
Citation
If you use SHC in your research, please cite:
@article{shc2026,
title={Sparse Selective Hyper-Connections: A Unified Framework for
Stable and Efficient Deep Residual Learning},
author={SHC Research Team},
journal={IEEE Conference},
year={2026}
}