Sapiens
Sapiens is a research framework from
Meta AI focused on embodied intelligence
and human-like multimodal learning,
aiming to train agents that can
perceive, rea...
Enter
FrankMocap
FrankMocap is a monocular 3D human
capture system that estimates body,
hand, and optionally face pose from a
single RGB image or video. It regresses
parametric...
Enter
PyCls
pycls is a focused PyTorch codebase for
image classification research that
emphasizes reproducibility and strong,
transparent baselines. It popularized
familie...
Enter
DarkForestGo
darkforestGo is an early
deep-reinforcement-learning Go engine
that combined a convolutional
policy/value network with Monte Carlo
Tree Search (MCTS) to play t...
Enter
Video Nonlocal Net
video-nonlocal-net implements Non-local
Neural Networks for video understanding,
adding long-range dependency modeling to
2D/3D ConvNet backbones. Non-local bl...
Enter
ConvNeXt V2
ConvNeXt V2 is an evolution of the
ConvNeXt architecture that co-designs
convolutional networks alongside
self-supervised learning. The V2 version
introduces a...
Enter
Denoiser
Denoiser is a real-time speech
enhancement model operating directly on
raw waveforms, designed to clean noisy
audio while running efficiently on CPU.
It uses a...
Enter
TimeSformer
TimeSformer is a vision transformer
architecture for video that extends the
standard attention mechanism into
spatiotemporal attention. The model
alternates at...
Enter
Multimodal
This project, also known as
TorchMultimodal, is a PyTorch library
for building, training, and
experimenting with multimodal,
multi-task models at scale. The li...
Enter