All-in-one training for vision models (YOLO, ViTs, RT-DETR, DINOv3): pretraining, fine-tuning, distillation.
Step-Audio 2 is an end-to-end multi-modal large language model designed for industry-strength audio understanding and speech conversation.
🔪Swiss-army knife for Android testing and development 🔪 ⛺
PoolFormer: MetaFormer Is Actually What You Need for Vision (CVPR 2022 Oral)
Official Implementation of CVPR24 highlight paper: Matching Anything by Segmenting Anything
A book about how to write OS kernels in Rust easily.
This Discord chatbot is incredibly versatile. Powered incredibly fast Groq API
AIL framework - Analysis Information Leak framework. Project moved to https://github.com/ail-project
Upload and download files from Telegram up to 4 GiB using your account
This repository includes the official implementation of OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs.
[CVPR 2024 Highlight] PhysGaussian: Physics-Integrated 3D Gaussians for Generative Dynamics
PANet for Instance Segmentation and Object Detection
Imitation learning benchmark focusing on complex locomotion tasks using MuJoCo.