steampunk machinery, many people standing around, illustration

Introducing Petals: Crowdsourced AI System for Running 100B+ LLMs at Home

Pretrained language models have proven effective in numerous real-world tasks, with performance typically improving as size increases. However, models with 100B+ parameters, such as OPT-175B and BLOOM-176B, require significant memory and computational resources, making them challenging to use for most academics and practitioners. To democratize access to these large language models (LLMs), researchers have developed PETALS, a framework enabling online collaboration for inference and optimization of LLMs. PETALS allows users to act as clients, servers, or both, with servers storing model layers and clients chaining servers for full model inference. Users can also adjust the model through training or parameter-efficient techniques like adapters or quick tuning.

The researchers have implemented enhancements such as dynamic quantization, low-latency connections prioritization, and server load balancing to optimize 100B+ models in this environment. They also address security, privacy, and reward systems for PETALS. Read the full article here.

Access the paper, code, and chat tool below:

Credit to the researchers for their work on this project.

Comments

Leave a Reply