artificial intelligence

Iterative Combinatorial Brain Surgeon (iCBS): Scalable Pruning of Large Language and Vision Models (LLVMs)

By: Elton Zhu and Serdar Kadıoğlu | November 26, 2024
Share

FCAT collaborated with Amazon Quantum Solutions Lab to propose a new scalable pruning algorithm for large language and vision models.

  • Facebook.
  • Twitter.
  • LinkedIn.
  • Print

The Challenge

State-of-the-art large language and vision models (LLVMs) have seen tremendous success, but their massive scale comes with a hefty price in terms of computational resources. The need to balance performance and efficiency has led to a growing interest in model compression techniques. By using methods like pruning, quantization, or distillation, researchers aim to streamline these models without sacrificing their impressive accuracy.

The Impact

With the integration of advanced methods — such as the one proposed below — and specialized hardware support for sparse models, we can significantly decrease the computational power and energy required to run AI models, all while maintaining their original performance. This can enable the deployment of smaller, more efficient models directly on devices, rather than relying on server-side processing — ultimately helping to enhance data privacy.

The Outcomes

We proposed iterative Combinatorial Brain Surgeon (iCBS), a scalable iterative pruning algorithm that optimizes over small blocks of weights in neural networks using block gradient descent. This blockwise approach can allow iCBS to scale to very large models, including LLVMs with billions of parameters, while helping to achieve higher performance compared to existing one-shot pruning techniques.

For further details on this project, read the full paper.

  • Facebook.
  • Twitter.
  • LinkedIn.
  • Print
1176959.1.0
close
Please enter a valid e-mail address
Please enter a valid e-mail address
Important legal information about the e-mail you will be sending. By using this service, you agree to input your real e-mail address and only send it to people you know. It is a violation of law in some jurisdictions to falsely identify yourself in an e-mail. All information you provide will be used by Fidelity solely for the purpose of sending the e-mail on your behalf.The subject line of the e-mail you send will be "Fidelity.com: "

Your e-mail has been sent.
close

Your e-mail has been sent.

Related Articles

Artificial Intelligence
By: John Dalton | January 17, 2025
Thought leader Sayash Kapoor offers his perspective on bogus AI claims, why people fall for them, and how users can cut through the hype to tap into AI’s true potential.
01/17/2025
Article
Artificial Intelligence
By: Elton Zhu and Serdar Kadıoğlu | November 26, 2024
FCAT collaborated with Amazon Quantum Solutions Lab to propose a new scalable pruning algorithm for large language and vision models.
11/26/2024
Article
Artificial Intelligence
By: Matt Ehlers | November 19, 2024
Generative AI models have been trained by an enormous amount of data scraped off the internet, but as new data becomes scarce, companies are increasingly experimenting with synthetic data.
11/21/2024
Article
Artificial Intelligence
John Dalton| October 25, 2024
In Superconvergence, Jamie Metzl explains how emerging genetic, biotechnical, and AI technologies will transform our world. FCAT’s John Dalton spoke with Metzl about his research and how he hopes we can move forward as a society.
10/25/2024
Article

This website is operated by Fidelity Center for Applied Technology LLC (FCAT®). FCAT experiments with and provides innovative products, services, content and tools, as a service to its affiliates and as a subsidiary of FMR LLC. Based on input and feedback, FCAT is better able to engage in technology research and planning for the Fidelity family of companies. Unless otherwise indicated, the information and items presented are provided by FCAT and are not intended to provide tax, legal, insurance or investment advice and should not be construed as an offer to sell, a solicitation of an offer to buy, or a recommendation for any security by any Fidelity entity or any third-party. Third-party trademarks and service marks are the property of their respective owners. All other trademarks and service marks are the property of FMR LLC or its affiliated companies.


1150441.2.0


This is for persons in the U.S. only.


245 Summer St, Boston MA

© 2008-2025 FMR LLC All right reserved | FCATalyst.com


Terms of Use | Privacy | Security | DAT Support