Print
Email

Email

Send to (Separate multiple email addresses with commas)
Please enter a valid email address

Your email address
Please enter a valid email address

Message (Optional)

Important legal information about the email you will be sending. By using this service, you agree to input your real email address and only send it to people you know. It is a violation of law in some jurisdictions to falsely identify yourself in an email. All information you provide will be used by Fidelity solely for the purpose of sending the email on your behalf. The subject line of the email you send will be "Fidelity.com: "

Email

Your email has been sent.
Share
Share
Mutual Funds and Mutual Fund Investing - Fidelity Investments

Clicking a link will open a new window.

artificial intelligence

Iterative Combinatorial Brain Surgeon (iCBS): Scalable Pruning of Large Language and Vision Models (LLVMs)

By: Elton Zhu and Serdar Kadıoğlu | November 26, 2024

FCAT collaborated with Amazon Quantum Solutions Lab to propose a new scalable pruning algorithm for large language and vision models.

The Challenge

State-of-the-art large language and vision models (LLVMs) have seen tremendous success, but their massive scale comes with a hefty price in terms of computational resources. The need to balance performance and efficiency has led to a growing interest in model compression techniques. By using methods like pruning, quantization, or distillation, researchers aim to streamline these models without sacrificing their impressive accuracy.

The Impact

With the integration of advanced methods — such as the one proposed below — and specialized hardware support for sparse models, we can significantly decrease the computational power and energy required to run AI models, all while maintaining their original performance. This can enable the deployment of smaller, more efficient models directly on devices, rather than relying on server-side processing — ultimately helping to enhance data privacy.

The Outcomes

We proposed iterative Combinatorial Brain Surgeon (iCBS), a scalable iterative pruning algorithm that optimizes over small blocks of weights in neural networks using block gradient descent. This blockwise approach can allow iCBS to scale to very large models, including LLVMs with billions of parameters, while helping to achieve higher performance compared to existing one-shot pruning techniques.

For further details on this project, read the full paper.

1176959.1.0

Important legal information about the e-mail you will be sending. By using this service, you agree to input your real e-mail address and only send it to people you know. It is a violation of law in some jurisdictions to falsely identify yourself in an e-mail. All information you provide will be used by Fidelity solely for the purpose of sending the e-mail on your behalf.The subject line of the e-mail you send will be "Fidelity.com: "

Your e-mail has been sent.

close

Your e-mail has been sent.

Artificial Intelligence

A Conversation with Sayash Kapoor: Author of AI Snake Oil

By: John Dalton | January 17, 2025

Thought leader Sayash Kapoor offers his perspective on bogus AI claims, why people fall for them, and how users can cut through the hype to tap into AI’s true potential.

Artificial Intelligence

01/17/2025

Article

Artificial Intelligence

Iterative Combinatorial Brain Surgeon: Scalable Pruning of Large Language and Vision Models (LLVMs)

By: Elton Zhu and Serdar Kadıoğlu | November 26, 2024

FCAT collaborated with Amazon Quantum Solutions Lab to propose a new scalable pruning algorithm for large language and vision models.

Artificial Intelligence

11/26/2024

Article

Artificial Intelligence

Ask an FCAT Researcher: David Bracken on Synthetic Data

By: Matt Ehlers | November 19, 2024

Generative AI models have been trained by an enormous amount of data scraped off the internet, but as new data becomes scarce, companies are increasingly experimenting with synthetic data.

Artificial Intelligence

11/21/2024

Article

Artificial Intelligence

A Conversation with Jamie Metzl: Author of Superconvergence

John Dalton| October 25, 2024

In Superconvergence, Jamie Metzl explains how emerging genetic, biotechnical, and AI technologies will transform our world. FCAT’s John Dalton spoke with Metzl about his research and how he hopes we can move forward as a society.

Artificial Intelligence

10/25/2024

Article

Connect with us.

This website is operated by Fidelity Center for Applied Technology LLC (FCAT^®). FCAT experiments with and provides innovative products, services, content and tools, as a service to its affiliates and as a subsidiary of FMR LLC. Based on input and feedback, FCAT is better able to engage in technology research and planning for the Fidelity family of companies. Unless otherwise indicated, the information and items presented are provided by FCAT and are not intended to provide tax, legal, insurance or investment advice and should not be construed as an offer to sell, a solicitation of an offer to buy, or a recommendation for any security by any Fidelity entity or any third-party. Third-party trademarks and service marks are the property of their respective owners. All other trademarks and service marks are the property of FMR LLC or its affiliated companies.

1150441.2.0

This is for persons in the U.S. only.

245 Summer St, Boston MA

© 2008-2025 FMR LLC All right reserved | FCATalyst.com

Terms of Use | Privacy | Security | DAT Support