Meta’s Llama 4 fashions – Llama 4 Scout and Llama 4 Maverick are right here! These fashions can assist individuals construct extra personalised multimodal experiences, based mostly on giant enhancements in picture and textual content understanding and instruction following, and can accommodate a spread of use circumstances and developer wants. Whether or not you’re constructing apps for reasoning, summarization, or conversational AI, Llama 4 Scout and Maverick ship highly effective efficiency with open entry. Llama 4 fashions may be run, fine-tuned and deployed in Oracle Cloud Infrastructure (OCI) Knowledge Science. Whether or not you’re an information scientist or a developer, OCI presents the infrastructure and instruments to maneuver quick within the evolving world of Generative AI.
What are Llama 4’s enhancements?
Meta’s Llama 4 household contains:
- Llama 4 Scout: A robust multimodal mannequin that helps context window of as much as 10M tokens with 17B energetic parameters, 16 specialists and a complete of 109B parameters that may match on a H100 (with Int4 quantization).
- Llama 4 Maverick: A 17B energetic parameter mannequin with 128 specialists and a complete of 400B parameters, delivering sturdy efficiency to price ratio for reasoning and coding whereas remaining open-weight and customizable and might match on a H100.
The brand new Llama 4 fashions use a mix of specialists (MoE) structure. In MoE fashions, a single token prompts solely a fraction of the whole parameters. MoE architectures are extra compute environment friendly for mannequin coaching and inference and, given a set coaching FLOPs funds, ship increased high quality fashions in comparison with dense architectures. Llama 4 fashions are designed with native multimodality, incorporating early fusion to seamlessly combine textual content and imaginative and prescient tokens right into a unified mannequin spine.
Llama 4 Scout and Llama 4 Maverick fashions can be found as we speak on Meta’s web site llama.com and Hugging Face, a web based mannequin repository. Oracle Cloud Infrastructure (OCI) Knowledge Science is a platform for information scientists and builders to work with open supply fashions powered by OCI’s compute infrastructure with options that assist your complete machine studying lifecycle. You’ll be able to usher in Llama 4 fashions from Hugging Face or Meta to make use of inside OCI Knowledge Science effortlessly.
Working with Llama 4 fashions by way of the Convey-Your-Personal-Container strategy
OCI Knowledge Science helps a Convey Your Personal Container strategy for mannequin deployment and jobs, which lets you deploy and positive tune the Llama 4 fashions. The Convey-Your-Personal-Container strategy requires downloading the mannequin from the host repository, both by way of the Llama web site or Hugging Face, and making a Knowledge Science mannequin catalog entry. Subsequent, you’ll obtain the newest vLLM container and push it to the OCI Registry. The newly launched vLLM 0.8.3 is suitable with the Llama 4 fashions. Then, you possibly can deploy the mannequin or run a positive tuning job with the vLLM container picture within the OCI Registry. As soon as the mannequin is deployed, you’re set to invoke the mannequin with an HTTP endpoint. For extra particulars, please take a look at our tutorials Deploy LLM Fashions utilizing BYOC and Batch Inferencing information.
Llama 4 Scout is a 17 billion energetic parameter mannequin with 16 specialists whereas Llama 4 Maverick is a 17 billion energetic parameter mannequin with 128 specialists. Llama 4 Scout (with Int4 quantization) suits on a H100 whereas Llama 4 Maverick suits on a H100. Working with a H100 in OCI Knowledge Sciences requires a reservation for the form. You are able to do so by submitting a service request and specifying the form and area you have an interest in utilizing the form. For added info on working with GPU in OCI Knowledge Science, please examine this web page.
Coming Quickly: OCI Knowledge Science AI Fast Actions, proven within the picture bellow, is a no code answer for deploying and positive tuning LLMs. AI Fast Actions already helps Llama 3.3 and can quickly assist Llama 4 fashions, offering a simplified course of for working with these fashions.
Get began with OCI Knowledge Science and Llama 4 fashions as we speak!
OCI Knowledge Science lets you keep updated with AI developments. By way of our partnership with Meta, the supply of Llama 4 fashions in OCI Knowledge Science represents a step ahead for anybody trying to construct, deploy, and refine AI options.
Discover:
References
