Generative AI chatbot using Llama 2 on AWS

Blog
15 minute read
September 30, 2023

Jayant Raj

Director, Cloud & Digital, AWS Ambassador, PwC US

This post demonstrates building a GenAI chatbot using a private instance of the open source Llama 2 model deployed on Amazon Sagemaker using AWS Cloud Development Kit (CDK) and fronted by AWS Lambda and API Gateway. Llama 2 is a family of pretrained and fine-tuned large language models (LLMs) released by Meta in July 2023. Llama 2 was pretrained on 2 Trillion tokens and has a 4k context length.

The blog outlines the approach to deploy an open source LLM on SageMaker and use an open source python package, Chainlit, to build a ChatGPT-like user interface for LLM applications. The Llama 2 model is deployed on SageMaker using two different approaches. The first approach uses SageMaker Studio console to deploy the model via SageMaker JumpStart. The second approach uses AWS CDK to deploy the Llama 2 model from HuggingFace on SageMaker using the HuggingFace Text Generation Inference Container.

Want to see how it’s done?

Generative AI chatbot using Llama 2 on AWS details

© 2017 - 2026 PwC. All rights reserved. PwC refers to the PwC network and/or one or more of its member firms, each of which is a separate legal entity. Please see www.pwc.com/structure for further details.

Generative AI chatbot using Llama 2 on AWS

Want to see how it’s done?

Generative AI chatbot using Llama 2 on AWS details

{{filterContent.facetedTitle}}

{{item.title}}

{{item.title}}