Site icon Gradient Flow

Building LLM-powered Apps: What You Need to Know

Subscribe • Previous Issues

Building LLM-powered Applications

The past few weeks have been exciting for developers interested in deploying AI-powered applications. The field is evolving quickly, and it is now possible to build AI-powered applications without having to spend months or years learning the ins and outs of machine learning. This opens up a whole new world of possibilities, as developers can now experiment with AI in ways that were never before possible.

Foundation models, particularly large language models (LLMs), are now accessible to developers with minimal or no background in machine learning or data science. These agile teams, skilled in rapid iteration, can swiftly develop, test, and refine innovative applications showcased on platforms like Product Hunt. Significantly, this cohort of developers operates at a much quicker pace compared to the majority of Data and AI teams.

Building apps that rely on LLMs and other foundation models.
Custom Models

The current prevailing approach for developers is to use proprietary LLMs through APIs. However, as we explained in a recent post, factors such as domain specificity, security, privacy, regulations, IP protection and control, will prompt more organizations to opt to invest in their own custom LLMs. As an example, Bloomberg recently detailed how they built BloombergGPT, an LLM for finance. In addition, several examples of finely-tuned, medium-sized models have captured the attention of both researchers and developers, paving the way for more of them to create their own custom LLMs. 

Many organizations that develop foundation models have dedicated teams for safety, alignment, and responsible AI. Teams that opt to build their own custom models should make similar investments.

The recent proliferation of open-source models and tools has significantly expanded the available options for teams seeking to create custom LLMs.
Third-party integrations

OpenAI recently launched a new feature called “plugins” for its ChatGPT language model, which allows developers to create tools that can access up-to-date information, run computations, or use third-party services.   Companies such as Expedia, Instacart, and Shopify have already used the feature to create plugins. Third-party developers can develop plugins that range from simple calculators to more complex tools like language translation and Wolfram Alpha integration.

Knowledge Bases

Knowledge graphs and other external data sources can be used to enhance LLMs by providing complementary, domain-specific, factual information. We are starting to see tools that facilitate the connection to existing data sources and formats, including to new systems like vector databases. These tools enable the creation of indices over both structured and unstructured data, allowing for in-context learning. Additionally, they provide an interface for querying the index and obtaining knowledge-augmented output, which enhances the accuracy and relevance of the information provided.

Serving Models

Software services require several key features to meet the demands of modern computing. They must be responsive, highly available, secure, flexible, and interoperable across platforms and systems, while also being capable of handling large volumes of users and providing real-time processing and analytics capabilities. The deployment of LLMs presents unique challenges due to their size, complexity, and cost.

Summary

The proliferation of tools and resources for building LLM-powered applications has opened a new world of possibilities for developers. These tools allow developers to leverage the power of AI without having to learn the complexities of machine learning. As more organizations invest in their own custom LLMs and open-source resources become more widely available, the landscape for LLM-powered applications will become more diverse and fragmented. This presents both opportunities and challenges for developers.

It is important to remember that with great power comes great responsibility. Organizations must invest in safety, alignment, and responsible AI to guarantee that LLM-powered applications are employed for positive and ethical purposes.

An early sign that more tools are on the way: the Winter/2023 YC batch includes new tools to help teams build, customize, deploy, and manage LLMs in the future.

Data Exchange Podcast

1. How Data and AI Happened.  Chris Wiggins is a Professor at Columbia University and the Chief Data Scientist at The New York Times. He is also co-author of How Data Happened, a fascinating historical exploration of how data has been used as a tool in shaping society, from the census to eugenics to Google search. The book traces the trajectory of data and explores new mathematical and computational techniques that serve to shape people, ideas, society, and economies.

2. Uncovering AI Trends: Pioneering Research and Uncharted Horizons.  Jakub Zavrel, the Founder and CEO at Zeta Alpha, discusses the 100 most cited AI papers of 2022, this year’s trending research topics, and the future of language models, multimodal AI, and beyond. He highlights the dominance of transformers, the rise of multimodal models, the significance of synthetic data, custom large language models, chain-of-thought reasoning, and next-gen search technology.


Spotlight

1. Introducing NLP Test.  This much-needed, open source tool helps to improve the quality and reliability of NLP models. It is simple to use and provides comprehensive test coverage, helping to ensure that models are safe, effective, and responsible. The library offers over 50 test types compatible with popular NLP libraries and tasks, addressing model quality aspects such as robustness, bias, fairness, representation, and accuracy before deployment in production systems.

2. Microwave from BNH.  Microwave is a free AI-based bias assessment tool, designed to assist businesses in adhering to New York City’s Local Law 144. This legislation mandates the evaluation of potential biases in automated employment decision-making systems. It has been utilized for auditing AI systems for clients ranging from Fortune 100 companies to software startups, helping them measure and manage AI risks.

3. Training 175B Parameter Language Models at 1000 GPU scale with Alpa and RayAlpa is an open source compiler system for automating and democratizing model-parallel training of large deep learning models. It generates parallelization plans that match or outperform hand-tuned model-parallel training systems even on models they are designed for. This post discusses the integration of Alpa and Ray to train a 175B parameter model equivalent to GPT-3 (OPT-175B) model with pipeline parallelism. The benchmarks show that Alpa can scale beyond 1000 GPUs, achieve state-of-the-art peak GPU utilization, and perform automatic LLM parallelization and partitioning with one line decorator.


If you enjoyed this post please support our work by encouraging your friends and colleagues to subscribe to our newsletter:

Exit mobile version