AI Model Training or Prompt Engineering? Cost-effective Approach

26 Aug, 2023

In today’s ultra-competitive digital landscape, Artificial Intelligence (AI) plays a vital role in delivering smart and adaptive solutions. A key aspect of maximizing AI’s potential lies in choosing between AI model training (also known as fine-tuning) and prompt engineering. This critical decision can have far-reaching implications on your project’s performance, budget, and time to market.

Are you looking to build a sentiment analysis tool for your restaurant management software? Perhaps a chatbot for lead qualification? Or maybe a knowledgebase bot integrated into your Confluence platform? The article will deep-dive into each approach, debunk popular misconceptions, and provide real-world examples to guide your decision-making process.

On your journey to create AI-enabled business, we invite you to simplify your decision-making process by taking a closer look at fine-tuning, prompt engineering, plug-ins, and embeddings, and understand when and why to use each in your AI project. By the end, you’ll have a clearer perspective to align your AI strategy with your specific business needs.

Understanding AI Model Training and Fine-tuning

What is AI model training?

Fine-tuning is the process of updating a pre-trained language model’s parameters to adapt it for a specific task. It’s more like calibrating a high-performance machine rather than building it from scratch.

To shed light on the mechanics, let’s delve a little deeper. Imagine you have an AI model that can distinguish between images of cats and dogs - a classic use-case scenario. This capacity is achieved by initially training the AI model with thousands of labeled images. The model, through studying these examples, learns the unique characteristics or ‘features’ that define each category. Therefore, when presented with an unfamiliar image, it can confidently categorize it as a cat or dog based on these learned features.

Dataset for Fine-tuning the AI model

Fine-tuning Open AI Cost Structure

Now, let’s talk numbers. The cost of fine-tuning an AI model, according to OpenAI, can be sub-divided into training and usage costs. For their fine-tuning service for GPT-3.5 Turbo, the costs are as follows:

Consider that a GPT-3.5-turbo fine-tuning job with a training file of approximately 75,000 words (100,000 tokens), would therefore have a cost around $2.40. Noteworthy is the update from OpenAI that GPT-4 is set for fine-tuning availability in the coming fall, bringing even greater customized capabilities to [AI developers] (/insights/ai-developers/ai-developers-transforming-traditional-coding/).

Key Application Areas for fine-tuning AI model

The Limitation of Fine-tuning

However, the common misconception is that this fine-tuning mechanism can be universally applied, in particular to tasks requiring an understanding of unique, domain-specific information. If you were to show the AI model a single document highlighting a company’s organizational structure and anticipate it to comprehend the underlying structure and operations, you would unfortunately be disappointed. It simply doesn’t work that way. Fine-tuning and initial training are more about a focused effort, presenting a multitude of examples to help the AI model label unseen instances.

Myth-busting: Fine-tuning will not enable a model to understand a company’s org structure based on a single document.

Cost-Benefit Analysis

Pragmatic DLT’s Stand: Building a new model from scratch is rarely advisable given the advancements in the field. Starting with third-party Large Language Models (LLM) like OpenAI or HuggingFace is quicker and more cost-effective.

Fine-tuning is a potent tool when you have a specific, well-defined problem and a dataset to tune the model. It comes at a higher cost but can yield high value when applied to the right kind of tasks.

The Power of Prompt Engineering

Prompt engineering represents an alternative path when considering language model applications for your business. It delivers high value with relatively lower cost implications. Let’s dissect what it entails and the unique benefits it offers:

What is Prompt Engineering?

Contrary to the often-misconstrued perception, prompt engineering does not involve a hefty round of model training. Instead, it capitalizes on temporary learning during the phase known as “inference”. In simpler terms, this process involves feeding the language model crucial pieces of information (or ‘prompts’) at the time of use that guide its responses.

This can be anything from finely-chosen phrases to specific facts, essentially serving as a real-time nudge that influences the model’s prediction. Two main areas where prompt engineering shines are in the delivery of live information and adapting your organization’s unique facts.

Prompt Engineering the AI model

Key Characteristics

Cost and Efficiency

Why Choose Prompt Engineering?

The primary advantage of employing prompt engineering is its remarkable ability to fine-tune system responses without the need for computationally expensive model re-training. Here are a few compelling reasons to opt for such an approach:

Real-world Application: A Qualification Chatbot

Take, for instance, the usage of a qualification chatbot aiding a sales team in lead sorting. Here, prompt engineering can be used to instruct the model to respond to a variety of potential lead inquiries, without requiring a full-scale model re-training.

The chatbot can be promptly engineered to filter inquiries based on predefined sales criteria and assist with lead qualifications. Thus, not only can the sales team answer customer queries swiftly, but also target potential leads with a higher precision.

In conclusion, prompt engineering is a smart, effective, and cost-efficient approach to build AI systems that deliver real-world impact. Its versatile application lends it an edge over the more tedious and often expensive fine-tuning process, especially during the initial stages of a project. The balance, as with many strategic choices, lies in understanding when to employ which strategy.

When to Use Prompt Engineering

The Utility of Plug-ins and Embeddings

Understanding the effective use of plug-ins and embeddings is key in the landscape of AI development, particularly when working with Language Learning Models (LLMs), like OpenAI’s ChatGPT.

AI model Plug-ins: In essence, plug-ins extend the functionality of your AI, allowing it to integrate with SaaS offerings or other built-in services. For instance, a chatbot can be further developed to integrate with your company’s database to create a more rich and customizable user-experience.

Consider the plugins of customer relationship management (CRM) software. A chatbot, enhanced by a CRM plug-in, is able to recognize specific customer queries and leverage customer data from the CRM to answer them more effectively. This makes it possible to provide more personalized customer experiences, without the added cost and time of coding these integrations into your AI model from the ground up.

AI model Embeddings also lend themselves to extending an AI’s capabilities, albeit in a different way to plug-ins. They essentially distil the specific knowledge from your database and supply it to the AI through a series of prompts. This facilitates a dialogue in which AI could provide the most pertinent response based on an individual user’s behavior, historical actions, or preferences.

For example, a customer-facing chatbot integrated with your product documentation can provide detailed information and resolve specific product-related queries without human intervention. Equipping your app with your database’s specific knowledge will allow the AI to make accurate recommendations and assist customers in a more personalized and knowledgeable manner.

Recently, OpenAI revealed that asking a question in a neural information retrieval system that uses embeddings is around 5x cheaper than using GPT-3.5 Turbo. When compared to GPT-4, there’s an impressive 250x difference. Thus, it’s becoming clear that leveraging embeddings can lead to substantial savings and enhance the overall performance of AI systems.

In effect, plug-ins and embeddings provide an avenue to customize AI models for specific requirements without the high cost and complexity of building a new model from scratch. However, companies should consider the overall business goals, available resources, and the degree of customization required when deciding between plug-ins, embeddings, fine-tuning, or even prompt engineering for their AI development activities.

When to Use Plug-ins

When to Use Database Embeddings

Deep Dive: The Fine-tuning Procedure for AI Models

Fine-tuning AI models plays a vital role in AI utilization within businesses. This process involves tailoring a pre-trained AI model to tackle specific issues or cater to unique needs.

To ease comprehension, let’s explore a real-life example – ORTY, a restaurant management system. This system leveraged fine-tuning to perform sentiment analysis on popular review websites, providing restaurant owners with succinct overviews of the prevailing sentiments attached to their businesses. A simple task on the surface, yes, but one with far-reaching implications, especially when you consider the algorithm’s multi-faceted functionality.

To accomplish this task, Pragmatic DLT team initially employed a basic classification model. They then utilized GPT-3.5 Turbo to fine-tune this model, leveraging its categorization and filtering capabilities. The objective here was twofold: to sift through thousands of reviews and accurately parse reviewer sentiments and to categorize these sentiments for easy digestion by restaurant owners.

The fine-tuning process commenced with feeding the AI model thousands of labeled examples. These examples ranged from purely positive reviews to a mix of positive, neutral, and negative feedback. Following this ‘feeding’ process, the AI model was then tasked with labeling previously unseen examples, being guided by the distinguishing features it had learned.

The results? Quite impressive. The fine-tuned AI model was able to accurately distinguish and appropriately categorize sentiments within restaurant reviews. This effectively enabled the ORTY system to deliver an aggregated and easy-to-understand sentiment overview to restaurant owners.

So, how long did it take to achieve these results? From the onset of the fine-tuning process to getting the application fully ready, it took approximately two months. Restaurant owners expressed high satisfaction levels with the service provided by ORTY, further emphasizing the effectiveness and practical application of fine-tuning.

The final product achieved remarkable results in distinguishing and categorizing sentiments within restaurant reviews. This fine-tuned AI model considerably simplified the ORTY system’s task and provided restaurant owners with an easy-to-digest sentiment overview. This successful endeavor boosted customer satisfaction levels and increased user engagement by 20%, serving as a testament to the fine-tuning effectiveness.

It’s noteworthy mentioning that a fine-tuned model’s serving cost is about six times more than the base model on OpenAI. However, fine-tuning provides superior performance levels without requiring a substantial financial outlay.

The decision to fine-tune an AI model relies heavily on task-specific requirements, the task complexity, and the expected return on investment. Without a labeled review dataset, the result might not be feasible, and in such a scenario, prompting makes a better option.

Deep Dive: The Prompt Engineering Route

Prompt engineering aims for a more dynamic and flexible AI, capitalizing on temporary learning during immediate inference. This stands in contrast to fine-tuning which relies on a more persistent form of learning from a large corpus.

When considering a project that requires quick iterations and changes throughout its development process, prompt engineering typically comes out on top. A clear testimony of this is a Pragmatic DLT case with one of their customers who demanded a qualification chatbot for sales lead generation.

This client was tirelessly looking for an AI solution that could swiftly adapt to their dynamic and evolving sales environment. The aim was to develop an AI chatbot to qualify leads, saving time for their sales team and providing a seamless customer experience.

Prompt engineering stood out as the approach for a few key reasons:

  1. Shorter iteration cycle: With prompt engineering, the project could swiftly iterate through various instructions and make changes on the fly without extensively retraining the model.

  2. Cost and time-effective: Prompt engineering could speed up the development process since it didn’t require extensive fine-tuning. This factor dramatically increased its cost-effectiveness as opposed to rigorous model training.

  3. Real-time adjustments: It granted the capability to adjust responses based on real-time data, ensuring that the chatbot could adapt to different customers’ contexts.

Fast forward to the end of the project, the client was enormously satisfied. In terms of project duration, the whole process took about two months, which was significantly less than the time required for building and fine-tuning an AI model from scratch. The resulting chatbot was able to efficiently handle sales lead qualification with precision and context-awareness.

This pragmatic approach illustrates the power and flexibility of prompt engineering. When AI API needs to be customized to fit specific interactive responses and infused with agility, prompt engineering provides a profitable, swift and effective solution to consider. It is important to note, however, that the choice entirely depends on the type and purpose of the project undertaken.

Deep Dive: Embedding Data into AI

When deploying AI applications, particularly Natural Language Processing applications powered by language models like ChatGPT, an effective method to provide specific or proprietary knowledge is through data embeddings. In this context, embedding data goes beyond merely training the model with large volumes of general-purpose text. It equips the application with the specific knowledge from your database, thus significantly improving the relevancy of the AI application to your specific field or industry.

ChatGPT Embeddings in JIRA Confluence Atlassian

Applications of Data Embedding

A pivotal example of this use case is the Get Report’s Copilot for Confluence - a chatbot deeply integrated into the Confluence knowledgebase. Serving hundreds of enterprise companies, this chatbot enables employees to effectively use ChatGPT in the context of their internal database.

In the Get Report’s Copilot application, ChatGPT is not just trained on general web text, but it is also equipped with data embedded directly from the company’s Confluence knowledgebase. This ensures that the chatbot’s responses are not just general, but purposefully relevant, providing employees with the precise information they need within the context of their company’s specific operations.

The process of achieving this contextual knowledge involves a series of prompts passed to the AI, which provide it with the vital context required to generate suitable responses. This method of equipping the AI with proprietary information from a user’s database has proven to be substantially efficient and versatile.

When it comes to project length, the process of data embedding for this kind of application usually takes between 3 to 6 months to fully implement, depending on the size and complexity of your database. Feedback from clients who have adopted Get Report’s Copilot application indicates high satisfaction rates, primarily due to the impressive increase in operational efficiency and the resultant cost savings achieved by leveraging AI in this manner.

Limitations

Takeaways

Embedding data into AI application isn’t just about improving results; it’s also about cost efficiency. Consider a scenario where you’re using language models for information retrieval. Asking “What is the capital of Delaware?” in a neural information retrieval system costs around 5 times less than with GPT-3.5-Turbo. If you’re comparing against GPT-4? There’s a massive 250 times difference in cost! Thus, data embedding presents an opportunity for businesses to access powerful AI capability while maintaining control over cost.

By integrating real-time databases through data embedding, AI models like ChatGPT can serve highly specific queries that are tailored to the needs of the enterprise. Companies like Get Report have successfully leveraged this approach, making it a viable strategy for those looking to deploy AI in a context-rich environment.

Comparative Analysis

To make an informed decision on whether to go for AI model fine-tuning, prompt engineering, or plug-ins and embeddings, it’s essential to compare these approaches head-to-head.

Table: Fine-tuning vs. Prompt Engineering vs. Plug-ins & Embeddings

Criteria Explained

  1. Cost-Effectiveness

    • Fine-Tuning: Initial costs are high due to training; usage also has costs ($0.016 / 1K Tokens).
    • Prompt Engineering: Mostly cost-effective, especially if you just append “Be Concise” to your prompt.
    • Plug-ins & Embeddings: One-time setup, especially cost-effective for data-centric applications.
  2. Time to Market

    • Fine-Tuning: Requires significant time for training and iterations.
    • Prompt Engineering: Quick to implement, shorter iteration cycles.
    • Plug-ins & Embeddings: Moderate time required for setting up and testing.
  3. Complexity

    • Fine-Tuning: Involves training on thousands of labeled examples.
    • Prompt Engineering: Quick and straightforward; no formal training required.
    • Plug-ins & Embeddings: Moderate setup time, needs proper data structuring.
  4. Flexibility

    • Fine-Tuning: Limited by the quality and quantity of labeled data.
    • Prompt Engineering: Highly flexible; good for rapid iterations.
    • Plug-ins & Embeddings: Limited to database or SaaS service capabilities.
  5. Scalability

    • Fine-Tuning: Scales well but increases cost linearly.
    • Prompt Engineering: Not ideal for scaling; redundant token costs pile up.
    • Plug-ins & Embeddings: Highly scalable, especially if integrated with robust SaaS services.

Based on these criteria, you can select the most appropriate method that aligns with your business needs, time constraints, and budget.

Summary of Approaches and Recommendations Based on Specific Business Needs

When deciding on the optimal route for your AI projects, understanding the trade-offs associated with each approach is key. Typically, this centers around three interconnected concerns: costs, capabilities, and time to market. This is a usual approach Top AI Consultants like Pragmatic DLT suggest:

1. Building a New AI Model from Scratch

Though possible, this option is often most costly and time-intensive. As Pragmatic DLT, a seasoned AI consulting company, advises, building a new model generally outweighs reasonable investment parameters given the current advancements in Large Language Models (LLM) such as OpenAI or HuggingFace. These third-party models already offer highly competitive conversation and analytical capabilities that a newly built model might struggle to match. In light of this, dedicating resources to creating a new AI model from the ground up can be considered impractical and inefficient.

2. Leveraging Third-Party Large Language Models (LLMs) with a Focus on Prompt Engineering

The recommended first step is to start with a third-party LLM like OpenAI or HuggingFace. These models can be utilized to perform tasks based on your existing knowledge base, and custom algorithms can be engineered for responses via LangChain/OpenAI functions. This approach enables the creation of a production-ready application that is both quick and cost-effective. It provides an early-to-market solution and allows you to start providing value to stakeholders and customers at an accelerated pace.

3. Fine-tuning of Base Open Source Models

Once your initial AI solution is in place and performing well, further iterations can involve fine-tuning opensource models. This stage typically involves a more research-oriented approach. However, once completed, you end up with a proprietary model enriched with Intellectual Property (IP). This fine-tuned AI could be a significant asset for further fundraising and increasing your company’s competitive edge, despite the potentially lengthy timeline to achieve it.

Understanding where your business stands on the axes of time, budget, and technical prowess is pertinent in deciding which approach to take. Pragmatically, starting with prompt engineering powered by third-party LLMs and then slowly iterating towards fine-tuning opensource models is the most recommended path. It strikes a balance between costs, capabilities, and time to market, ensuring your AI investment yields maximum returns.

Testimonials

Let’s build something great together