26 Aug, 2023
In today’s ultra-competitive digital landscape, Artificial Intelligence (AI) plays a vital role in delivering smart and adaptive solutions. A key aspect of maximizing AI’s potential lies in choosing between AI model training (also known as fine-tuning) and prompt engineering. This critical decision can have far-reaching implications on your project’s performance, budget, and time to market.
Are you looking to build a sentiment analysis tool for your restaurant management software? Perhaps a chatbot for lead qualification? Or maybe a knowledgebase bot integrated into your Confluence platform? The article will deep-dive into each approach, debunk popular misconceptions, and provide real-world examples to guide your decision-making process.
On your journey to create AI-enabled business, we invite you to simplify your decision-making process by taking a closer look at fine-tuning, prompt engineering, plug-ins, and embeddings, and understand when and why to use each in your AI project. By the end, you’ll have a clearer perspective to align your AI strategy with your specific business needs.
Fine-tuning is the process of updating a pre-trained language model’s parameters to adapt it for a specific task. It’s more like calibrating a high-performance machine rather than building it from scratch.
To shed light on the mechanics, let’s delve a little deeper. Imagine you have an AI model that can distinguish between images of cats and dogs - a classic use-case scenario. This capacity is achieved by initially training the AI model with thousands of labeled images. The model, through studying these examples, learns the unique characteristics or ‘features’ that define each category. Therefore, when presented with an unfamiliar image, it can confidently categorize it as a cat or dog based on these learned features.
Now, let’s talk numbers. The cost of fine-tuning an AI model, according to OpenAI, can be sub-divided into training and usage costs. For their fine-tuning service for GPT-3.5 Turbo, the costs are as follows:
Consider that a GPT-3.5-turbo fine-tuning job with a training file of approximately 75,000 words (100,000 tokens), would therefore have a cost around $2.40. Noteworthy is the update from OpenAI that GPT-4 is set for fine-tuning availability in the coming fall, bringing even greater customized capabilities to [AI developers] (/insights/ai-developers/ai-developers-transforming-traditional-coding/).
Key Application Areas for fine-tuning AI model
However, the common misconception is that this fine-tuning mechanism can be universally applied, in particular to tasks requiring an understanding of unique, domain-specific information. If you were to show the AI model a single document highlighting a company’s organizational structure and anticipate it to comprehend the underlying structure and operations, you would unfortunately be disappointed. It simply doesn’t work that way. Fine-tuning and initial training are more about a focused effort, presenting a multitude of examples to help the AI model label unseen instances.
Myth-busting: Fine-tuning will not enable a model to understand a company’s org structure based on a single document.
GPT-3.5 Turbo vs. GPT-4: Fine-tuning a GPT-3.5 Turbo can match or outperform GPT-4 on specific tasks, at about 50x lower cost.
Fine-tuning Adds Cost: Expect an 8x cost increase when fine-tuning GPT-3.5 Turbo, in addition to training costs.
Pragmatic DLT’s Stand: Building a new model from scratch is rarely advisable given the advancements in the field. Starting with third-party Large Language Models (LLM) like OpenAI or HuggingFace is quicker and more cost-effective.
Fine-tuning is a potent tool when you have a specific, well-defined problem and a dataset to tune the model. It comes at a higher cost but can yield high value when applied to the right kind of tasks.
Prompt engineering represents an alternative path when considering language model applications for your business. It delivers high value with relatively lower cost implications. Let’s dissect what it entails and the unique benefits it offers:
Contrary to the often-misconstrued perception, prompt engineering does not involve a hefty round of model training. Instead, it capitalizes on temporary learning during the phase known as “inference”. In simpler terms, this process involves feeding the language model crucial pieces of information (or ‘prompts’) at the time of use that guide its responses.
This can be anything from finely-chosen phrases to specific facts, essentially serving as a real-time nudge that influences the model’s prediction. Two main areas where prompt engineering shines are in the delivery of live information and adapting your organization’s unique facts.
Key Characteristics
Cost and Efficiency
The primary advantage of employing prompt engineering is its remarkable ability to fine-tune system responses without the need for computationally expensive model re-training. Here are a few compelling reasons to opt for such an approach:
Flexibility: It provides a useful medium to experiment with different instructions quickly. For example, you can try various prompts with your AI before deciding the most effective one that fetches desired results.
Cost-efficiency: While a fine-tuned model on OpenAI costs around 8 times more than the base model, modifying the prompt can achieve a similar impact at a mere fraction of the cost. Thus, your operating expenses can be substantially reduced.
Shorter Iteration Cycle: In the beginning stages of a project, prompt engineering can be a more efficient strategy due to its shorter iteration cycle. This is advantageous for projects that are still being defined and can benefit from quick adjustments.
Take, for instance, the usage of a qualification chatbot aiding a sales team in lead sorting. Here, prompt engineering can be used to instruct the model to respond to a variety of potential lead inquiries, without requiring a full-scale model re-training.
The chatbot can be promptly engineered to filter inquiries based on predefined sales criteria and assist with lead qualifications. Thus, not only can the sales team answer customer queries swiftly, but also target potential leads with a higher precision.
In conclusion, prompt engineering is a smart, effective, and cost-efficient approach to build AI systems that deliver real-world impact. Its versatile application lends it an edge over the more tedious and often expensive fine-tuning process, especially during the initial stages of a project. The balance, as with many strategic choices, lies in understanding when to employ which strategy.
When to Use Prompt Engineering
Understanding the effective use of plug-ins and embeddings is key in the landscape of AI development, particularly when working with Language Learning Models (LLMs), like OpenAI’s ChatGPT.
AI model Plug-ins: In essence, plug-ins extend the functionality of your AI, allowing it to integrate with SaaS offerings or other built-in services. For instance, a chatbot can be further developed to integrate with your company’s database to create a more rich and customizable user-experience.
Consider the plugins of customer relationship management (CRM) software. A chatbot, enhanced by a CRM plug-in, is able to recognize specific customer queries and leverage customer data from the CRM to answer them more effectively. This makes it possible to provide more personalized customer experiences, without the added cost and time of coding these integrations into your AI model from the ground up.
AI model Embeddings also lend themselves to extending an AI’s capabilities, albeit in a different way to plug-ins. They essentially distil the specific knowledge from your database and supply it to the AI through a series of prompts. This facilitates a dialogue in which AI could provide the most pertinent response based on an individual user’s behavior, historical actions, or preferences.
For example, a customer-facing chatbot integrated with your product documentation can provide detailed information and resolve specific product-related queries without human intervention. Equipping your app with your database’s specific knowledge will allow the AI to make accurate recommendations and assist customers in a more personalized and knowledgeable manner.
Recently, OpenAI revealed that asking a question in a neural information retrieval system that uses embeddings is around 5x cheaper than using GPT-3.5 Turbo. When compared to GPT-4, there’s an impressive 250x difference. Thus, it’s becoming clear that leveraging embeddings can lead to substantial savings and enhance the overall performance of AI systems.
In effect, plug-ins and embeddings provide an avenue to customize AI models for specific requirements without the high cost and complexity of building a new model from scratch. However, companies should consider the overall business goals, available resources, and the degree of customization required when deciding between plug-ins, embeddings, fine-tuning, or even prompt engineering for their AI development activities.
When to Use Plug-ins
When to Use Database Embeddings
Fine-tuning AI models plays a vital role in AI utilization within businesses. This process involves tailoring a pre-trained AI model to tackle specific issues or cater to unique needs.
To ease comprehension, let’s explore a real-life example – ORTY, a restaurant management system. This system leveraged fine-tuning to perform sentiment analysis on popular review websites, providing restaurant owners with succinct overviews of the prevailing sentiments attached to their businesses. A simple task on the surface, yes, but one with far-reaching implications, especially when you consider the algorithm’s multi-faceted functionality.
To accomplish this task, Pragmatic DLT team initially employed a basic classification model. They then utilized GPT-3.5 Turbo to fine-tune this model, leveraging its categorization and filtering capabilities. The objective here was twofold: to sift through thousands of reviews and accurately parse reviewer sentiments and to categorize these sentiments for easy digestion by restaurant owners.
The fine-tuning process commenced with feeding the AI model thousands of labeled examples. These examples ranged from purely positive reviews to a mix of positive, neutral, and negative feedback. Following this ‘feeding’ process, the AI model was then tasked with labeling previously unseen examples, being guided by the distinguishing features it had learned.
The results? Quite impressive. The fine-tuned AI model was able to accurately distinguish and appropriately categorize sentiments within restaurant reviews. This effectively enabled the ORTY system to deliver an aggregated and easy-to-understand sentiment overview to restaurant owners.
So, how long did it take to achieve these results? From the onset of the fine-tuning process to getting the application fully ready, it took approximately two months. Restaurant owners expressed high satisfaction levels with the service provided by ORTY, further emphasizing the effectiveness and practical application of fine-tuning.
AI model Fine-tuning Project details:
Project costs & duration:
The final product achieved remarkable results in distinguishing and categorizing sentiments within restaurant reviews. This fine-tuned AI model considerably simplified the ORTY system’s task and provided restaurant owners with an easy-to-digest sentiment overview. This successful endeavor boosted customer satisfaction levels and increased user engagement by 20%, serving as a testament to the fine-tuning effectiveness.
It’s noteworthy mentioning that a fine-tuned model’s serving cost is about six times more than the base model on OpenAI. However, fine-tuning provides superior performance levels without requiring a substantial financial outlay.
The decision to fine-tune an AI model relies heavily on task-specific requirements, the task complexity, and the expected return on investment. Without a labeled review dataset, the result might not be feasible, and in such a scenario, prompting makes a better option.
Prompt engineering aims for a more dynamic and flexible AI, capitalizing on temporary learning during immediate inference. This stands in contrast to fine-tuning which relies on a more persistent form of learning from a large corpus.
When considering a project that requires quick iterations and changes throughout its development process, prompt engineering typically comes out on top. A clear testimony of this is a Pragmatic DLT case with one of their customers who demanded a qualification chatbot for sales lead generation.
This client was tirelessly looking for an AI solution that could swiftly adapt to their dynamic and evolving sales environment. The aim was to develop an AI chatbot to qualify leads, saving time for their sales team and providing a seamless customer experience.
Prompt engineering stood out as the approach for a few key reasons:
Shorter iteration cycle: With prompt engineering, the project could swiftly iterate through various instructions and make changes on the fly without extensively retraining the model.
Cost and time-effective: Prompt engineering could speed up the development process since it didn’t require extensive fine-tuning. This factor dramatically increased its cost-effectiveness as opposed to rigorous model training.
Real-time adjustments: It granted the capability to adjust responses based on real-time data, ensuring that the chatbot could adapt to different customers’ contexts.
Fast forward to the end of the project, the client was enormously satisfied. In terms of project duration, the whole process took about two months, which was significantly less than the time required for building and fine-tuning an AI model from scratch. The resulting chatbot was able to efficiently handle sales lead qualification with precision and context-awareness.
This pragmatic approach illustrates the power and flexibility of prompt engineering. When AI API needs to be customized to fit specific interactive responses and infused with agility, prompt engineering provides a profitable, swift and effective solution to consider. It is important to note, however, that the choice entirely depends on the type and purpose of the project undertaken.
When deploying AI applications, particularly Natural Language Processing applications powered by language models like ChatGPT, an effective method to provide specific or proprietary knowledge is through data embeddings. In this context, embedding data goes beyond merely training the model with large volumes of general-purpose text. It equips the application with the specific knowledge from your database, thus significantly improving the relevancy of the AI application to your specific field or industry.
A pivotal example of this use case is the Get Report’s Copilot for Confluence - a chatbot deeply integrated into the Confluence knowledgebase. Serving hundreds of enterprise companies, this chatbot enables employees to effectively use ChatGPT in the context of their internal database.
In the Get Report’s Copilot application, ChatGPT is not just trained on general web text, but it is also equipped with data embedded directly from the company’s Confluence knowledgebase. This ensures that the chatbot’s responses are not just general, but purposefully relevant, providing employees with the precise information they need within the context of their company’s specific operations.
The process of achieving this contextual knowledge involves a series of prompts passed to the AI, which provide it with the vital context required to generate suitable responses. This method of equipping the AI with proprietary information from a user’s database has proven to be substantially efficient and versatile.
When it comes to project length, the process of data embedding for this kind of application usually takes between 3 to 6 months to fully implement, depending on the size and complexity of your database. Feedback from clients who have adopted Get Report’s Copilot application indicates high satisfaction rates, primarily due to the impressive increase in operational efficiency and the resultant cost savings achieved by leveraging AI in this manner.
Data Embedding Project cost & duration:
Duration of Project:
Customer Satisfaction:
Why Data Embedding Was Ideal:
Embedding data into AI application isn’t just about improving results; it’s also about cost efficiency. Consider a scenario where you’re using language models for information retrieval. Asking “What is the capital of Delaware?” in a neural information retrieval system costs around 5 times less than with GPT-3.5-Turbo. If you’re comparing against GPT-4? There’s a massive 250 times difference in cost! Thus, data embedding presents an opportunity for businesses to access powerful AI capability while maintaining control over cost.
By integrating real-time databases through data embedding, AI models like ChatGPT can serve highly specific queries that are tailored to the needs of the enterprise. Companies like Get Report have successfully leveraged this approach, making it a viable strategy for those looking to deploy AI in a context-rich environment.
To make an informed decision on whether to go for AI model fine-tuning, prompt engineering, or plug-ins and embeddings, it’s essential to compare these approaches head-to-head.
Cost-Effectiveness
Time to Market
Complexity
Flexibility
Scalability
Based on these criteria, you can select the most appropriate method that aligns with your business needs, time constraints, and budget.
When deciding on the optimal route for your AI projects, understanding the trade-offs associated with each approach is key. Typically, this centers around three interconnected concerns: costs, capabilities, and time to market. This is a usual approach Top AI Consultants like Pragmatic DLT suggest:
Though possible, this option is often most costly and time-intensive. As Pragmatic DLT, a seasoned AI consulting company, advises, building a new model generally outweighs reasonable investment parameters given the current advancements in Large Language Models (LLM) such as OpenAI or HuggingFace. These third-party models already offer highly competitive conversation and analytical capabilities that a newly built model might struggle to match. In light of this, dedicating resources to creating a new AI model from the ground up can be considered impractical and inefficient.
The recommended first step is to start with a third-party LLM like OpenAI or HuggingFace. These models can be utilized to perform tasks based on your existing knowledge base, and custom algorithms can be engineered for responses via LangChain/OpenAI functions. This approach enables the creation of a production-ready application that is both quick and cost-effective. It provides an early-to-market solution and allows you to start providing value to stakeholders and customers at an accelerated pace.
Once your initial AI solution is in place and performing well, further iterations can involve fine-tuning opensource models. This stage typically involves a more research-oriented approach. However, once completed, you end up with a proprietary model enriched with Intellectual Property (IP). This fine-tuned AI could be a significant asset for further fundraising and increasing your company’s competitive edge, despite the potentially lengthy timeline to achieve it.
Understanding where your business stands on the axes of time, budget, and technical prowess is pertinent in deciding which approach to take. Pragmatically, starting with prompt engineering powered by third-party LLMs and then slowly iterating towards fine-tuning opensource models is the most recommended path. It strikes a balance between costs, capabilities, and time to market, ensuring your AI investment yields maximum returns.
Stan and his team are expert blockchain developers with excellent full stack experience and I can't recommend them enough! The work is high quality, communication is excellent, and they are available at all times to connect live to discuss key project decisions in real time and to provide regular progress updates. For someone with limited development experience, Stan was also very willing to help walk through key architecture decisions in a way that was easy to understand and that made the process very smooth. In addition to being great at blockchain development, Stan and his team also have excellent startup business experience and we regularly had in-depth strategy discussions where Stan served as a trusted advisor to brainstorm product features, customer segmentation, go to market, and launch strategy for a blockchain startup. If you're looking for an excellent blockchain development team, full stack development resources, or a trusted startup advisor, these are the guys!
Read full testimonialWhen initially looking for a team for this build the interview process was paramount. within minutes Stan had me at a comfortable spot knowing we would be in the best care, we have been working together for months. this team & stan especially are top tier. we have had hard thought discussions around smart contract development, optimization, payments, routing of transactions, ecosystem architecture, design. this team solves problems so you wont have to. apart from being entirely flexible im at ease knowing the team is efficient with top quality full stack work.
Read full testimonialMichael is very knowledgeable in his area. He delivered the work within a very quick time period but equally took time to provide the necessary support that came with building the decentralised app script. I would recommend him to any other client on upwork. I just hope that he doesn't get soo much work that he can't work with me again.
Read full testimonialGreat communication from the team. The finished Coinbase Wallet reader script works great and we were able to adapt it to our specific needs.
Read full testimonialThe platform met our expectations. Pragmatic DLT's team provides transparent communication, and are skilled experts in blockchain. Their effective project management and responsiveness have facilitated the potential for a long-term partnership.
Read full testimonialGreat communication and attention to detail, the design process went really smoothly and the results were excellent.
Read full testimonialMichael and team completed backend development of my web3 app in 1 week where it took months for others to complete the same work. The quality was high and well-documented. This team is my go-to resource for future blockchain development.
Read full testimonialVery uncomplicated to work with. Did the job well and exactly as requested.
Read full testimonialPragmatic DLT offered helpful, transparent advice about blockchain technology and data security. Their clear communication and efficient Agile methodology set them apart.
Read full testimonialThey ultimately surpassed our initial expectations. The MVP was delivered within the agreed timeframe and scope, and the design that Pragmatic DLT implemented processes one order per second for more than 2,000 worldwide merchants. The team had a proactive approach, was responsive to the client's needs, and took ownership of their work
Read full testimonialEverything was very good! The team managed the engagement effectively and fixed errors prior to the launch. Moreover, their resources were highly competent. Overall, the project was a success.
Read full testimonialMichael and the team did an amazing job! I am pretty demanding customer and the project was not a trivial one. Mike, however, jumped on the task and a) Captured my requirements really well. Guys put it on paper and in the meantime I happen to better understand what I wanted than ever before. I am so grateful they saw my perspective and put it in a perspective of technical viability. b) Delivered very clear and concise architecture. Now I am very clear on how the big project may look and how much will it cost. Finally the thing I am grateful the most. Guys didn't push me to buy the development, they really rose the challenges and questions of viability in a manner I would never be able to do myself. And, effectively, talked me out of the project! They are true professionals and I can now fully trust that they would not sell me solutions they don't believe in. Absolutely recommended!
Read full testimonial