BishopPhillips Consulting - Advanced Techniques in Prompt Engineering

LARGE LANGUAGE MODELS - Prompt Engineering

LLM Few Shot Prompting

Few Shot Learning

Few-shot learning is a subfield of machine learning that aims to teach AI models how to learn from only a small number of labeled training data. The goal of few-shot learning is to enable models to generalize new, unseen data samples based on a small number of samples we give them during the training process. Few-shot learning has applications in many different subfields of data science, such as computer vision, natural language processing, robotics, healthcare, and signal processing.

One example of few-shot learning is one-shot learning, which involves training an AI model on a single example of each class. One-shot learning can be used to recognize new objects or patterns with only one example.

Another example of few-shot learning is meta-learning, which involves training an AI model on a set of tasks and using this knowledge to learn new tasks quickly. Meta-learning can be used to improve the performance of AI models on new and unseen tasks3.

Few-shot learning has many potential applications in areas such as computer vision, natural language processing (NLP), and robotics. For example, few-shot learning can be applied to various NLP tasks like text classification, sentiment analysis, and language translation4. In computer vision, few-shot learning can be used for image classification, object recognition, gesture recognition, and other object-related applications2.

If you want to learn more about few-shot learning and its applications, I recommend checking out this article on AIMultiple that provides a comprehensive overview of the topic with examples and use cases.

Few Shot Prompting

In prompt design we can use the concept of single and few shot prompting to teach the LLM the type and context of the desired response and then pose a problem or a question. The LLM will then use the provided context as a kind of template for the desired response.

The idea behind this approach is that we are providing an example of the type of answer, or how to solve a problem to the LLM from which it is to extrapolate an answer to a different problem or query.

For example a one shot prompt might be of the form:

"A twingle is a marsupial native to Tasmania known for its beautiful song but ugly disposition and foul smell. An example of a sentence that uses twingle is:"

Here we have taught the LLM what a twingle is and provided some little data about it, and lastly instructed it as to the required action. The LLM completes the prompt action thus:

A twingle is a marsupial native to Tasmania that is known for its beautiful song but has an ugly disposition and foul smell ¹. Here’s an example sentence that uses the word:

“The twingle’s song was so beautiful that it attracted many tourists, but its unpleasant odor kept them at bay.”

I hope this helps!

Amusingly, bingAI in this case even assists us with a reference to a guide for marsupials native to Tasmania, where we will not, of course, find any reference to "twingles"! Single shot prompting is one of the most common ways you are likely to drive a search assistant AI - short of just asking it a question.

Let's contrast the difference between a zero shot prompt and a one shot prompt:

Our zero shot prompt might be:

"An example of a sentence that uses twomple is:"

To which BingAI (set to balanced) replies:

"I’m sorry, but I am not aware of the word “twomple”. It is possible that it is a newly coined term or a misspelling of another word. If you could provide me with more context or information about the word, I might be able to help you better."

So now we add some information making it a one-shot prompt:

"A twomple is a new type of reram memory. An example of a sentence that uses twomple is:"

To which BinAI in balanced mode responds:

"Thank you for providing more information. According to my search results, “twomple” is not a commonly used word in the English language. However, it appears that “twomple” is a term used in the field of computer science to describe a new type of resistive random-access memory (ReRAM) ¹. ReRAM is a type of non-volatile memory that can store data even when the power is turned off ¹.

Here’s an example sentence that uses “twomple”: “The new twomple-based ReRAM technology has the potential to revolutionize the computer memory industry” ¹.

I hope this helps!"

A twomple is, of course, of course nothing of the kind, but we have "trained" the AI to accept that it is for the purposes of this conversation.

Turning to a few shot example:

"If the big red rock qwingle likes to eat rocks and the small green plant wangle likes to eat plants and the large lemon spider twibble likes to eat spiders, using these examples, what does the medium purple fly twimble like to eat ?"

The LLM returned:

"Based on the examples provided, we can assume that the medium purple fly twimble likes to eat flies. This is because each of the other creatures mentioned in the examples likes to eat something that shares its name, and so it would make sense for the medium purple fly twimble to like to eat flies as well."

This was surprisingly difficult to formulate as the, the formulations the excluded the word "If" resulted in the AI attempting to write a long whimsical paragraph about fly twimbles, but not identifying that they would eat flies.

BingAI, at least then promptly forgot what it had learned so that even though we were in the same conversation, it could not answer successfully a further nonesense patttern about what the "crimson toffee gingo" would eat, nor could it recall what the red rock qwingle likes to eat. Later we shall look at an example where the context of the conversation was successfully held throughout, but clearly the LLM struggled with this.

Turning to a favourite example of few shot prompting, we shall consider mathematical reasoning. Firstly we shall teach the AI how to determine an even and odd number:

"For the remainder of this conversation remember that an even number must be divisible by 2 and result in an integer with no decimal places needed, so 2 is an even number but 3 is not. An odd number is any number that is not an even number."

Now we will give it a problem:

"Is it true or false that the sum of the odd numbers in the following list is even: 1, 2, 3, 24, 33 ?"

The LLM replied:

"Based on the definition you provided, we can say that the sum of the odd numbers in the list is odd. This is because the odd numbers in the list are 1, 3, and 33, and their sum is 37, which is an odd number. Therefore, the statement is false."

Now in this case we gave it the rule to follow, in few-shot prompting we would normally just give it the examples. So with a prompt like:

"If a function, we shall call F of a set of numbers gives the following results:

F(1 ,2, 3)
A: 6

F(1, 3, 5)
A: 9

F(6, 7, 9)
A:"

Now note we have introduced a new concept here - the use of "A:" which is a label the LLM recognises as meaning "Answer". This prompt provided a pair of examples, but no rules for calculating the answer. The LLM responded with:

"Based on the given information, we can see that the function F takes a set of numbers as input and returns a single number as output.

We are given three examples of the function F applied to different sets of numbers:

F(1, 2, 3) = 6

F(1, 3, 5) = 9

F(6, 7, 9) = ?

We can see that the first two examples have a pattern: the output is equal to the sum of the inputs. For example, in the first example:

F(1, 2, 3) = 1 + 2 + 3 = 6

Similarly, in the second example:

F(1, 3, 5) = 1 + 3 + 5 = 9

Therefore, we can assume that this pattern continues for all inputs. Using this pattern, we can calculate the output for the third example:

F(6, 7, 9) = 6 + 7 + 9 = 22

Therefore, we can conclude that:

F(1 ,2, 3) = 6

F(1, 3, 5) = 8

F(6, 7, 9) = 22"

Which is, of course, correct. Here using only examples, the LLM was able to deduce the rule that applied and then apply that rule to calculate the answer to the remaining problem.

Bias In Few-Shot Prompting

The mechanism of few-shot prompting is that the LLM is being trained in solving the problem by a series of examples that cover the range of the problem domain. Thus a few questions around the effect of bias arise: To what extent does the nature of the exemplars provided impact the outcome?

Testing with longer sets of few shot prompts reveals the following:

Order of exemplars does not matter. Arranging the exemplars in any order does not appear to impact the accuracy of the result.
Balance of exemplars does matter. Where a problem could be interpreted in more than one way or have a either choice in a binary outcome (such as a True/False response, or text (attitude) classification), or indeed, where the LLM actually does not know the answer but only has assertive answers available in the response domain, the balance of exemplars matters. In the absence of a clear pattern the LLM will provide the response that matches the majority of the outcomes in the exemplars. Generally I would advise against prompts that require a binary response, and always allow for an "I don't know" or "balanced" option, as forcing a binary response to a prompt that is neither true nor false means the LLM has to effectively guess the answer, and then the training set will act as an artificial constraint rather than a doorway to learning.

You can build in an "I don't know" option into the training set by inserting at least one answering few-shot prompt exemplar with an equivocal outcome such as:

Q: How many twundlecups comprise a quintocletom?
A: I don't know.

or

Q: Is it true or false that a blintocurple is always green?
A: ?

Prompt Engineering - Chain of Thought Prompting

Overview of LLM Solutions

References

unite
aimultiple
arxiv
builtin
promptingguide
doi
analyticsvidhya
arxiv