Making LLMs do what you want

29 March 2025, 16 minutes

Tired of spending hours convincing an LLM to just do that one thing? You have tried lots of different ways to no satisfaction and yet it eludes you why it isn't able to just do this really simple task? Look no further! You've reached the place you'll learn the dark arts of making any LLM head over heels for you. I have worked with LLMs since about two years now and this guide is about what I learnt and how we can ensure the best prompts while also modifying prompts to accomodate evolving requirements.

Introduction

With time, I have grown patient and more understanding of what an LLM goes through in it's short life and with this patience and understanding I have found ways to more efficiently communicate with the LLM for the highest ROI.

In this article, I'm going to give you tips on what's the best way to communicate with an LLM.
What to do, what to avoid, and how to make the LLMs job easier. This is not about hacking prompts or making bots do what they weren't designed to. It's a simple guide for folks looking to build with LLMs.

Though this post is targeted mainly towards software engineers who build things that rely on LLMs, anyone who wants to better their interactions with LLMs would find this useful too.

Of course, these aren't strict rules, so choose the ones that you think fit the best for your situation. These are techniques I've found useful that I often apply while working with LLMs.

Setting a goal

"If you are sitting, just sit; if you are walking, just walk; don't wobble."

— Zen proverb

Focus

Give the LLM a clear goal - something that makes its life worth living.

This is important especially if the task at hand requires some level of reasoning.

When you ask an LLM to do multiple things at once, it has a tendency to mix things up. To avoid that ensure your prompt asks it to do just one thing.

This doesn't mean that it's incapable to do multiple steps or to hold multiple properties about some data. It just means that the LLM has one direction.

For example, if you have an agent, it's goal should be one thing, and to accomplish that goal, it would be choosing from a set of options. Even though, it may output multiple options, one must ensure that the goal for the agent is just one so it can generate the best options.

Another example, say you want it to look at some HTML code and tell you what the website is about and you want it to generate some code to modify the HTML. It may be tempting to just do both of these things in one prompt, and in fact, in this case, this is what you should do. The part about finding what the website is about could possibly help it generate better more contextual code too! (More on this below)

But if instead the tasks were like generate some code to do thing 1 and then generate some other code to do thing 2 or do some detailed analysis about something, then it makes more sense to divide your task into separate LLM calls. These are some tasks that need reasoning, so would have it's own separate reasoning field too.

Get to the point

Bad

You are a helpful assistant.  I'm working on a project to understand websites better.
Sometimes websites are hard to understand. So ensure that you choose the correct topic.
Also, ignore anything in the scripts and style tags. Can you look at this HTML and tell 
me a little bit about the website, like what it might be about?  I'm not sure if this is 
going to work, but I hope you can figure it out. 

Good

Extract the primary topic of the webpage

Avoid repeating yourself

When you repeat yourself, it sets a precedent. It makes it look like you NEED to repeat yourself to have your point come across. So the LLM doesn't have to take the things that you don't repeat that seriously and might just forget about it. This leads you to repeating other things that you want to ensure the LLM does.

When the requirements evolve, you go and update the prompt to add that, but you realise that it's not listening to you again, so you have to repeat yourself again. This leads to a practice where you have to repeat your point for every single change you add.

Avoid negatives

"you want it to be one way, but it's the other way"

— Marlo Stanfield (The Wire)

There are certain things that you want the LLM to avoid doing. Often, when you want an LLM to not do something, in a sinister turn of events it goes ahead and does exactly that.

Turn the phrasing into what it needs to do instead of the negative. Though of course, there may be cases where a positive sentence isn't possible or is too weird - in that case turn it into a reasoning step. More on this in the Thinking section below.

Let it do things that only it can do

If there's something that can be done programmatically, then do it programmatically.

Instead of asking an LLM to ignore something or not consider something in the prompt, the best thing to do is to just remove it programmatically before sending it.

When sending something to an LLM, ensure you optimise that data and remove everything it doesn't need to have, and add any context it needs.

When you're adding a subtask for it to follow or a condition, think first whether what you want it to do can be done programmatically.

For example, instead of asking the LLM to ensure adding https before URLs after observing that it often misses adding HTTPs (where adding https is the only option), you can have this step as a post-processing step outside the LLM where it just corrects the URL and forwards it. This basically removes one unnecessary condition letting the LLM focus on the task at hand.

Customizing the output format

Popular LLMs allow a developer to set the response mode, which is normally either JSON or text. And the LLM religiously follows the output format as you've specified it.

This wasn't the case initially with earlier LLMs. There was no JSON mode and even when there was it used to break often and sometimes the request itself would take too long or never complete.

Though, those painful days are behind us now, it's still worth mentioning the ways we solved this problem since it can be applied to places where an LLM needs to output in a specific way or tone or similar to something.

Providing examples in the prompt

Simply, provide examples in the prompt of what you expect. This gives the LLM a clear idea of what it should output.

This method requires that you do have some examples already that don't bias the LLMs outputs. A bad example would be something that's realistic because when that case actually occurs, it's likely for the LLM to output in the exact same way as the example even if it doesn't fit or without the example it would've worked better. If this behaviour is __ to you then realistic examples do make sense. Giving examples can also help in logical tasks sometimes by acting as a "framework" to solve the specific problem it's tasked to.

A way to make the LLM even more strict towards following your output format is to show it that it has done it already.

Most LLMs use a chat-like structure where messages alternate between user message, assistant, and one system prompt specified separately. If in a conversation, the assistant has already followed your output format/specific instruction once, then it's more likely that in it's next response it would do the same.

So, all you need to do is to fabricate a small interaction in which the assistant answers perfectly. The next message would then likely try to follow the "perfection" of the previous one. Again this would have the same restrictions as providing examples directly in the system prompt i.e. it might bias the LLM towards certain kinds of outputs so ensure that the messages are neutral.

This is also commonly known as few shot prompting.

Using a different format

It's possible that the format you're expecting it to output in is too complicated. For example, if you're expecting the output in JSON and the output contains quotes and some escaping within the quotes too (like say it's some code), then the JSON format was not the best choice (this was when models didn't support JSON mode natively) since the LLM needed to keep track of whether it's in a string within the JSON value and how much it needs to escape. This leads to JSON parsing errors making us rely on less strict JSON parsers or inventing our own. Or a lot of post-processing (spaghetti) code.

In a situation like this, it's worth exploring alternative formats that may be easier to work with. In the above example, since quotes are the problem the solution would be to use a format less reliant on quotes. I've found the best format to be just delimiting parts of the output with specific strings that look meaningful to an LLM but are highly unlikely to be part of the output. This way the LLM doesn't have to adjust the actual task that it needs to do to accomodate the format.

Again, now that LLMs provide devs to change response format easily, this is not a problem anymore for output formats on a global level. But you can definitely use similar tactics to ensure that the LLM outputs in a specific tone or an inner field of your output follows a specific format or something like that.

Setting an identity

"The Way that can be described is not the unchanging Way"

Tao Te Ching

You must consider how much does the LLM having an identity matters. Often an LLM doesn't need an identity - it might perform the same or in some cases, better without an identity.

Adding an identity is putting the LLM inside a box. By giving identity to an LLM, you restrict it to that.

Below I describe a bit more about the different aspects of this take. Feel free to reach out to me if you have anything on the contrary - any example of how setting an identity to an LLM helped you marginally.

Adding an identity for the bot helps when it's a chat-like interface you're dealing with. The features that I work with are rarely chat-based and even when they are I usually avoid adding much to it's identity. This is something that can be user-controlled too.

Personality

You can make the AI friendly, concise, chatty, angry, cheerful, etc. Though, in most tasks that you want the LLM to perform you might not need to set a personality to it.

The cases where you do need to set a personality is when you're looking to affect the user's emotions with the generated text. Give it the mood that you want the user to have.

Values

You can set some values for the LLM that dictate it's actions. For example, if you want it to generate leftist content, or to avoid right aligning content, you give it an identity of being leftist. You want it to do or not do something then give it the identity of someone that is like that.

Thinking

"models need tokens to think"

Andrej Karpathy

Space for reasoning

You need to give LLMs some space to think. When you ask it something like count the number of 'r's in strawberry, or to multiply two big numbers, the AI (without reasoning) will not usually give you the right answer unless it's specifically trained for it.

Reasoning gives AI space to lay out its thoughts, one should assume that there is no internal "thinking" that goes on inside an LLM (though you'd be interested in this paper by Anthropic if you'd like to explore more on this topic), its output tokens are its thoughts. So you need to give it some space to think. Adding "Think step by step" often does the trick. I often explicitly ask it to add a key "reasoning" in the JSON object it outputs where it can lay out it's thoughts about the task.

Think first

When you add reasoning in the LLM, you need to make sure that happens before the output. Any thinking done after your output, the LLM would only realise it's mistake and dwell on it in reasoning.

So, my prompt usually has Start with reasoning so that the LLM thinks first and gives me the result based on the thinking.

There's also another approach when you're not confident with the initial line of thought that the LLM could come up with: reviewing the answer.

You can basically ask it to start with reasoning, then give the best output based on the reasoning, then review that output and it's reasoning, and then based on the review give a more refined output.

This kind of flow is useful for outputs that are less deterministic, or need a stricter review.

Tell me how to think - exploit the output schema

For a simple reasoning task with less conditions, one can simply add a reasoning key and get away with it working correctly for 90% of the cases.

For tasks that have multiple conditions and need more reasoning, it is good to specify how the LLM should go about it.

Basically, you can ask it to reason based on your steps and to follow the steps exactly like you have stated them.

If you feel like the LLM is still missing some steps, make it part of the output format that the LLM expects. You can also add a conclusion field which contains a conclusion based on the reasoning just before the result.

I like to keep it so that it's clear that it's part of the "reasoning" process and not part of the output.

Before:

<Identity>
<Task>
Start with reasoning, ensure you follow these steps while reasoning:
1. <condition 1>
2. <condition 2>
...
Output format:
{
    "reasoning": <reasoning>,
    "result": <result>
}

After:

<Identity>
<Task>

Start with reasoning.

Output format:
{
    "reasoning": {
        "<condition 1 reasoning>": <reasoning>,
        "<condition 1 reasoning>": <reasoning>,
        ...
    },
    "conclusion": <conclusion>,
    "result": <result>
}

Miscellaneous

Doing multiple independent things in one prompt

In the focus section, I cautioned against doing multiple things in one prompt, but there may be constraints you have to follow leading you to use just one prompt for doing multiple things.

It's not that difficult to do, and also though it might be less focussed than just doing one thing at a time, it can actually help an LLM give a better quality output sometimes. Let me explain how.

Let's say you're giving an LLM two tasks with their own separate steps, reasoning, condition, etc. So, if you are asking an LLM to do multiple things, you would need it to do it sequentially. If needed you may add a reasoning key to each task. So, the first task may have it's own reasoning step independent of the second task which may or may not have reasoning steps. The LLM will output the answer to your first task while also keeping in mind some implicit constraints or restrictions or anything urelated to the first task but is specified in the second task. This might lower the quality of the output for the first task.

The second task, though, has the context of the first task and perhaps has already reasoned a bit about some things. At this point, the LLMs job is to do just one thing i.e. finish the output of the second task. Though some of the restrictions still apply similar to the first task, it is still possible to give a better quality output for the second task since it's equipped with the reasoning and knowledge from the first task.

i.e. the latter the task is specified, the better reasoning it will have.

Context

The better context you add in your prompt the better it will be for your usecase. Collect all the context you can about your topic and add it to the prompt. If you think the context is too long that it wouldn't fit in the context window, then you might need to implement a way to retrieve the relevant part of the context and put it inside the prompt. A popular way to do it is as follows also popularly known as an RAG-based solution: divide the context in a set of meaningful chunks and store each chunks embeddings in a Vector DB, then for a specific task, based on keywords in the query for the task or through other techniques, create a "search text" and generate embeddings for this text. The context chunk which has the closest embedding to the search text is something you'd want to usually add in your prompt.

An organized and structured context helps the most so you can select the specific part you know will help the LLM the most for a query. An unorganized context would need to be organized first.

Conclusion

Though there is a lot of research around trying to explain how LLMs work, LLMs are at this point still a black box. I'm confident that someday we will open the box and see what's inside but that's not today. So, here we are, making instructing an LLM an art while LLMs create real art for us. Anyway, re-iterating: these aren't hard and fast rules. Be creative and think how for your usecase you can best utilize the AI most efficiently. You got this. I believe in you.

tech generative-ai llm
END