AI Agents' onboarding
Last week I was sitting with a friend of mine who is working on an AI co-pilot for X startup.
(I know, I know. A lot of my posts start with “I was sitting with a friend”. This is a gentle way of signalling that I have friends. These are real people, people. And real discussions).
We were discussing agents. I have never let my basic knowledge of things stop me from forming opinions. So why stop now. So this is what I shared with him.
This may not be new to people who are deep into AI and agents. But for an average product person it might be interesting.
There are 2 types of agents. Agent as colleague. Agent as copilot.
Colleague agents are much more expensive. Take Devin for example. He is supposed to be a junior developer who you can also delegate tasks to and he works independently, while Cursor Agent (Composer) and Replit Agent are examples of copilots. They take some tasks off you, but I assume they enhance your ability as a developer rather than working independently.And they are priced accordingly. 500 dollars a month for Devin versus 20 dollars for Composer.
So how do you work with each type of agent? I told my friend that if you think of Devin as a real colleague, an employee in your organisation, then its onboarding should be like a real colleague. Let’s take the example of a new agent. We have decided to buy Jagan, a data analyst agent created by Data Friend Labs. What should the workflow look like?
In an ideal scenario. This may already be happening with some agents today, but let’s say we don’t know of any and we’re using some good old-fashioned first principles thinking. How do you onboard an agent?
A regular employee joins. They have a manager. They have an onboarding buddy. They get an email id. They have a slack. They have an onboarding checklist created by their manager. Now suppose you want this colleague, Jagan, to be as useful as a human employee. How would you do its onboarding?
A normal human would probably do 1-1s with key people. Read a lot of documentation. If they have questions, they could ping their onboarding buddy or manager. Do some onboarding projects. At Facebook, you might even do your first PR on your first day.
No human is going to remember every word of every document they read during their onboarding. They create an onboarding folder. Bookmark all the Google Doc links. Take notes from the 1-1s with key stakeholders. Is the process the same for Jagan? What is Jagan anyway? His brain is powered by some LLM. His memory (local) is the docs he has read. There will be some RAG based system to extract relevant information depending on the task. He will have access to tools. Use of tools. This means his account would be created on Tableau, Metabase, Amplitude, Appsflyer, Google suite, (all relevant tools he needs to do his job). Do we really need stakeholder interviews? Will employees be interested in talking to an AI colleague to provide context? Will we just create the relevant documents to give Jagan context on each stakeholder?
A normal employee would work pretty independently, only going to their manager when they get stuck. I expect Jagan would be stuck a lot. LLMs are good, and with each new generation they get more and more capable. But the context window is still limited. Yes, you can RAG and get relevant information from the docs, but we have seen cursor agents mess up so much. Especially during onboarding, do they keep going back to their manager when they get stuck? Remember that most LLMs today work on pull rather than push. ChatGPT never pings you or sends information itself. You ask ChatGPT. You tag some files and ask the cursor to do something. Even the Agents platform mostly takes a request from a user and completes the task. Do we have a progress md file for Jagan, where he writes after each interaction? People recommend using a progress md for Cursor, where you update the progress, so that when you ask Cursor, it is always up to date on what is happening with the project. Jagan can have many interactions during the day. A human colleague does that. So does it write everything? How does it know what information to write? Find out what is important to discuss on Slack. Does Jagan attend the standups? Does it even have a performance review? Devin was an interesting experiment in how AI colleagues will coexist with human workers. But the reviews aren’t great. And the price means that at least people like me can’t just try it out for fun. Also, Devin has been more like Cursor agent than an actual colleague. People tag Devin to help with PR. And menial tasks as far as I can see on X dot com. You don’t tag your human colleague like that. They work as equals. A manager decides which tasks are assigned to them based on their maturity, skills and experience.
Copilot is also interesting. Copilot is broad. Used by every employee. So shared context. Example: Copilot for a data analyst, let’s call him Data X, needs to understand the schema. Understand the data pipeline. Which data should be retrieved from which table. But setting up the copilot takes time. Even for Cursor, a lot of people just read something on Twitter and figure out how to make Cursor better. Cursorrules, progress md, product description md, asking Cursor to go through the steps before executing, all came from random Twitter users. Democratised learning. But coding is mostly dependent on the language you write code in. So .cursor has rules for different languages. Collecting data and creating artefacts is much more nuanced. For example. GMV may be GTV in some companies. Sales can be calculated differently. I remember reading a report on the state of food delivery from Momentumworks and they have so many disclaimers saying note that Grab’s revenue is calculated this way, Shoppee’s this way and what not. GAAP is only for public companies. And real companies have many more definitions and terminologies. So how do you set up Data X?
Here you are getting onboarded to Data X (employee onboarding done on a new platform) vs new employee (Jagan) getting onboarded to your company.
Do you start dumping documents to it? Do you need the company (let’s say Data Inc) to come in and create a middle layer to map how you would give tasks to the copilot in natural language vs how it would query? Or can you platformise it somehow? For example, let it ask questions in the first week and get context about your business? You can start throwing documents at it. You need Data Inc to come in and set up the RAG system. You need FDEs, as in the case of Palantir. Palantir is high ticket and the cost structure allows it, but maybe not for a bottom-up Data copilot. How much can you abstract away? Context windows will get bigger. Maybe you will use fine-tuned open source models instead of the frontier models, but every company is unique and you will still need a lot of manual work to set up Data X. I spent so much time on onboarding because onboarding is important. The quicker the onboarding, the smoother the process, the quicker the aha moment, the more likely people are to be wowed by these AI products and actually use them to augment or replace their employees. I think if I were a PM trying to get into AI, I would be obsessed with how to master onboarding. Onboarding PMs, or PMs for activation, was an important role in web2. It’s going to be even more important for people building AI products.