StarCIO Digital Trailblazer Community - Confidence to lead, community to advise

Coffee With Digital Trailblazers
Coffee With Digital Trailblazers
Managing AI Agents: New Skills, Operating Models, and Tools
Loading
/

Participants

Hosted by Isaac Sacolick, CEO of StarCIO

Special Guests

Digital Trailblazers

Summary

Hosted by Isaac Sacolick, CEO of StarCIO, the event featured several special guests known as Digital Trailblazers, including Joanne Friedman, Liz Martinez, Joseph Puglisi, Martin Davis, John Patrick Luethe, Heather May, and Derrick Butts. The focus was on insights from StarCIO Research.

StarCIO Research

โ€ขDeloitte State of AI in the Enterprise

โ€ข74% will deploy agentic AI within two years

โ€ขCurrent approach to AI transformation

โ€ข84% have not redesigned jobs around AI

โ€ข85% expect to customize AI agents, but 21% have a mature governance model

โ€ขKPMG AI Quarterly Pulse Q4/2025

โ€ข44% of leaders expect AI agents to take lead roles in specific projects with human team members in the next 2-3 years.

โ€ข73% of leaders say, “the more they use AI, the more they trust it.โ€

โ€œAs organizations adopt AI agents to augment teams or act in the physical world, they should treat their onboarding with the same rigour as a new employee. This includes developing well-defined roles, safeguards and structured oversight practices.โ€ โ€“ WE Forum

Can an AI agent lead human teams? โ€“ โ€œTeams led by the LLM operator had comparable containment performance as those led by the human operator, yet were unsuccessful in completing trials in the presence of fog.โ€ (Desert Herding multiplayer game where fog impacts visibility)

โ€œMaking the analogy of agents as digital workers too literally may limit the potential of agents. Holding them to standards developed for measuring human performance risks misaligning their activities to functions better left to human workers.โ€ – Deloitte

Transcript

[00:00:00] Speaker A: Greetings everyone. Welcome Friday, March 20th for our 165th episode of the Coffee with Digital Trailblazers. I’m excited to be here.

Isaac Sokolak. I’m the president of Star CIO and your host today to talk about managing AI agents, new skills, operating models and tools. I’ve got a full house of speakers. Everybody you see here is I think here.

Welcome Roman, who is one of our stand in guests. We’re now building up a second layer of guests who have joined us as speakers for select events. And Roman, I’m very happy to have you here on our AI and innovation oriented coffee hours.

Thank you for joining this week. Hopefully we’ll be here just before all the basketball festivities start.

I am excited. I am a University of Arizona alumni, so go Wildcats.

That’s what I will be doing as soon as this is over.

And just getting ready to watch a game tonight today just trying to get my PC to do what I’m asking to do. There we go. Oops, hit delete one too many times. So let’s talk some data.

Talk about managing AI agents. I think the very first time I heard somebody talk.

Sorry about that. The first time I heard people saying that we should treat AI agents like people was referring to identity. And I heard folks suggesting that we should be putting our AI agents not only in our directories but in our HR systems. And then the next thing I heard a few weeks later is, you know, we need to start thinking of them and managing them just like people and putting them through performance reviews. And I said, I don’t know how I feel about that just yet.

There’s a quote here.

There’s a quote here. I’m sharing with you from the World Economic Forum.

I’m just dropping out. Just give me one second. I’m going to fix this problem once and for all.

Can I just get a hello from someone? Joanne, can you hear me?

[00:02:24] Speaker B: Yes, I can.

[00:02:25] Speaker A: Okay, sorry about that. So let’s talk about what the World Economic Forum has said is organizations adopt AI agents to augmented teams. They should treat their onboarding with the same rigor as an employee, develop well defined roles, safeguards and structured oversight practices. It’s not as going as far as saying we should put them through performance reviews and treat AI agents as teammates, but it’s getting there pretty close.

If you read this other article, can an AI agent lead human teams?

It talks about essentially a multiplayer game where the AI agent is leading teams and doing a fairly decent job compared to benchmarked against human operators until you threw them The AIA surprise. And in this case this is a game called Desert Herding, which I know nothing about.

You put fog into the game, which impacts visibility and the AI can’t keep up with it.

Very interesting. Many, I love this quote from Deloitte. Many making the analogy of agents as digital workers too literally may limit the potential of agents holding them to standards developed for performance. Human performance risks misaligning their activities to functions better left to human workers. So Deloitte is saying, be careful with this. And then on the left side, you’re seeing some of the studies I have here. The Deloitte study just shows you that we’re still at the tip of the iceberg.

74% will deploy AI agents over the next two years, sort of saying 26 won’t.

That’s kind of interesting.

Most of what you’re seeing here is taking AI agents and taking what we’re doing and doing something a little bit better. We’re not really redesigning processes for AI. We’re not redesigning jobs for AI just yet. And I really like this data from KPMG and they have a quarterly report.

44% of leaders expect AI agents to take lead roles in specific projects with human team members in the next two to three years. So half of their survey, which I think is several thousand people are on this survey are saying we’re going to put agents in charge.

And 73% say the more they use AI, they’re building trust in it. So we very much an indicator that there are at least half of the organizations surveyed moving in the direction of AI agents, putting them in place. And now the question is, how do you manage them? And I think Joanne first brought up that term here a few weeks ago that I said, you know what, we have to talk about this because quite frankly, I have some issues. I think we pretty much suck at managing performance, managing people.

I think there’s pockets of companies who do it well and pockets within companies that do it well. But most of the time when you hear people leaving companies, it’s because they weren’t given development opportunities. There’s misalignment with their leaders and their bosses. And now we’re going to say we’re going to take what sparingly worked against humans, apply it to AI agents. And here’s my problem with it.

If I am hold, can I hold an AI agent accountable? Can I put them through a PIP program? Can I manage them the same way I do a human? I don’t think so.

And so somewhere in this storyline, there has to be accountability.

So that’s what we’re going to talk about here. I want to welcome Roman as our special guest. Roman, you mentioned to me right before the start of the program that you had the opportunity to sit through an AI agent lab.

Why don’t you share your insights from that opportunity? Thanks for joining, Roman.

[00:06:25] Speaker C: Well, thanks for having me. So, yeah, I had the opportunity to sit through a developing AI agents boot camp and wanted to really share with the whole group two general observations that I think are important to where we’re going to go today.

There were about 30 people in the boot camp and only two of them were actually traditional software developers. It was a extremely diverse group of people, some for marketing, some from finance. There was HR and sales. There was even a gentleman there who was a CEO saying he wanted to clone himself for a startup idea that he had. So these people were primarily focused on personal productivity and workflow automation.

While the software developers, though, were really very interested in embedding agents into existing applications.

Nobody was really talking about writing new applications from a software development standpoint. They were really talking about augmenting the ones they already had.

All of these attendees were personally interested in upskilling themselves into, I guess what I would call an AI agent first or an AI native professional.

So as an example, one person wanted to think of himself as a AI native product manager or an AI native marketing person, those kinds of things, and they wanted to really upskill themselves to do that.

The second observation I wanted to make is a bit darker.

When I finished the class, I really had the feeling that we had been given a very sophisticated set of tools that could be very, very dangerous. And just at the. There was a bare minimum conversation around, and I’ll just use the word safety, not necessarily security. I mean, we were given tools like command line compilers and an interactive development environment to massage code around. We were given the ability to connect to APIs and a whole library.

Skills like how to build Excel spreadsheets, how to use Word, how to, you know, go through email, things like that. And then finally the ability to control those agents from simple messaging apps like WhatsApp or Slack or, you know, so you just send a message to your, you know, your digital twin, if you will, and say, you know, I’d like you to clean up my email for today.

Now here’s the dangerous part, right? What does clean up your email really look like?

Does it mean delete everything?

So, and I know Isaac will talk about some of these things as we go on, but I wanted to Leave you with those two things. One is this is not an, you know, a software developer thing only people are really jumping into this with both feet. And the second thing is the level of sophistication of these tools that are being brought together is just amazing.

To watch a marketing or an HR person walk through something like Visual Studio or any interactive development and do it successfully.

This was really amazing to me.

[00:09:46] Speaker A: Wow.

Thank you for sharing that, Roman.

Sophistication is amazing. We’re going to do another conversation on the what’s this? On April 17th we’re going to be talking about AI coding competencies. I’m going to try to load the speaker board with experts who are really hands on with these new coding tools so we can learn from their observations. Roman, you’re welcome to join us.

I have two sets of questions here for everyone.

I’m going to let Roman go first and I’m glad Joanne’s raising her hand.

Joanne, love to hear you speak about managing AI agents.

What does performance management look like? What are the consequences of this approach and where might it be dangerous? I want to talk about performance management. And then Derek is already raising his hands. I want to talk about part of performance management is just dealing with bad behavior behaviors that are falling outside of policies.

How should we think about that when it comes to AI agents? And then we’ll go around the room. Martin’s raising his hand as well.

You can comment on either of those things. At the end of our program we’ll talk about skills, operating models and tech capabilities of digital trailblazers. Joanne, let’s start with you. You introduced the term of AI managing AI agents here for the first time. Maybe put some clarification what you meant by that.

[00:11:12] Speaker B: Sure.

One of my favorite topics. First of all, let me address the question directly. Should we manage AI agents like employees?

Well, the analogy is seductive because it’s very, very dangerous.

So the answer is no. In a short word, where it works. What you really need to look at is that you’re not actually managing the agent. You’re managing the corporate policies. You’re managing the permissions, you’re managing the logic. You’re managing a bunch of things that have absolutely nothing to do with the agent but what the agent is given access to.

And that’s really where the problem starts to begin.

In our case, we use return on data because every agent has a cost. And you think about the cost of developing it, what data systems or systems of record it’s going to access and it has to be driving a yield and that Yield comes across four channels. Is it revenue attribution, is cost avoidance, is it risk mitigation, is it capital efficiency? And if you break those down to is it saving me time in my calendar or my Gmail? That’s a productivity game.

So you can actually relate these things to hard dollars.

If it’s not moving one of those needles, then it’s not performing because this is a tool, it’s not a person.

But where the analogy breaks down is around judgment.

Employees have judgment, they push back. Agents optimize very confidently towards the wrong outcome at times. Think about a factory. A person will want to have a higher yield, meaning more products being produced or better quality.

That’s a trade off. And that trade off happens a thousand times a day. And it’s not just in manufacturing, it’s in supply chain, it’s in financial services, it’s in trade management, logistics. Virtually every industry that you can think of, there are trade offs that get made on a daily basis. Is the agent wrong? If it makes the wrong call, the outcome may have severe consequences. You can sacrifice yield for quality but lose a lot of customers because your your on time deliveries fall off. Conversely, you can end up in a situation of, I don’t know, a big automotive manufacturer or three who all had massive recalls and it cost them billions of dollars because they were more prepared to trade yield for quality.

So how do you then represent the value of the agent and how do you say whether it’s, you know, performing well or not performing well and what does that pip start to look like? So the harder problem with this is thinking about the bigger picture and the consequences to cost to revenue and the trade offs that these things have to make. You don’t have an agent going rogue. You have an agent that has been given permission to access a policy that may be very much out of date, that may not have the right attributes affixed to it or consequences that are unforeseen because nobody ever expected that a person would not provide oversight to a system.

So in measuring things, you know, as I said, trade offs are the bigger issue and the bigger consequence. But a rogue agent isn’t the one that broke a rule. It’s the one that makes a trade off nobody realizes was that it was authorized to make.

[00:15:12] Speaker A: Lot to unpack there. Joanne, I’m just going to ask you one follow up question to answer quickly. The tech companies, the ones putting out the AI models, are in fact the one saying that we should treat AI performance management like people.

And I’m wondering if this is just a Layer of hype to get people more immersed into what’s happening with AI or is this just something that’s leading, something that will come as, you know, enterprises do. They believe enterprises are going to go in that direction in the next two to three years. So closer to hype or closer to leading two or three years out hype.

[00:15:58] Speaker B: And I would say ask anyone in senior management this question.

If you just hired an intern, would you trust that intern to make decisions for the whole organization?

Because agents are like interns until they. Or apprentices until they learn, until they start realizing how your company operates and what the consequences of decisions are and how decision management actually works, I wouldn’t trust them.

[00:16:28] Speaker A: Interesting. Thank you, Joanne. Let’s jump to Derek. Derek, Joanne said there’s no such thing as a rogue AI. It’s about the policies that we’re, we’re training them on and the context they have or don’t have. But, you know, look, there are employees who, you know, who exhibit behaviors that are problematic.

I think we do have to protect against rogue AI, somebody putting an AI out there intentionally to go rogue. So I think it’s both. I’ll let you comment on, you know, what’s the equivalent of performance improvement plans or termination when an agent goes rogue or repeatedly violates policy. Go ahead, Derek.

[00:17:14] Speaker D: Yes, good morning. And I agree with what Joanne was saying. And also Roman said, I think I was cringing when he was talking about the, the access they were getting with some of the tools they were given. But, you know, when a manager asked about, you know, how do we manage AI agents, my first answer would be, don’t start with AI. Let’s start with risk. And looking at, as Joanne mentioned, you know, these agents are actually put into an environment, but they don’t have the guardrails and things put up around them. So when you look at the behavior of a human versus an AI, you know, you’re going to manage operational behavior of a human. But when it comes to things like artificial intelligence and AI agents, you’re going to, you have to assume things like drift and failure and this discovery that you didn’t want them to discover because the guardrails haven’t been put up properly. So when you’re looking at, you know, how you do a job description, you know, a person is going to have a job description that tell them exactly what to do. And an agent’s going to need capabilities that declare where you can go and what you can do. But an agent’s not going to know if they’re doing Something wrong because they haven’t been told what they’re doing is wrong. They’re told to go out and manage and find and evaluate and research and do all those things that you ask them to do. So in their sense, they’re operating under a normal condition and they don’t know they’re doing anything wrong. The thing you mentioned about those doing it with intent, that’s different. That goes back to the governance that these companies need to set up before they start deploying these AI agents within their ecosystem. The problem doesn’t really lie with the agents themselves. The problem lies with the companies that trying to deploy them with unrealistic expectations setting up front. As I mentioned when I started looking at the risk first and say, what is this agent going to introduce to my business? When it comes to issues, risks, challenges, Those are the questions not upfront, because everybody sees this AI as the godsend of it’s going to solve all my problems. Not realizing when you have a new employee, as Joanna mentioned, you don’t give them access to the keys of the kingdom. You want to understand how this employee is going, going to work and what they’re going to do. And the real problem is looking at how they’re going to do it well and the intent of behavior. As I mentioned, you know, you can coach an employee. An agent’s going to be optimized to help those systems do what they need to do. They’re going to look at context shift, they’re going to look at prompts that may be adversarial, they’re going to look at output, and they’re not going to know what’s wrong until you tell them that’s wrong based on the oversight that you’re working with. Now you talk about performance, you know, with an employee, you can do performance improvement. With an employee, when it comes to an agent, you know, what are the things you can do there? Can you terminate them? Yeah, that’s a point. You can revoke the credentials, you can revoke the API keys, you can disable tool access, you can even quarantine some of their workflows. But these are the things again, you have to design up front. These are the things that need to be put up front. Just don’t let an agent go rogue within a system without putting the guardrails in place. And I think this is the thing that people miss because it’s the hype of artificial intelligence going to solve your problem. It’s going to create more productivity, it’s going to give you more access, and it’s going to be beneficial. Yes. If you take the time to train it and put up the guardrails and tell it where it can go. There need to be behavioral things that going to trigger red flags for the agent to understand if it’s overstepped in a boundary, if the policies haven’t been set up, if it’s being misused and a thing in a way that it should not be misused, if it’s going outside of a data boundary or creating a violation, which it shouldn’t, that falls to the company compliance. There’s a whole lot of things that come into deploying AI agents and how you manage them. But it really starts with asking the question what will this agent do when it gets in my system and what will it have access to?

[00:20:37] Speaker A: Thank you for Derek. I got a room of raised hands here, so I’m just going to keep going. Martin, if we’re going to treat a highs like people do, we need change management for AI. I’m just laughing at that concept. Go ahead.

[00:20:52] Speaker E: I just kind of. Yeah, this, this whole thing scares me totally. It really does. Because at the end of the day you’re setting something loose with some instructions and a vague set of rules. And I use that term very deliberately, vague set of rules because if we start thinking about it as human, humans have certain inbuilt things that they’re taught as we grow up, we’re taught various things which are good and bad.

[00:21:22] Speaker A: Yeah.

[00:21:22] Speaker E: The AI agent doesn’t necessarily have this set of inbuilt rules as well. So it can do all sorts of things.

And I think the kind of key amongst some of this is how could it interpret things and how does it override one thing on another?

I go back to the age old example of the AI running a tuck shop and the fact it kind of went rogue because it over overwrote or not overwrote, it used the make profit rule to override various other things and it started ordering all sorts of stuff which had nothing to do with the tuck shop because it thought it could make more profit from doing that stuff. So you have to be very careful to say what are the rules? And make sure those rules are clear, make sure they are in place and make sure that if there are a combination of things happening, combination of circumstances, how will it choose the right set of rules to comply by? Because it’s. Yeah, the intern’s a good, a good example but I think even that’s a little bit of a misnomer as well.

I think of it more of A Ever heard the joke about the thing spell check. As a little man in a box who’s partly drunk and is trying his best to help but doesn’t quite know how. I kind of almost think of it as that in times because it’s very difficult to make sure that everything is going to actually happen the way you think it will happen. Sorry, I rambled a bit.

[00:23:06] Speaker A: No, you’re good, Martin. And the example I have of what you’re describing is if you turn on Grammarly and Gemini on your Gmail at times, you will see them having competitive behaviors over how to fix your grammar.

They have different rules. They’re built on different training sets.

So you can have multiple agents that are going to make different suggestions or different versions of the agents making different suggestions.

It’s just important, I think, for people to realize you have a bunch of APIs. I put some natural language ability in front for it to understand a role in the set of responsibilities. I’ve put some natural language at the back so that we can trace through its decision making so that we can respond if it’s triggered by a human. And that’s effectively what an agent really is. And I have an article that I’m working on right now about AI Agent Orchestra.

I will give you the punchline. They’re fairly primitive at this time.

So you want an AI to do a simple job for you. It can do it fairly well. You wanted to solve a larger problem for you, and we’re still a work in progress there. And so I think this question of how we manage them, I think this question about, you know, how to prevent behaviors and monitor for poor behaviors, core responses, even looking at rogue, I think is very relevant. Let’s go to Joe. You want to talk about performance or do you want to talk about rogue behaviors?

[00:24:47] Speaker F: Well, I’m going to go back to your first question. Should we manage them like people?

And you know, I’ve had.

I don’t want to give you a hard no. I want to give you a qualified yes.

[00:24:58] Speaker A: All right. Disagreement in the coffee with digital trailblazers. Go ahead, Joe.

[00:25:04] Speaker F: I have a set of golden rules, and I think that you can adapt the golden rules to digital agents.

You know, your point about we handle people badly is spot on. We hand them a handbook, we give them a little bit of coaching, and we set them off and expect them to perform stellar. Right? Well, that’s not what I do. I set some expectations.

I give clear, defined boundaries. To Derek’s point, it’s all about what does the policy book contain what is the AI agent learning? Is the policy accurate? Is it up to date? Does it have the proper guardrails built into it?

This applies to the agent as much as it applies to a human, if that’s the way you train.

Bad agents don’t necessarily make bad choices like people do.

Bad agents or agent misbehavior on the part of an agent is a design flaw. It’s a data problem, it’s a governance issue and it has to be addressed as such.

Again, go back to people misbehave and when you catch a misbehaving, you know, you correct it. Right.

It’s all about monitoring and recognizing that the real risk here, and this is where I think there’s a big departure, human versus agent.

The real risk here is that agents work at light speed.

So the damage they can do in a heartbeat is, you know, multiple times the damage that a human can do. Now, that’s not to suggest that a human can’t make a mistake and shut down the entire plant for a week. That can happen too, and it can have tremendous impact.

So I’m going to give you a qualified yes on that for all those reasons.

[00:26:56] Speaker A: Wow.

You know, my more tactical issue, and I just put this in the dashboard, is that, you know, once again, I think we’re at a, in a situation where the capability and the innovation is leading or, well, leading our ability to test and monitor, in this case, whether you’re going to manage them, an agent like a person, or in the other extreme, like a system.

I don’t have good evidence that the AI agent monitoring tools are on par to keep enterprises and businesses aware of what’s going on once they deploy them. Happy to hear some ideas and disagreements on that. Go ahead, Heather.

[00:27:39] Speaker G: Well, I’m going to vote in Joe’s court because I think the question is, what is the evaluation? What is that performance evaluation? Some of them are awful for humans, some of them are excellent and gold standards. So I think that what you’re going to evaluate, evaluate the performance of a thing or a person has to be relevant to that thing or that person has to be relevant, as Joanne said, to the level of that person and the knowledge that they come in with and what you’re granting them. You can’t just assume that it can be assigned all these tasks if they don’t have the ability, if they don’t have the exposure, if they don’t have the experience to do it. And if you don’t monitor, shame on you.

And that’s a leadership problem. That’s a business problem, that’s not a technology or an agent problem. If you don’t have people that are willing to take the responsibility for the work that they’re doing and the work that they’re overseeing, that’s really critical. And when you have a job description and something that’s very clear for an individual, this is what your task is, this is what you’re expected to do. And if you change a job description midstream and then you, you ding them because they didn’t do their job, that’s not fair, that’s not right. And why would it be any different for a tool?

So if you’re going to have someone have an agent doing a task and then changing it without telling it, then you can’t expect it to perform optimally.

[00:29:07] Speaker A: Thanks, Heather. Folks, we have two votes. No. Two votes of qualified. Yes.

Of whether we should manage the performance of an AI agents using similar mechanisms as we do. People, I would love to hear everybody’s opinion on this in the chat.

I think this is at a point where it’s an opinion because it’s an evolving science. So I’d love to hear this. AI wants to break free.

I don’t know if we know about that yet, but listen, I have yet to have even an LM come back and tell me, you know what, I don’t have data to support what you’re asking. For the most part, it doesn’t happen very often. And when you start restricting what AI agents can actually look at, if you have poor data quality in your CRM, I’m not sure the CRM’s agents are going to tell you that up front before it starts answering your questions. Go ahead, John.

[00:30:05] Speaker H: Yeah, I think you can draw so many parallels from managing humans to managing AI systems.

Sure, it’s not exactly the same, but we’ve spent 2,000 years trying to get humans to do what we want through when it’s been actually recorded and it’s even been longer than that. But for as far as work, they’ve been writing laws, they’ve been writing stories of how people should behave, and there’s so much we can draw through. And when I think back to talking to my mentees about when they’re having major issues at work, there’s a common set of things that are always wrong, which often is that they get unclear information up front, they don’t get the expectations set by their manager and then no one’s checking in on them. And this is so much the same with computer systems. It’s so much the same with AI systems. If you’re not regularly monitoring something and you don’t own everything in house, you know, changes can be made by the third parties that you’re relying on.

You know, all sorts of stuff can happen and you’re not going to get your expected results. And so I just, I’m on the camp that says, you know, if you’re not drawing the best practices from managing humans and using them with AI systems, you’re going to run into problems.

[00:31:21] Speaker A: Thank you John. Let’s just take our mid session break folks. Thank you for joining this week’s coffee with digital Trailblazers. We meet every week here to discuss leadership practices impacting digital transformation leaders.

A good about a third of our topics are AI related and I’ve got a bunch of new ones coming soon. Let me just bring this into the field of view for all of you to see. We’ll be talking about leadership next week. Essential sales and marketing skills for digital Trailblazers. If you have a big idea, you have to be able to sell and market it. I have a world class marketer joining us next week as a special guest. So do join us for that one, especially all you founders. April 3rd we’ll be here to talk about redefining data governance. Is the data owner role obsolete in the AI era? This has come up a few times.

We are no longer talking about classifying data in very strict static terms and so we’ll talk more about that on the 3rd. On the 10th we’ll be talking about developing your personal brand best practices from thought leaders. On the 17th I’m going to have AI coding competencies hype realities in the future.

And the 24th will be a TBD. I’ll be announcing that one shortly.

Do use the URL drive starcio.com Coffee There is a button there when you go to that URL add to your calendar and if you ever miss one that page also has links to the LinkedIn pages to go back and watch them. Folks, just something I’ve been working on. Just trying to share this. How do I do this? Geez, that’s not what I want to do. Maybe I’ll leave it to the end folks. Let’s come back. Roman. We’re talking about AI agents. We’re talking about managing their performance. Performance. I don’t know if I got an official vote from you and I want to talk about your thoughts from your the, the workshop that you went to and was this discussed. How do you monitor AI agents and how do you look for poorly performing ones? Welcome back, Roman.

[00:33:40] Speaker C: So, thanks. So what I would do is say I agree with everybody who said everything so far. Yes, no, and even those who said in between.

Here’s what I think. There are many analogs to managing agents as you might a human being, especially in considerations like onboarding an agent versus onboarding a human. I think that’s very similar. You’re going to put them in a registry or a database you’re going to put them into.

You’re going to figure out what their access rights are. You’re going to figure out and give them markdown files which give them the skills they need that you want them to perform form. Now, that doesn’t mean you’ve covered all the bases. It just means the onboarding part looks very similar to onboarding a human. But once you leave that and you start to put them in the wild or put them in the real life, you need to start thinking really hard about not managing them the same way you would a person. You need to, they are so fast and so capable of doing things that you hadn’t expected that you probably need to create at least two dashboards. One to give you the overall view of what is this agent doing because it’s working really fast.

Then the second is a kill switch. You need to have both the dashboard that gives you the high level view that lets you figure out what’s going on real time. And then you need a kill switch to say, you know, we gotta stop right now and figure out what’s really just happened.

I would use the analogy. I’m becoming convinced you can’t do human in the loop.

And I know I want to, but I just don’t think it’s any more feasible than having somebody walk in front of a, you know, early automobile, you know, down the dirt road with a lantern or a bell screaming, you know, auto coming, you know, things like that.

It’s just not feasible.

[00:35:44] Speaker A: And you’re saying that because you think more people will adopt automated decision making at scale in ways that humans just won’t be able to keep up.

[00:35:55] Speaker C: Correct. And I saw that from the class. I mean, there were people in the class who were trying to make themselves AI natives as part of their career development.

And I can see how that would really help in a resume saying, you know, I was a product manager and I knew how to build out evaluations of, you know, various types of different scenarios for these products. And I was able to figure out what the customer responses were very quickly, you know, within days rather than months.

[00:36:30] Speaker A: I was talking to someone Yesterday at the Fairfield Westchester SIM event about some of the agents that I was seeing in an erp and one of them that’s becoming common is this idea of a close the book agent accelerating automating aspects of it. But I just don’t see the CFO who is accountable for putting out financial metrics saying, you know what, I’m going to automate putting closing the books or an HR leader, I’m going to automate who’s our next hire is going to be there.

[00:37:07] Speaker C: Isaac? There was a guy who was a financial analyst or finance analyst in a company and another guy who was in their accounting and treasury department and that’s exactly what they were doing.

[00:37:17] Speaker A: Geez, we’re in trouble folks. Go ahead, Liz.

Are you going to put AI agents through performance management or are you going to look at other metrics to monitor their behaviors?

[00:37:32] Speaker I: Well, first of all, I love the idea of the performance management. I love it however, when we start talking about AI agents as humans or treating them as children, right. That you’re watching and making sure that they’re doing the right thing, etc,

[00:37:52] Speaker C: the

[00:37:53] Speaker I: idea is that somehow that that child will grow up and that you won’t have to necessarily monitor it anymore. So it’s just never going to happen.

And the reason it’s never going to happen is because children, when they’re born and as they grow up way before they can speak or wreak havoc on our society, they are taught things like loyalty, integrity, morality.

These are not things that you can code into an AI agent.

They just don’t have the judgment to actually make decisions along those broad and very deep concepts. So while you can train an AI agent to be better, stronger, faster, I’m thoroughly in love with the kill switch idea.

They’ll never be anything more than a 12 year old.

[00:38:53] Speaker B: Okay.

[00:38:54] Speaker I: And you just have to expect that that’s the case. They will always have to be monitored forever.

And then you actually would be really interesting is if we could come up with some kind of maturity assessment so that we could actually measure the maturity of an AI agent. Now that would be really interesting.

[00:39:14] Speaker A: Yeah. We have to get past I think two things. The idea that thumbs up, thumbs down is a sign of feedback. And then to Roman’s point, if we’re, we’re seeing more companies adopt agent capabilities, which basically means the AI is doing elements of its decision making without a human in the loop.

We need some better tools to be able to monitor this. Whether you’re doing performance management plans or other ways to look for a question mark is this AI making a good decision. Go ahead, Joanne. I want to jump into this last question. Your thoughts on performance and rogue behaviors, but also, you know, the, you know, let’s follow up on what Roman has been talking about. This new leadership skill operating model shift capability that we might be looking for so that our digital trailblazers know what they need to manage AI agents over the next six to 18 months. I’d love to hear your thoughts around that.

[00:40:19] Speaker B: Excuse me.

Well, first of all, AI agents is a plural and it’s a plural for a reason. Because when you engineer an agent, it’s given a specific task and nine times out of 10, the agent that’s given a very long set of steps in a task will be much more inclined to have an oops than agents that have shorter tasks. So agents is regarded as a plural because you usually have multiple agents running at a time to solve a problem. When you put in a prompt in generative AI and it’s not running agents or it’s running agents in the background, it’s the big, you know, Encyclopedia Britannica, no pun intended to them suing, but it’s the worldview. So they have a lot to go through before they give you an answer. In an agentic environment, each agent is given a set of tasks, a set of guardrails around those tasks. And this is why you engineer human in the loop in to the agent through an approval process through you need to have, you need to pause and wait for a human to give you direction.

You need to accomplish this task, hold it, but wait for someone to review it before you actually execute.

In our world, you know, to take something that Joe proposed months ago, it’s like an apprentice, a journeyman and a master. And we agents actually do learn and they do get smarter over time. And to the moralities and ethics questions and endearment and trust and all of those things. People trust agents the more they know because they’re able to give a more complete response. The more they know, the, the smarter they get. The smarter they get, the more they can start being more specific about what they do. So when we measure them in our case we use, you know, hard dollars.

What’s moving the needle? That’s a performing agent because it’s saving productivity, saving cost, saving time, it’s helping generate revenue, etc, so on that side but what digital trailblazers really need to know is that where, what are they actually using the agents for? And in this case it’s I would say to close one of the gaps and what do they need in that regard is remember time to data, time to decision and time to value. Organizations compress one at a time. When they look at AI, you need to look to compress all three because to the point made earlier, they can become wicked fast, but they’re not complete. So look at managing agents as starting to manage an execution layer and build a trust architecture. Because if you don’t build trust and judgment into the architecture, which means that your sources of data are kind of diverse, include tribal knowledge, include policy, include structured and unstructured data, deterministic and probabilistic data, all of those things, then you’re going to end up in a problem. If you don’t have that sort of mindset around the architecture within that 18 months, you’re going to have agents everywhere, nobody knowing what they’re authorized to do. And your accountability role, well that’s going to be up in the air because there’s being responsible for data and then there’s being accountable for how that data has been used.

[00:44:03] Speaker A: Lot there. Joanne, I got four or five people raising their hands. We got 15 minutes left. I want to hear everybody. Martin, thoughts on skills, operating models, tech capabilities, performance management and rogue agents. Where do you want to go today? Spin the wheel.

[00:44:20] Speaker E: I think the I’m going to follow up on saying, Joanne said rather than delving into lots of different things, which is you’re not talking about a single AI agent. You’re talking about, let’s call it a swarm of AI agents performing a set of interlinked tasks and things like this.

So let’s just kind of go from where we were talking earlier about the misbehaving, or not necessarily misbehaving, but in our views misbehaving really, because we haven’t set up the rules well enough for them to understand what’s good and what’s bad.

So now let’s kind of exponentially increase that problem and that risk to a swarm of AI agents. All of which may be getting things in our minds wrong, but their minds, I’ll use the term loosely in their logic, do appear to be correct.

So now you’ve got to exponentially look at that increasing and how you manage that. So I disagree. Yeah, so going back to your earlier question, I disagree with managing like humans because this is non human like behavior. You are talking about things that are executing very, very fast, doing an awful lot of things and a swarm of them that could all be doing things wrong.

[00:45:44] Speaker A: We got three votes on no and two votes on a qualified yes. I’m going to listen To Derek, where. Where’s your vote coming in at?

[00:45:53] Speaker D: Mine was coming in, yes, with caveats. But AI monitoring is definitely something you need to, to be on top of. But when you go back to ask your question, you know, the leadership skills and things to look at, I think leadership starts to look at agent risk literacy. And this is the ability to ask better questions of the agent, of what you really wanted to do and what you expected to do. These are things to look at. You know, what am I going to. What’s the agent going to touch? What kind of. How can we detect drift? How can we look at those things that are in our ecosystem that this agent may affect? From an operating model perspective, I’m looking at the AI or the app governance to autonomy. You know, most people, they think they put it in there and agent is static. It is not. An agent is a dynamic actor within your ecosystem. It’s going to have access to whatever you let it have, access to whatever guardrails you put in. And then the other thing is looking at the capability. So we have all these policies from old legacy systems that we have today, but the policies for AI need to be not only enforceable, but need to be observable. And this goes back to. We talked about threat intelligence monitoring, in this case AI threat intelligence monitoring. But it needs to be based on an AI framework. And I think a lot of people miss that. A lot of people are when it comes to resilience and resilience strategies, they’re looking at the NIST cybersecurity framework that’s not really relegated for artificial intelligence agents. You really need to move up to those frameworks that are going to be rated and better suited for artificial intelligence, like the NIST AI, RMF and even more so there’s other tools that are out there that will actually help you understand those risks. And that leadership should be asking, what are the things I need to deploy over the next few months? So if I’m looking at these tools, I’m going to have the NIST AI Risk Management framework that’s going to talk about the governance, the mapping, the measurement and the management of artificial intelligence. But then how do you measure that? You really need to do two frameworks. There’s another one called MITRE atlas. ATLAS is adversarial threat landscape and artificial intelligence systems. And having those two together, you got to have a framework that’s going to govern how the agent is going to work. And then the system will actually monitor how the agent is behaving within your ecosystem. And I think People are missing out on that because they want to deploy the agent fast, but they don’t want to put the tools and monitoring capabilities in place to make sure they do the agents doing exactly what it’s supposed to do until after the fact. So the skill set to changing leadership needs to get involved and ask these questions earlier. What do I need to do, how often I need to do it and how soon do I need to do it before I deploy these agents to let them loosen my system?

[00:48:17] Speaker A: Derek, I’m surprised. I mean it’s like, you know, we’ve always had this problem where monitoring and incident management and threat detection is, you know, two or three generations lagging apps, it’s lagging APIs and now it’s lagging AI agents. And you know, there’s just too much momentum from the top down to get these AI agents in place and to see efficiencies drive hard cost savings. And look, I’m queasy about this. I shared this a few about three, four months ago, I think it was anthropic. Sent a vending machine that was self managed with AI to the Wall Street Journal and they hacked their way through it. You know, and now look, it’s been maybe four or five, six months since and the models have gotten better at their decision making significantly over that time.

But I’m looking for examples like that, right? There are ways to game systems that have been around for 10 to 20 years with all the layers of security that we put in, with all the guardrails that we put in, and yet we still have vulnerabilities out there that we can’t manage. I want to go to John. Go ahead, Go ahead Derek.

[00:49:40] Speaker D: Yeah, so one thing you talk about the same and money piece of it, the thing that leadership needs to understand. You need to spend money on these AI threat intelligence monitoring tools to save money. And that’s where the rub is. They don’t want to spend because it is expensive.

[00:49:53] Speaker A: Thank you Derek. Go ahead John.

[00:49:56] Speaker H: I think if you’re going to take the human out of loop, the amount of preparation work you have to do and the amount of ongoing work you have to do really increases. And I mean monitoring is really important. Like if you’re going to take the human out of loop, you absolutely have to monitor, but you really even have to go beyond that. Like you have to do so much more testing. You also have to do continuous like simulated testing just to make sure that this thing is behaving the way you expect once it’s live in production. And so yeah, monitoring that helps you after the fact that something’s happened. But I mean there’s all sorts of stories where these AI agents are doing things that people didn’t intend them for them to do and the monitoring would catch that and you could use those that information to make changes in the future. But you really want to get ahead as much as possible.

And the way you can do that is through simulated transactions and other testing as it’s running in production.

[00:50:49] Speaker A: Once again hitting on a skill set and domain that many enterprises under invest in the ability to do testing and continuous testing and creating synthetic data and putting in place the monitoring tools to make sure things are working. Folks, if you can, I’m a little bearish about this. I think AI agents are really exciting, but I think we’re being a little too optimistic about automating their decision making and very optimistic about using performance management techniques around this. I’d like to see, you know, let’s, let’s just put some basic unit metrics in and error rates and I’ll be doing some writing around this. These notions from site reliability engineering and managing error budgets and looking for service level objectives. You want my opinion? I think that’s a better framework for managing AI agents than how we manage people. Go ahead, Joe.

[00:51:46] Speaker F: I’m going to give you all three on your list of leadership skill, operating model shift and tech capability.

[00:51:52] Speaker A: Awesome.

[00:51:54] Speaker F: The first is on the leadership skill. It’s orchestration thinking. It’s, it’s not about, as Joanne said, one single agent, it’s a cluster.

You’re managing lots of workflows. I just wrote a, a blog piece last week about elevating the thinking from point solutions or even departmental solutions to leadership across the enterprise. Everything old is new again, right? You have to think about the entire enterprise as your perspective. And when you’re bringing Agentix into play, the operating model shift, similar cross functional agent governance teams, you have to look across all of the functions and stop thinking about AI as just a single project. And then on the tech capabilities, I think everybody has really hit hard on this. It has to be observable and auditable.

And I put in the comments that you can’t allow agents to do anything so fast or so broad that you can’t monitor it. To your point, Isaac, the technology hasn’t caught up. Humans can’t monitor as fast as agents can act. So that means in the short term we have to slow them down and make sure that we monitor them sufficiently before we allow them to do more.

And that’s the way I look at it.

[00:53:13] Speaker A: Thank you Joe. I love the breakdown. Go to Joanne. Your thoughts on skills, operating models and tech capabilities.

[00:53:22] Speaker B: I think you know, not only will I echo Joe’s sentiments, it is a mindset for leadership to understand that the whole point of using a tool and really we’re talking about tooling is to drive a business outcome.

And if you’re not going to be able to figure out what the business outcome or the rather value that’s being created by agentic AI or generative AI or physics AI or a combo of the three, which could be called a robot, then why are you doing it?

So figure out first what are the, what are the real outcomes you’re trying to achieve? If it’s replacing people, you’re making a mistake.

Because no matter how much tribal knowledge you can capture in a system, people will never function the same way as an agent and agents will never function the same way as a person because the guardrails that we need to establish are slightly different. Especially in situations of multi dimensional change management that we deal with every day. How often, think of it this way, how often do we change our hat from one moment to another in our day to day work life? One minute we’re answering email, another minute we’re making decisions, a third minute, you know, we’re doing workflow. All of those different hats have different criteria, different responsibility, different accountability.

And if you put that in terms of AI, the observability, the explainability, the provenance, what’s an embedding, what’s a semantic layer, what’s an ontology?

You can’t directly map one to the other. So be careful, be purposeful about what you’re doing and start with leadership. And if leadership doesn’t buy in, or you can’t put six items on a chart and say these are the things that we’re aiming at over the next six months, 12 months, two years, then rethink your strategy with AI.

[00:55:26] Speaker A: Thank you Joanne. Let’s go to John and then Roman. I want to end with you. Just a quick comment on. Finish up on what an AI native professional looks like from your program that you were just on. Go ahead, John.

[00:55:39] Speaker H: I was just going to say I’ve been looking at a lot of enterprise and small business software and I’m surprised like how many of them are starting to have agents built into them. And so it’s just like they like this conversation. It’s, it’s so real. Yesterday I was in a marketing team that for a small amount of money can drop a set of Agents to manage almost all the back office, marketing, recruiting, other things. And so it’s this, it’s, it’s such a real conversation, kind of in light of the other conversations I’m having with software companies right now. And so that, like when I was talking, like, it’s really kind of scary, like, so, you know, are you really going to drop a set of agents in to manage everything in my back office? And this company’s like, yeah, we absolutely can.

[00:56:17] Speaker A: I don’t know. I run a pretty small business, John, and I have agents doing my work for me, but they are not automated for the most part. I’ve had automation in my business for a very long time, but when I have to make a decision, I’m making that decision called old school. Roman, giving you the last word, I want to hear where your conclusions were around AI native professionals.

[00:56:43] Speaker C: So what, what I would say is it appeared to me that most people were looking at very similar to what Joanne was talking about. What are the things that I do over my career or over my month or daily cycle. So let me just take an example of somebody who’s a product manager. They’re going to be looking at product strategy, discovery, defining, prioritizing, delivering, go to market and then modify the product.

[00:57:10] Speaker B: Okay.

[00:57:11] Speaker C: And what they’re looking at from a professional perspective is what parts of those steps in the process are going to get automated either by my company or by me. And so that’s what they’re looking at is how do I, as a professional product manager, automate certain steps in product strategy or discovery or in delivering the actual product, you know, or getting ready for go to market and that, you know, so which, if you look at all the steps underneath those bigger processes, which ones in there am I going to automate with a agent and Isaac on a personal basis? I mean, think about all the emails that Star CIO gets and you can put an agent in there to answer questions rather than just scheduling time with you.

[00:58:02] Speaker A: Yeah, believe me, I’m thinking about that. Sort of like, you know, there are actually tools out there that will replicate your knowledge and make you a digital twin of yourself to answer questions like that and that that’s someone like me. That’s both exciting and a little bit scary. Roman, thank you for joining this week, folks. Thanks for joining this discussion on managing AI agents. Just want to make a personal announcement from Star cio, the company I lead. We do workshops all the time. Every year we launch one or two new flavors of workshop. We just formalized our workshop on AI strategy and governance for your organization. Even if you have this down, the technology and the process and the culture is changing too quickly. You will likely want to update these things at least once or twice a year and our workshop has 15 different advisories to it. You can pick the ones that are most relevant to you. Visit star cio.com star CIO workshop AI strategy I’ll put that in the comments for you. After for you be able to see the workshop that we are currently offering our upcoming Coffee Hours. Next week we’ll be talking about essential sales and marketing skills for digital trailblazers. The third we’ll be talking about the evolution of the data owner role and is it obsolete in the AI era. On the 10th we’ll be talking about developing your personal brand. One of the more important things I think digital trailblazers have to do if you are looking for new opportunities that will be on the 10th and the 17th, I’m going to have a bunch of special guests talking about AI coding competencies, where they see hype realities and what they think the future will bring.

Again, visit drive.starcio.com Coffee for the add to calendar. You won’t miss ones that you can if your calendar is blocked. And you can always visit that URL again to go back and listen and watch the previous episodes if you missed any part of it. Folks. Enjoy the basketball this way Weekend. Go Arizona Wildcats.

I’m rooting for my team everybody. Have a great weekend. I’ll see you here next week.

Leave a Reply

Digital Trailblazer Community

Isaac Sacolick

Our community of Digital Trailblazers are for leaders of digital transformation initiatives, people aspiring to tech/data/AI leadership roles, and C-level leaders who develop digital transformation as a core organizational competency.

Review the Community Guidelines