217: Anthropic Just Dropped Their Internal Data Playbook (copy this)

Help us become the #1 Data Podcast by leaving a rating & review! We are 67 reviews away!

Anthropic just dropped their entire internal data playbook. Here's what they're doing and how it affects your career.

💌 Join 30k+ aspiring data analysts & get my tips in your inbox weekly 👉 https://datacareerjumpstart.com/newsletter

🆘 Feeling stuck in your data journey? Come to my next free "How to Land Your First Data Job" training 👉 https://datacareerjumpstart.com/training

👩‍💻 Want to land a data job in less than 90 days? 👉 https://datacareerjumpstart.com/daa

👔 Ace The Interview with Confidence 👉 https://datacareerjumpstart.com/interviewsimulator

📄 Read Anthropic's full data playbook 👉 https://claude.com/blog/how-anthropic-enables-self-service-data-analytics-with-claude

⌚ TIMESTAMPS

00:00 – Anthropic dropped their data playbook

02:39 – Why AI analytics keeps failing

05:24 – How they hit 95% accuracy

09:24 – What a Claude skill is

14:39 – None of this is actually new

17:09 – Still hiring data people

🔗 CONNECT WITH AVERY

🎥 YouTube Channel

🤝 LinkedIn

📸 Instagram

🎵 TikTok

💻 Website

Mentioned in this episode:

July Cohort of DAA

Join the July Cohort of DAA and become an analyst! Be sure to check out our current deal to save BIG! See you in class!

https://datacareerjumpstart.com/daa

Avery Smith-3: 00:00:00

So Anthropic, the makers of Claude, literally just dropped an absolute masterclass on how they analyze data internally, and they posted a blog post that is four thousand five hundred words, and there's a lot in there. So I summarized that entire blog post, and I will explain it to you like you're five years old in today's episode, and literally you can steal it and learn how to analyze data just like a Claude data analyst. So this is what Claude is actually claiming. They're claiming that they now do self-serve analytics, which is kind of a funny phrase. Basically, it means allowing non-technical people, non-data analysts to do data analytics in easy ways, and this has been a thing for the last decade or so. In fact, it's one of the main reasons why Tableau and Power BI became so important with dashboards is it allows business people, non-technical people to actually kind of analyze their data in predefined ways. It's been really hard to do for the last ten, fifteen years. Now, basically, Anthropic just tweeted that they are able to do ninety-five percent accuracy on all of their business analytics queries with Claude, which is crazy. That basically means that w- if anyone has some sort of an analytics question, they can answer it now with ninety-five percent accuracy using this internal playbook. So what are they actually doing, and how can you replicate it in your own organization, or how can you bring this to an interview to make you a more marketable aspiring data analyst? So- Basically, like I said, self-serve analytics has always kind of sucked. It's when non-technical people are analyzing the data sets, and there's basically two different ways to do it. Option A is you open up to everyone, which basically means you have non-data analyst people trying to analyze data, and a lot can go wrong. You can get really messy, different queries, maybe messy dashboards, conflicting definitions, those type of things. Or you lock it all down, which basically means that, uh, you create a bajillion different types of dashboards, but it never really answers anyone's question when they want it the way they want it. And, uh, that's been, that's been tricky in the past. So now there's AI, and now you can give, you know, Claude... You can give someone Claude or ChatGPT and point it to a database, and you can have them ask ChatGPT or Claude questions to the database. Uh, but there's a big issue. Number one, that we all think the AI doesn't hallucinate, doesn't lie, doesn't make things up, and it does, and it can be wrong. Uh, and number two, it gives everyone like, "Oh, this is a hundred percent accuracy," but it's, it's not, and that can cause a lot of issues. So, um, you know, AI is a great solution for self-serve analytics, but it causes a lot of problems as well. So how did, uh, Anthropic actually solve it? Because what they're claiming that ninety-five percent of their business analytics queries are now automatedly solved by Claude, and they're ninety-five percent accuracy, accurate, um, which is a big claim. Like that's basically like, "Hey, Claude is now our company data analyst, essentially." Now, I, I will mention here, um, that the data team can now work on bigger and better problems that are like less sequel monkey questions, right? Um, so it's not like they're getting rid of their data analyst or their data scientists. It's just you don't have to do as many ad hoc reportings, and you can just focus on more important things. And just managing this Claude infrastructure of creating this company-wide, uh, self-serve analytics platform is a beast, and we'll get to that here in a second. Um, basically, in this article, their thesis is data is very different than software. If you've, you know, heard about Claude or Codex, um, for programming and software engineering, it can do those things really, really well out of the box. Um, because coding has lots of right answers. There's ways to test things. There's documentation that goes with code. Um, and all those, you know, infrastructure can basically catch hallucinations. It's a more solved problem. Analytics, it's quite a bit different because there's only one right answer, and you don't really know what the right answer is. There's no way to actually test what the answer is versus i-in programming, you're like, "Does this box open up if I click the button?" You can test that. There's no way to know, like if I ask Claude for the m- you know, the mean of our sales over the last month, you really have to like go actually run the query yourself To make sure that Claude's not giving you, uh, a false answer. So, um, their, their argument is we're not having issues coming up with code generation. It's basically all of the context and verification that goes around solving a business analytics problem. And LLMs historically have been pretty bad at this, uh, for a multitude of reasons. One is that we give it unclear directions. I don't know about you guys, but if you're anything, uh, like me, you don't necessarily give Claude or ChatGPT the most specific instructions on planet Earth, and there's some ambiguity. And the problem with that is, like, it can go into the database and, like, it thinks it knows what you're talking about, but it finds a different column, or it's not using the same definition. You're not basically on the same page as ChatGPT unless you give really explicit directions. Number two, there's data staleness, which basically means that your database is constantly changing, uh, over time. Definitions change, tables change, and, uh, these AI LLMs, they're not really good at following with that. Like, they don't have the business context, the domain context that you may have as a human being on the other side of like, "This is why we made those changes," you know, "This is why it's better," so on and so forth. And then number three is it just doesn't know where to find the right thing. Like, it thinks the data's in there, it's looking, but it's not entirely sure

Avery Smith-4: 00:05:23

So here's what Anthropic did to try to solve this problem, and they're calling it Anthropic's Agent Analytics Stack. And there's basically four different stages right here, and each one is built to try to take one of those previous problems that we talked about and solve it. So the first one is data foundations, and basically, it just means you have really solid data foundations. It means you're very clear on what a table is, what it actually has, what a row represents, what a column represents, and how often it's updated. Um, number two is you only have one source of truth, and the idea is if you have a sales table in your database, you don't have, like, another sales table in your database. Like, there's only one sales table, and that is the sales table. There are no other sales tables. And for some of you guys listening who might be more junior data analysts or aspiring data analysts might be thinking, "Well, that makes sense. Why would it ever be a different case?" And the issue is when you get to, like, large organizations, something like Anthropic or when I worked at ExxonMobil, you gotta think that there's literally seventy thousand plus employees, and all of them might need access to that table, and they might need it slightly different. So you might have someone that's like, "Oh, this is their sales table, but we only need the weekly averages," so they create, you know, the weekly average sales table. And then there's someone else who's like, "Oh, well, we actually only need the sales from Monday, Wednesday and Friday," and so they create this other table. And basically, you just get a bajillion versions of really the same table. So, uh, one source of truth, really important here. Number three, they develop skills. These are like Claude skills for LLMs that specifically do a repeated task with specific instructions and maybe even some, uh, accompanying code to make it really repetitive. LLMs have inherent randomness built into them. They are non-deterministic, as in you don't get the answer every time, the same answer every time you ask the same question, and skills helps make it more deterministic, that there actually is a specific answer. This is exactly what you should be doing. So it's basically like instructions and almost code files to actually follow every single time this gets asked. And the fourth one is validation, and that is making sure that the LLMs are actually doing what you think they are and validating their answers. So let's dive in a little bit deeper. So like I said, uh, layers one and layers two, basically this is just having good data governance and good data foundations. One source of truth. Um, they also make sure that they have like little, uh, descriptions for each one of your different tables that describes what the table is and what it isn't. Uh, you know, LLMs are really good at reading text, so if you add a little bit of text with your tables that explains what's going on, the LLM understands the context a little bit better versus just looking at the rows and the columns and guessing. Um, you can think of this as like a README file for your data. In code, in building software, in software engineering, in programming, we've always had README files. If you're unfamiliar, a README file, you can just think of it as like a summary of the actual what's going on in your code base. Like all of these different folders, all these different files, all these different code scripts, what's going on. So it's just a human way to describe what's going on for your code or your different, you know, databases in this case. And they also feed it company knowledge maps. So for this system, they give it roadmaps, org charts, decisions, so like a bunch of business context that isn't data. It's not data related. It's all business and domain related, but that extra information helps the LLMs make smarter choices on how to actually analyzing the da-- how to analyze the data based off of what the, what the context says So they actually tried an experiment here, which I thought was really interesting, where they basically took all the data analysts' and all the data scientists' old sequel files, and they said, "Here, Claude, you know, learn from these. These are all, all the things that our engineers and our analysts and our data scientists have done over time. Uh, learn from it." And it actually didn't really help, which was really interesting. Um, it didn't know what code to use when. Um, and they found that there was a right answer eighty percent of the time, but Claude wasn't good at pulling that answer out. And so what's actually been the biggest skill, uh, uh, I guess the biggest, uh, unlock is actually having skills. And that went from twenty-one percent accuracy in actually analyzing data to ninety-five percent accuracy in analyzing data. And if you're unfamiliar with, like, what a Claude skill is, or I think they have some equivalent in ChatGPT and OpenAI and Codex. But basically a, an LLM skill, an AI skill is a reusable step-by-step pattern to follow. Think of it almost like a recipe for AI LLM models to actually follow. So like I said, majority of the time they're written kind of like a human would write them, and it's just like, "Hey, AI, do exactly this. Step one, step two, step three. Look out for this. Be aware of this." And it might have some coding files specifically like, "This is what your code should look like if you generate code." Um, so they... It, it's, it's essentially what a senior analyst's thoughts written down on paper, uh, for a specific task. So you might have a skill on how to, you know, create a, a bar, a bar chart, or you might have a skill on how to do a hypothesis test or AB testing or something like that. And it's basically like you have your, your team get together and write down exactly what the process is. It's like a standard operating procedure that you'd give to a junior analyst, "Hey, follow this," except for now the junior data analyst is Claude or an AI One issue they saw was if you don't actually update these skills, like if you don't like constantly add to them and improve them, that the accuracy slides over time. They actually were at ninety-five percent accuracy, and then they jumped down to sixty-five percent accuracy in only a few weeks. Um, so you need to make sure you're updating your skills. And the last thing is they wanted to make sure that their skills were everywhere. So analytics is really changing. Uh, and this-- You probably haven't seen this in big organizations now. It's just kind of rolling out to maybe, you know, these more frontier trillion-dollar companies, um, and maybe like small solopreneurs like, like me. Um, but the way that we do data analytics is changing. So obviously, like in the past, you'd use Excel to do data analytics, and there's still literally billions and billions of Excel files that we will analyze in Excel. Uh, but gradually, you know, ten, fifteen years down the road, I'm not sure if that will be the case. We will probably be analyzing data in a different way than we are now. And before you're really scared and like, "Oh my gosh, this is awful, AI's coming for my job," well, just think about this. Uh, basically, Power BI came out fifteen years ago. So fifteen years ago, there were like basically no dashboards. Tableau was around, but not super popular at the time, yet it was about to be. Uh, about twenty eighteen it started to get really popular. So it's just like, yes, the way that we analyze data changes over a decade. That's the truth. Um, and just know that right now we are moving into, you know, analyzing our data with these chatbots, and those chatbots may be in multiple different places. So for example, at my company, um, I try to analyze data on, you know, my YouTube watches or my podcast listens, and I've been trying to tr- to automate that as much as I can or make it easier for me to follow, you know, all these analytics. And so we actually have a bot that will help me with these analytics where I can just ask it natural language questions like, "How many, uh, views did the last YouTube video get?" You know, "How many listens did this podcast episode get?" And we can actually do that on a website that I've built and also in our Slack. So they want to make sure that they have the truth and those-- these skills avail-available everywhere, whether it's, you know, you're coding, whether you're using like a website or a dashboard or whether you're in Slack. So those are the keys to having good skills in your organization. And the last thing is, even if it has a good skill, how do you know that it's correct? And that's what we call verifications. And so what, what Anthropic's doing, what Claude's doing is for any analytics they do, they have the sources in the footer. Like this is where we got this information. This is how we calculate it. This is the table we used. Um, so that way it's like very clear that you could look at the table and be like, "Oh, that is the right table," or, "It's not even the right table." They also have a freshness and a version stamp on every data model and how old it is. So like think about like i- if your data changes over time. They're basically timestamping everything, so that way you know, okay, we can trace it back to this database on this day type of a thing. Uh, they're also doing correction harvesting, which is a really fancy way to say they're giving the AI feedback. So every time that this Claude data analyst gets something wrong, the humans are saying, "Hey, you actually did this wrong. You know, you're supposed to grab from database A, and you grabbed it from database B." Or maybe you, you know, you did your query wrong some way. And every time that feedback goes from the human to the agent, the agent actually updates itself, and it's like, "Oh, okay, I'm gonna mark that as something to try in the future." And the last thing they add is basically before it gives any answer back to the human, they run a second agent against it that's called an adversarial review. And basically, if, if you are the AI data analyst and you come up with an answer and you're like, "The average over the last, you know, the average revenue over the last month was thirty thousand dollars," this ad-adversarial re-review comes in and says, "Is it though? Like, does, does that actually make sense? Uh, like, it's been this for the last month and this for the last month. Are you a hundred percent sure?" Um, it's basically trying to prove the first agent incorrect before actually giving them the model, the information to the human. So that way, it's like almost like a peer review, a double check from an agent to actually make sure that the analytics is correct So this might be really interesting to some of you guys, and this might be really scary to some of you guys. You're like, "Oh my gosh, these AI agents are coming for my job." Well, the first thing I'll tell you that none of this is actually new. It's just kind of packaged in a fancy prettified way. Like, if you literally take AI out of this, it's just pure data fundamentals, things that we've had for decades. We've talked about this for years. Like, yes, it's good to have good data quality. Yes, it's good to have good data governance, like to actually know what, what tables mean and what columns mean and what rows mean. Yes, we should repeat our analysis when we can. If we can analyze the data in a uniform way, we should do that. And yes, we should have verification. Like, if I do an analysis, someone else should check it to make sure it all looks good. This is not new. It's just AI-fied, essentially. The next step is this is actually a ton of work to do, and really I don't see, you know, a whole lot of companies being able to pull this off bec- other than, like, Anthropic, for example, because Anthropic has literally trillions of dollars. Uh, you know, they're growing like crazy. They have tons of employees. But all that documentation, all that governance, all that quality, all that metric mapping and, you know, adding all the business information to Claude, it takes hundreds of hours. It takes so much time. Before we even talk about maintenance, like we talked about how they slipped from ninety-five percent accuracy to sixty-five percent accuracy by not maintaining their skills. Like, there's so much upfront work and so much maintenance work on this that it's insane. I'm not the only person who actually noticed this. Uh, Kristen Lum said, "This work takes hundreds and hundreds of upfront hours at any moderately sized organization, and that's not even counting maintenance." So there is tons of work to be done even if this is working, even this is set up, you know, at normal companies. I mean, I'm not Ex-ExxonMobil. I haven't been at ExxonMobil in five years. I have no clue where they're at. I have no insight. A lot of people that I knew there no longer work there. But, like, just like the security and privacy- concerns that Exxon would have about all of this would take years to solve. Not, not even like implementing and setting it up. Maybe that's changed, I don't know. But my point is these large organizations, even ones with billions of dollars, this is gonna be difficult for them to pull off. Um, the crazy thing about all this is they literally just gave this out. It's like they literally give you a skill sheet, um, a skill file that you can literally just copy and use for your own personal analysis, or you can use it on your team and organization's analysis. Um, I have a little part of it right here, or you can just go to the blog post and find the full file. My point here, though, is with all these jobs are- with all these things that we have to be doing for AI to become a good data analyst, it's like Anthropic's not getting rid of the data analyst right now. They have four hundred roles open, and eight of them at least are in data. They have four thousand seven hundred and forty-two employees on, on LinkedIn and, uh, one- one thousand four hundred and seventy-eight of them deal with data, and a hundred and ninety-six of them are data analysts. So if this company that has mastered ninety-five percent accuracy, the AI data analyst is still hiring data people, I think that AI jobs aren't going away. Like, this is the company that if they could get rid of humans, they would, right? If you've heard the CEO talk about it, he thinks it's happening, and you don't really see that in their hiring numbers yet. Um, my point of view is like this is literally going to free you up to do higher value work, including creating and maintaining systems like this. Like, like I said, like you guys as data analysts are the people best suited for the AI period. Like, you guys know numbers, and if you can compare numbers with AI, you're going to be undefeated. You're gonna be employed for a really long time, and just the fact that you're listening to this right now tells me you're one of those people because you're interested in data, you're interested in AI, and if you can really carve a niche that's AI plus data, I think you're gonna land an awesome job. I think you're gonna get promoted to an awesome job. I think you're gonna make a lot of money in your career for a really long time. So if you found this fascinating, my name's Avery Smith. Please hit subscribe because I really want to talk about how data and AI intertwine over the next six months, and I want you to be on this journey. I will see you in the next episode.