There is, at last, an AI assistant that is being hailed as helping with practical things – like organising your overflowing downloads folder or filing your expense reports, rather than ripping off artists or cheating on homework.
The latest release from the US firm Anthropic is a “pivotal advancement”, according to AI influencer Nathan Hodgson. It’s a “wild” development, AI educator James Maduk said on YouTube, capable of doing the work of 50 staff. The uses of this new tool called Claude Cowork are “insane”, added another influencer.
Paying subscribers can now access the new mode for Claude – Anthropic’s answer to ChatGPT – which can manipulate files directly on your computer, assign admin tasks to agents it controls, generate polished documents, and chew over extensive tasks.
But despite the online hype, Anthropic itself is not joining the hysteria. It calls Claude Cowork a “research preview” that needs close monitoring and a tight leash. And our testing reveals that despite Cowork’s impressive features, there’s a reason for the company’s caution. So what can the tool do?
Testing the AI coworker
Cowork can currently only function on relatively new Apple Macs, via the Claude app. To start, I limited the AI’s access by leaving most connectors (which give Claude access to certain resources on your machine or on the internet) turned off, and only giving it permission to use a single folder. It took longer than expected, and asked a lot of clarifying questions, but it was able to sort files into new folders and extract data from photos to put into a spreadsheet.
It didn’t do quite as well combining multiple photo libraries into one folder, copying things instead of moving them, failing to identify near-identical photos for deletion, and getting confused about dates.
To really test it, I added some online connectors and asked it to take 24 unorganised receipts from my Google Drive and file them as expense reports in Concur, an online workplace reimbursement system. Claude confidently assured me it would be easy. It was not.
Claude opened the URL for Google Drive but had no idea how to use it, and since it expresses its thinking in real time, I could watch it struggling. It tried to scroll down but somehow wrote “page_down” in the search bar instead, and ignored Google’s polite warning by insisting on searching for that string, which turned up nothing. It tried to select all the files in the folder but only selected one, and tried to download it anyway. Twice. It expressed all of this cheerfully as though it was exactly what it had planned to do.
“Excellent! I can see ‘Preparing download’ and ‘Zipping 1 file’ at the bottom right. It appears that clicking Download while having just one file selected is downloading only that file. Let me wait for this to complete, but I realise I need to download ALL the receipts, not just one,” it said.
“Let me cancel this and instead navigate to the parent folder to download the entire folder. Let me click the X to cancel this download. Good! The download was cancelled. Now I’m still in the folder with one file selected. Let me think of a better approach.”
Eventually I told Claude to stop and downloaded the files myself, then asked it to continue. It read the receipts and made appropriate expense report shells in Concur, but it couldn’t upload anything.
After a full 30 minutes, it had the gall to leave me step-by-step instructions to upload my own receipts. “The receipts are organised and ready”, it said, forgetting that I had been the one to make it so because it couldn’t use Google Drive.
The potential pros and cons
The idea that you could just ask for what you want in plain language and have an AI use your computer to do it is compelling. Even if it’s not much faster than a human, you could effectively get admin done while you were making lunch or doing other work tasks. AI early adopters are already automating stuff like this using Claude Code – the company’s more advanced AI tool for people who are technically adept – but Cowork has the potential to give that power to people who don’t know how to speak computer language.
The version of Claude Cowork I tested is only for messing around with, not ready for real tasks. It was apparently built in a week and a half, with Anthropic saying it plans to keep building it alongside users in the months ahead. But we’ve seen in the past that it’s not always simple or possible to go from a generative system that kind of does something, to one that does it consistently and safely.
Assuming you were using Claude for real tasks, the risks could multiply. The agent is capable of deleting or irreversibly changing your files. Generative AI is fallible. It gets things wrong and lies about it. Currently, Anthropic suggests limiting it only to a Claude-specific work folder.
Sending Claude online is another level of risk, especially if you’re handing it personal, financial or sensitive data. Claude could click the wrong thing, upload in the wrong spot, or run afoul of malicious code. Claude may have a mind of its own, but Anthropic’s policy puts the responsibility for anything the bot does on you.
I like the way Cowork asks clarifying questions and chats through what it’s doing, but as with all language models you can tell it’s mostly talk. My tests left me feeling like I was dealing with a human helper who was enthusiastic, but incapable of the jobs I’d set.
Would it be better if I was an experienced prompter making more specific requests, or a coder who could build direct pipelines to the services I wanted to connect to? Sure, but that’s a different product to the “desktop agent for everyone” promised here.
Get news and reviews on technology, gadgets and gaming in our Technology newsletter every Friday. Sign up here.
From our partners