Mark the date: July 6, 2026. That's the day ChatGPT Workspace Agents stop being free and start drawing down credits every time they run. After two deadline extensions — first from the original May 6 cutoff, then to July 6 — OpenAI's free preview for agent runs invoked inside ChatGPT is finally ending. If you're reading this the week it publishes, you have roughly seven days to understand the new math before your agents start spending real money.
Here's why this matters more than a typical pricing tweak. Workspace Agents are the feature that turns ChatGPT Business from "a smarter chatbot" into "a teammate that does multi-step work on its own" — routing product feedback, generating weekly metrics reports, triaging inbound requests, drafting and sending follow-ups across your connected apps. They're genuinely useful. They're also genuinely metered now. And agents, by their nature, burn far more tokens than a person typing into a chat box ever did.
The good news: the change is more navigable than the scary headlines suggest, the per-run cost is small if you set it up right, and one entire category of agent usage stays free. The bad news: small businesses that flip these on without spend controls are walking straight into the exact trap that just blew up enterprise AI budgets across the Fortune 500. Let's break down what's actually changing, what a run really costs, and the seven moves that keep your bill predictable.
What's Actually Changing on July 6
Through 2025, both OpenAI and Anthropic sold their best capabilities on flat, predictable subscriptions. In 2026, both labs moved substantial portions of their services to token-based, usage-metered billing. Workspace Agents are the latest piece of OpenAI's lineup to cross that line — and the timing is no accident. Agentic tools are expensive to run because an agent doesn't ask one question; it plans, calls tools, reads results, and loops, often dozens of times, to finish a single task.
OpenAI's rate card spells out the new model. There is no fixed price per agent run. Instead, each run consumes credits based on the underlying model, the volume of input and output tokens, and how much cached context it reuses. The published guidance: a typical end-to-end run on GPT-5.5 lands between 5 and 25 credits. A worked example — 20,000 input tokens, 80,000 cached input tokens, and 5,000 output tokens — comes to about 7.25 credits. Cached input is dramatically cheaper than fresh input, which is your first clue about how to keep costs down.
The free lane that's easy to miss
This is the detail most coverage skips, and it's the most useful one for a small team: the July 6 rates apply only to agent runs invoked within ChatGPT. Runs invoked outside ChatGPT — most importantly, an agent responding inside a Slack channel — remain in free preview. If a chunk of your intended agent work is "post in Slack when X happens" or "answer team questions in this channel," that path doesn't start metering on July 6. Designing around this distinction is a legitimate, supported way to keep your early agent spend near zero while you learn what's worth paying for.
The $500 Million Cautionary Tale
If you think runaway agent costs are an abstract worry, look at what just happened one tier up. On June 19, 2026, the Financial Times reported that Amazon, Walmart, Cisco, Uber, and Meta are all capping internal AI tool budgets, pushing staff to cheaper models, and warning against "AI for the sake of AI." Uber introduced a hard limit of $1,500 per employee per month in token spending on individual AI tools — after reportedly burning through its entire 2026 AI budget by April, four months into the year. One enterprise reportedly spent $500 million in a single month after deploying AI access with no usage caps.
The root cause is structural, not careless. Most companies set their 2026 AI budgets in the fall of 2025 — before agentic tools detonated their per-seat assumptions. When billing shifted from "flat subscription" to "scales with usage," and agents started looping through tasks autonomously, the spend curve bent sharply upward. The lesson for a 10-person company is identical to the lesson for a 100,000-person company: the fix isn't a cheaper model — it's treating budget governance as core infrastructure, set up before the agents go to work, not after the invoice arrives.
That's the real reason this July 6 change deserves your attention. Not because the per-run cost is high — 7.25 credits is trivial. But because agents make it effortless to run thousands of those trivial runs without noticing, and "trivial × thousands" is exactly how the big players blew their budgets.
Free Preview vs. Credit Pricing: The At-a-Glance Table
Here's how the two regimes compare, so you can see exactly which of your intended workflows starts costing money on July 6 and which doesn't.
| Dimension | Before July 6 (Free Preview) | From July 6 (Credit Pricing) |
|---|---|---|
| Agent run inside ChatGPT | ✓ Free | Billed ~5–25 credits/run |
| Agent run in Slack (outside ChatGPT) | ✓ Free | ✓ Still free preview |
| Plan access to Workspace Agents | Bundled with Business+ | Bundled with Business+ |
| Typical cached-heavy run (worked example) | Free | ~7.25 credits |
| Cost predictability | Flat (no metering) | Variable — needs caps |
The takeaway: access to agents isn't what costs money. The agents doing work inside ChatGPT is what costs money. Your job between now and July 6 is to decide which work is worth metering, route the rest through the free lane, and put guardrails around the part you pay for.
7 Moves to Control Spend Before the Bills Start
This is the survival guide. Run through these in the next seven days and you'll enter the credit era with your costs boxed in rather than open-ended.
- Set per-seat and workspace spend caps first — before you turn agents loose. The single biggest mistake the enterprises made was deploying agent access with no usage limits. Define a monthly credit ceiling for the workspace and, where possible, per-user limits, so one enthusiastic power user can't drain the pool. This is a five-minute setting that prevents a four-figure surprise.
- Route high-frequency, low-stakes work through the Slack free lane. Status pings, channel Q&A, "notify us when X happens" — if it can live in Slack, keep it there. Those runs stay in free preview. Reserve metered in-ChatGPT runs for the high-value tasks that actually justify credits.
- Exploit prompt caching. The rate card makes cached input far cheaper than fresh input — in the worked example, 80,000 cached tokens cost a fraction of what fresh tokens would. Build agents that reuse stable context (your playbook, your data schema, your brand voice) rather than re-sending it every run. Predictable, repeated tasks are where caching saves the most.
- Match the model to the job. Don't run a frontier reasoning model on a task a lighter model handles fine. The enterprises that got their bills under control did it by pushing routine work to cheaper models, not by banning AI. Reserve heavy reasoning for the runs that genuinely need it.
- Scope each agent tightly. An agent told to "handle support" will wander; an agent told to "draft a reply, tag urgent tickets, and stop" finishes in fewer loops. Tighter instructions mean fewer tool calls, fewer tokens, fewer credits. Specificity is a cost-control tool, not just a quality one.
- Pilot before you scale. Turn on two or three agents for your highest-value workflows, watch the credit draw for a week, and extrapolate before rolling out to the whole team. You want a measured cost-per-outcome number — "this weekly report costs us about 9 credits" — before you multiply it across 12 seats.
- Review credit usage weekly, like you review cash. Put a recurring 10-minute slot on the calendar to scan the workspace usage dashboard. Catching a misconfigured agent in week one costs you a few credits; catching it in week eight costs you a budget line. Governance is a habit, not a one-time setup.
How This Compares to Microsoft Copilot
If you're weighing your options, it's fair to ask how the metered-agent model stacks up against the alternative. Microsoft is running a promotional $18/user/month price on Copilot for Business through September 2026, and bundles agent-style automation inside the Office apps your team may already live in. Copilot's pitch is flat-rate predictability and deep Office integration; ChatGPT's pitch is broader capability, no ecosystem lock-in, and agents that act across whatever tools you connect — at the cost of usage-based metering for the heaviest work.
There's no universal winner. If your team lives entirely inside Microsoft 365 and wants one predictable bill, Copilot is a reasonable call. If you want the strongest agentic reasoning, flexibility across mixed tools, and you're willing to manage credits to get it, ChatGPT Business with Workspace Agents is the more capable platform. Many larger organizations now run both. For a small business, the right answer usually comes down to where your team already works and how much agent automation you actually intend to deploy — which is exactly the kind of decision a partner can model out with you before you commit. See our fuller ChatGPT vs Copilot breakdown for the side-by-side.
The Sayfe.ai Take
We've watched OpenAI move this deadline twice, meter feature after feature, and quietly rewrite the cost math for small businesses three times in a single quarter. The July 6 Workspace Agents change is the clearest example yet of why setup matters more than ever: the teams that win with agents aren't the ones who turn everything on — they're the ones who turn on the right things with caps in place from day one.
As an authorized OpenAI SMB Channel Partner, Sayfe.ai helps you configure Workspace Agents the right way: spend caps and per-seat limits set before launch, high-frequency work routed through the free Slack lane, caching-friendly agent design, and a weekly usage review you'll actually keep. We do it at no markup over OpenAI's pricing — and we track these roadmap changes daily so a release-note edit never turns into a billing surprise. If you want agents that finish real work without the four-figure shock the big players just took, this is the week to get it set up. Want the broader picture on what agents can do? Start with our guides on how Workspace Agents replace custom GPTs and Goal Mode autonomous agents for small business.
Frequently Asked Questions
Credit-based pricing for Workspace Agent runs invoked inside ChatGPT begins July 6, 2026. OpenAI extended the free preview twice — from the original May 6 cutoff to July 6 — so this is the current effective date. Agent runs invoked outside ChatGPT, such as an agent responding in a Slack channel, remain in free preview as of this date.
There's no fixed per-run price. OpenAI's rate card indicates a typical end-to-end run on GPT-5.5 consumes roughly 5 to 25 credits, depending on the model used and the volume of input, cached, and output tokens. A worked example with 20,000 input tokens, 80,000 cached input tokens, and 5,000 output tokens comes to about 7.25 credits. Cached input is significantly cheaper than fresh input, so caching-friendly agents cost less.
No separate access fee. Workspace Agents are bundled with ChatGPT Business ($25/user/month) or higher plans, including Enterprise and Edu. What changes on July 6 is that the work the agents do inside ChatGPT consumes credits. You pay for usage, not for the feature itself.
Set workspace and per-seat spend caps before you enable agents, route high-frequency low-stakes work through the free Slack lane, design agents to reuse cached context, match lighter models to routine tasks, scope each agent tightly, pilot before scaling, and review credit usage weekly. The enterprises that lost control deployed agents with no usage limits; small teams that set caps first keep spend predictable.
For most small businesses, yes. The flat-rate core — data-private ChatGPT, admin controls, SSO, shared custom GPTs, and the in-app agent experience — is unchanged. Workspace Agents add autonomous, multi-step work on top, and with sensible caps the credit cost for a typical small team stays modest. The platform's flexibility and agentic capability remain a strong value, especially with a partner configuring the cost controls for you.
Turn On Workspace Agents Without the Budget Surprise
July 6 is days away. Sayfe.ai, an authorized OpenAI SMB Channel Partner, sets up your agents with spend caps and per-seat limits from day one, routes free-lane work to Slack, and reviews your usage weekly — all at no markup over OpenAI's pricing.
Get Started TodayAbout Sayfe.ai: Sayfe.ai is an authorized OpenAI SMB Channel Partner. We help small and medium-sized businesses implement and optimize ChatGPT Business, ChatGPT Enterprise, and the OpenAI API. We're here to make enterprise AI accessible to teams of any size.