AI Doesn't Wait Anymore: What the Fastest Frontier Models Can Actually Do

Nine days ago, Anthropic launched Claude for Small Business. I started using it that same day, and I have run it through real client work every day since. The experience changed how I think about what is possible for small businesses in Fargo-Moorhead right now, not in some future year, but this month.

The term “frontier model” sounds technical. It just means the most capable AI systems available at a given time, the ones at the leading edge of what the technology can do. These are not simple chatbots. They draft proposals, analyze contracts, write code, explain technical ideas in plain language, and reason through complex problems. And they do all of this at a speed that removes waiting from the equation.

Speed That Changes How You Work

Here is where the numbers get interesting. Early AI tools like the original GPT-4, released in early 2023, generated text at roughly 20 to 30 tokens per second. A token is approximately three-quarters of a word. At that pace, generating an 800-word blog draft took 35 to 45 seconds. Long enough that you stared at a progress bar. Long enough to break your thinking.

Today’s frontier models work at a completely different scale.

Model	Tokens Per Second	Approx. Words Per Minute
GPT-4 (original, 2023)	20-30	900-1,400
Claude Sonnet 4.6 (2026)	40-60	1,800-2,800
Claude Haiku 4.5 (2026)	80-120	3,700-5,500
GPT-4o (2026)	116-156	5,400-7,200
Gemini 2.0 Flash (2026)	250	~11,500

Source: Artificial Analysis LLM Leaderboard, May 2026

At 5,400 words per minute, GPT-4o can produce a full proposal faster than you can read the opening paragraph. At 11,500 words per minute, Gemini 2.0 Flash generates text at roughly 40 times the pace of an average human reader. The bottleneck is no longer the model’s output speed. The bottleneck is knowing what to ask for.

What This Feels Like in Real Work

The speed shift is not just a benchmark number. It changes the texture of how you use these tools.

When a model takes 40 seconds to respond, you commit to one prompt and wait. When it takes two seconds, you iterate. You try a different angle, tighten the constraints, adjust the tone. The faster the model, the more you can treat it like a real-time conversation instead of a form submission. That shift produces better output, because iteration produces better thinking.

I have used Claude for Small Business every day since its May 4 launch. In these nine days, the work has included drafting client-facing reports, preparing talking points for a technical presentation, summarizing a 40-page document in under 10 seconds, and building a first draft of a competitive analysis that would have taken three hours of manual research. Every task produced usable output on the first or second attempt.

The quality holds up under specific instructions. When I give the model a particular voice to match or a specific structure to follow, it maintains those parameters throughout the document. That consistency is what makes frontier models useful for professional work, not just personal experiments.

Fargo-Moorhead Is Already in the AI Infrastructure Race

The AI expansion is not abstract for this region. Applied Digital is building the Polaris Forge 2 facility in Harwood, North Dakota, eight miles north of downtown Fargo. The project carries 200 megawatts of computing capacity dedicated to AI and high-performance computing workloads, with approximately $5 billion in contracted revenue over 15 years. That puts one of the largest AI infrastructure investments in the upper Midwest eight miles from our front door.

North Dakota State University is responding on the education side. Starting in fall 2026, NDSU will award full-ride scholarships to 30 students per incoming class for a new program focused on AI ethics, technology, and society. That investment signals something real: the regional institutions are treating AI as a permanent part of the economic landscape, not a trend to monitor from a distance.

Infrastructure follows demand. As more companies in the metro integrate AI into daily operations, the local talent pipeline, technical support, and available tooling will continue to grow. Businesses that build AI skills now will have a head start when that pipeline matures.

Who Is Using These Tools and What They Are Getting

Adoption among small businesses has climbed fast. The U.S. Chamber of Commerce found that 58 percent of small businesses used generative AI in 2025, up from 40 percent in 2024. A QuickBooks survey put daily AI use among small businesses at 68 percent. These are not pilot programs or executive demos. These are businesses running AI tools as part of regular operations.

The productivity numbers support the adoption curve. The Federal Reserve published research in April 2026 showing that AI tools save workers an average of 2.2 hours per week, or about 5.4 percent of total work hours. For a team of five people, that is more than 570 hours recovered over the course of a year. That is time redirected to client work, business development, or getting ahead of problems before they become crises.

The businesses seeing the largest gains share one trait. They do not try these tools once and set them aside. They build AI into the workflow the way they built email into the workflow two decades ago, as a utility that runs in the background of nearly every task rather than a tool reserved for special occasions.

The Skills That Actually Matter

Getting strong output from a frontier model takes some skill, but far less than most people expect. The biggest shift is learning to write clear, specific prompts. Vague instructions produce vague results. Detailed, constrained instructions produce work that is close to usable on the first attempt.

A solid prompt answers four questions before the model starts: What is the task? Who is the audience? What tone or format should it follow? What should the output leave out? Answering those four questions in writing before you submit will do more for your results than anything else you can change.

That skill is teachable. We offer classes that cover prompt writing, practical workflow integration, and how to apply these tools to the kind of work Fargo-Moorhead businesses actually do. The goal is not to produce AI experts. The goal is to cut the learning curve from six months of trial and error to a few hours of structured practice, so businesses can start seeing real returns without the frustration that comes from figuring everything out alone.

What Changes When the Wait Goes Away

The practical effect of fast, capable AI is a shorter feedback loop. You ask, you get a response, you refine. That cycle used to take minutes. Now it takes seconds. The difference sounds small, but it compounds quickly across a full workday.

This is not about replacing people. It is about removing the friction from tasks that eat chunks of the day without demanding much creative thinking. Writing a first draft, summarizing a long document, answering a question that would have required 20 minutes of research, these are the places where fast AI earns its place in a real workflow.

April and May 2026 brought a wave of new model releases from every major AI provider, including Anthropic, OpenAI, and Google. Each release pushed the capability ceiling higher. The gap between the tools available today and the tools available 12 months ago is large enough to matter for everyday business tasks in a meaningful way.

The businesses that benefit most from this moment will be the ones that take it seriously, learn how to direct these tools well, and commit to building AI into how they already work. The speed is there. The capability is there. The only variable is whether you are using it.