Cloud

139 posts
Backstage as the developer portal for AI services
AI

Backstage as the developer portal for AI services

AI services need a catalog the same way every other internal platform does. The wiki approach falls over the moment you have more than a handful of models. Backstage with a thin AI plugin layer is the pattern that holds, a direct callback to the catalog discipline.

Sid Smith Sid Smith 6 min read
A close-up of a metal junction box on a dark wall with multiple thick cables of different colors entering and leaving it cleanly
Cloud

A short defense of the boring middleware

The interesting work in any AI system lives in the model and the application layer. The boring middleware between them, auth, rate limits, retries, logging, request shaping, is what makes the system actually work. Worth defending the boring part.

Sid Smith Sid Smith 5 min read
Self-hosted Forgejo and Harbor: the sovereign AI substrate
AI

Self-hosted Forgejo and Harbor: the sovereign AI substrate

If your AI infra depends on third-party container images, you don't control your supply chain. Forgejo on store-01 as the source-of-truth git host, Harbor on engine-01 as the registry plus image-signing layer. The sovereign-infra argument, and why mirroring is non-negotiable now.

Sid Smith Sid Smith 7 min read
Running AI workloads on Kubernetes: patterns that hold up
AI

Running AI workloads on Kubernetes: patterns that hold up

Not every AI workload belongs on Kubernetes. Some belong nowhere else. The patterns that hold up, separating CPU and GPU tiers, sizing autoscaling for serving versus batch, picking the right foundation, and the ones that fall apart at the first real load.

Sid Smith Sid Smith 7 min read
A heavy ornate brass padlock locked through a thick chain on a dark wooden surface, partially encircling a polished computer chip
AI

Vendor lock-in in the AI era is worse than 2010 cloud lock-in

Cloud lock-in in 2010 was bad. AI lock-in in 2025 is worse for reasons most teams aren't thinking about. The data, the prompt patterns, the agentic surface, the fine-tunes, none of it ports cleanly. Worth being clear about why before you commit.

Sid Smith Sid Smith 5 min read
CI/CD for AI models: the pipeline shape that holds up
AI

CI/CD for AI models: the pipeline shape that holds up

Tekton, Argo CD, GitHub Actions, Jenkins X, four answers to model deploys. You can't unit-test a model, so eval suites become the test substitute. Versioning, rollback, blue-green serving. Pipeline config as the Decisions as Code surface, projected per environment.

Sid Smith Sid Smith 6 min read
A close-up of a metal server rack with neatly organized fiber-optic cables and faint glowing LEDs along the cable paths
Automation

Why your IaC pipeline is the right place to put AI

The infrastructure-as-code review pipeline is one of the highest-leverage places to deploy AI in an engineering org, and almost nobody is doing it well. The reasons it's underused are mostly accidental rather than principled.

Sid Smith Sid Smith 5 min read
Six woven rope strands of different muted colors converging from different directions toward a single central knot on a dark fabric surface
Cloud

AWS Strands and the agent-framework wars

AWS shipped its own agent SDK two weeks ago, joining LangGraph, Semantic Kernel, the OpenAI Agents SDK, the Anthropic Agents SDK, and a half-dozen others. The question isn't which one's best. It's whether the framework layer is still where the differentiation happens at all.

Sid Smith Sid Smith 4 min read
A vast factory floor with many small autonomous robotic figures moving in patterns and no human supervisors visible
AI

Microsoft Build 2025: agents everywhere, governance nowhere

Build was the agents conference Microsoft has been preparing for. The agents pitch lands. The governance story underneath it is exactly as unfinished as it was a year ago, and the contradiction is starting to show.

Sid Smith Sid Smith 5 min read
An open vintage accountant's ledger with hundred-dollar bills tucked between pages and a brass calculator beside it
Cloud

Cost-modeling AI workloads with FinOps eyes

The per-token price is the easy line. Everything else, the retries, the context overhead, the agentic tool calls, the egress, the GPU reservation underneath the API, is where the actual bill comes from.

Sid Smith Sid Smith 5 min read