Cloud

139 posts
The enterprise AI stack: substrate, platform, applications
AI

The enterprise AI stack: substrate, platform, applications

Three layers, top to bottom: applications (the AI features users see), platform (model registry, serving, observability, governance), substrate (K8s, GPUs, storage). Decisions as Code runs through every layer. Centralize the decisions. Project them everywhere.

Sid Smith Sid Smith 7 min read
Why traceability dies in most platforms
Cloud

Why traceability dies in most platforms

Every platform starts with traceability as a goal. Most lose it by month six. The predictable failure modes, log-format drift, ID-namespace collisions, the 'we'll add structured logging later' debt, the asymmetric incentive to write logs but never read them. What survives, and why.

Sid Smith Sid Smith 8 min read
Argo Workflows for AI pipelines: RAG indexing, fine-tuning, eval suites
AI

Argo Workflows for AI pipelines: RAG indexing, fine-tuning, eval suites

Argo Workflows for the long-running, branching, fan-out AI ops pipelines that don't fit a CI runner. RAG indexing jobs, fine-tuning runs, eval suite execution. WorkflowTemplates as the Decisions as Code surface, same pipeline shape, different inputs per project.

Sid Smith Sid Smith 6 min read
When HIPAA shapes the system before you write a line of code
Cloud

When HIPAA shapes the system before you write a line of code

Most teams treat HIPAA as a feature you bolt on at the end. The teams who've actually shipped under it know it shapes the data model, the audit log, the auth model, and the deletion semantics, before you draw a single arrow on the diagram.

Sid Smith Sid Smith 8 min read
A row of identical metal storage lockers on a dark wall with subtle brass numbered tags on each door
Cloud

GPUaaS in late 2025: who's left and what they cost

The GPU-as-a-service market consolidated through 2025. The crowded field of neoclouds is smaller than it was; the surviving providers are more differentiated. Worth a survey of who's still around and what they actually cost.

Sid Smith Sid Smith 5 min read
Observability for AI workloads: Prometheus, Grafana, Loki
AI

Observability for AI workloads: Prometheus, Grafana, Loki

Latency, token throughput, cost-per-request, queue depth, drift. Prometheus for the metrics, Loki for the prompts and responses with PII discipline, Tempo for the tracing across agent calls. The dashboard JSON itself as a Decisions as Code surface, projected per-environment.

Sid Smith Sid Smith 6 min read
Vector databases on Kubernetes: Qdrant, Weaviate, Milvus
AI

Vector databases on Kubernetes: Qdrant, Weaviate, Milvus

Qdrant vs Weaviate vs Milvus on K8s. The foundation question for retrieval. StatefulSets, persistent volumes, replication, the operational reality. RAG indexing patterns at homelab scale on engine-01, and the decisions that change shape at fleet scale.

Sid Smith Sid Smith 6 min read