Skip to content
Back to blog
aimachine-learningproductionllm

From ML Experiments to Production AI: My Journey

2 min read

I have been experimenting with machine learning since 2016, from TensorFlow models to LLM-powered agents. Here is what actually works in production versus what stays in notebooks.

My first real machine learning project was a TensorFlow model in 2016 that predicted server load patterns for a client's infrastructure. The model worked great in a Jupyter notebook. Getting it into production took three months of additional engineering. That gap between "works in a notebook" and "runs reliably in production" has defined most of my ML experience since.

Between 2016 and 2020, I built a handful of ML projects: sentiment analysis for customer feedback, image classification for a manufacturing client, and a recommendation engine for an e-commerce platform. Every project followed the same arc. The model training was the easy part. Data pipelines, model serving, monitoring for drift, and handling edge cases consumed 80% of the effort.

The LLM wave that started in 2023 changed my perspective on AI in production. Instead of training custom models for each use case, you could prompt a foundation model and get useful results immediately. The tradeoff is cost and latency instead of training time and data requirements. For most business applications, that tradeoff is worth it.

I have integrated Claude and GPT-4 into several production applications over the past two years. The pattern that works best is using LLMs for tasks where "pretty good" is acceptable and humans review the output. Content summarization, code review suggestions, data extraction from unstructured text. These are all cases where the AI handles the bulk work and a human catches the mistakes.

ClawDE is my most ambitious AI project. Building an AI agent system taught me that the real challenge is not the model. It is the orchestration layer. Managing context, coordinating parallel agents, handling failures gracefully, and keeping costs predictable. Those are engineering problems, not ML problems. And that is where my 17 years of software engineering experience actually matters more than any ML specialization.