largo.dev

largo.devProduction AI tutorials on embeddings, transformers, retrieval, and deployment. Complete, runnable code.https://largo.dev/en-usGPU Sizing for ML Workloadshttps://largo.dev/tutorials/production-ml/gpu-sizing-for-ml/https://largo.dev/tutorials/production-ml/gpu-sizing-for-ml/Learn to calculate VRAM requirements, select the right AWS instance, and optimize costs. Includes real benchmarks and a Python sizing calculator.Sat, 24 Jan 2026 00:00:00 GMTExperiment Tracking with MLflow and Langfusehttps://largo.dev/tutorials/production-ml/experiment-tracking/https://largo.dev/tutorials/production-ml/experiment-tracking/Set up experiment tracking for ML models with MLflow and LLM observability with Langfuse. Includes hyperparameter sweeps, model registry, and cost tracking.Sat, 24 Jan 2026 00:00:00 GMTCI/CD for Machine Learninghttps://largo.dev/tutorials/production-ml/ml-cicd/https://largo.dev/tutorials/production-ml/ml-cicd/Build a complete ML pipeline with GitHub Actions: data validation, model training, automated testing, and staged deployment to production.Sat, 24 Jan 2026 00:00:00 GMTModel Serving on AWShttps://largo.dev/tutorials/production-ml/model-serving/https://largo.dev/tutorials/production-ml/model-serving/Deploy ML models to production with optimized inference: torch.compile vs ONNX benchmarks, FastAPI serving patterns, and AWS deployment options.Sat, 24 Jan 2026 00:00:00 GMTML Monitoring and Drift Detectionhttps://largo.dev/tutorials/production-ml/ml-monitoring/https://largo.dev/tutorials/production-ml/ml-monitoring/Monitor production ML models with data drift detection, performance tracking, and automated alerting. Includes working Python implementations.Sat, 24 Jan 2026 00:00:00 GMTML Security Best Practiceshttps://largo.dev/tutorials/production-ml/ml-security/https://largo.dev/tutorials/production-ml/ml-security/Secure your ML infrastructure with IAM roles, secrets management, VPC configuration, and input validation. Practical patterns for production systems.Sat, 24 Jan 2026 00:00:00 GMTWhat It Takes to Be a Senior Machine Learning Engineerhttps://largo.dev/articles/senior-mle-guide/https://largo.dev/articles/senior-mle-guide/A roadmap to the skills, knowledge, and practices that separate senior MLEs from the rest - with links to hands-on tutorials for each area.Sat, 24 Jan 2026 00:00:00 GMTBuilding an AI Trading Agent with Claude and News Signalshttps://largo.dev/tutorials/agents/trading-agent/https://largo.dev/tutorials/agents/trading-agent/Build an automated trading agent that extracts market signals from news using Claude Haiku, executes trades via Alpaca, and manages positions with trailing stops and sentiment monitoring.Sat, 17 Jan 2026 00:00:00 GMTCross-Attention Fusion: Combining Text Embeddings with Structured Featureshttps://largo.dev/tutorials/embeddings/cross-attention-text-tabular-fusion/https://largo.dev/tutorials/embeddings/cross-attention-text-tabular-fusion/Concatenation is the default. Here's why cross-attention works better for combining text embeddings with tabular data—and how to implement it in PyTorch.Wed, 14 Jan 2026 00:00:00 GMTDeepSeek V3.2: Frontier Reasoning at 6x Lower Costhttps://largo.dev/tutorials/transformers/deepseek-v3-architecture/https://largo.dev/tutorials/transformers/deepseek-v3-architecture/Technical deep dive into DeepSeek V3.2's architecture: DeepSeek Sparse Attention (DSA), integrated reasoning with tool-use, and how it achieves IMO gold-medal performance.Sat, 03 Jan 2026 00:00:00 GMT2026 Frontier LLM Architectures: MLA, iRoPE, mHC, and the Race for Efficiencyhttps://largo.dev/articles/frontier-llm-architectures-2026/https://largo.dev/articles/frontier-llm-architectures-2026/Technical comparison of DeepSeek V3.2, Llama 4, Gemini 3, and Qwen3 architectures—plus DeepSeek's mHC innovation expected in V4.Sat, 03 Jan 2026 00:00:00 GMTData Models for AI Applications: Pydantic vs Python Built-inshttps://largo.dev/tutorials/production-ml/data-models-for-ai-applications/https://largo.dev/tutorials/production-ml/data-models-for-ai-applications/Compare Python's data modeling options for AI/ML applications. Learn when to use dataclasses, TypedDict, or Pydantic for API responses, embeddings metadata, and agent tool contracts.Thu, 01 Jan 2026 00:00:00 GMTCFP Oracle: Semantic Search for College Football Historyhttps://largo.dev/tutorials/retrieval-systems/cfp-oracle/https://largo.dev/tutorials/retrieval-systems/cfp-oracle/Build a semantic search system to find historically similar College Football Playoff games using Amazon S3 Vectors and Bedrock embeddings.Thu, 01 Jan 2026 00:00:00 GMTGetting Started with Amazon S3 Vectorshttps://largo.dev/tutorials/retrieval-systems/s3-vectors-getting-started/https://largo.dev/tutorials/retrieval-systems/s3-vectors-getting-started/Build a semantic search system using AWS's new serverless vector storage. Store millions of embeddings in S3 with sub-second query times and serverless pricing.Wed, 31 Dec 2025 00:00:00 GMT2025: The Year AI Got a Reality Checkhttps://largo.dev/articles/2025-year-in-review/https://largo.dev/articles/2025-year-in-review/From DeepSeek's January bombshell to vibe coding going mainstream, here's what actually changed for AI practitioners in 2025.Wed, 31 Dec 2025 00:00:00 GMTMamba for Predictive Maintenance: State Space Models vs Transformershttps://largo.dev/tutorials/production-ml/mamba-predictive-maintenance/https://largo.dev/tutorials/production-ml/mamba-predictive-maintenance/Compare Mamba's selective state space architecture against LSTM and Transformer for hard drive failure prediction. Learn when SSMs beat attention.Mon, 29 Dec 2025 00:00:00 GMTBuild a Community Christmas Tree with AI-Generated Ornamentshttps://largo.dev/tutorials/agents/christmas-tree-ornament-generator/https://largo.dev/tutorials/agents/christmas-tree-ornament-generator/Create a shared Christmas tree where visitors add AI-generated ornaments using Amazon Nova Canvas, with defense-in-depth content moderation using Bedrock Guardrails and Claude.Mon, 22 Dec 2025 00:00:00 GMTBuild a Holiday Cocktail Agent with TheCocktailDBhttps://largo.dev/tutorials/agents/holiday-cocktail-agent/https://largo.dev/tutorials/agents/holiday-cocktail-agent/Create an AI bartender that suggests cocktails based on weather, searches by ingredient, and generates party menus with shopping lists.Mon, 22 Dec 2025 00:00:00 GMTBuilding a Fishing Report Agent with AWS Strandshttps://largo.dev/tutorials/agents/fishing-report-agent/https://largo.dev/tutorials/agents/fishing-report-agent/Create an AI agent that combines tide, weather, and marine data to generate fishing reports. Learn tool-calling patterns with the Strands SDK, NOAA APIs, and Claude on AWS Bedrock.Sun, 21 Dec 2025 00:00:00 GMTBi-Encoders: Fast Semantic Search at Scalehttps://largo.dev/tutorials/embeddings/bi-encoders-semantic-search/https://largo.dev/tutorials/embeddings/bi-encoders-semantic-search/Learn how bi-encoders enable sub-millisecond semantic search over millions of documents. Build a complete search system with sentence-transformers, FAISS indexing, and production-ready Python code.Sun, 21 Dec 2025 00:00:00 GMT