| Management number | 231975903 | Release Date | 2026/06/18 | List Price | US$3.44 | Model Number | 231975903 | ||
|---|---|---|---|---|---|---|---|---|---|
| Category | |||||||||
Build a production-ready AI system with Google's most capable open-weight models. No API keys. No vendor lock-in. Your data stays on your hardware.Google's Gemma 4 family gives you state-of-the-art reasoning in models you can download, inspect, and deploy anywhere. This book shows you exactly how to use them.You will build one continuous project across all 14 chapters: a support ticket triage assistant that classifies issues, assigns priorities, generates summaries, extracts structured data, drafts replies, and flags escalations. Every code example connects to a real system you can extend and deploy.What you will build:- A domain model with strict type safety, enums, and validation- Tokenization and chat template pipelines for Gemma 4's E4B, 26B MoE, and 31B dense variants- Inference services with batched generation, decoding strategies, and JSON extraction- Prompt engineering workflows with versioned templates and A/B evaluation- A full triage pipeline: ingestion, PII redaction, classification, and escalation routing- LoRA and QLoRA fine-tuning on your own labeled ticket data- Evaluation harnesses with statistical rigor (McNemar's test, confusion matrices, failure mode analysis)- Quantized inference with INT8 and NF4 for deployment on constrained hardware- A FastAPI service with health checks, graceful shutdown, and artifact versioningWhat makes this book different:- One project, start to finish. No disconnected toy examples. Each chapter advances the same system from first inference to deployed service.- Open weights, full control. Gemma 4 runs on your infrastructure. Process sensitive customer data without sending it to a third-party API.- Production-first mindset. Memory estimation before model loading. Left-padding for batched generation. Temporal splits for honest evaluation. Safety architecture you own.- Written by a practitioner. Hard-won lessons from real pipelines, not repackaged documentation.Covers Gemma 4 models released April 2026, including the E4B edge model (128K context, 12GB VRAM), the 26B Mixture-of-Experts model (256K context, only 4B active parameters), and the 31B dense model for maximum accuracy.Prerequisites: Intermediate Python. Basic familiarity with PyTorch tensors. No prior NLP or transformer experience required.Stop paying per token. Start building AI systems you actually own. Read more
| ASIN | B0GWS4FD1B |
|---|---|
| XRay | Not Enabled |
| Language | English |
| File size | 658 KB |
| Page Flip | Enabled |
| Word Wise | Not Enabled |
| Print length | 695 pages |
| Accessibility | Learn more |
| Screen Reader | Supported |
| Publication date | April 10, 2026 |
| Enhanced typesetting | Enabled |
If you notice any omissions or errors in the product information on this page, please use the correction request form below.
Correction Request Form