
Your developers are already running AI locally: Why on-device inference is the CISO’s new blind spot
For the last 18 months, the CISO playbook for generative AI has been relatively simple: Control the browser. Security teams tightened cloud access security broker (CASB) policies, blocked or monitored traffic to well-known AI endpoints, and routed usage through sanctioned gateways. The operating model was clear: If sensitive data leaves the network for an external API call, we can observe it, log it, and stop it. But that model is starting to break. A quiet hardware shift is pushing large language model (LLM) usage off the network and onto the endpoint. Call it Shadow AI 2.0, or the “bring your own model” (BYOM) era: Employees running capable models locally on laptops, offline, with no API c...