A Holistic View of GitHub Copilot Security
Many people have wondered whether GitHub Copilot generates mostly secure or vulnerable code — and I’ll talk about that — but the field of security can, and should, encompass so much more. Copilot interacts with software reliability, privacy, legal risks (not just copyright), developer behavior, complicated ethical questions, and human power structures, right down to the FOSS ecosystem and the nature of our own jobs. If AI tools really have the power to fundamentally transform our civilization, then we who build and use AI owe it to ourselves to think beyond just how it works and what it can do. Let me give you a broad overview of all the areas we need to talk about to make sure that Copilot and tools like it serve our needs as people — which is what both “security” and “software engineering” really boil down to.
I’ll start by showing how Copilot performs when prompted with tasks that tend to have code vulnerabilities, how well it finds and fixes existing vulnerabilities, and the ways that your interactions with it can impact the quality of its results. I’ll look at how Copilot can enhance learning and improve practices like documentation and unit testing, but also show the pitfalls of relying on the AI for critical thinking. That will lead into the difficulties that arise because of Copilot’s training data and how Copilot fundamentally works: not just the quality of the data, or whether it could be poisoned or prompt injected, but how OpenAI gathered the data. How do we create a sustainable, symbiotic relationship between FOSS developers who share code with each other, and large language models that need to ingest that code to generate more — especially when those models are proprietary and expensive? If I do my job right, you’ll leave with more questions than answers, but also start talking them through with each other. It’s time to shape the world we want to live and code in.