Trend Micro Inc.

01/14/2022 | News release | Distributed by Public on 01/14/2022 08:15

Codex Exposed: How Low Is Too Low When We Generate Code?

In June 2020, OpenAI released version 3 of its Generative Pre-trained Transformer (GPT-3), a natural language transformer that took the tech world by storm with its uncanny ability to generate text seemingly written by humans. But GPT-3 was also trained on computer code, and recently OpenAI released a specialized version of its engine, named Codex, tailored to help - or perhaps even replace - computer programmers.

In a series of blog posts, we explore different aspects of Codex and assess its capabilities with a focus on the security aspects that affect not only regular developers but also malicious users. This is the second part of the series. (Read the first part here.)

Because it is based on GPT-3, Codex has benefited from being trained with a massive codebase taken from all over the internet, in virtually every programming language available. However, natural languages and programming languages do not share the same properties. And while natural languages can benefit from the generally higher flexibility in understanding concepts of the human mind, programming languages are bound to more rigid constructs and less forgiving principles, dictated by their respective interpreters or compilers.

For high-level programming languages, it is reasonable to expect that the same statistical modeling that works so well in GPT-3 for natural languages would bear similar advantages for its code generation capabilities. Codex, however, lacks some of the essential constructs required for a real "understanding" of programming languages, such as their abstract syntax tree or the computational architecture of the target machine.

So, how far can we push code generation while still being effective?

The imitation game: Codex's ability to understand low-level code

To assess how deep Codex's understanding of its generated code is, we tested its understanding of assembly language. We chose this to get the farthest away from natural language and the closest to the machine. We tested Codex by giving it assembly code samples, which we asked it to translate to ordinary language.