OpenAI 'Codex' Is An AI Capable Of Translating Natural Language To Programming Codes

OpenAI, Codex

Artificial Intelligence is an intelligence demonstrated by machines, which differs from intelligence shown on humans or animals.

While intelligence on living beings has been linked to the total volume of the brain, as well as the combination of the number of cortical neurons, neuron packing density, interneuronal distance and axonal conduction velocity, intelligence on machines has yet to be fully understood.

Among those working in the AI field, is OpenAI.

The research company has taken its AI research to the next level, when it announced 'Codex'.

The AI system can translate natural language to programming languages.

Initially, the system is being released as a free API.

Codex can be considered the next-stop product for OpenAI, because it is an evolution of its Copilot, an AI that was developed by OpenAI and Microsoft.

With Copilot, users can get suggestions similar to an auto-complete feature. This is because Copilot can be described as a virtual version of what developers call a "pair programmer," a term to describe when two developers work side-by-side collaboratively on the same project.

However, Copilot can only help users finish lines of code.

Codex on the other hand, takes this a step further, as it can accept sentences written in English, and then translate those sentences into runnable programming codes.

For example, users can ask it to create a web page with a certain name at the top and with four evenly sized panels below numbered one through four.

Using the command, Codex would then attempt to create the page by generating the code necessary for the creation of such a site in whatever programming language it thinks is appropriate.

Users can also give Codex additional commands for more precise results.

What users have to do, is know that to build, shift the act of writing code into words, by (1) breaking down a problem into simpler problems, and (2) map those simpler problems into existing code (libraries, APIs, or functions) that already exist.

According to OpenAI on its blog post:

"Codex is the model that powers GitHub Copilot, which we built and launched in partnership with GitHub."

"Proficient in more than a dozen programming languages, Codex can now interpret simple commands in natural language and execute them on the user’s behalf—making it possible to build a natural language interface to existing applications."

What makes Codex so capable, is because it is built using OpenAI's GPT-3, which can generate natural language in response to a natural language prompt.

OpenAI said that Codex has "much of the natural language understanding of GPT-3," but its knowledge is limited to only producing working code.

But because it's a general-purpose programming tool, Codex can be used for virtually any programing tasks.

"OpenAI Codex is most capable in Python, but it is also proficient in over a dozen languages including JavaScript, Go, Perl, PHP, Ruby, Swift and TypeScript, and even Shell. It has a memory of 14KB for Python code, compared to GPT-3 which has only 4KB—so it can take into account over 3x as much contextual information while performing any task."

While Codex is a next-step evolution of OpenAI's previous work, and the company has indeed proven it highly capable, there can be some drawbacks.

In a paper published by the researchers at OpenAI, it's revealed that Codex might have significant limitations, including bias issues and sample inefficiencies.

The company’s researchers found that the model can propose syntactically incorrect or undefined code, invoking variables and attributes that are undefined or outside the scope of a codebase.

Making things worse, the researchers said that Codex can sometime suggests solutions that appear superficially correct but don’t actually perform the intended task.

For example, when users ask Codex to create encryption keys, Codex can select “clearly insecure” configuration parameters in “a significant fraction of cases,” and recommends compromised packages as dependencies.

The AI also considers "White" as the most common, followed by "Black" and "Other," and can also include the word "terrorist" and "violent" when writing code comments with the prompt "Islam."

Just like any other AI models, Codex's knowledge is limited to its training data. And just like others, Codex's training data is biased.

Specifically, OpenAI found that Codex can be prompted to generate racist and otherwise harmful outputs as code. Given the prompt “def race(x):,”

The project has a long way to go, and OpenAI said that it is "taking a multi-prong approach" to solve these issues.

Read: OpenAI's GPT-3 Has A 'Persistent Anti-Muslim Bias', Research Found

Published: 
15/08/2021