Why are programming languages different from natural languages?

AKA: Why can I write PHP but am I unable to count in Russian?

12 min readSep 21, 2020

Natural languages have been around for millennia. They have grown, evolved, spread, split into new ones or disappeared with the people that spoke it… Even though programming languages are much more recent, some are now in their sixties, like C, Lisp or Fortran. Most of the time, we think of developers more as scientific than literary people. But why then are they called programming languages, and not programming sciences? What are the connections and differences between natural and programming languages?

In this article, I’ll give an overview of how I approach the learning of new language, what I believe programming languages are mostly about and to which extent coding requires a specific way of thinking.

Prelude

When I tried to learn various natural languages, I usually considered I would need to learn 2 or 3 types of things:

for languages using another alphabet (so, in my case, a non-latin alphabet), the unique characters, kanjis, etc.
how the language works: the syntax, the grammar, the verbs tenses and other particularities in the language structure
lots and lots of vocabulary!

Everyone is different and, after talking with several people, I’ve spotted that each of us struggles at a different step. A friend of mine has learnt both Japanese and Korean and told me that, while learning the characters takes a bit of time, the hardest part for him was to know enough vocabulary to have real discussions. Conversely, someone else told me that they were way better at remembering words but really scuffled with the inner workings of a language. I myself am quite quick at picking up grammar or syntax rules, but I often forget the words!

In truth, it’s all about practice. For example even though I’ve known the basics of English grammar for a while, spending a few months in the UK cemented my “university” knowledge of English and I’m now much more fluent, because I had tons of opportunities to repeat or discover words.

I hardly recall my German lessons from five years ago, and the most I remember is where the words should go (as for finding the words per se, well…). On the other hand, I daily use between 2 and 5 programming languages, and I’ve encountered more in my life, be it at work our for my personal projects. I find I rarely have difficulties writing the program — the complexity is usually more in the algorithm itself.

That got me thinking: how come I’m able to remember and write programs without too much of a hassle in about 10 programming languages, but I only consider myself a proper French and English speaker? We’ve all heard that “as soon as we learn three languages, it becomes easier to pick up the next one”… so, can’t my knowledge of computer science help with the natural languages I’ve desperately tried to learn (such as Russian or Japanese)?

Programming languages are about structure

I think the biggest difference between natural and programming languages is that the latter have been constructed specifically for communicating with machines and therefore only require quite a limited set of words (in addition to using the latin alphabet, which is peachy!). If you’ve taken some hard computer science classes, you might have dived into the low-level concepts and discussed notions like “assembly language”, “instructions set” or “binary output” and “compilation”. Those are a lot of complicated terms that would each require full-fledged articles, but here, I think we can tie them to the following idea: programming languages are about translating (or rather encoding) human thoughts into basic instructions so that a machine which only understands zeros and ones can (re)act on them.

Note: to relate this sentence to the previous concepts: “compilation” is the actual translation process that takes your human-written file and turns it into its binary format; the “binary output” is the result of this process; the closest thing to binary we have are “assembly languages”, that directly use the computer’s custom “instructions set” to describe the actions and thus require minimal translation to binary.

By the way, xkcd seems to share my view on this, according to his “Formal languages” comic strip… :) (https://xkcd.com/1090/)

As a computer science student, you’re often asked to write programs in pseudo-code. For example, you’ll write things like:

Get the list of groceries
FOREACH item in the list
  IF you don't have it yet
    THEN Take the item
  ELSE
    Skip the item
Smile to the cashier
Pay the required amount

This is not a true computer script, but it highlights something crucial: learning programming is not about learning specific words, it’s about learning structures. This is why we can teach “pseudo-code” instead of “real code” and still present programming concepts. If instead of “item”, you want to say “product”, this set of instructions will still work and people will still understand you. The only words that cannot be switched out for other ones are the ones in uppercasing in the previous snippet: in programming languages, those are called “keywords”. The reason they can’t be replaced is because they intrinsically hold structural meaning: a “FOREACH” word directly translates to the idea that you will process a list of items and do something for each; the “IF” keyword is associated with a conditional branching that will take your program on either one or the other road.

Note: we see the exact same thing with cooking recipes — many words are used across various recipes to make quite different dishes; actions like “heat”, “chop” or “slice” can be customized to fit various purposes and create unique results! With a common set of keywords, you can in fine get thousands of soups and gratins…

I believe this is why developers like me are able to “remember” many programming languages: we don’t learn by heart hundreds of programs, and we don’t even learn by heart dozens of specific words — we learn patterns, structures and general rules that are embodied by the language’s keywords. (The good news being that programmers are somewhat smart people that have cleverly decided to share most of the keywords between languages, so you don’t even need to learn an entirely new set of instructions when switching from one programming language to another!)

A few weeks ago, I wrote about learning to teach and learning to learn. In particular, I pointed out that understanding a topic deeply enough allows you to extract global schemas from it and avoid remembering specific examples; instead you can rather remember one method and apply it to several problems. It’s the same for programming languages — most programming courses have very similar chapters (at least at the beginning): what is it best suited for? how do you print “hello world”? how do you declare and store variables? how do you ask the user for an input and program a guessing game? how do you read and write files? And then, for the more graphical ones, you’ll get a chapter on GUI building.

Of course, I’m not saying all programming languages are the same! Each has its own goals and recommended usages, its strengths and its weaknesses, its history and its upcoming roadmap. What I’m saying is that they all rely on an “appeal to structure” rather than an “appeal to words”, and I think this is what brings developers together: from what I’ve experienced so far, most of us have this “analytical and scientific” approach to things, we like to deconstruct, rebuild, disassemble, remake from scratch.

Are developers a unique kind?

Earlier this month, I wrote an article about the no-code movement, this new growing trend that generates more and more tools to give everyone an opportunity to create apps or websites. Even if the philosophy in itself is very interesting and has great potential, one of the reasons why I think we should be wary of embracing it too quickly is precisely that it shortcuts this learning of structure.

Nobody is a born computer scientist. You don’t have a naturel predisposition for writing C++ classes and I don’t know of any Fortran wizards that woke up one morning and wrote down a perfect physics modeling script. It takes time and effort to truly understand all the mechanics under the hood; and “computers” is so vast a subject you literally have hundreds of ways of “mastering computers”: is it about knowing the ins-and-outs of Intel hardware? or about setting up virtual machines and Kubernetes pods? or about writing flawless Golang coroutines and defers? or about top-notch ReactJS lazy loading and optimization?

Still, programmers usually have a common sense of architecture and design. They are able to take down a problem, to cut it down into processable pieces and then to apply various tools (or call their colleagues if they’re not pros at it) to tackle the issue.

Programming is a bit like a game of Tetris. In this puzzle game, you play with a limited number of possible pieces: it is up to you to create the best structure from this small set of “keywords”! Developers often think of coding like a puzzle to solve with the toolkit at their disposal.

No-code tools are an abstraction that helps attract non-experts and provides a quick-and-easy way to prototype ready-to-use products. The problem is that they do so by hiding away the “grammar” of programming, this whole first step of architecture and design. Instead, they present you with the “vocabulary” and you’re mostly blindly following the lessons to guess where these words should go in the code “sentences”.

Think of an app maker, like Google’s — this tool invite you drag-and-drop widgets on your screen and assign very basic hooks to them so as to create a very simple app logic: “when the users perform an action, this happens in reaction”. This is completely fine for small products, and once again I’m really interested to see how this no-code movement could spark creativity and welcome more people to the world of developing. But widgets are small isolated words. You write parts of sentences and you never see the whole thing. Still on Google’s App Maker, the tutorials are quite indicative of this “hide-away” philosophy:

On this screenshot, we see snippets of code that you can copy and paste in your demo app. It will work perfectly, no worries there. But how obvious is it to someone who has never coded before (this is the targeted audience, right?) that this app page is a hierarchical layout with a “root” element that has “descendants”, referenced by unique ids so they don’t collide, which in turn may have children widgets or editable properties? Moreover, where does the app.pages concept come from?

I know I’m probably nitpicking here. Those tutorials are really nice and it is amazing that people can go from zero to a fully functional webapp in the matter of hours, even if they don’t have any experience with development! However I’m still cautious of the “magical results” you can get with it — my guess is that it’s nearly impossible for someone who is not a programmer to transfer what they’ve learnt here to another app or another tool because they will be lacking the core structure, the “grammar and syntax”, and they only know some “vocabulary”.

So, are words meaningless in computer science?

At this point, it might seem like I’m preaching for only one thing: patterns. And it’s true I have some family history that pushes me towards the idea that we could analyze even natural languages to extract blueprints and general grammar structures. To some extent, I’d be fascinated to see “empty shells” with placeholder word functions and relationships that could be “instantiated” with real words to directly create a sentence…

But linguistics aren’t that simple and we are far away from having such a comprehensive knowledge of the topic! In actuality, “vocabulary” is not just a dictionary of words because words are used in a context and, when joined together in sentences, they convey more than their plain definition.

In code, we have the aforementioned keywords, but we still use natural language, in particular for variable names or comments.

The quarrel of comments: structure versus freedom

There is a friendly quarrel in programming about comments. Comments have quite a unique role in scripts: they are sentences that are ignored by the computer and are only useful to the human readers that will glance at the program. They are essentially a “blind spot” for the programming language itself because contrary to the rest of the code they won’t be translated into actual actions on the computer’s part. You can write whatever you like, with whichever structure you choose, regardless of the overall context.

So there is this argument among developers: should you even write comments?

some programmers advocate for it: comments are not as constrained and can therefore be used to make the developer’s intention more explicit, to warn readers of specific choices or even to remind your future self of why you did things that way
others cynically say that the code should speak for itself and that if it’s not clear just by looking at the instructions, then your logic is too hard (or flawed) anyway

We see that the former see in natural languages a form of freedom: comments are usually written in English or in the language spoken among the team of coders, so you simply revert back to your “day-to-day way of communicating”, which can make it easier. The latter precisely criticize this laziness and focus on the structurally sound (but limiting) syntax of the programming language at hand.

From Bloom filters to NLP AI models

The funny thing is that, to me, vocabulary is at the same time the easiest and the hardest part to learning a new natural language! On the one hand, it’s only about memorizing words, so there is nothing fancy to this learn-by-heart process. On the other hand, to truly master a language, you need to use it everyday, because there are so many words in every language that you are likely to either completely miss some, forget plenty or misuse several if you don’t practise regularly.

For computers, things are virtually reverse: while “learning by heart” a list of words is straight-forward (it’s basically copying a sequence of characters from one computer memory chip to another), understanding a language in an analytical way is pretty complicated at the moment — this is the goal of the subdomain of AI called natural language processing (or NLP).

Let’s take an example. Suppose I give you this text: “Bob is eating an apple”.

Chances are that you will quite quickly pick up on the different “roles” of the words in the sentence:

“Bob” is a person, the subject that performs the action
“is eating” is the action currently being performed
“an apple” is the object of the action, it is what the subject is working on, so to speak

Humans take a few years to learn to speak properly, but even a child is able to do this classification.

From: http://www.contrib.andrew.cmu.edu/~dyafei/NLP.html#

If you ask an AI to do the same, then it will be specifically performing a “name entity recognition” task. And it might get some good results, because we have some amazing algorithms nowadays (if you want to learn more, I encourage you to take a look at OpenAI’s GPT-3 model!). But it will need hundreds and hundreds of hours of training on thousands of data samples; it will need lots of GPUs and computing power; it will need experts to tune high-level algorithms.

In comparison, spellcheckers have been around for a long time: a common implementation uses bloom filters that were first conceived by Burton Bloom in 1970. So, verifying whether a word exists or not is a piece of cake for a computer!

To conclude…

In computer science, we as human programmers rely on structure (the keywords and syntax of the programming language) and adapt it for a specific task with vocabulary (variables names or comments) to produce a text that can be translated for a computer; in parallel, computers themselves are gradually learning about natural languages and may one day meet us half-way.

Just like natural languages have evolved through the course of centuries, we now have “generations” of programming languages and algorithms. I’m excited to see what the future has in store for us: will we have AIs that converse with us naturally and brilliantly pass the famous Turing test? computers that are able to read natural languages and render programming languages useless? or will we popularize this way of thinking so common among developers that makes us analyze and decompose problems into a core structure?

References

Google App Maker: https://developers.google.com/appmaker
OpenAI’s website: https://openai.com/
Dr. M. J. Garbade, “A Simple Introduction to Natural Language Processing” (https://becominghuman.ai/a-simple-introduction-to-natural-language-processing-ea66a1747b32), October 2018. [Online; last access 21-September-2020].
I. Wikimedia Foundation, “Kanji” (https://en.wikipedia.org/wiki/Kanji), September 2020. [Online; last access 21-September-2020].
I. Wikimedia Foundation, “Bloom filter” (https://en.wikipedia.org/wiki/Bloom_filter), September 2020. [Online; last access 21-September-2020].
I. Wikimedia Foundation, “Turing test” (https://en.wikipedia.org/wiki/Turing_test), September 2020. [Online; last access 21-September-2020].