Forth Readability

last modified: December 5, 2014

Programs written in the ForthLanguage can be hard for some programmers to read for these reasons:

All of these problems can be surmounted by anyone willing to learn Forth. But it is certainly true that someone who only knows C and Java probably won't be able to read the average Forth program at first glance.

Inflammatory material moved to ForthFlames -- verec 27Aug2001

One aspect of common Forth practice that affects readability is the use of Forth as a MetaLanguage. In other words, many Forth programmers use the ForthLanguage to define domain-specific or application-specific languages for solving particular problems, and then write the high-level definition of the applications in those languages. This leads to a lot of inconsistency of style, syntax, and semantics. For example, a Forth program that analyzes astronomical data may look absolutely nothing like a Forth program that controls industrial robots. If the programmer does not document the created language, and if the reader is not familiar with the domain, then it can be difficult for the reader to understand the meaning of the application-specific language. And if the programmer is not a skilled language designer, a big mess can result. --KrisJohnson

Compare: Many C programmers use C to define domain-specific or application-specific libraries for solving particular problems, and then write the high-level definition of the applications using those libraries. This leads to inconsistencies of style, syntax, and semantics. For example, a C program that analyzes astronomical data may look absolutely nothing like a C program that controls industrial robots. If the programmer does not document the created library, and if the reader is not familiar with the domain, then it can be difficult for the reader to understand the meaning of the library. And if the programmer is not a skilled library designer, a big mess can result.

Should we be surprised that not understanding the domain, nor the code to act in it, would confuse the programmer, regardless of the language? --BillTrost

The difference is that in a C program, no matter who writes it, function definitions look like function definitions, function calls will look like function calls, array operations will look the same, arithmetic operators work the same, pointers work the same, #include directives work the same, and so on. A Forth programmer is more likely to invent a whole new "syntax" for a domain-specific language, which will be gibberish to one who has not figured out the programmer's intention.

The difference is that in a Forth program, no matter who writes it, words are always whitespace delimited. The uniform method by which both variables and functions are referenced leads to highly expressive programs, without rival at all until the advent of FunctionalProgramming as a household term. As a result, once you know the notation, Forth reads like prose, not like mainframe-era job control language. Array operations all look the same unless factored out (+ @ or + !), arithmetic operators work the same, pointers are all Forth knows so they must work the same, Forth uses requires or load to hierarchially load dependent modules (though this is bad form; structuring your code like a book is preferred), and so on. A Forth programmer is likely to invent a whole new syntax for a domain-specific language because that allows the most compact and correct representation of a problem's solution, which is gibberish to only those who have not bothered to read the documentation of the program. A C programmer has consistent syntax, but as a result, you lose the ability to express the problem at the problem's most natural level of abstraction. You end up with a total mess of a program, where the details of syntax often weigh more heavily on the program's design than the intended problem solution! -- SamuelFalvo

In many programming languages, variable names carry a lot of the clues about what the code is doing. In Forth, this is missing. That's one reason why Forth style tends to lots and lots of small words - you use just as many names to explain what's going on, but you keep them in different places.

Interesting observations, but somewhat skewed. I am amused that I almost never see these criticisms of AssemblyLanguage. Perhaps that's because ASM isn't *supposed* to be a "high level" language.

Kris' observations above are very pertinent. Forth is, in many ways, a "language assembly" kit (vs AssemblyLanguage). It has all the low-level power you need and the power to spin really high-level abstractions.

No seatbelts (just like AssemblyLanguage). You don't want to put this kind of raw power into careless or clumsy hands. Though those who are careless and clumsy either learn from their mistakes really fast, or move on to other languages that come pre-equipped with roll-cages.

I've written really awful Forth (a full sketch program in 48 lines) and really clean Forth (a modem control and dialer program). I've written pretty C and ugly C. Ditto AssemblyLanguage.

In the end, the programmer, not the language, determines readability. -- GarryHamilton

In many programming languages, variable names carry a lot of the clues about what the code is doing. In Forth, this is missing. That's one reason why Forth style tends to lots and lots of small words - you use just as many names to explain what's going on, but you keep them in different places.

The end result is that the program is expressed in terms a human reader can understand. Humans do not think in terms of abstract nouns, but rather imperatives. When you look at a set of instructions, what's clearer to you, (a) "Insert the aforementioned article into the previously described recepticle," or (b) "Insert the plug into the wall socket." Both say the same thing, but one requires the reader to have a larger context kept in his/her head, while the latter is simple, direct, and to the point. I would rephrase (a) as "Insert it there." Now which is clearer?

This is not to say that variables are a bad thing. If a language is designed to use variables, such as most applicative languages, then that's perfectly OK. But Forth isn't an applicative language -- it's concatenative. It's designed to not use variables, and the structure of the language reflects that. If you try to write C code in Forth, you're resulting program will be an unreadable mess. If you try to write Forth code in C, the result won't even compile. If you write C code in C, it'll be clear and readable to a C programmer. If you write Forth code in Forth, it'll be clear and readable to a Forth programmer. (That's true, but Forth has less chance of being understood by someone that doesn't know the language than a program written in C. Compare them visually: ForthAndCsample. In my experience, I can teach someone isn't a coder the rudaments of Forth in 15 minutes. Teaching them C takes much longer. --SamuelFalvo)

Huh? If you show the two examples to a person who doesn't know any programming languages, the C example isn't gonna be easier to understand than the terse Forth example. In fact the C example would probably be harder to understand because there's more weird syntax such as i++ and for( ; ; ). Most people exposed to programming get exposed to an imperative language, like C, pascal, java. They forget the learning curve that exists. I think the learning curve for Forth is similar as for a person's first imperative language.

The PrettyPrinter for GNU Forth (gforth command 'see word') demonstrates one rather readable CodingStyle:
[char] * emit ;
0 DO star LOOP cr ;
dup 0 DO dup stars LOOP drop ;
1 DO i stars LOOP ;
dup triangle square ;
cr 7 stars cr 3 triangle cr 6 tower ;
This code is actually a fine example of how a beginner would code in Forth. Predominantly vertical, because he's not accustomed to keeping stack state in his head. Fortunately, this code can be cleaned up significantly with experience.
stars 0 do [char] * emit loop cr ;
square dup 0 do dup stars loop drop ;
triangle 0 do i stars loop ;
tower dup triangle square ;
main cr 7 stars cr 3 triangle cr 6 tower ;

Note the use of spaces to separate phrases within the same definition.

If it helps the reader to understand, feel free to annotate your code like so:

( let )

 : stars      0 do [char] * emit loop cr ;
 : square     dup 0 do  dup stars  loop drop ;
 : triangle   0 do  i stars  loop ;
 : tower      dup triangle square ;

( in )

 : main       cr 7 stars  cr 3 triangle  cr 6 tower ;

However, I'd hazard a guess that what's trying to be accomplished here is actually not at the right level of abstraction. Someone looking at the above, who is not an experienced Forth coder, will find the above only marginally easier to read than JayLanguage.

So, at the risk of introducing concepts using strange notations in one place, we can make the code easier to read (even if slightly more verbose) elsewhere:

invoke >r ; ( I've discovered using "call" primitive in GForth on 64-bit capable CPUs is buggy )

repeated: begin dup while r@ invoke 1- repeat drop rdrop ;

stars repeated: [char] * emit ;

row over stars ;

onASide repeated: row cr ;

square dup onASide drop ;

triangle 0 do i stars cr loop ;

tower dup triangle square ;

So, sometimes, with a little imagination and a wee bit of experience, you actually can write readable Forth after all. Who would have thought? :) --SamuelFalvo

Ooh, that is really neat. I'd been wondering if you could do functional programming in Forth. Elegant, but it didn't work for me in gforth 0.6.2 on Linux 2.6.26-1-amd64 (or in a 32-bit chroot), so I came up with the following alternative for "invoke" that simply monkeys with the return stack rather than relying on "call". I also had to do a little magic to make sure the repeated phrase had access to the expected stack contents:
invoke >r ;
repeated: begin dup while r@ swap >r invoke r> 1- repeat drop rdrop ;

-- BillTrost

That's true, but Forth has less chance of being understood by someone that doesn't know the language than a program written in C.

I don't understand why it's important for people who don't know a language to be able to read code written in it. Can someone explain?

ForthLanguage, ExampleForthCode, ForthIsDead, ForthPortability, ForthReusability, ForthPessimism

