Forth, interpreted or compiled?

Question

Supposedly Forth programs can be "compiled" but I don't see how that is true if they have words that are only evaluated at runtime. For example, there is the word DOES> which stores words for evaluation at runtime. If those words include an EVALUATE or INTERPRET word then there will be a runtime need for the dictionary.

To support such statements it would mean the entire word list (dictionary) would have to be embedded inside the program, essentially what interpreted programs (not compiled programs) do.

This would seem to prevent you from compiling small programs using Forth because the entire dictionary would have to be embedded in the program, even if you used only a fraction of the words in the dictionary.

Is this correct, or is there some way to compile Forth programs without embedding the dictionary? (maybe by not using runtime words at all ??)

Lars Brinkhoff · Accepted Answer · 2015-11-02 06:00:53Z

16

Forth programs can be compiled with or without word headers. The headers include the word names (called "name space").

In the scenario you describe, where the program may include run-time evalutation calls such as EVALUATE, the headers will be needed.

The dictionary can be divided into three logically distinct parts: name space, code space, and data space. Code and data are needed for program execution, names are usually not.
A normal Forth program will usually not do any runtime evaluation. So in most cases, the names aren't needed in a compiled program.
The code after DOES> is compiled, so it's not evaluated at run time. It's executed at run time.
Even though names are included, they usually don't add much to program size.
Many Forths do have a way to leave out the names from a program. Some have a switch to remove word headers (the names). Other have cross compilers which keep the names in the host system during compile time, but generate target code without names.

edited Nov 2, 2015 at 6:00

answered Nov 1, 2015 at 7:18

Lars Brinkhoff

14.6k2 gold badges30 silver badges50 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Tyler Durden Over a year ago

Yes, but the DOES> clause could have the EVALUATE or INTERPRET word in it. I have updated my answer to clarify this.

Julian Fondren Over a year ago

@TylerDurden, any word can have EVALUATE in it. There's very little that's special about the DOES> part of a word.

Tyler Durden Over a year ago

I have accepted this answer due to the long time with no further answers, but it does completely answer my question because it does address the fundamental issue of whether a program needs to include the entire dictionary in a "compiled" program and whether that disqualifies Forth programs from being considered to be compiled.

Lars Brinkhoff Over a year ago

If you need more information on some aspect of this, feel free to post a new question!

agc · Accepted Answer · 2021-09-09 02:38:44Z

No, the entire dictionary need not be embedded, nor compiled. All that need remain is just the list of words used, and their parent words, (& grandparents, etc.). And the even names of the words aren't necessary, the word locations are enough. Forth code compiled by such methods can be about as compact as it gets, rivaling or even surpassing assembly language in executable size.

Proof by example: Tom Almy's ForthCMP, an '80s-'90s MSDOS compiler that shrunk executable code way down. Its README says:

 .    Compiles Forth into machine code -- not interpreted.

 .    ForthCMP is written in Forth so that Forth code can be executed
      during compilation, as is customary in Forth applications.

 .    Very fast -- ForthCMP compiles Forth code into an executable file
      in a single pass.

 .    Generated code is extremely compact. Over 110 Forth "primitives"
      are compiled in-line. ForthCMP performs constant expression
      folding, strength reduction, register optimization, DO...LOOP
      optimization, tail recursion, and various "peephole"
      optimizations.

 .    Built-in assembler.

4C.COM runs under emulators like dosemu or dosbox.

A "Hello World" compiles into a 117 byte .COM file, a wc program compiles to a 3K .COM file (from 5K of source code). No dictionary or external libraries, (aside from standard MSDOS calls, i.e. the OS it runs on).

Brian Fox · Accepted Answer · 2021-01-04 02:52:23Z

Forth can be a bear to get your head around from the outside because there is NO standard implementation of the language. Much of what people see are from the early days of Forth when the author (Charles Moore) was still massaging his own thoughts. Or worse, homemade systems that people call Forth because it has a stack but are really not Forth.

So is Forth Interpreted or Compiled? Short answer: both

Early years: Forth had a text interpreter facing the programmer. So Interpreted: Check

But... The ':' character enabled the compiler which "compiled" the addresses of the words in the language so it was "compiled" but not as native machine code. It was lists of addresses where the code was in memory. The clever part was that those addresses could be run with a list "interpreter" that was only 2 or 3 instructions on most machines and a few more on an old 8 bit CPU. That meant it was still pretty fast and quite space efficient. These systems are more of an image system so yes the system goes along with your program but some of those system kernels were 8K bytes for the entire run-time including the compiler and interpreter. Not heavy lifting.

This is what most people think of as Forth. See JonesForth for a literate example. (This was called "threaded code" at the time, not to be confused with multi-threading)

1990ish Forth gurus and Chuck Moore began to realize that a Forth language primitive could be as little as one machine instruction on modern machines so why not just compile the instruction rather than the address. This became very useful with 32bit machines since the address was sometimes bigger than the instruction. They could then replace the little 3 instruction interpreter with the native CALL/Return instructions of the processor. This was called sub-routine threading. The front end interpreter did not disappear. It simply kicked off native code sub-routines

Today Commercial Forth systems generate native code, inline many/most primitives and do many of the other optimization tricks you see in modern compilers. They still have an interpreter facing the programmer. :-)

You can also buy (or build) Forth cross-compilers that create standalone executables for different CPUs that include multi-tasking, TCP/IP stacks and guess what, that text interpreter can be compiled into the executable as an option for remote debugging and configuration if you want it.

So is Forth Interpreted or Compiled? Still both.

Albert van der Horst · Accepted Answer · 2022-04-27 11:51:10Z

1

You are right that a program that executes INTERPRET (EVALUATE, LOAD, INCLUDE etc.) is obliged to have a dictionary. That is hardly a disadvantage because even a 64 bit executable is merely a 50 K for Linux or MS-Windows. Modern single board computer like the MSP430 can have the whole dictionary in flash memory. See ciforth and noforth respectively. Then there is scripting. If you use Forth as a scripting language, it is similar to perl or python The script is small, and doesn't contain the whole language. It requires though that the language is installed on your computer.

In case of really small computers you can resort to cross compiling or using an umbellical Forth where the dictionary is positioned on a host computer and communicates and programs via a serial line. These are special techniques that are normally not needed. You can't use INTERPRETing code in those cases on the sbc, because obviously there is no dictionary there.

Note: mentioning the DOES> instruction doesn't serve to make the question clearer. I recommend that you edit this out.

edited Apr 27, 2022 at 11:51

answered Apr 27, 2022 at 11:23

Albert van der Horst

9679 silver badges21 bronze badges

4 Comments

Tyler Durden Over a year ago

Ok, fair enough, but if there is an eval in there, then at the end of the day, it's an interpreter I would suppose, because ultimately you are turning text into code via the eval, so by definition that is not compiled. So, for example, if you solicit input from the user, and then run that input through EVALUATE, then essentially your program is executing completely arbitrary interpreted instructions. So, in retrospect, it seems like FORTH can be considered to be compiled unless the code contains an eval-type statement, thus it seems to be kind of a hybrid language having both characteristics.

Albert van der Horst Over a year ago

@Durden I'm confused. If you use python and you say "print 12/5" is that compiling or interpreting? In interactive mode in Python on can define a function. You call that interpreting or compiling?

Tyler Durden Over a year ago

Python is an interpreted language only. Interpretation is when there is an engine ("the interpreter") that reads source code as text, scans that text, parses it, then navigates the parse tree executing any instructions found along the way. In compiled code, there is no source code being dynamically processed, only CPU binary instructions.

Albert van der Horst Over a year ago

I switched to Python because you may have a better chance of understanding. You understand Python so far. The situation in Forth is exactly the same. Also eval("1+2") in python is the same as `` "1 2 +" EVALUATE in Forth.

Brian Fox · Accepted Answer · 2023-12-30 00:02:37Z

I think it is worth adding that early Forth "compilers" were compiling the "execution token" (XT) of words that exist in the virtual machine (VM). The XTs were in fact just the address of a piece of native code or the address of a list of addresses.

Since it is a virtual machine you must interpret that address, yes. But it is not EVAL. It is not looking up text. It is running code at an address.

The magic piece is that this "interpreter" for such an indirect-threaded VM is 3 instructions on a "bog-standard" CPU. If you used "direct threading" it is 2 instructions. So it's pretty quick.

But all modern commercial systems for 32 bit or larger machines generate native code because there is typically no space saving using threaded code on these machines.

Collectives™ on Stack Overflow

Forth, interpreted or compiled?

5 Answers 5

4 Comments

Comments

Comments

4 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

4 Comments

Comments

Comments

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related