papierkorb:4th_lesson_3
**Dies ist eine alte Version des Dokuments!**
Lesson 3
\ Lesson 3 - How Forth Works
\ The Forth Course
\ by Richard E. Haskell
\ Dept. of Computer Science and Engineering
\ Oakland University, Rochester, MI 48309
comment:
Lesson 3
HOW FORTH WORKS
3.1 VARIABLES 3-2
3.2 MORE ABOUT VARIABLES -- FETCH AND STORE 3-3
3.3 CONSTANTS 3-5
3.4 THE FORTH COLON WORD : 3-6
3.5 ARRAYS 3-7
3.6 THE RETURN STACK 3-8
3.7 CODE WORDS 3-9
3.8 THE FORTH DICTIONARY 3-10
3.9 TABLES 3-11
3.10 CHARACTER OR BYTE DATA 3-12
3-11 FINDING DICTIONARY ADDRESSES 3-13
3-12 THE NAME FIELD 3-14
3-13 OPERATION OF THE F-PC INNER INTERPRETER 3-15
EXERCISES 3-17
3.1 VARIABLES
The Forth word VARIABLE is a defining word used to define
variable names. If you type
VARIABLE my.name
then Forth will create a new dictionary entry called my.name.
All dictionary entries have the same general form, consisting of
a header (made up of a view field, a name field and a link field)
and a body (made up of a code field and a parameter field). In
F-PC the header and the body are physically stored in separate
segments.
(In the 8086/8088/80286 the 1 Mbyte address space is divided into
64 Kbyte segments. A physical address is specified by a 16-bit
segment address, seg, and a 16-bit offset address, off. The total
address is written as seg:off. The segment address can begin on
any 16-byte boundary, called a paragraph. Therefore, to address
any byte in memory, one has to specify both a segment address and
an offset address.)
For the word MY.NAME the dictionary entry will look like this
HEADER ________
view field address ---> VFA | VIEW |
|------|
link field address ---> LFA | LINK |
|------|
name field address ---> NFA | 7 M |
| Y . |
| N A |
| M E |
|------|
pointer to CFA -------> | ^CFA | -------|
|------| |
Head Segment YSEG |
|
BODY ________ |
code field address ---> CFA | CODE | <------|
|------|
parameter field address ---> PFA | 0 0 |
|------|
Code Segment ?CS:
The view field contains a character count equal to the offset
address of the beginning of the colon definition within the file.
It is used to locate the colon definition when you ask to VIEW
the definition of a word.
The link field contains a pointer to the LFA of a previously
defined word.
The name field contains the name, beginning with the character
count and followed by 1-31 characters.
The pointer to CFA contains the offset address of the CFA in the
code segment. This segment address is given by the F-PC word
?CS:.
The code field contains code that will be executed when the name
in the name field is interpreted. This is called direct threaded
code and is used by F-PC. Many Forths use indirect threaded code
in which the code field contains a pointer to the code that is
executed. For a VARIABLE the code field contains the three bytes
corresponding to the instruction CALL >NEXT where >NEXT is the F-PC
inner interpreter to be described later in this lesson. The CALL
instruction automatically puts the address of the next instruction
on the stack. In this case it is not the address of an instruction
at all but rather the parameter field address.
The parameter field contains different things for different kinds
of words. For a VARIABLE word, the parameter field contains the
16-bit value of the variable. An initial value of zero is stored
in the parameter field when a VARIABLE is defined.
When you type the name of a VARIABLE the CALL >NEXT instruction in
the code field will be exectuted which has the effect of leaving
the parameter field address on the stack. If you type
my.name .
the PFA of my.name will be printed. Try it.
3.2 MORE ABOUT VARIABLES -- FETCH AND STORE
Forth words:
! ( n addr -- ) ( "store" )
stores the value of n at addr
6 my.name !
will store the value 6 at the PFA of my.name.
@ ( addr -- n ) ( "fetch" )
reads the value at addr and puts it on the stack.
my.name @ .
will print the value of my.name.
Stack variables
The system variable SP0 contains the value of the stack pointer
for an empty stack. Thus,
SP0 @
will return the address of the stack pointer when nothing is on
the stack.
The Forth word SP@ returns the address of the last item pushed on
the stack. That is, it is the current value of the stack pointer.
The Forth word DEPTH returns the number of items on the stack.
It is defined as follows:
: DEPTH ( -- n )
SP@ SP0 @
SWAP - 2/ ;
Note that since each element on the stack is a word that contains
two bytes, the number of bytes on the stack must be divided by 2
to give the number of words on the stack.
3.3 CONSTANTS
The Forth word CONSTANT is a defining word that lets you define
constants. For example, if you type
25 CONSTANT quarter
the name quarter will be entered into the dictionary as follows:
________
VFA | VIEW |
|------|
LFA | LINK |
|------|
NFA | 7 Q |
| U A |
| R T |
| E R |
|------|
| ^CFA | -------|
|------| |
Head Segment YSEG |
|
________ |
CFA | CODE | <------|
|------|
PFA | 25 |
|------|
Code Segment ?CS:
The code field contains the three bytes corresponding to the
instruction CALL DOCONSTANT. The code at DOCONSTANT pops the
PFA from the stack (where it was put by the CALL instruction)
and then pushes on the stack the value stored at the PFA. Thus,
if you type
25 CONSTANT quarter
and then type
quarter .
the value 25 will be printed.
NOTE: The value stored in CONSTANTs are 16-bit signed numbers.
3.4 THE FORTH COLON WORD :
The Forth word : ("colon") is also a defining word that lets you
define new Forth words. When you type
: squared DUP * ;
the colon word : is executed. It creates your Forth word
squared as a dictionary entry that looks like this:
________ ________
VFA | VIEW | |-------> | DUP | ES:0
|------| | |------|
LFA | LINK | | | * |
|------| ES = LSO + XSEG |------|
NFA | 7 S | | |UNNEST|
| Q U | | |------|
| A R | | List Segment XSEG
| E D | |
|------| |
| ^CFA | -------| |
|------| | |
Head Segment YSEG | |
| |
________ | |
CFA | CODE | <------| |
|------| |
PFA | LSO | --------------|
|------|
Code Segment ?CS:
The code field contains the three bytes corresponding to the
instruction JMP NEST. The code at NEST is part of the inner
interpreter whose operation will be described later in this lesson.
The parameter field contains a list segment offset, LSO, whose value
is added to the base list segment address contained in the variable
XSEG. The resulting segment address is stored in the register ES
and the list of code field addresses making up the definition of
squared is stored in memory starting at address ES:0.
UNNEST is the address of another routine that is part of the Forth
inner interpreter.
3.5 ARRAYS
Suppose you want to create an array that contains five 16-bit values.
If you type
VARIABLE my.array
Forth will create the dictionary entry my.array that contains a
single 16-bit value in its parameter field.
________ |
CFA | CODE | <------|
|------|
PFA | 00 |
|------|
Code Segment ?CS:
Here we have not bothered to show the head segment associated with
the definition of my.array. Note that the parameter field contains
2 bytes to store the 16-bit value.
The Forth word ALLOT will add n bytes to the dictionary in the code
segment where n is the value on the stack when ALLOT is executed.
Thus,
8 ALLOT
will add 8 bytes, or 4 words to the dictionary. The code segment
part of the dictionary entry my.array will then look like this.
_______________ |
CFA | CODE | <------|
|-------------|
PFA | my.array(0) |
|-------------|
PFA + 2 | my.array(1) |
|-------------|
PFA + 4 | my.array(2) |
|-------------|
PFA + 6 | my.array(3) |
|-------------|
PFA + 8 | my.array(4) |
|-------------|
Code Segment ?CS:
To print the value of my.array(3) you could type
my.array 3 2* + @ .
3.6 THE RETURN STACK
When you type a number it is put on the parameter stack. All of
the arithmetic operations and words such as DUP, ROT, DROP, SWAP,
and OVER operate on numbers that are on the parameter stack.
Forth also has a second stack called the return stack. The return
stack is used by the Forth inner interpreter to store the return
address of the next word in a colon definition. It is also used
by certain Forth words such as DO.
You can also use the return stack IF YOU ARE CAREFUL. You can move
a number from the parameter stack to the return stack temporarily
if you are sure to move it back before the end of a colon definition.
Otherwise, the proper return address will not be on top of the return
stack, and the inner interpreter will not be able to find its way
back to the next word to execute.
The following Forth words use the return stack, R:
>R ( n -- ) ( "to-R" )
pops the top element of the parameter stack
and pushes it onto the return stack.
Ex. 3 >R
moves 3 to the top of the return stack and
leaves the parameter stack empty.
R> ( -- n ) ( "from-R" )
pops the top element of the return stack
and pushes it onto the parameter stack.
R@ ( -- n ) ( "R-fetch" )
copies the top element of the return stack
to the top of the parameter stack.
Here is a possible definition of ROT:
: ROT ( n1 n2 n3 -- n2 n3 n1 )
>R \ n1 n2
SWAP \ n2 n1
R> \ n2 n1 n3
SWAP ; \ n2 n3 n1
3.7 CODE WORDS
Code words are Forth words that are defined in terms of 8086
machine code rather than in terms of other Forth words. Of
course, eventually some real 8086 machine code must be executed!
The inner interpreter is written in machine code. In addition,
many of the F-PC Forth words are written in machine code to
maximize their speed of operation. In Lesson 7 we will show you
how you can write your own Forth code words.
Since F-PC uses direct-threading, the machine code in a code word
is stored directly at the CFA in the code segment. Here are
examples of how some F-PC primatives are defined. In each case
the header of the word is stored in the head segment in the same
way as described previously for VARIABLES, CONSTANTS, and colon
definitions.
DUP
_______________ |
CFA | POP AX | <------|
|-------------|
| PUSH AX |
|-------------|
| PUSH AX |
|-------------|
| JMP >NEXT |
|-------------|
Code Segment ?CS:
SWAP
_______________ |
CFA | POP DX | <------|
|-------------|
| POP AX |
|-------------|
| PUSH DX |
|-------------|
| PUSH AX |
|-------------|
| JMP >NEXT |
|-------------|
Code Segment ?CS:
@ ("Fetch")
_______________ |
CFA | POP BX | <------|
|-------------|
| PUSH [BX] |
|-------------|
| JMP >NEXT |
|-------------|
Code Segment ?CS:
3.8 THE FORTH DICTIONARY
The Forth dictionary consists of a linked list of all words that
have been defined. These words can be variables, constants,
colon definitions or code words. The names of all of these words
are stored in the head segment and linked by means of the link
field pointers. The code field of each word is pointed to by the
code field pointer in the header. The code field always contains
real executable code and therefore must be in the 8086 code segment.
In a colon definition the list of CFA's for each word in the
definition is stored in a separate list segment that is pointed to
by a pointer at the PFA in the code segment.
When you define a new word using the colon definition, the
compilation process consists of storing the word in the dictionary.
F-PC links the name of your word into one of 64 threads using a
hashing scheme to increase the speed at which words can be found when
searching the dictionary.
The next available address in the code segment of the dictionary
is given by HERE. That is, HERE is a Forth word that returns the
next available address on the stack. The variable DP, the dictionary
pointer, contains this next available dictionary address. The word
HERE is just
: HERE ( -- n )
DP @ ;
The outer interpreter is what is being executed when you boot up
a Forth system and the ok prompt is displayed. When you type a
word and press <Enter> the outer interpreter searches the dictionary
for the word you typed. If it finds the word, it executes it by
executing the code located at the code field address. If it doesn't
find the word, it executes a word called NUMBER that tries to convert
your word to a valid number. If it succeeds, it pushes the value
of the number on the stack; otherwise, it displays the <- What?
message that tells you that is doesn't understand your word.
We will take a closer look at the operation of the inner interpreter
in Section 3.13.
3.9 TABLES
A table is like a constant array. You could create an array and
then fill the values of the array using the ! store word.
Another way to create a table is to use the Forth word CREATE
which works the same way as VARIABLE except that no space is
reserved in the parameter field. For example, if you type
CREATE table
you will create the following dictionary entry.
table
________ |
CFA | CODE | <------|
|------|
PFA <---- HERE
Code Segment ?CS:
As with the case of a VARIABLE the code field contains the three bytes
corresponding to the instruction CALL >NEXT where >NEXT is the F-PC
inner interpreter. The CALL instruction will leave the parameter
field address on the stack when the word table is called.
At this point the dictionary pointer, DP, contains the value of the
PFA of table. The forth word comma (,) will store the value on the
stack at the location pointed to by the dictionary pointer; that is,
at the next available location in the dictionary. Therefore, if you
type
CREATE table 5 , 8 , 23 ,
the following dictionary entry will be created:
table
________ |
CFA | CODE | <------|
|------|
PFA | 5 | 0
|------|
| 8 | 1
|------|
| 23 | 2
|------|
Code Segment ?CS:
You could now define a new word called @table as follows:
: @table ( ix -- n )
2* table \ 2*ix pfa
+ @ ; \ @(pfa + 2*ix)
For example, 2 @table
will return 23 to the top of the stack.
3.10 CHARACTER OR BYTE DATA
Character (ASCII) data can be stored in one byte. Data can be
stored and retieved from single bytes using the following Forth
words:
C, ( c -- ) ("C-comma")
Store least significant byte (LSB) of value on top of
the stack at HERE (next available location in the
dictionary).
C! ( c addr -- ) ("C-store")
Store LSB of value on top of the stack at addr.
C@ ( addr -- c ) ("C-fetch")
Fetch the byte at addr. Leave as LSB on stack.
You can create a table containing byte values rather than word
values by typing
CREATE table 5 C, 8 C, 23 C,
You can then define a word called C@table as follows:
: C@table ( ix -- c )
table + C@ ;
2 C@table will then return 23 to the top of the stack.
Note the difference between C@table and @table defined in
Section 3.9.
3.11 FINDING DICTIONARY ADDRESSES
The following words can be used to locate and examine Forth
dictionary entries:
' ( -- cfa ) ("tick")
The statement ' table
will leave the CFA of table on the stack.
>NAME ( cfa -- nfa ) ("to-name")
Converts the code field address, CFA, (in the code segment)
to the name field address, NFA, (in the head segment).
>LINK ( cfa -- lfa ) ("to-link")
Converts the code field address, CFA, (in the code segment)
to the link field address, LFA, (in the head segment).
>BODY ( cfa -- pfa ) ("to-body")
Converts the code field address, CFA, (in the code segment)
to the parameter field address, PFA, (in the code segment).
You can go to the code field address by using
BODY> ( pfa -- cfa ) ("from-body")
NAME> ( nfa -- cfa ) ("from-name")
LINK> ( lfa -- cfa ) ("from-link")
You can also go from NAME to LINK and from LINK to NAME using
N>LINK ( nfa -- lfa ) ("name-to-link")
L>NAME ( lfa -- nfa ) ("link-to-name")
The Forth word HEX will change the base used to print out values to
hexadecimal. The word DECIMAL will change the base back to decimal.
You can change to any base by storing the value of the base in the
variable BASE. For example, the definition of HEX is
: HEX 16 BASE ! ;
Note that the base must be DECIMAL when this definition of HEX is
loaded.
The Forth word U. prints the top value on the stack as an unsigned
number between 0 and 65535, or if in the HEX mode, between 0000 and
FFFF. As an example, to print the hex address of the name field of
the word OVER type
HEX ' OVER >NAME U. DECIMAL
The Forth word LDUMP ( seg off #bytes -- ) can be used to get a
hex dump of #bytes of memory starting at address seg:off.
For example, type
YSEG @ ' OVER >NAME 20 LDUMP
and see if you can see the name field of OVER.
3.12 THE NAME FIELD
When you define a new word such as TEST1 using a colon definition
the following name field is created:
Precedence bit ----| |--- Smudge bit Hex value
| |
___________________________________
NFA | 1 | 0 | 0 | Name char count = 5 | 85
|---------------------------------|
| 0 | Character #1 T = 54H (hex) | 54
|---------------------------------|
| 0 | Character #2 E = 45H | 45
|---------------------------------|
| 0 | Character #3 S = 53H | 53
|---------------------------------|
| 0 | Character #4 T = 54H | 54
|---------------------------------|
| 1 | Character #5 1 = 31H | B1
|---------------------------------|
If the precedence bit is set, the word will be executed immediately.
Immediate words will be discussed in Lesson 9.
If the smudge bit is set, the word will not be able to be found in
a dictionary search. This bit is set while a colon definition is
being compiled.
Type in the following dummy colon definition:
: TEST1 ;
Then examine the name field by typing
YSEG @ ' TEST1 >NAME 10 LDUMP
The six hex values shown in the figure above should be displayed.
Note that the most significant bit is set to 1 in the first and last
byte in the name field. The maximum number of characters actually
stored in the name field is the value stored in the variable WIDTH.
For example,
10 WIDTH !
will cause a maximum of 10 characters to be stored in the name field.
F-PC sets the default value of WIDTH to 31 -- its maximum possible
value.
3.13 OPERATION OF THE F-PC INNER INTERPRETER
The following diagram illustates the operation of the F-PC inner
interpreter.
cubed
________ |
CFA | CODE | <------|
|------| _________
PFA | LSO | -------- +XSEG -------> | DUP | ES:0
|------| |-------|
Code Segment ?CS: |<------- |squared| IP = ES:SI
| |-------|
| | * |
| |-------|
|<- W = AX = current word pointer <-| |UNNEST |
| |-------|
| List Segment XSEG
| squared
| ________ |
|-----> CFA | CODE | <------|
|------| ________
PFA | LSO | -------- +XSEG -------> | DUP | ES:0
|------| |------|
Code Segment ?CS: | * |
|------|
|UNNEST|
|------|
List Segment XSEG
>NEXT LODSW ES: \ Load AX with CFA at ES:SI & inc SI
JMP AX \ Execute the code at the CFA in AX
NEST XCHG BP,SP \ Push IP = ES:SI
PUSH ES \ on the return stack
PUSH SI
XCHG BP,SP
MOV DI,AX \ AX = CFA of word to execute
MOV AX,3[DI] \ Get LSO at PFA
ADD AX,XSEG \ and add to XSEG
MOV ES,AX \ Put this sum in ES
SUB SI,SI \ Make new IP = ES:SI = ES:0
JMP >NEXT \ Go to >NEST
UNNEST XCHG BP,SP \ Pop IP = ES:SI
POP SI \ from the return stack
POP ES
XCHG BP,SP
JMP >NEXT \ Go to >NEXT
The inner interpreter contains the three routines >NEXT, NEST, and
UNNEST. An interpreter, or instruction pointer, IP, points to the
memory location in the list segment containing the code field address
of the next word to be executed. In F-PC this instruction pointer
is made up of the two registers ES:SI. Suppose this is pointing to
the CFA of squared in the definition of cubed as shown in the
previous figure. The routine >NEXT puts this CFA into a word
address register, W, (which is AX in F-PC) increments IP (SI) by 2
so that it will point to the next word (*) in the current definition,
and then executes the code at the CFA in W.
If this is a colon definition as it is in this case then the code
at the CFA is a jump to the routine NEST. As shown above NEST will
push IP (both ES and SI) on the return stack so that the program can
later find its way back to the next word in cubed when UNNEST is
executed. NEST then gets the list segment offset, LSO, for the word
squared, adds it to the base list segment address in XSEG and stores
this value in ES. It then sets SI to zero so that the new IP is
given by ES:0, which points to the first word in the definition of
squared. It then jumps to >NEXT which will repeat the process, this
time executing the first word in squared, namely, DUP.
Since DUP is a code word, its actual machine code is located at its
CFA. This code will therefore be executed when >NEXT is executed.
The last instruction in the definition of DUP is another jump to
>NEXT. But now the IP will have been incremented to point to the
CFA of *. Again this is a code word which will be executed and then
jump to >NEXT again.
The last word in a colon definition is UNNEST. The CFA of UNNEST is
added to the list segment in the dictionary when the semi-colon in a
colon definition is executed. The code field of UNNEST contains the
machine code shown above which pops IP (SI and ES) from the return
stack and then jumps to >NEXT. Since this is the IP pushed on the
stack by NEST when squared was executed, it will point to the word
following squared in the definition of cubed. This is the word *
which is the next word to be executed.
This is how Forth works. Colon definitions are stored as lists of
CFA's in the list segment. When the CFA to be executed is that of
another colon definition, the IP is incremented and pushed on the
stack and then changed to point to the first CFA in the definition
of the new word to be executed. When the CFA to be executed is that
of a code word, then the actual machine code located at the CFA is
executed. This process continues with a jump to >NEXT following the
execution of each word.
EXERCISE 3.1
Define the colon words
: squared DUP * ;
and
: cubed DUP squared * ;
Use the F-PC words ' ("tick"), >LINK ("to-link"), and
LDUMP (seg off #bytes -- ) to answer the following questions:
1) What is code segment address ?CS:?
2) What is the head segment address stored in YSEG?
3) What is the list segment address stored in XSEG?
4) What is the CFA of squared?
5) What is the LFA of squared?
6) What is the NFA of squared?
7) What is the PFA of squared?
8) What is the CFA of cubed?
9) What is the LFA of cubed?
10) What is the NFA of cubed?
11) What is the PFA of cubed?
12) Draw a picture of the header of squared showing the
hex values in all locations. What is the value stored
in the ^CFA location? Draw a picture of CFA and PFA fields
as well as the list segment dictionary entries for the word
squared. Show all addresses and values stored in the
dictionary.
13) What is the LFA of the dictionary entry defined just prior to
cubed? What is the name of that word?
14) What is the address of NEST?
15) What is the CFA of DUP?
16) What is the CFA of *?
17) What is the address of UNNEST?
comment;
papierkorb/4th_lesson_3.1755364027.txt.gz · Zuletzt geändert: 2025-08-16 19:07 von mka