Forth-eV Wiki

Lesson 10

\       Lesson 10 - Forth Data Structures
\       The Forth Course
\       by Richard E. Haskell
\          Dept. of Computer Science and Engineering
\          Oakland University, Rochester, MI 48309

comment:



                                Lesson 10

                          FORTH DATA STRUCTURES


                10.1  ARRAYS                            10-2

                10.2  LINKED LISTS                      10-5

                10.3  RECORDS                           10-13



































10.1  ARRAYS

        Much of the information in this lesson is based on material found
        in the book "Object-oriented Forth" by Dick Pountain, Academic
        Press, 1987.  We will extend those ideas so that the data structures
        can take advantage of all of the memory in the system.

        The F-PC words ALLOC ( #para -- #para segment flag )
        and DEALLOC ( segment -- flag ) use the DOS function calls
        AH = 48H and AH = 49 respectively to allocate and release memory.
        Using these words we can define the following more convenient
        words to allocate and release memory:
comment;

: alloc.mem         ( size -- segment )
                  PARAGRAPH ALLOC         \ DOS alloc INT 21H - AH=48H
                  8 =
                  ABORT" Not enough memory to allocate "
                  NIP ;                    \ discard #para allocated

: release.mem       ( segment -- )
                    DEALLOC                \ DOS INT 21H - AH=49H
                    ABORT" Failed to deallocate segment "
                    ;

comment:
        The word alloc.mem expects the size of the block you want to
        allocate (in bytes) on the stack and returns the segment address
        of the allocated block.  The F-PC word

        : PARAGRAPH     15 + U/16 ;

        will convert the number of bytes requested to the number of
        16-byte paragraphs.

        The word release.mem will release the memory allocated by alloc.mem.
        You must first push on the stack the segment address of the block
        of memory you want to release.  (This must be the segment address
        returned by a previous call of alloc.mem).

        Suppose you want to create an array of a certain size in extended
        memory and then use @L and !L to fetch and store values in this
        array.  We might define the following defining word:
comment;

        : array         ( size +++ )
                        CREATE
                           2* DUP alloc.mem ,   \ save seg address
                           ,                    \ save array size in bytes
                        DOES>
                           @ ;




comment:
        Then, for example,

                1000 array array.name

        will create a dictionary entry called array.name, allocate 1000
        words of memory and store the segment address of the allocated
        block of memory and the size of the array in the parameter field
        of array.name.  When array.name is later called it will leave the
        segment address of the array on the stack.  The dictionary entry
        of array.name will be stored in memory as follows:

                               array.name
                      ________        |
                  CFA | CODE | <------|
                      |------|                          ________
                  PFA | seg  | ---------------------->  |      | seg:0
                      |------|                          |------|
                      | size |                          |      |
                      |------|                          |------|
                  Code Segment ?CS:                     |      |
                                                        |------|
                                                        |      |
                                                        |------|
                                                        |      |
                                                        |------|
                                                      Array Segment


        To access the value of the array element array.name(5), for example,
        you would type
                        array.name 5 @L

        The problem with using this scheme for extended memory arrays is
        that it will fail if you make a turnkey system of your program.
        Making a turnkey system will strip all headers from the dictionary
        and create an .EXE file that contains all of your program words
        together with all the F-PC words.  When you save this system the
        code segment part of any arrays that you have defined will be saved
        but the memory allocated for the actual array will be lost.  This
        means that when the turnkey program later runs it must somehow
        allocate any memory it needs for arrays and store the segment
        address of the array in the PFA of the array name.

        We can modify the definition of array to be used in turnkey systems
        as follows:
comment;

        : array.tk      ( size +++ )
                        CREATE
                           0 ,                  \ fill in seg address later
                           2* ,                 \ save array size in bytes
                        DOES>
                           @ ;

\       Note that if you now type

                1000 array.tk array.name

comment:
        you will create the dictionary entry array.name and save the size
        of 1000 but will not allocate any memory for the array at this point.
        Memory can later be allocated for all arrays using the following
        words:
comment;

        : alloc.array   ( cfa -- )
                        >BODY DUP 2+ @          \ get size in bytes
                        alloc.mem               \ allocate memory
                        SWAP ! ;                \ save seg at PFA

        : allocate.arrays       ( -- )
                        [ ' array.name ] LITERAL alloc.array ;

comment:
        The word allocate.arrays would contain a similar line for each
        array that you had defined in the program.  You would include
        the word allocate.arrays as part of the initiallization of your
        program.  This will allow memory to be allocated for each of your
        arrays even in turnkey systems.

        You can release all memory allocated to arrays using the following
        words:
comment;

        : release.array         ( cfa -- )
                        >BODY @                 \ get segment address
                        release.mem ;           \ and release it

        : release.all.arrays    ( -- )
                        [ ' array.name ] LITERAL release.array ;

comment:
        You would add similar lines to release.all.arrays for each array
        whose memory you want to release.















10.2  LINKED LISTS

        In this section we will write a number of words for creating and
        maintaining linked lists.  (See Chapter 3 in Pountain's book).

        Each node in the linked list will contain 4 bytes.  The first two
        will be a pointer to the next node and the last two will contain
        the 16-bit value associated with the node.

                _____________    _____________     _____________
                | ptr |value|    | ptr |value|     |  0  |value|
                -------------    -------------     -------------
                   |               ^ |               ^
                   |               | |               |
                   |---------------| |---------------|

        When adding values from a given linked list we will get a node
        from a large pool of nodes in a free list and when we delete a
        value from a list we will return the node to the free list.  We
        will allocate a large block of memory for the free list and then
        link all of the nodes in the free list as follows:

                        _________
                        |       | <list.seg>:0
                        |-------|
                        | head^ |--|        :2
                        |-------|  |
                     |--|  ptr  |<-|        :4
                     |  |-------|
                     |  | value |
                     |  |-------|
                     |->|  ptr  |--|        :8
                        |-------|  |
                        | value |  |
                        |-------|  |
                     |--|  ptr  |<-|        :12
                     |  |-------|
                     |  | value |
                     |  |-------|
                     |->|  ptr  |--|        :16
                        |-------|  |
                        | value |  |
                        |-------|  |
                        |  ptr  |<-|
                        |-------|

        Available nodes start at offset address 4 within the segment
        <list.seg> and occur at multiples of 4 bytes thereafter.  The
        head pointer of the free list is at address <list.seg>:2.  The
        value at <list.seg>:0 is not used.  The following words will
        create this free list.
comment;



\ ------------------------------------------------------
\ Variables and Constants
DECIMAL
0    CONSTANT nil
2    CONSTANT [freelist.head]
0    VALUE    <list.seg>
[freelist.head]    VALUE    [list.offset]

\ ------------------------------------------------------
\ Allocate memory

: release.seglist       ( -- )
                <list.seg> ?DUP
                IF
                    DEALLOC 0=            \ DOS INT 21H - AH=49H
                    IF
                       0 !> <list.seg>
                    ELSE
                       ABORT" Failed to deallocate <list.seg> "
                    THEN
                THEN ;

: alloc.seglist         ( size -- )
                  release.seglist
                  2* 2* 4 +                \ 4 bytes/node + head
                  alloc.mem                \ allocate memory
                  !> <list.seg> ;          \ <list.seg> = base segment address

\ ------------------------------------------------------
\ Create freelist
\ Nodes:  | ptr | val |

: allocate.freelist      ( size -- )
                  DUP alloc.seglist               \ size
                  [list.offset] 2+                \ next ptr addr
                  <list.seg> [list.offset] !L     \ store at current ptr
                  2 +!> [list.offset]             \ make next ptr current ptr
                  1 DO                            \ do size-1 times
                        [list.offset] 4 +            \ next ptr addr
                        <list.seg> [list.offset] !L  \ store at current ptr
                        4 +!> [list.offset]       \ make next ptr current ptr
                  LOOP
                  nil <list.seg> [list.offset] !L  \ make last ptr nil
                  4 +!> [list.offset] ;            \ [list.offset] --> eolist

: freelist        ( -- seg offset )
                  <list.seg> [freelist.head] ;








\ ------------------------------------------------------
\  Node manipulation words

\       The following word will insert a node at address seg:node
\       after a node whose pointer is at address seg:list.

: node.insert   ( seg list seg node --- )       \ insert after seg:list
                2OVER @L                        \ s l s n @l
                ROT 2 PICK                      \ s l n @l s n
                !L                              \ s l n
                -ROT !L ;

\       The following word will remove the node following the pointer at
\       seg:list and leave the address of the removed node, seg:node,
\       on the stack.  If seg:list is the header, this word removes the
\       first node in the list.  If the list is empty, it leaves seg:0.

: node.remove   ( seg list -- seg node )
                2DUP @L                         \ s l @l
                2 PICK SWAP DUP                 \ s l s @l @l
                IF                              \ s l s @l
                  2SWAP 2OVER @L                \ s @l s l @@l
                  -ROT !L                       \ s n
                ELSE                            \ s l s 0
                  2SWAP 2DROP                   \ s 0
                THEN ;

\       To get a node you just remove one from the free list using getnode.

: getnode       ( --- seg node )
                freelist node.remove ;

\       To put a node at seg:node back in the free list you use freenode.

: freenode      ( seg node --- )
                freelist 2SWAP                  \ seg list seg node
                node.insert ;


















\       The word newlist will create a new list header in the code segment.
\       The PFA of this list header will contain the offset address in the
\       segment <list.seg> of the list header.

: newlist       ( +++ )
                CREATE
                        nil ,            \ fill in node addr later
                DOES>           ( -- seg list )
                        <list.seg>  SWAP @ ;

\       To create a new list called sample.list you would type

                newlist sample.list

\       You would then create the header for this list in the segment
\       <list.seg> by including the following line in the word
\       fill.newlists.

        : fill.newlists ( -- )
              getnode DUP [ ' sample.list ] LITERAL >BODY ! nil -ROT !L ;

comment:
        This technique is used to make the lists available in a turnkey
        system in much the same way we did it for arrays.  Before you can
        use any of these data structures you must allocate the memory in
        your program using a word such as
comment;

        : init.data.structures  ( -- )
                        allocate.arrays
                        1200 allocate.freelist
                        fill.newlists ;

\       So that you can test the words in this lesson we will go ahead
\       and execute
                        init.data.structures

\       when you FLOAD LESSON10.

















\       To push the value 5 on the top of the list sample.list you would type

\                5 sample.list push

\       using the following word push:

: push          ( value seg list -- )
                getnode ?DUP
                IF                              \ v s l s n
                   4 ROLL 2 PICK 2 PICK         \ s l s n v s n
                   2+ !L node.insert
                ELSE
                   ." no free space " ABORT
                THEN ;

\       To pop the top value from the list sample.list you would type

\               sample.list pop

\       using the following word pop:

: pop           ( seg list -- value )
                node.remove ?DUP
                IF                              \ s n
                  2DUP freenode                 \ put node back in freelist
                  2+ @L                         \ get value
                ELSE
                   ." empty list " ABORT
                THEN ;

\       To print out the contents of the list sample.list you would type

\               sample.list .all

\       using the following word .all:

: .all          ( seg list -- )                 \ print list contents
                BEGIN                           \ s l
                  OVER SWAP @L ?DUP             \ s n  n
                WHILE
                  2DUP 2+ @L .                  \ s n
                REPEAT
                DROP ;

\       To reclaim all of the nodes in sample.list you would type

\               sample.list kill

\       using the following word kill:

: kill          ( seg list -- )                 \ reclaim list space
                BEGIN                           \ s l
                  2DUP node.remove ?DUP         \ s l s n  n
                WHILE freenode                  \ s l
                REPEAT DROP 2DROP ;

\ -------------------------------------------------------------
\ List tests
\       The following word will check to see if a particular value is
\       in a list.  For example,

\               5 sample.list ?in.list

\       will return a true flag over the 5 if the value 5 is in the list.

: ?in.list      ( val seg list -- val f )
                >R FALSE -ROT R>                \ 0 v s l
                BEGIN                           \ 0 v s l
                  ROT 2 PICK 2 PICK             \ 0 s l v s l
                  @L ?DUP                       \ 0 s l v n  n
                WHILE
                  3 PICK SWAP                   \ 0 s l v s n
                  2+ @L OVER =                  \ 0 s l v f - true if v'=v
                  IF NIP NIP NIP TRUE EXIT      \ v tf
                  THEN                          \ 0 s l v
                  -ROT OVER SWAP @L             \ 0 v s n
                REPEAT
                NIP NIP SWAP ;                  \ v ff


\       The word ?pop can be used to pop the top of the list if the list
\       is not empty.  If the list is empty, this word leaves a false flag
\       on top of the stack.  This word is useful if you are not sure when
\       the list will be empty and you don't want to abort if it is.

: ?pop          ( seg list -- value tf | ff )   \ ff if list is empty
                node.remove ?DUP
                IF                              \ s n
                  2DUP freenode                 \ put node back in freelist
                  2+ @L TRUE                    \ get value
                ELSE
                  DROP FALSE
                THEN ;

\       The word ?list.empty will return a true flag if the list is empty.

: ?list.empty           ( seg list -- f )
                2DUP ?pop                       \ try to pop
                IF                              \ if something in list
                  -ROT push FALSE               \ push it back - set false
                ELSE
                  2DROP TRUE                    \ else, set true
                THEN ;








\       The word findpos< will find the position of the node after which
\       to insert a value so that values will be stored in the list
\       in ascending order.  For example, to insert the value 35 into the
\       list sample.list so that the list is maintained in ascending order
\       you would type
\                       35 sample.list findpos< push

: findpos<      ( val seg list -- val seg node )
                BEGIN                           \ v s l
                  ROT 2 PICK 2 PICK             \ s l v s l
                  @L ?DUP                       \ s l v n  n
                WHILE
                  3 PICK SWAP                   \ s l v s n
                  2+ @L OVER >                  \ s l v f - true if v'>v
                  IF
                    -ROT EXIT                   \ v s l
                  THEN                          \ s l v
                  -ROT OVER SWAP @L             \ v s n
                REPEAT
                -ROT ;                          \ v s l



\       The word findpos> will find the position of the node after which
\       to insert a value so that values will be stored in the list
\       in descending order.  For example, to insert the value 35 into the
\       list sample.list so that the list is maintained in descending order
\       you would type
\                       35 sample.list findpos> push

: findpos>      ( val seg list -- val seg node )
                BEGIN                           \ v s l
                  ROT 2 PICK 2 PICK             \ s l v s l
                  @L ?DUP                       \ s l v n  n
                WHILE
                  3 PICK SWAP                   \ s l v s n
                  2+ @L OVER <                  \ s l v f - true if v'<v
                  IF
                    -ROT EXIT                   \ v s l
                  THEN                          \ s l v
                  -ROT OVER SWAP @L             \ v s n
                REPEAT
                -ROT ;                          \ v s l












\       The following word can be used to find the address of the
\       nth node in a list.  For example, to get the value in the
\       5th node of the list sample.list you would type

\               sample.list 5 traverse.n 2+ @L

: traverse.n    ( seg list n -- seg addr )      \ find address on nth node
                ?DUP
                   IF                         \ s l n
                      0 DO                    \ s l
                        OVER SWAP             \ s s l
                        @L DUP 0=             \ s n f
                          IF
                             ." Beyond list end " ABORT
                          THEN
                      LOOP                    \ s n
                   THEN ;                     \ s l  if n=0



\       The following word can be used to find the number of nodes in
\       a list.  For example,

\               sample.list get.#nodes .

\       will print the number of nodes in the list sample.list.

: get.#nodes    ( seg list -- n )
                0 -ROT                          \ 0 s l
                BEGIN                           \ cnt s l
                   OVER SWAP                    \ cnt s s l
                   @L ?DUP                      \ cnt s @l @l | cnt s 0
                WHILE                           \ cnt s @l
                   ROT 1+ -ROT                  \ cnt+1 s @l
                REPEAT
                DROP ;                          \ cnt

comment:

















10.3  RECORDS

        For a discussion of Forth records see Chapter 1 in the Pountain
        book.  The following examples are based on information in that
        chapter.

        The words in this section can be used to produce a rather flexible
        linked record system in which each record is a separate segment in
        memory and these records can be linked by pointers which are fields
        in the record.  Any number of different record types can be defined
        and any number of record instances can be created and linked to the
        entire structure in a hierarchical system.  The various field in a
        given record can be of varying size.

        We will illustrate the use of these record words by considering
        a simple example of a student record system.  Each student will
        be assigned the following record:

                  ________
        sr.head:0 | ^SR  |-----|
                  |------|     |
                               |        ___________
                               |------> |  size   | <SR.NODE>:0
                                        |---------|
                                        | ^next   | <SR.NODE> [NEXT.SR]
                                        |---------|
                                        | ^name   | <SR.NODE> [NAME.SR]
                                        |---------|
                                        | ^addr   | <SR.NODE> [ADDR.SR]
                                        |---------|
                                        | ^data   | <SR.NODE> [DATA.SR]
                                        |---------|

        The header sr.head:0 contains the segment address of the first
        student record.  The first element in the <SR.NODE> segment contains
        the number of fields in the current record.  The first field at
        offset address [NEXT.SR] contains a pointer (segment address) to
        the next student record.  The second field at offset address
        [NAME.SR] contains a pointer to a segment containing the student's
        the next student record.  The third field at offset address
        [ADDR.SR] contains a pointer (segment address) to an address
        record.  This record might contain separate fields for street,
        city, state and zip code.  The fourth field at offset address
        [DATA.SR] contains a pointer (segment address) to an data
        record.  This record might contain various fields for data such
        as sex, age, class, major, GPA, or any other information.

        These records can be created with the following words:






VARIABLE total.bytes    2 total.bytes !

                \ Declare a field name
: field         ( n +++ )
                CREATE
                   total.bytes @ ,              \ store offset
                   total.bytes +!               \ bump offset count
                   IMMEDIATE
                DOES>           ( seg pfa -- seg off )
                   @                            \ get field address
                   STATE @                      \ if compiling
                   IF
                      [COMPILE] LITERAL         \ ...bind early
                   THEN ;

                \ Make an instance of a record type (internal use only)
: make.instance         ( seg off n --- seg )
                DUP alloc.mem                   \ allocate fields
                TUCK 0 !L                       \ store instance size
                DUP 2SWAP !L                    \ store new seg at seg:off
                IMMEDIATE ;

                \ Create the record defining word
: define-record          ( +++ )
                CREATE
                   total.bytes @ ,              \ store instance size
                   2 total.bytes !              \ reset the count
                DOES>           ( seg off -- seg' )
                   @ make.instance ;

1       array   sr.head

: sr.list       ( -- seg off )
                sr.head 0 ;

\  The following fields are offset addresses into the sr node
2 field [NEXT.SR]            \ pointer (seg addr) to next node
2 field [NAME.SR]            \ pointer (seg addr) to student name
2 field [ADDR.SR]            \ pointer (seg addr) to student address record
2 field [DATA.SR]            \ pointer (seg addr) to student data
define-record SR-REC

        Note that the word field is a defining word that defines names
        corresponding to the offset addresses in the student record
        <SR.NODE>.  When these words are created the value in the
        variable total.bytes is stored in the PFA of the created word
        and then the value of total.bytes in incremented by the value
        on the stack when field is called.  (Note that total.bytes starts
        with an initial value of 2).  This technique will produce the
        correct offset addresses for fields of different width.  Fields
        can also be added or subtracted without having to worry about
        changing the offset addresses.



        The statement
                        define-record SR-REC

        will now create a word called SR-REC that itself is used later
        to create instances of a student record.

        To complete the example, suppose we define the following records:

\ The following fields are offsets into the student data node
2 field [SEX.D]                         \ sex - 1 char counted string M or F
11 field [BIRTH.D]                      \ date of birth - M/D/YR  string
11 field [ENTER.D]                      \ date of enterance - M/D/YR  string
2 field [MAJOR.D]                       \ major code
2 field [GPA.D]                         \ GPA x 100
define-record DATA-REC

\ The following field is an offset addr of the name node
24 field [NAME.FN]                       \ student name - counted string
define-record NAME-REC

\ The following fields are offset addresses into the address node
16 field [STREET.AD]                    \ street address
16 field [CITY.AD]                      \ city
3 field [STATE.AD]                      \ state - 2 char abbrev
11 field [ZIP.AD]                       \ zip code
define-record ADDR-REC

\ ------------------------------------------------------
0       VALUE   <SR.NODE>               \ SR node seg address
0       VALUE   <NODE.NAME>             \ name node seg address
0       VALUE   <NODE.ADDR>             \ address node seg address
0       VALUE   <NODE.DATA>             \ SR data node seg address

\       The following words are used to create and delete a student record:

: >end.of.SR.list          ( seg list -- seg end.of.list.node )
                BEGIN                           \ s\l
                  2DUP @L ?DUP                  \ s\l\@l\ @l
                WHILE                           \ s\l\@l  or \s\l
                  NIP NIP [NEXT.SR]             \ @l\off
                REPEAT ;

: make.SR.record     ( seg off -- )
                >end.of.SR.list
                SR-REC DUP !> <SR.NODE>
                DUP 0 SWAP [NEXT.SR] !L
                DUP [NAME.SR] NAME-REC !> <NODE.NAME>
                DUP [ADDR.SR] ADDR-REC !> <NODE.ADDR>
                [DATA.SR] DATA-REC !> <NODE.DATA> ;

: zero.<nodes>          ( -- )
                0 !> <SR.NODE>
                0 !> <NODE.NAME>
                0 !> <NODE.ADDR>
                0 !> <NODE.DATA> ;

: release1.SR           ( ^SR -- )
                DUP [NAME.SR] @L release.mem
                DUP [ADDR.SR] @L release.mem
                DUP [DATA.SR] @L release.mem
                release.mem ;

: release.all.SR        ( seg off -- )
                2DUP @L ?DUP
                IF
                   BEGIN
                      DUP [NEXT.SR] @L
                      SWAP release1.SR ?DUP
                   WHILE
                   REPEAT
                   0 -ROT !L
                THEN
                zero.<nodes> ;

comment:
        To add a record you would type

                sr.list make.SR.record

        You could then add data to the various fields either from the
        keyboard or from another disk file.  For example,

                345 <NODE.DATA> [MAJOR.D] !L

        will store the value of 345 in the appropriate major field.
comment;