\ Lesson 10 - Forth Data Structures \ The Forth Course \ by Richard E. Haskell \ Dept. of Computer Science and Engineering \ Oakland University, Rochester, MI 48309 comment: Lesson 10 FORTH DATA STRUCTURES 10.1 ARRAYS 10-2 10.2 LINKED LISTS 10-5 10.3 RECORDS 10-13 10.1 ARRAYS Much of the information in this lesson is based on material found in the book "Object-oriented Forth" by Dick Pountain, Academic Press, 1987. We will extend those ideas so that the data structures can take advantage of all of the memory in the system. The F-PC words ALLOC ( #para -- #para segment flag ) and DEALLOC ( segment -- flag ) use the DOS function calls AH = 48H and AH = 49 respectively to allocate and release memory. Using these words we can define the following more convenient words to allocate and release memory: comment; : alloc.mem ( size -- segment ) PARAGRAPH ALLOC \ DOS alloc INT 21H - AH=48H 8 = ABORT" Not enough memory to allocate " NIP ; \ discard #para allocated : release.mem ( segment -- ) DEALLOC \ DOS INT 21H - AH=49H ABORT" Failed to deallocate segment " ; comment: The word alloc.mem expects the size of the block you want to allocate (in bytes) on the stack and returns the segment address of the allocated block. The F-PC word : PARAGRAPH 15 + U/16 ; will convert the number of bytes requested to the number of 16-byte paragraphs. The word release.mem will release the memory allocated by alloc.mem. You must first push on the stack the segment address of the block of memory you want to release. (This must be the segment address returned by a previous call of alloc.mem). Suppose you want to create an array of a certain size in extended memory and then use @L and !L to fetch and store values in this array. We might define the following defining word: comment; : array ( size +++ ) CREATE 2* DUP alloc.mem , \ save seg address , \ save array size in bytes DOES> @ ; comment: Then, for example, 1000 array array.name will create a dictionary entry called array.name, allocate 1000 words of memory and store the segment address of the allocated block of memory and the size of the array in the parameter field of array.name. When array.name is later called it will leave the segment address of the array on the stack. The dictionary entry of array.name will be stored in memory as follows: array.name ________ | CFA | CODE | <------| |------| ________ PFA | seg | ----------------------> | | seg:0 |------| |------| | size | | | |------| |------| Code Segment ?CS: | | |------| | | |------| | | |------| Array Segment To access the value of the array element array.name(5), for example, you would type array.name 5 @L The problem with using this scheme for extended memory arrays is that it will fail if you make a turnkey system of your program. Making a turnkey system will strip all headers from the dictionary and create an .EXE file that contains all of your program words together with all the F-PC words. When you save this system the code segment part of any arrays that you have defined will be saved but the memory allocated for the actual array will be lost. This means that when the turnkey program later runs it must somehow allocate any memory it needs for arrays and store the segment address of the array in the PFA of the array name. We can modify the definition of array to be used in turnkey systems as follows: comment; : array.tk ( size +++ ) CREATE 0 , \ fill in seg address later 2* , \ save array size in bytes DOES> @ ; \ Note that if you now type 1000 array.tk array.name comment: you will create the dictionary entry array.name and save the size of 1000 but will not allocate any memory for the array at this point. Memory can later be allocated for all arrays using the following words: comment; : alloc.array ( cfa -- ) >BODY DUP 2+ @ \ get size in bytes alloc.mem \ allocate memory SWAP ! ; \ save seg at PFA : allocate.arrays ( -- ) [ ' array.name ] LITERAL alloc.array ; comment: The word allocate.arrays would contain a similar line for each array that you had defined in the program. You would include the word allocate.arrays as part of the initiallization of your program. This will allow memory to be allocated for each of your arrays even in turnkey systems. You can release all memory allocated to arrays using the following words: comment; : release.array ( cfa -- ) >BODY @ \ get segment address release.mem ; \ and release it : release.all.arrays ( -- ) [ ' array.name ] LITERAL release.array ; comment: You would add similar lines to release.all.arrays for each array whose memory you want to release. 10.2 LINKED LISTS In this section we will write a number of words for creating and maintaining linked lists. (See Chapter 3 in Pountain's book). Each node in the linked list will contain 4 bytes. The first two will be a pointer to the next node and the last two will contain the 16-bit value associated with the node. _____________ _____________ _____________ | ptr |value| | ptr |value| | 0 |value| ------------- ------------- ------------- | ^ | ^ | | | | |---------------| |---------------| When adding values from a given linked list we will get a node from a large pool of nodes in a free list and when we delete a value from a list we will return the node to the free list. We will allocate a large block of memory for the free list and then link all of the nodes in the free list as follows: _________ | | <list.seg>:0 |-------| | head^ |--| :2 |-------| | |--| ptr |<-| :4 | |-------| | | value | | |-------| |->| ptr |--| :8 |-------| | | value | | |-------| | |--| ptr |<-| :12 | |-------| | | value | | |-------| |->| ptr |--| :16 |-------| | | value | | |-------| | | ptr |<-| |-------| Available nodes start at offset address 4 within the segment <list.seg> and occur at multiples of 4 bytes thereafter. The head pointer of the free list is at address <list.seg>:2. The value at <list.seg>:0 is not used. The following words will create this free list. comment; \ ------------------------------------------------------ \ Variables and Constants DECIMAL 0 CONSTANT nil 2 CONSTANT [freelist.head] 0 VALUE <list.seg> [freelist.head] VALUE [list.offset] \ ------------------------------------------------------ \ Allocate memory : release.seglist ( -- ) <list.seg> ?DUP IF DEALLOC 0= \ DOS INT 21H - AH=49H IF 0 !> <list.seg> ELSE ABORT" Failed to deallocate <list.seg> " THEN THEN ; : alloc.seglist ( size -- ) release.seglist 2* 2* 4 + \ 4 bytes/node + head alloc.mem \ allocate memory !> <list.seg> ; \ <list.seg> = base segment address \ ------------------------------------------------------ \ Create freelist \ Nodes: | ptr | val | : allocate.freelist ( size -- ) DUP alloc.seglist \ size [list.offset] 2+ \ next ptr addr <list.seg> [list.offset] !L \ store at current ptr 2 +!> [list.offset] \ make next ptr current ptr 1 DO \ do size-1 times [list.offset] 4 + \ next ptr addr <list.seg> [list.offset] !L \ store at current ptr 4 +!> [list.offset] \ make next ptr current ptr LOOP nil <list.seg> [list.offset] !L \ make last ptr nil 4 +!> [list.offset] ; \ [list.offset] --> eolist : freelist ( -- seg offset ) <list.seg> [freelist.head] ; \ ------------------------------------------------------ \ Node manipulation words \ The following word will insert a node at address seg:node \ after a node whose pointer is at address seg:list. : node.insert ( seg list seg node --- ) \ insert after seg:list 2OVER @L \ s l s n @l ROT 2 PICK \ s l n @l s n !L \ s l n -ROT !L ; \ The following word will remove the node following the pointer at \ seg:list and leave the address of the removed node, seg:node, \ on the stack. If seg:list is the header, this word removes the \ first node in the list. If the list is empty, it leaves seg:0. : node.remove ( seg list -- seg node ) 2DUP @L \ s l @l 2 PICK SWAP DUP \ s l s @l @l IF \ s l s @l 2SWAP 2OVER @L \ s @l s l @@l -ROT !L \ s n ELSE \ s l s 0 2SWAP 2DROP \ s 0 THEN ; \ To get a node you just remove one from the free list using getnode. : getnode ( --- seg node ) freelist node.remove ; \ To put a node at seg:node back in the free list you use freenode. : freenode ( seg node --- ) freelist 2SWAP \ seg list seg node node.insert ; \ The word newlist will create a new list header in the code segment. \ The PFA of this list header will contain the offset address in the \ segment <list.seg> of the list header. : newlist ( +++ ) CREATE nil , \ fill in node addr later DOES> ( -- seg list ) <list.seg> SWAP @ ; \ To create a new list called sample.list you would type newlist sample.list \ You would then create the header for this list in the segment \ <list.seg> by including the following line in the word \ fill.newlists. : fill.newlists ( -- ) getnode DUP [ ' sample.list ] LITERAL >BODY ! nil -ROT !L ; comment: This technique is used to make the lists available in a turnkey system in much the same way we did it for arrays. Before you can use any of these data structures you must allocate the memory in your program using a word such as comment; : init.data.structures ( -- ) allocate.arrays 1200 allocate.freelist fill.newlists ; \ So that you can test the words in this lesson we will go ahead \ and execute init.data.structures \ when you FLOAD LESSON10. \ To push the value 5 on the top of the list sample.list you would type \ 5 sample.list push \ using the following word push: : push ( value seg list -- ) getnode ?DUP IF \ v s l s n 4 ROLL 2 PICK 2 PICK \ s l s n v s n 2+ !L node.insert ELSE ." no free space " ABORT THEN ; \ To pop the top value from the list sample.list you would type \ sample.list pop \ using the following word pop: : pop ( seg list -- value ) node.remove ?DUP IF \ s n 2DUP freenode \ put node back in freelist 2+ @L \ get value ELSE ." empty list " ABORT THEN ; \ To print out the contents of the list sample.list you would type \ sample.list .all \ using the following word .all: : .all ( seg list -- ) \ print list contents BEGIN \ s l OVER SWAP @L ?DUP \ s n n WHILE 2DUP 2+ @L . \ s n REPEAT DROP ; \ To reclaim all of the nodes in sample.list you would type \ sample.list kill \ using the following word kill: : kill ( seg list -- ) \ reclaim list space BEGIN \ s l 2DUP node.remove ?DUP \ s l s n n WHILE freenode \ s l REPEAT DROP 2DROP ; \ ------------------------------------------------------------- \ List tests \ The following word will check to see if a particular value is \ in a list. For example, \ 5 sample.list ?in.list \ will return a true flag over the 5 if the value 5 is in the list. : ?in.list ( val seg list -- val f ) >R FALSE -ROT R> \ 0 v s l BEGIN \ 0 v s l ROT 2 PICK 2 PICK \ 0 s l v s l @L ?DUP \ 0 s l v n n WHILE 3 PICK SWAP \ 0 s l v s n 2+ @L OVER = \ 0 s l v f - true if v'=v IF NIP NIP NIP TRUE EXIT \ v tf THEN \ 0 s l v -ROT OVER SWAP @L \ 0 v s n REPEAT NIP NIP SWAP ; \ v ff \ The word ?pop can be used to pop the top of the list if the list \ is not empty. If the list is empty, this word leaves a false flag \ on top of the stack. This word is useful if you are not sure when \ the list will be empty and you don't want to abort if it is. : ?pop ( seg list -- value tf | ff ) \ ff if list is empty node.remove ?DUP IF \ s n 2DUP freenode \ put node back in freelist 2+ @L TRUE \ get value ELSE DROP FALSE THEN ; \ The word ?list.empty will return a true flag if the list is empty. : ?list.empty ( seg list -- f ) 2DUP ?pop \ try to pop IF \ if something in list -ROT push FALSE \ push it back - set false ELSE 2DROP TRUE \ else, set true THEN ; \ The word findpos< will find the position of the node after which \ to insert a value so that values will be stored in the list \ in ascending order. For example, to insert the value 35 into the \ list sample.list so that the list is maintained in ascending order \ you would type \ 35 sample.list findpos< push : findpos< ( val seg list -- val seg node ) BEGIN \ v s l ROT 2 PICK 2 PICK \ s l v s l @L ?DUP \ s l v n n WHILE 3 PICK SWAP \ s l v s n 2+ @L OVER > \ s l v f - true if v'>v IF -ROT EXIT \ v s l THEN \ s l v -ROT OVER SWAP @L \ v s n REPEAT -ROT ; \ v s l \ The word findpos> will find the position of the node after which \ to insert a value so that values will be stored in the list \ in descending order. For example, to insert the value 35 into the \ list sample.list so that the list is maintained in descending order \ you would type \ 35 sample.list findpos> push : findpos> ( val seg list -- val seg node ) BEGIN \ v s l ROT 2 PICK 2 PICK \ s l v s l @L ?DUP \ s l v n n WHILE 3 PICK SWAP \ s l v s n 2+ @L OVER < \ s l v f - true if v'<v IF -ROT EXIT \ v s l THEN \ s l v -ROT OVER SWAP @L \ v s n REPEAT -ROT ; \ v s l \ The following word can be used to find the address of the \ nth node in a list. For example, to get the value in the \ 5th node of the list sample.list you would type \ sample.list 5 traverse.n 2+ @L : traverse.n ( seg list n -- seg addr ) \ find address on nth node ?DUP IF \ s l n 0 DO \ s l OVER SWAP \ s s l @L DUP 0= \ s n f IF ." Beyond list end " ABORT THEN LOOP \ s n THEN ; \ s l if n=0 \ The following word can be used to find the number of nodes in \ a list. For example, \ sample.list get.#nodes . \ will print the number of nodes in the list sample.list. : get.#nodes ( seg list -- n ) 0 -ROT \ 0 s l BEGIN \ cnt s l OVER SWAP \ cnt s s l @L ?DUP \ cnt s @l @l | cnt s 0 WHILE \ cnt s @l ROT 1+ -ROT \ cnt+1 s @l REPEAT DROP ; \ cnt comment: 10.3 RECORDS For a discussion of Forth records see Chapter 1 in the Pountain book. The following examples are based on information in that chapter. The words in this section can be used to produce a rather flexible linked record system in which each record is a separate segment in memory and these records can be linked by pointers which are fields in the record. Any number of different record types can be defined and any number of record instances can be created and linked to the entire structure in a hierarchical system. The various field in a given record can be of varying size. We will illustrate the use of these record words by considering a simple example of a student record system. Each student will be assigned the following record: ________ sr.head:0 | ^SR |-----| |------| | | ___________ |------> | size | <SR.NODE>:0 |---------| | ^next | <SR.NODE> [NEXT.SR] |---------| | ^name | <SR.NODE> [NAME.SR] |---------| | ^addr | <SR.NODE> [ADDR.SR] |---------| | ^data | <SR.NODE> [DATA.SR] |---------| The header sr.head:0 contains the segment address of the first student record. The first element in the <SR.NODE> segment contains the number of fields in the current record. The first field at offset address [NEXT.SR] contains a pointer (segment address) to the next student record. The second field at offset address [NAME.SR] contains a pointer to a segment containing the student's the next student record. The third field at offset address [ADDR.SR] contains a pointer (segment address) to an address record. This record might contain separate fields for street, city, state and zip code. The fourth field at offset address [DATA.SR] contains a pointer (segment address) to an data record. This record might contain various fields for data such as sex, age, class, major, GPA, or any other information. These records can be created with the following words: VARIABLE total.bytes 2 total.bytes ! \ Declare a field name : field ( n +++ ) CREATE total.bytes @ , \ store offset total.bytes +! \ bump offset count IMMEDIATE DOES> ( seg pfa -- seg off ) @ \ get field address STATE @ \ if compiling IF [COMPILE] LITERAL \ ...bind early THEN ; \ Make an instance of a record type (internal use only) : make.instance ( seg off n --- seg ) DUP alloc.mem \ allocate fields TUCK 0 !L \ store instance size DUP 2SWAP !L \ store new seg at seg:off IMMEDIATE ; \ Create the record defining word : define-record ( +++ ) CREATE total.bytes @ , \ store instance size 2 total.bytes ! \ reset the count DOES> ( seg off -- seg' ) @ make.instance ; 1 array sr.head : sr.list ( -- seg off ) sr.head 0 ; \ The following fields are offset addresses into the sr node 2 field [NEXT.SR] \ pointer (seg addr) to next node 2 field [NAME.SR] \ pointer (seg addr) to student name 2 field [ADDR.SR] \ pointer (seg addr) to student address record 2 field [DATA.SR] \ pointer (seg addr) to student data define-record SR-REC Note that the word field is a defining word that defines names corresponding to the offset addresses in the student record <SR.NODE>. When these words are created the value in the variable total.bytes is stored in the PFA of the created word and then the value of total.bytes in incremented by the value on the stack when field is called. (Note that total.bytes starts with an initial value of 2). This technique will produce the correct offset addresses for fields of different width. Fields can also be added or subtracted without having to worry about changing the offset addresses. The statement define-record SR-REC will now create a word called SR-REC that itself is used later to create instances of a student record. To complete the example, suppose we define the following records: \ The following fields are offsets into the student data node 2 field [SEX.D] \ sex - 1 char counted string M or F 11 field [BIRTH.D] \ date of birth - M/D/YR string 11 field [ENTER.D] \ date of enterance - M/D/YR string 2 field [MAJOR.D] \ major code 2 field [GPA.D] \ GPA x 100 define-record DATA-REC \ The following field is an offset addr of the name node 24 field [NAME.FN] \ student name - counted string define-record NAME-REC \ The following fields are offset addresses into the address node 16 field [STREET.AD] \ street address 16 field [CITY.AD] \ city 3 field [STATE.AD] \ state - 2 char abbrev 11 field [ZIP.AD] \ zip code define-record ADDR-REC \ ------------------------------------------------------ 0 VALUE <SR.NODE> \ SR node seg address 0 VALUE <NODE.NAME> \ name node seg address 0 VALUE <NODE.ADDR> \ address node seg address 0 VALUE <NODE.DATA> \ SR data node seg address \ The following words are used to create and delete a student record: : >end.of.SR.list ( seg list -- seg end.of.list.node ) BEGIN \ s\l 2DUP @L ?DUP \ s\l\@l\ @l WHILE \ s\l\@l or \s\l NIP NIP [NEXT.SR] \ @l\off REPEAT ; : make.SR.record ( seg off -- ) >end.of.SR.list SR-REC DUP !> <SR.NODE> DUP 0 SWAP [NEXT.SR] !L DUP [NAME.SR] NAME-REC !> <NODE.NAME> DUP [ADDR.SR] ADDR-REC !> <NODE.ADDR> [DATA.SR] DATA-REC !> <NODE.DATA> ; : zero.<nodes> ( -- ) 0 !> <SR.NODE> 0 !> <NODE.NAME> 0 !> <NODE.ADDR> 0 !> <NODE.DATA> ; : release1.SR ( ^SR -- ) DUP [NAME.SR] @L release.mem DUP [ADDR.SR] @L release.mem DUP [DATA.SR] @L release.mem release.mem ; : release.all.SR ( seg off -- ) 2DUP @L ?DUP IF BEGIN DUP [NEXT.SR] @L SWAP release1.SR ?DUP WHILE REPEAT 0 -ROT !L THEN zero.<nodes> ; comment: To add a record you would type sr.list make.SR.record You could then add data to the various fields either from the keyboard or from another disk file. For example, 345 <NODE.DATA> [MAJOR.D] !L will store the value of 345 in the appropriate major field. comment;