en:pfw:string_handling
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
en:pfw:string_handling [2023-09-04 18:22] – gelöscht - Externe Bearbeitung (Unbekanntes Datum) 127.0.0.1 | en:pfw:string_handling [2024-06-05 22:53] (current) – [Generic Forth] willem | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | {{pfw: | ||
+ | ====== String handling ====== | ||
+ | |||
+ | ===== The idea ===== | ||
+ | |||
+ | Character strings are mostly associated with dynamic memory and garbage collection. | ||
+ | That is an overkill with the string handling that is used in most Forth programs. | ||
+ | In particular we can get by with buffers that are statically allocated using CREATE. | ||
+ | It is still useful to lift manipulating single characters to manipulating strings as a whole. | ||
+ | |||
+ | (We define) A few words that make string manipulation in forth a little smoother.\\ | ||
+ | Original idea Albert Nijhof & [[https:// | ||
+ | |||
+ | * Manipulate files | ||
+ | * Start programs | ||
+ | * Add, delete and use folders/ | ||
+ | * Etc. | ||
+ | |||
+ | ===== Construction of strings ===== | ||
+ | |||
+ | Strings in Forth are of the type address & length. The length is stored in front of the string. | ||
+ | There are two views possible. The classic view is to store the length in a byte. | ||
+ | |||
+ | The so called counted strings, as is shown in the picture: | ||
+ | |||
+ | {{https:// | ||
+ | |||
+ | |||
+ | |||
+ | ===== Pseudo code ===== | ||
+ | |||
+ | < | ||
+ | Function: $VARIABLE | ||
+ | reserve a buffer for the count-byte + ' | ||
+ | Alternatively" | ||
+ | Define: ( maxlen " | ||
+ | Save maxlen & buffer-address | ||
+ | Action: ( -- s ) | ||
+ | Leave address of string variable | ||
+ | |||
+ | Function: $@ ( s -- c ) | ||
+ | Read counted string from address | ||
+ | Function: $+! ( c s -- ) | ||
+ | Extend counted string at address | ||
+ | Function: $! ( c s -- ) | ||
+ | Store counted string at address | ||
+ | Function: $. ( c -- ) | ||
+ | Print counted string | ||
+ | Function: $C+! ( char s -- ) | ||
+ | Add one character to counted string at address | ||
+ | </ | ||
+ | |||
+ | The original idea also contains : < | ||
+ | See the reference in the introduction. | ||
+ | |||
+ | Two tools, idea Albert Nijhof: | ||
+ | |||
+ | < | ||
+ | Function: -HEAD ( adr len i -- adr' len' ) cut first ' | ||
+ | Function: -TAIL ( adr len i -- adr len' ) cut last ' | ||
+ | </ | ||
+ | However that flies in the face of the goals mentionned in the introduction. | ||
+ | We promised to get rid of characters, never count characters, only concern ourselves with strings. | ||
+ | |||
+ | A better example in this context is: | ||
+ | < | ||
+ | Function: -TRAILING ( c -- c' ) remove trailing blanks space from string. | ||
+ | Function: -LEADING | ||
+ | </ | ||
+ | ===== Generic Forth ===== | ||
+ | |||
+ | The idea of strings is that a character string (s) is in fact a counted string < | ||
+ | |||
+ | <code forth> | ||
+ | : $VARIABLE | ||
+ | here swap 1+ allot align \ Reserve RAM buffer | ||
+ | create | ||
+ | does> | ||
+ | |||
+ | : C+! ( n a -- ) >r r@ c@ + r> c! ; \ Incr. byte with n at a | ||
+ | : $@ ( s -- c ) count ; \ Fetch string | ||
+ | : $+! ( c s -- ) >r tuck r@ $@ + swap move r> c+! ; \ Extend string | ||
+ | : $! ( c s -- ) 0 over c! $+! ; \ Store string | ||
+ | : $. ( c -- ) type ; \ Print string | ||
+ | : $C+! ( char s -- ) dup >r $@ + c! 1 r> c+! ; \ Add char to string | ||
+ | </ | ||
+ | |||
+ | Here is a version where the count is stored in a cell, it is hardly different. | ||
+ | Note that it uses the non Generic Forth word '' | ||
+ | the [[https:// | ||
+ | |||
+ | <code forth> | ||
+ | : $VARIABLE | ||
+ | here swap CELL+ allot align \ Reserve RAM buffer | ||
+ | create | ||
+ | does> | ||
+ | |||
+ | : $@ ( s -- c ) @+ ; \ Fetch string | ||
+ | : $+! ( c s -- ) >r tuck r@ $@ + swap move r> +! ; \ Extend string | ||
+ | : $! ( c s -- ) 0 over ! $+! ; \ Store string | ||
+ | : $. ( c -- ) type ; \ Print string | ||
+ | : $C+! ( char s -- ) dup >r $@ + c! 1 r> +! ; \ Add char to string | ||
+ | </ | ||
+ | |||
+ | ===== Implementations ===== | ||
+ | |||
+ | Have a look at the sub directories for implementations for different systems. | ||
+ | |||
+ | * String word sets | ||
+ | * [[https:// | ||
+ | * [[https:// | ||
+ | * [[https:// | ||
+ | * [[https:// | ||
+ | * Etc. | ||
+ | |||
+ | Note that Albert Nijhof' | ||
+ | |||
+ | ^ Name ^ Alt-name | ||
+ | | '' | ||
+ | | '' | ||
+ | | '' | ||
+ | | '' | ||
+ | | '' | ||
+ | |||
+ | ===== String tools ===== | ||
+ | |||
+ | Two string tools as implemented by Albert Nijhof.\\ | ||
+ | - '' | ||
+ | - '' | ||
+ | |||
+ | <code forth> | ||
+ | \ Extra: cut i characters from a string, with underflow protection | ||
+ | : -TAIL ( adr len i -- adr len' ) 0 max over min - ; | ||
+ | : -HEAD ( adr len i -- adr' len' ) 0 max over min tuck - >r + r> ; | ||
+ | \ -HEAD and -TAIL do not store anything. | ||
+ | </ | ||