en:pfw:marsaglia_s_xorshift_for_arm
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
en:pfw:marsaglia_s_xorshift_for_arm [2023-09-04 18:16] – ↷ Seite von pfw:marsaglia_s_xorshift_for_arm nach en:pfw:marsaglia_s_xorshift_for_arm verschoben uho | en:pfw:marsaglia_s_xorshift_for_arm [2024-10-08 23:03] (current) – [Raspberry Pi 3b+ with wabiForth] jeroenh | ||
---|---|---|---|
Line 5: | Line 5: | ||
The Forth version of the randomisation routines is the same on any processor as only standard Forth words are used. But the ARM-processor can do do a neat | The Forth version of the randomisation routines is the same on any processor as only standard Forth words are used. But the ARM-processor can do do a neat | ||
- | trick: it can do q cycle (dup, shift and xor) in 1 opcode!! And as | + | trick: it can do 1 cycle (dup, shift and xor) in 1 opcode!! And as |
most Forths include an assembler it is an interesting exercise to see how | most Forths include an assembler it is an interesting exercise to see how | ||
much faster the routine is when coded in assembly. | much faster the routine is when coded in assembly. | ||
This example is coded using wabiForth on a Raspberry 3b+, but the principle is the same for any ARMv8 Aarch32 processor. | This example is coded using wabiForth on a Raspberry 3b+, but the principle is the same for any ARMv8 Aarch32 processor. | ||
- | The routine uses two registers named top and w. Top contains the top of the stack, w is a scratch | + | The routine uses three registers named top, v and w. Top contains the top of the stack, |
Line 25: | Line 25: | ||
w, w, w, 5 lsl#, eor, | w, w, w, 5 lsl#, eor, | ||
| | ||
- | w, top, str, \ save new value in seed | + | |
+ | v, top, str, \ save xor' | ||
top, w, mov, | top, w, mov, | ||
| | ||
- | ] ; 6 inlinable | + | ] ; 7 inlinable |
</ | </ | ||
Line 42: | Line 44: | ||
1 seed 32bit Forth: | 1 seed 32bit Forth: | ||
2 seed 32bit Forth: | 2 seed 32bit Forth: | ||
- | 1 seed 32bit assembly: | + | 1 seed 32bit assembly: |
--------------------------- | --------------------------- | ||
</ | </ | ||
Line 48: | Line 50: | ||
Time measured is the number of CPU-cycles required to put a | Time measured is the number of CPU-cycles required to put a | ||
random number on the stack with a given method. The routine in assembly | random number on the stack with a given method. The routine in assembly | ||
- | is 4 times as fast as the corresponding routine in Forth. Which is a | + | is 3 times as fast as the corresponding routine in Forth. Which is a |
- | nice speed-up of the routine. | + | decent |
en/pfw/marsaglia_s_xorshift_for_arm.1693844210.txt.gz · Last modified: 2023-09-04 18:16 by uho