User Tools

Site Tools


en:pfw:marsaglia_s_xorshift_for_arm

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
en:pfw:marsaglia_s_xorshift_for_arm [2023-09-04 18:16] – ↷ Seite von pfw:marsaglia_s_xorshift_for_arm nach en:pfw:marsaglia_s_xorshift_for_arm verschoben uhoen:pfw:marsaglia_s_xorshift_for_arm [2024-10-08 23:03] (current) – [Raspberry Pi 3b+ with wabiForth] jeroenh
Line 5: Line 5:
  
 The Forth version of the randomisation routines is the same on any processor as only standard Forth words are used. But the ARM-processor can do do a neat The Forth version of the randomisation routines is the same on any processor as only standard Forth words are used. But the ARM-processor can do do a neat
-trick: it can do cycle (dup, shift and xor) in 1 opcode!! And as+trick: it can do cycle (dup, shift and xor) in 1 opcode!! And as
 most Forths include an assembler it is an interesting exercise to see how most Forths include an assembler it is an interesting exercise to see how
 much faster the routine is when coded in assembly. much faster the routine is when coded in assembly.
 This example is coded using wabiForth on a Raspberry 3b+, but the principle is the same for any ARMv8 Aarch32 processor. This example is coded using wabiForth on a Raspberry 3b+, but the principle is the same for any ARMv8 Aarch32 processor.
  
-The routine uses two registers named top and w. Top contains the top of the stack, w is a scratch register.+The routine uses three registers named top, v and w. Top contains the top of the stack, v and are scratch registers.
  
  
Line 25: Line 25:
   w, w, w,  5 lsl#, eor,   w, w, w,  5 lsl#, eor,
      
-  w, top, str,         \ save new value in seed+  v, v, w, eor,        \ xor old seed value with generated random number 
 +  v, top, str,         \ save xor'value in seed
   top, w, mov,   top, w, mov,
      
-  ] ; inlinable+  ] ; inlinable 
 </code> </code>
  
Line 42: Line 44:
     1 seed 32bit Forth:     40c     1 seed 32bit Forth:     40c
     2 seed 32bit Forth:     60c     2 seed 32bit Forth:     60c
-    1 seed 32bit assembly:  10c+    1 seed 32bit assembly:  13c
     ---------------------------     ---------------------------
 </code> </code>
Line 48: Line 50:
 Time measured is the number of CPU-cycles required to put a Time measured is the number of CPU-cycles required to put a
 random number on the stack with a given method. The routine in assembly random number on the stack with a given method. The routine in assembly
-is times as fast as the corresponding routine in Forth. Which is a +is times as fast as the corresponding routine in Forth. Which is a 
-nice speed-up of the routine.  +decent speed-up of the routine.  
  
en/pfw/marsaglia_s_xorshift_for_arm.1693844210.txt.gz · Last modified: 2023-09-04 18:16 by uho