ELF file that only has an offset thread into a dynamically linked symbol table and interpreter.
Build an ELF file that can execute the code to build itself.
North Runner compatible with the x86 runner that can mmap a file to execute.
Get out of the Turing Tar Pit. Make a useful program like a game.
A fully self contained binary that can output its own code that can be used to rebuild.
- minimal features
- full features
FFB15684:-5155192 > 3 integer make-typed-pointer FFB15678:-5155204 > dup print-instance FFB1567C:0 integer (B364283F) < value 0:4 pointer<any> name 16843078 1010146 4:4 uint<32> byte-size 0 0 8:4 pointer<any> super 0 0 12:4 pointer<any> data 671089408 28000300
Should print like a struct with a single field.
Collection of functions that specialize on one or more argument types.
Would extend the .
and ->
operators with mini dictionaries.
Might be a base to build struct fields.
interface Animal
def say
hello
end
def walk
end
end
struct: Duck
value field: flying
Duck implements Animal
def say
quack
end
end
struct: Fish
Fish implements Animal
def say
bloop bloop
end
def walk
flop
end
end
Duck make-instance
dup Animal -> say ( quack )
dup Animal -> walk
dup Instance -> print
Fish -> new
dup Animal -> say ( bloop bloop )
dup Animal -> walk
dup Instance -> print
interface Number
def +
end
end
int<32> implements Number
def + arg1 int<32> coerce arg0 int-add 2 return1-n end
end
float<32> implements Number
def + arg1 float<32> coerce arg0 float32-add 2 return1-n end
end
3.14 2.0 Number . +
3.14 make-float<32> 2.0 make-float<32> Number -> +
Does need a limit on stack-find. Add repeat-word?
seem to be doing misaligend ldr-pc. patch-ldr-pc! might be doing the calculation wrong.
Needs constants defined in the interp and out dictionaries: output constants would overlap with builder constants in top level.
MVP: Load linux.4th before cross.4th, escaped strings and tty-img[ available in the builder
Builder needs abilities to load files pre-runner, post-runner, and pre-cross.
Builder could use a single option with a value to flag runner, interp, and cross libraries.
north/words needs immediates loaded prior to the runner redefines def
, or a way to switch
between system and output mode / dictionaries.
Backported the needed TTY functions.
Needs:
" src/lib/case.4th" load core-init alias> defconst> const> " src/demos/tty/drawing.4th" load color-init
def fn s[ hello world ] fn[ swap write-line ] map-seqn end
Using defined? allows undefined symbols to be safely used.
Include into the builder
Actually copying fields at the end of the build is sounding best while keeping byte size up to date.
struct:
creates system structband an empty struct with associated word.
field:
, inherits:
and create-field functions add to system.
End of build: Structs get updated, fields copied,
Startup: traverse all instances and add cs
to pointers?
Rename offset32
to something like literal+cs
or cs+int32
? Then offset
is free for relative pointer literals: eip + literal
. eip+int32
?
load-core
needs less use if it’s compiled in. Actually crashes if core is already loaded.
- files…
- list of files
- -e
- eval string
- -i
- always prompt
- -D var=value
- set
var
tovalue
before any interpreting; may need a type indicator - -D \[data|return\]-stack=number
- stack sizes, location
- -v
- verbosity
- -d
- debug; may duplicate the above?
- -I
- add search path
One use is trig functions for float32 and float64. Another is using the interpreter as a calculator.
Has no thumb2 so no coprocessor, no float.
Build included lib/math/float32.4th
with constants computed using floats.
Add hardfloat
, softfloat
, and nofloat
to the platform string?
A features list supplied to the builder?
Runtime detection of features? FPU? Thumb 1 or 2? Division?
runner/imports.4th
crashed a bootstrap to static build. > stage1 checked worked around.
Buffer cells need to have a larger char field. Should have an indication and option to the terminal’s encoding. Internally utf32 will be used. No need to encode for utf8 if the terminal is utf32.
Move boot/cross.4th into src/cross/interp.4th? src/cross/words/interp.4th? src/cross/interp/words.4th?
Output cell-size: Use out-cell-size
in cross compiling and other ouput words. out-op-size
could replace -op-size
too.
Placing each in separate dictionaries could work. defop/endop could load/unload could work for all but macros. Builder adds those words?
Only cell sized math. No floats. Minimal syscalls. No debugging aids. Barely able to load-core. A build option to strip unused words? Same words as SectorForth?
Output too? Buffered output: dumped out in the select loop when ready?
No aliases. Normalized vocab.
Standards may require a full ASN.1 stack.
Three sets of immediates:
- interpreter: top level, interpretable, used in evaluated defs
- cross compiling: interpretable, only used when [cross] compiling
- output: compiled into binary, listed in binary’s immediates.
Defines constants and immediates needed during compilation, and generates accessors needed in compiled output.
Use data var.
Call code field that then jumps to data.
Fixed?
Little endian puts LSB at the lower address. Stack ordering has LSB at the higher address. But byte order in code needs to be consistent on big and little endian systems, which may need 64 bit support in the integer reader instead of faking it.
When TTY output to a pipe is desired?
s” places the string onto the data stack polluting the binary. Special interp version for defproper?
Would only work when the token does not span reads.
Vulkan on Android doesn’t report any devices to 32 bit code.
The ability to dump the program to source code into a loadable and buildable format.
Find equivalent words to add immediate
and/or immediate-as
after the definition.
Arrays, strings, lists, (function) pointers
Standard Forth uses these for stream output. Switch to < or > like standard stack ops?
,ins breaks the rule on ,word
and .word
.
CASE
N OF ... ENDOF
else...
ENDCASE
begin ... condition until
begin ... while condition... repeat
max init do ... loop
leave
return
+loop
0sp - zero stack to init rot a b c – b c a -rot a b c – c a b pick -> 1 + overn nip -> swap drop tuck a b – b a b
lshift rshift arshift
include file : loads file include? file : loads file if it’s not already loaded forget file : unload the file’s definitions (a word to free and forget?) anew : called when entering a new file for bookkeeping for forget. ? +!
struct: name
type field: name
...
Executable words that can be rebound with IS.
defer motd
' hello is motd
motd ( calls hello )
what's motd ( -> ' hello )
[IF] and other bracketed conditions behave like #if in C.
{ arg1 arg0 | local0 – result }
Use the reader.
#BEGIN_SRC
' hello literal hey assert-equals
#END_SRC
For op specific data: if the word is in R1, can that be used to address the data field for…? perhaps not for init.
Lacks thumb2 and therefore division and coprocessor ops.
Call frames, stack & data pointer math
Needs to interpret input when called while not reading additional input.
Have return1-n now.
#+BEGIN def f ( x y z – a b ) a b returns 3 2 end
def f ( x y z – a b ) [ a b ] return end
[ x y z ] f => [ a b ]
4 1 2 + dup 3 overn f
#+END
Need to better handle targets and loading their sources. Too much duplication. Pass sources in as args from Makefile? Every file requires what it needs?
Cat in the better compiler. Cat in just the assembler.
const> var> load
Creating dictionary entries: make-dict-entry create dict-entry accessors compiling-read with immediates: reuse comments & strings string appending
Dictionary entries that are and have real pointers. All their fields need CS added. Threads too: offset & indirect. Data stack: relative or absolute?
concat-seq down-stack uses revmap-stack? stack-find?
16 bit op codes: needs int32, literal, etc. to be immediates that write proper sized bytes to op sequence.
Would be nice to have colon definitions as code words.
Bash specifically.
load needs file opening and reading with a reader stack.
” returns a pointer & length when bash cross compiles. ” returns just a pointer in interp Maintaining the length some where is good. s” c” tmp” d” ; some only make sense when interpreting at top level Touches words that take pointer or a pointer/length pair.
fn | TL storage | def storage | returns |
c” | stack | chars length | |
d” | data | data | pointer length |
s” | stack | data | pointer length |
tmp” | buffer | pointer length | |
“ | ?? | ?? | bash: pointer |
cross: pointer length | |||
interp: pointer! |
fn | TL storage | def storage | returns |
c” | stack | chars length | |
d” | data | data | pointer length |
s” | stack | data | pointer length |
tmp” | buffer | pointer length | |
“ | stack | data | pointer |
Write code segment, data segment, and stack to an ELF blob. Each part needs a segment and program headers to load to same memory location. Dynamic linking would move these.
To unify interpretation of tokens and indirect threads.
Zero needed at end of definitions for decompile. [Data] segment needs to be aligned at 4096 bytes.
dry up with comp’ immediated as ’ to use compiling-dict.
move defining/*-boot files to interp/boot/defining, or put arch specific files under a cross/${arch}/
The locals…
values, pointers, sequences, offset code, live frames Pointers to sequences of unknown size are one problem.
Calling ops like any other procedure makes subroutine call threading easy.
Needs an explicit BYE. exit gets out of a thread, restoring eip.
Mixing threading types? Puts responsibility on enter and exit to return to the right procedure caller.
EIP, LR
branch-link possible jump table
Inline exec
exp has a trick reusing results, powers of two can bit shift
Top level interpreter and cross compiling ideally use the same vocabulary. Need to be able to enter and exit the cross compiling vocabulary. Likewise with the assemblers. Constants should appear in both environments. Compiling code should be able to alter the compiling environment.
IF ELSE THEN CASE OF ENDOF ENDCASE s” ” ’ s[
create create> lookup drop-dict
var> const> defcol def : immediate immediate-as string-const> symbol>
Used every where. Nice to be optional.
Combine multiple sets. Mix and match on a per file basis?
Drop in primitives for modules.
Store the dictionaries in a structure. Save and switch to them at will. Bit like a fork. Marks with dict and idict?
Can be mixed together. Prefixed Essentially a list of word lists. Default user to TopLevel. Integration with files? Lexical scoping Still doesn’t handle the mixed code segments.
module TopLevel endmodule
module A module B def sq arg0 arg0 * 1 return1-n end end
module C def sq arg1 arg1 * arg0 arg0 * 2 return2-n end end end
4 A :: B :: sq
A :: B include 5 sq
module D A :: B include
def mag arg1 sq arg0 sq + 2 return1-n end end
A :: C module E arg0 include def mag arg1 sq arg0 sq + 2 return1-n end end
module F ’ D :: mag import-as> mag-int end
var> const> alias> defcol => defcolon def :
Scheme style symbol table
Head points to a Type that has a caller attribute. Tail points to the definition sequence.
[ exprs… ] => sequence ‘[ exprs… ] => sequence of resolved, but unexecuted, symbols
def name value def name s[ exprs… ]
def name [ exprs… ] def name colon[ exprs… ]
def name fun[ exprs… ] def name begin[ exprs… ] def name fun( args… ) exprs… end
def name fun exprs… end def name begin exprs… end def name fun( args… )[ exprs… ]
def name [ args… ] do exprs… end def name [ args… ] { exprs… }
Need to restore state. Globals make this tough, but compiler object with output stack, immediates, and words can handle that.
repeated call sequences that have no side effects and return the same values each call can set a generated binding.
Calls in a definition are indexed from the register. Dictionary specified at compile time by specifying a type.
func> tokens+ func: tokens+
Difference in the interpretation of what gets read and returned.
func[ tokens… ] func [ tokens… ]
Reads in a colon definition.
func< (types|atoms)+ > func < (types|atoms)+ >
Needed for creating generic types via generator functions. Interpretation semantics: at minimum, words looked up, value placed on stack. ‘>’ completes the read with word values on stack.
func( tokens+ ) func{ tokens+ }
Immediates?
func” chars*” func/ chars*/
Easy(?) enough to implement algorithms to securely and efficiently interact with the world.
A sequence: ptr -> type, length, *data -> memory Even functions. Arguments are too. Calls would push the FP, return address, and 2 plus the number of arguments, and then the new frame pointer.
#+NAME todos
grep --exclude \*~ -Hn -E "todo|fixme" -r ./src | sed -E -e 's/(.+):([0-9]+):(.*)\( +(todo.*|fixme.*) +(.*) +\)/\4 \5 [[file:\1::\2]]/g' -e 's:todo:TODO:g' -e 's:fixme:FIXME:g' | tee >(wc -l)