|
| Sexpr_t | mk_error (const char *fmt,...) CHECKPRINTF_pos2 |
| Sexpr_t | mk_error (Sexpr_t irritants, const char *fmt,...) CHECKPRINTF_pos3 |
In general, it is best not to do anything complex at static construction time. There is no guarantee that dependencies will be ready. In particular, use of the terminal may cause the system to crash. Most applications should leave any construction activities to more purposeful code at runtime, using only cells obtained through cons.
|
|
| Cell (Word_t typ, Word_t w1, Word_t w2) |
|
| Cell (Word_t typ, Int_t w1, Sexpr_t w2) |
|
| Cell (Word_t typ, Sexpr_t w1, Sexpr_t w2) |
Note that the type field occupies all of byte 0, although not all bits are required. This speeds up the very common type-checking operation, because no mask is required to extract the type bits.
The flags are in byte 1, and a mask operation is required to access the individual flag fields.
|
| void | zero () |
| Int_t | type (void) |
| void | type (Int_t t) |
| Int_t | flags (void) |
| void | flags_set (Int_t f) |
| void | flags_clr (Int_t f) |
| void | rplaca (Word_t p) |
| void | rplacd (Word_t p) |
| Ptr_t | get_car_addr () |
| Word_t | get_car (void) |
| Word_t | get_cdr (void) |
| Bool_t | as_Bool_t () |
| Char_t | as_Char_t () |
| Int_t | as_Int_t () |
| Real_t | as_Real_t () |
| Ptr_t | as_Ptr_t () |
| Charst_t | as_Charst_t () |
| Bytest_t | as_Bytest_t () |
| CharVec_t | as_CharVec_t () |
| ByteVec_t | as_ByteVec_t () |
| Portst_t | as_Portst_t () |
| Int_t | as_numerator () |
| Int_t | as_denominator () |
Hashing is used extensively throughout LambLisp. Each Cell has a hash value, calculated as follows:
If the Cell is a symbol, the Cell's hash value is the hash value of the symbol.
Otherwise, the address of the Cell is hashed, and that is the result.
|
| Int_t | hash_sexpr (void) |
| Int_t | hash_contents (void) |
| Int_t | hash (void) |
| Sexpr_t | set (Int_t typ, Word_t w1, Word_t w2) |
| Sexpr_t | set (Int_t typ, Int_t a, Sexpr_t b) |
| Sexpr_t | set (Int_t typ, Sexpr_t a, Sexpr_t b) |
| Sexpr_t | set (Bool_t b) |
| Sexpr_t | set (Char_t c) |
| Sexpr_t | set (Int_t i) |
| Sexpr_t | set (Real_t r) |
| Sexpr_t | set (Port_t &p) |
| Sexpr_t | set (Int_t typ, Int_t a, Charst_t b) |
| Sexpr_t | set (Int_t typ, Int_t a, Bytest_t b) |
| Sexpr_t | set (Int_t typ, Int_t a, CharVec_t b) |
| Sexpr_t | set (Int_t typ, Int_t a, ByteVec_t b) |
If the cell type is already known, then these accessors can be used to efficiently access the cell contents.
|
|
Sexpr_t | prechecked_anypair_get_car () |
|
Sexpr_t | prechecked_anypair_get_cdr () |
|
Int_t | prechecked_sym_heap_get_hash () |
|
Charst_t | prechecked_sym_heap_get_chars () |
|
void | prechecked_sym_heap_get_info (Int_t &hsh, Charst_t &chars) |
| CharVec_t | prechecked_str_heap_get_chars () |
| CharVec_t | prechecked_str_ext_get_chars () |
| CharVec_t | prechecked_str_imm_get_chars () |
|
void | prechecked_bvec_imm_set_length (Int_t l) |
|
Charst_t | prechecked_gensym_get_chars () |
|
void | prechecked_gensym_get_info (Int_t &hsh, Charst_t &chars) |
|
Sexpr_t | prechecked_error_get_irritants () |
|
Sexpr_t | prechecked_error_get_str () |
|
Charst_t | prechecked_error_get_chars () |
The any and mustbe accessors will perform type checking and throw an error if an improper access is attempted. The coerce operators will perform C coercion on its operand if possible, otherwise throw an error.
|
| Bool_t | mustbe_Bool_t () |
| Char_t | mustbe_Char_t () |
| Int_t | mustbe_Int_t () |
| Real_t | mustbe_Real_t () |
| Sexpr_t | mustbe_any_str_t () |
| Sexpr_t | mustbe_cppobj_t () |
| CPPDeleterPtr | prechecked_cppobj_get_deleter () |
| Ptr_t | prechecked_cppobj_get_ptr () |
|
CPPDeleterPtr | any_cppobj_get_deleter () |
|
Ptr_t | any_cppobj_get_ptr () |
|
void | any_cppobj_get_info (CPPDeleterPtr &d, Ptr_t &p) |
|
Real_t | coerce_Real_t () |
|
Real_t | coerce_Int_t () |
|
Sexpr_t | anypair_get_car () |
|
Sexpr_t | anypair_get_cdr () |
|
Sexpr_t | error_get_str () |
|
Sexpr_t | error_get_irritants () |
|
Charst_t | error_get_chars () |
|
Int_t | any_sym_get_hash () |
|
Charst_t | any_sym_get_chars () |
|
void | any_sym_get_info (Int_t &hsh, Charst_t &chars) |
LambLisp supports several subtypes of strings. At the time of writing, there are heap-allocated strings (read-write), strings whose characters outside of LambLisp's managed memory, (aka external or EXT strings), and short immediate strings that are contained completely within a single cell. Additional subtypes (such as load-on-demand strings) may be added in future.
There are functions of the form any_xxx() that can be used with any subtype, and there are type-specific functions of the form any_xxx_yyy() for use where the type is already known, with yyy being a code hint.
|
| CharVec_t | any_str_get_chars () |
|
Int_t | any_str_get_length () |
|
void | any_str_get_info (Int_t &len, CharVec_t &chars) |
Within the LambLisp virtual machine, a Lisp vector is referred to as svec. This is an array of S-expressions having a fixed dimension. There are also bytevectors; these are an array of bytes, also of fixed dimension.
As with strings, there are several subtypes of S-expression vectors. There is a heap-allocated vector, which may be of any size. There is a second type of heap-allocated vector, that is always sized to be a power of 2. These are provided to support efficient hash tables. There are immediate vectors, which may be of 0, 1, or 2 elements. The 2-element immediate vector can also be used as a hash table. This can reduce search time by half without requiring any heap allocation, at the cost of 1 extra cell allocation.
Bytevectors are also diverse, having heap, external, and immediate variants. Heap bytevectors are allocated, obviously, on the system heap, and the heap space is freed when the bytevector is garbage-collected.
External bytevectors operate on bytes provided externally to LambLisp's memory manager. This space may heve been dynamically allocated from outside LambLisp, or may be located in read-only memory. When a C++ object os injected into LambLisp, it can optionally be provided with a garbage collector callback; in that case the external object can be automatically garbage collected when no it's longer used in the Lisp program.
Immediate bytevectors are contained within a LambLisp Cell. The maximum size is limited by the word size of the underlying platform.
Within the LambLisp virtual machine, there are generic functions of the form any_xvec_xxx() that can operate on any xvec (svec or bvec) subtype, as well as type-specific functions of the form xvec_yyy_xxx(), where yyy is a code hint for the cell storage type (heap, immediate, ROM).
|
|
void | any_svec_get_info (Int_t &Nelems, Sexpr_t *&elems) |
|
Sexpr_t * | any_svec_get_elems () |
|
Int_t | any_bvec_get_length () |
|
ByteVec_t | any_bvec_get_elems () |
|
void | any_bvec_get_info (Int_t &Nelems, ByteVec_t &elems) |
These functions convert a Cell (or parts of a Cell) to a printable representation of the S-expression contents of the Cell. Because environments are often included in the descendants of the Cell being printed, the depth of environment recursiveness is limited.
|
| String | cell_name (void) |
| Charst_t | type_name (Int_t typ) |
|
Charst_t | type_name (void) |
| Charst_t | gcstate_name (void) |
| String | dump () |
|
String | str (Sexpr_t sx, Bool_t as_write_or_display, Int_t env_depth, Int_t max_depth) |
|
String | str (Bool_t as_write_or_display, Int_t env_depth, Int_t max_depth) |
|
String | str (Bool_t as_write_or_display, Int_t env_depth) |
|
String | str (Bool_t as_write_or_display) |
|
String | str (Int_t env_depth) |
|
String | str (void) |
The Cell class is the foundational class for the LambLisp Virtual Machine.
Nearly all the Cell methods are inline for performance. A Cell has only getters and setters; there are no other side effects. Cell fields are changed only through explicit requests, and not as the result of any other mutations. This means that Cells do not participate in garbage collection, which is managed from outside the Cell class. Indeed, there is no concept within the Cell that there might be a plurality of them; only the behavior of a single Cell is defined.
A Cell may be one of several types. Historically, the number of types varied with the variant of Lisp. In LambLisp, there are Cell types that correspond directly to the types described in the Scheme RxRS specifications (integers, procedures etc), and there are additional types that implement underlying behavior to support the higher-level language (thunks, dictionaries).
For example, LambLisp supports several types of strings internally, depending on whether the character are stored on the heap, in externally-provided memory, or immediately within the cell (for fast operations on short strings). Each of these string types is compatible with the string type in the Scheme RxRS specifications. Likewise, bytevectors have external and immediate variants.
LambLisp also supports specialized pair types, such as the LambLisp dictionary. These act as Lisp pairs for purposes of allocation and garbage collection, but their specialized operators are executed by the underlying LambLisp virtual machine and therefore operate at C++ speed.
The Cell type enumeration is purposefully ordered in such a way as to allow efficient integer comparisons instead of a C switch in most cases, enhancing runtime performance as well as garbage collection.
| Cell enumeration characteristics: |
| Cells requiring submarking |
| Cells requiring finalizing |
| Simple atoms: immediate cell types like bool char int real etc |
| External (non-GC) objects and pointers to native C++ code |
| Cells that point to C++-allocated objects, paired with optional GC finalizers. |
| "well-known singleton atoms" such as NIL, undefined, and void types. |
| Pair - the Cell type that responds to the Lisp pair? predicate. |
| Other pairs - specialized pair types |
With this ordering of Cell types, the solution to many common cases during expression evaluation can be obtained with an inequality, rather than a C++ switch statement. This provides a significant performance boost, because the most common cases are checked first, and the total number of cases is reduced.
| Cell optimized type tests |
| (Types < "simple atoms") need specialized GC marking and heap reclamation. |
| (Types <= "cells requiring finalizing") need specialized GC heap reclamation. |
| (Types >= T_NIL) are lists. |
| (Types > T_NIL) are pairs (but not only Lisp pairs, extended pairs too). |
| (Types != T_PAIR) are atoms, including numbers, strings, vectors, and singletons such as NIL. |
| (Types > T_PAIR) are extended pairs, used for specialized lists known to LambVM, but receiving regular pair GC processing. |
There are cases where the ordering does not help so much. For example the LambLisp type system allows for several types of strings. Because the type system is ordered according to GC requirements, these are not adjacent in the enumeration and less amenable to inequality tests. For these cases, there is a cell feature matrix available to be queried by type and feature. This allows the determination to be made with an array lookup, which is faster than several sequential tests or a C switch, putting a permanent cap on the cost of a type check. It also allows other details such as is-immediate? to be implemented inexpensively.
The garbage collector states are ordered for similar reasons. See the garbage collector chapter for details on the Cell life cycle.