by David Leonard, 2004.
The Simple ECMAScript Engine ('SEE') is a parser and runtime library for the popular ECMAScript language. ECMAScript is the official name for what most people call JavaScript:
[ECMAscript] is based on several originating technologies, the most well known being JavaScript (Netscape) and JScript (Microsoft). The language was invented by Brendan Eich at Netscape and first appeared in that company's Navigator 2.0 browser. It has appeared in all subsequent browsers from Netscape and in all browsers from Microsoft starting with Internet Explorer 3.0. (ECMA-262 standard, 1999)
SEE [almost] fully complies with ECMAScript Edition 3, and to JavaScript 1.5. It has compatibility modes that allow it to run scripts developed under earlier versions of JavaScript, Microsoft's JScript and LiveScript.
⚠ Note:
At the time of writing, the only non-compliant feature of SEE is with its
behaviour of String.toLowerCase()
and
String.toUpperCase()
for characters in the
non-ASCII range of Unicode.
This documentation is intended for developers wishing to incorporate SEE into their applications. It explains how you can use SEE to:
This documentation does not explain the ECMAScript language, nor discuss how to build the library on your system.
SEE includes an example application, called see-shell which allows interactive use of the interpreter, and demonstrates how to write host function objects.
I will use the phrase host application
to mean your application, or
any application that uses the SEE runtime environment auxillary to
some primary purpose.
Examples of a host application are web browsers and
scriptable XML processors.
Throughout this documentation, references are made to the C functions and macros provided by the SEE library. To avoid definitional redundancy and to improve precision, the reader is encouraged to examine the SEE header files to find the precise definitions and arguments of each function or macro. Signatures for C macros are given, but you should understand that the compiler cannot normally typecheck your use of those macros.
Where literal C code is used, it is typeset in a monospace font, like this:
if (failed) { abort(); }
Similarly, ECMAScript code is typeset in a sans serif font, like this:
window.location = "about:blank";
Elided code is indicated with an elipsis: ...
Compiling SEE requires an ANSI C compiler. Although the SEE library is essentially self-contained, it does depend on you (the host application developer) providing the following:
SEE uses scripts from GNU autoconf to determine if these
are available, and also to determine other system-dependent
properties.
Host applications should #include <see/see.h>
to
access all the macros and functions prototypes.
⚠ Note: A future release of SEE will use IBM's ICU library for Unicode support.
(As a developer you may find the need to edit header files and configure scripts to make SEE compile on your system. I would be interested in hearing what changes were needed so that future releases can supply this automatically for other users. Please send mail to leonard@users.sourceforge.net.nospam.)
The first step in executing ECMAScript program text with SEE is to create
yourself an interpreter instance.
Each interpreter represents a reusable execution context. When created,
they are initialised with all the standard ECMAScript objects
(such as Math
and String
).
First, have your application allocate storage for a
SEE_interpreter
structure and then call
SEE_interpreter_init()
to initialise that structure.
void SEE_interpreter_init(struct SEE_interpreter *interp);
A pointer to the initialised SEE_interpreter
structure
is required for almost every function that SEE provides.
SEE supports multiple independent interpreter instances. This is useful, for example, in an HTML web browser application, where each window may need its own interpreter instance because the variables and bindings to built-in objects must be different and separate in each one.
SEE's functions are not inherently thread-safe,
but multiple different interpreters can be safely used by
different threads.
This is because all data used by the library is attached to the
SEE_interpreter
structure; there are no global
data structures.
Interpreters can remain
completely independent of each other in this way if you:
Here is an example where the storage has been allocated on the stack, and consequently the interpreter only exists until the function returns.
void example() { struct SEE_interpreter interp_storage; SEE_interpreter_init(&interp_storage); /* now the interpreter is ready */ }
There is no mechanism for explicitly destroying an initialised interpreter; instead, SEE relies on the garbage collector to reclaim all unreferenced storage. If you want finalization semantics, you will need to arrange that yourself.
If SEE encounters an internal error (such as memory exhaustion,
memory corruption, or a bug), it calls the global function pointer
SEE_abort
,
passing it a pointer to the interpreter in context, and a short descriptive
message.
The SEE_abort
hook initially points to a wapper function that simply calls
the C library function abort()
. You can set the hook
early if you want to handle errors more gracefully.
Its signature is:
extern void (*SEE_abort)(struct SEE_interpreter *interp, const char *msg);
SEE uses a garbage collecting memory allocator. It has global function pointers for memory allocation that the host application can configure. These hooks must be set up before any interpreter instances are created.
SEE manages memory by calling through the following function pointers, that you can change:
void * (*SEE_mem_malloc_hook)(struct SEE_interpreter *interp, unsigned int size); void (*SEE_mem_free_hook)(struct SEE_interpreter *interp, void *ptr); void (*SEE_mem_exhausted_hook)(struct SEE_interpreter *interp);
If SEE was compiled with Boehm-gc support, SEE_mem_malloc_hook
is initialised to point to a wrapper
around the GC_malloc()
function.
Otherwise, it is initialised as NULL
and your application must
set it before use.
If you intend to hook in your own memory allocator, be aware that any of
these hooks may be called with a NULL
argument,
indicating unknown context.
They must not throw exceptions, but may return NULL
on failure.
Currently, SEE_mem_free_hook
is unused,
although future versions may use it.
It should be left at its default value, NULL.
Instead of calling the hooks directly, application code should use these three convenient macros to allocate storage:
SEE_NEW()
- allocate structure storage in the context of an interpreter,
returning a pointer of given type
SEE_NEW_ARRAY()
- allocate storage for an array of elements of the given type
SEE_ALLOCA()
- allocate storage for an array on the stack (see alloca()
)
T * SEE_NEW(struct SEE_interpreter *interp, type T); T * SEE_NEW_ARRAY(struct SEE_interpreter *interp, type T, int length); T * SEE_ALLOCA(int length, type T);
An usage example is:
char *buffer = SEE_NEW_ARRAY(interp, char, 30);
These macros check for a memory allocation failure indicated by the
SEE_mem_malloc_hook
returning NULL.
In this event they will
assume an out-of-memory condition and call
the SEE_mem_exhausted_hook
.
This hook defaults
to a function that simply calls SEE_abort
.
Your application may prefer to change the SEE_mem_exhausted_hook
to handle this situation more gracefully.
It is worth familiarizing yourself with the macro definitions to
see what they do.
See <see/mem.h>
for the definitions.
Why is SEE so dependent on a garbage collector? Why doesn't it use reference counting?
This subsection is a short diversion on answering this good question. I have asked myself the same thing about other applications that use garbage collectors. I'll justify SEE's reliance on a garbage collector with the following reasons:
malloc()
and free()
)
would have significantly increased the
complexity, development time, run-time performance and code size of the library.
This would in turn affect those properties of the host application.
There are various convincing documents that explain why a garbage collector
is better a better general software engineering choice than (say) explicit
reference counting.
(See the Advantages and Disadvantages of Conservative Garbage Collection.)
longjmp()
, because references on the stack would
become memory leaks or introduce too much fragile 'finaly' code.
SEE's ultimate purpose is to execute user scripts. A full script, or a self-contained fragment of script is referred to as ECMAScript program text. You should execute program text using the following general strategy:
SEE_interpreter
(§2);
SEE_input
unicode character stream to
transport the
ECMAScript program text to SEE
(§4.2);
SEE_Global_eval()
to parse and
evaluate the stream;
The SEE_Global_eval()
function is optionally able to
return the value associated with the last statement executed.
In a non-interactive environment, this value is meaningless, and the
value result return pointer ('res
') given to
SEE_Global_eval()
may be
safely given as NULL.
void SEE_Global_eval(struct SEE_interpreter *interp, struct SEE_input *input, struct SEE_value *res);
The program text is parsed and executed in one step with this
mechanism. If the evaluated text contains function definitions, the
function-objects created inside the interpreter will contain a
'precompiled' copy of the function text. This means it is safe
to destroy the input after it has been passed to
SEE_Global_eval()
.
Although the rest of this document explains the library API in detail, a complete, but simple example of using the SEE interpreter follows:
#include <see/see.h>
/* Simple example of using the interpreter */
int
main()
{
struct SEE_interpreter interp_storage, *interp;
struct SEE_input *input;
SEE_try_context_t try_ctxt;
struct SEE_value result;
char *program_text = "Math.sqrt(3 + 4 * 7) + 9
";
/* Initialise an interpreter */
SEE_interpreter_init(&interp_storage);
interp = &interp_storage;
/* Create an input stream that provides program text */
input = SEE_input_utf8(interp, program_text);
/* Establish an exception context */
SEE_TRY(interp, try_ctxt) {
/* Call the program evaluator */
SEE_Global_eval(interp, input, &result);
/* Print the result */
if (result.type == SEE_NUMBER)
printf("The answer is %f\n", result.u.number);
else
printf("Unexpected answer\n");
}
SEE_INPUT_CLOSE(input);
/* Catch any exceptions */
if (SEE_CAUGHT(try_ctxt)) {
printf("Unexpected exception\n");
}
exit(0);
}
When this program is compiled, linked against the SEE library and the garbage collector library, and run, it should respond with:
The answer is 14.567764
This works because the value of the last executed statement in the
program_text
is stored in result
.
Calling SEE_Global_eval()
is essentially the same
as using ECMAScript's built-in eval()
function.
SEE uses Unicode character stream sources known as 'inputs' to consume (scan and parse) ECMAScript program text. An input is a stream of 32-bit Unicode UCS-4 characters. The stream is read, one character at a time, through its 'get next character' callback function.
The SEE library provides some useful stream constructors.
Each constructor create a new SEE_input
structure, initialised for reading the source it is supplied.
SEE_input_file()
- streams from a stdio FILE
pointer, and
understands Unicode byte-order marks in that file
SEE_input_utf8()
- streams the contents of a null-terminated char
array, and
assumes 7-bit ASCII or UTF-8 encoding
SEE_input_string()
- streams the contents of a SEE_string
structure
(which uses UTF-16 encoding, see §5.3)
struct SEE_input *SEE_input_file(struct SEE_interpreter *interp, FILE *f, const char *filename, const char *encoding); struct SEE_input *SEE_input_utf8(struct SEE_interpreter *interp, const char *s); struct SEE_input *SEE_input_string(struct SEE_interpreter *interp, struct SEE_string *s);
If these constructors do not adequately meet your needs, you are encouraged to develop your own. They're quite easy to do if a bit fiddly. I recommend you find the source to one of the above and modify it to do what you want.
The rest of this section describes the input API in detail, with a view towards custom input streams.
Why streams instead of strings? SEE uses a stream API for inputs rather than (say) a simple UCS-4 or UTF-8 string API, because Unicode-compliant applications will usually have a much better understanding of the encodings they are using than will SEE. With only a small amount of effort, streams provide this flexibility while avoiding unnecessary duplication or text storage.
Inputs are described by SEE_input
structures.
These are functionally similar to stdio's FILE
type, or Java's
ByteReader
classes.
Except they stream fully-decoded Unicode characters.
The SEE_input
structure is the focus of the API and maintains
the input's stream state and provides a pointer to its access (callback)
methods.
struct SEE_input { struct SEE_inputclass *inputclass; SEE_boolean_t eof; SEE_unicode_t lookahead; ... }; struct SEE_inputclass { SEE_unicode_t (*next)(struct SEE_input *input); void (*close)(struct SEE_input *input); };
The inputclass
member
indicates the access methods.
It is a pointer to a SEE_inputclass
structure. This class structure
contains function pointers to the two methods next()
and
close()
.
The next()
method should advance the input pointer, update the
eof
and lookahead
members of the
SEE_input
structure, and return the old value of
lookahead
.
SEE's scanner calls next()
repeatedly, until
the eof
member becomes true.
If the next()
method encounters an encoding error, it should
return SEE_INPUT_BADCHAR
and try to recover.
It can throw an exception if it wants to, but SEE does not attempt to
handle that: the application or user program will receive it.
If you don't particularly care about Unicode, it is helpful to
know that 7-bit ASCII is a direct subset of Unicode, so you can just pass
each of your ASCII char
s as a 32-bit SEE_unicode_t
masked with 0x7f
.
(See the references.)
The close()
method should deallocate any operating system
resources acquired during the input stream's construction.
By convention, SEE will not call the close()
method
of any application-supplied input. The onus is on the caller to close the
inputs supplied to SEE library functions.
For this reason, you should use the 'finally' behaviour described
in §4.3 to clean up a possibly failed stream.
The SEE_input
structure represents the current state of the
input stream.
Most importantly, the lookahead
field must always reflect the
next character that a call to next()
would return.
Once initialised, the filename
, first_lineno
and
interpreter
members of the SEE_input
structure
should not be changed.
The lookahead
and eof
members
should also be initialised before the structure is given to SEE.
You are encouraged to read the source code to the three constructors listed at the beginning of this section.
Callers will use these convenience macros to call input methods on a constructed input stream, rather than calling through the class structure directly:
SEE_INPUT_NEXT()
-
Consumes and returns the next Unicode character from the stream
SEE_INPUT_CLOSE()
-
Releases any resources obtained by the stream
SEE_unicode_t SEE_INPUT_NEXT(struct SEE_input *input); void SEE_INPUT_CLOSE(struct SEE_input *input);
SEE's exceptions are implemented using C's
setjmp()
/longjmp()
mechanism. SEE provides macros
that establish a try-catch context, and test later if a try block
terminated abnormally (i.e. due to an thrown exception). Typical code that
uses try-catch looks like this:
struct SEE_interpreter *interp; struct SEE_value *e; SEE_try_context_t c; /* storage for the try-catch context */ ... SEE_TRY(interp, c) { /* * Now inside a protected "try block". * The following calls may throw exceptions if they want, * causing the try block to exit immediately. */ do_something(); do_something_else(); /* * Because the SEE_TRY macro expands into a 'for' loop, * avoid using 'break', or 'return' statements. * If you must leave the try block, use 'continue;', * or throw an exception. */ } /* Code placed here always runs. */ do_cleanup(); if ((e = SEE_CAUGHT(c))) { /* Handle the thrown exception 'e', somehow. */ handle_exception(e); /* or you can throw it up to the next try-catch like so: */ SEE_THROW(interp, e); } ...
Do not return
, goto
or
break
out of a try block; the macro does not check for this,
and the try-catch context may not be restored properly, causing all sorts of
havoc.
Exceptions thrown outside of any try-catch context will cause the interpreter to abort.
If you are not interested in catching exceptions, and only want the 'finally' behaviour, use the following idiom:
SEE_TRY(interp, c) { do_something(); } do_finally(); /* optional */ SEE_DEFAULT_CATCH(interp, c);
The signatures of these macros are:
SEE_TRY(struct SEE_interpreter *interp, SEE_try_context_t ctxt) { stmt... } struct SEE_object *SEE_CAUGHT(SEE_try_context_t ctxt); void SEE_THROW(struct SEE_interpreter *interp, struct SEE_object *exception); void SEE_DEFAULT_CATCH(struct SEE_interpreter *interp, SEE_try_context_t ctxt);
Eventually, your host application will want to pass numbers, strings and complex value objects about, through the SEE interpreter, to and from the user code. This section describes the C interface to ECMAScript values.
The ECMAScript language has exactly six types of value. They are:
undefined
null
true
and false
The SEE_value
structure can represent values of all of
these types.
struct SEE_value { enum { ... } type; union { SEE_boolean_t boolean; SEE_number_t number; struct SEE_string * string; struct SEE_object * object; ... } u; };
The first member, type
, is the discriminator,
and must be one of the enumerated values
SEE_UNDEFINED
, SEE_NULL
,
SEE_BOOLEAN
, SEE_NUMBER
, SEE_STRING
or
SEE_OBJECT
.
Depending on the type,
you can directly access the corresponding value of a
SEE_value
.
If the value variable is declared as:
struct SEE_value v;
then the value that it holds is directly accessed through
its union member, v.u
.
The following table shows which union fields of v.u
are valid when:
v.type |
Valid member | Member's type |
---|---|---|
SEE_UNDEFINED |
n/a | |
SEE_NULL |
n/a | |
SEE_BOOLEAN |
v.u.boolean |
SEE_boolean_t |
SEE_NUMBER |
v.u.number |
SEE_number_t |
SEE_STRING |
v.u.string |
struct SEE_string * |
SEE_OBJECT |
v.u.object |
struct SEE_object * |
Two other types (SEE_COMPLETION
and SEE_REFERENCE
)
are only used internally to SEE and are not documented here.
To convert/coerce values into values of a different types, use the utility functions describe in §5.1.
To create new values in struct SEE_value
structures,
use the following initialisation macros. They first set the type
field and then copy the second parameter into the appropriate union field.
It is fine to use a local variable for a struct SEE_value
,
because the garbage collector can see what is being used from the stack.
void SEE_SET_UNDEFINED(struct SEE_value *val); void SEE_SET_NULL(struct SEE_value *val); void SEE_SET_OBJECT(struct SEE_value *val, struct SEE_object *obj); void SEE_SET_STRING(struct SEE_value *val, struct SEE_string *str); void SEE_SET_NUMBER(struct SEE_value *val, SEE_number_t num); void SEE_SET_BOOLEAN(struct SEE_value *val, SEE_boolean_t bool);
Most SEE_value
s are passed about the SEE library functions using
pointers. This is because the general contract is that the caller supplies
storage for the return value (usually named ret
), while
other pointer arguments are treated as read-only.
Conventionally, the result value pointer is provided as the last argument
to these functions and is named res
.
⚠ Note:
The SEE_VALUE_COPY()
macro breaks this convention
by instead following the better-known idiom of memcpy()
, and
placing the destination first.
Avoid storing a struct SEE_value
as a pointer.
Instead, extract and copy values into storage using the following macro:
void SEE_VALUE_COPY(struct SEE_value *dst, struct SEE_value *src);
A simple pitfall to avoid when passing values to SEE functions is to use value storage as both a parameter to the function and as the return result storage. Do not do this. It is possible that the function will initialise its return storage before it accesses its parameters.
The ECMAScript language specification provides for conversion functions that the host application developer may find useful. They convert arbitrary values into values of a known type:
SEE_ToPrimitive()
- Returns a non-object value. It calls the
object's DefaultValue()
method
(see §6.3)
SEE_ToBoolean()
- Returns a value of type SEE_BOOLEAN
SEE_ToNumber()
- Returns a value of type SEE_NUMBER
SEE_ToInteger()
- Returns a value of type SEE_NUMBER
that is also a finite integer
SEE_ToString()
- Returns a value of type SEE_STRING
SEE_ToObject()
- Returns a value of type SEE_OBJECT
using the String
,
Number
and
Boolean
constructors
void SEE_ToPrimitive(struct SEE_interpreter *interp, struct SEE_value *val, struct SEE_value *hint, struct SEE_value *res); void SEE_ToBoolean(struct SEE_interpreter *interp, struct SEE_value *val, struct SEE_value *res); void SEE_ToNumber(struct SEE_interpreter *interp, struct SEE_value *val, struct SEE_value *res); void SEE_ToInteger(struct SEE_interpreter *interp, struct SEE_value *val, struct SEE_value *res); void SEE_ToString(struct SEE_interpreter *interp, struct SEE_value *val, struct SEE_value *res); void SEE_ToObject(struct SEE_interpreter *interp, struct SEE_value *val, struct SEE_value *res);
The undefined and null types have exactly one implied value each, namely
undefined
and null
.
⚠ Note:
null
is not an object type, and is
not related to C's NULL
constant.
Boolean types (SEE_boolean_t
) have values of either true (non-zero) or false (zero).
Number values (SEE_number_t
) are IEEE 754 signed floating
point numbers, normally corresponding to the C compiler's built-in
double
type.
The following macros may be used to find information about a number value.
(They assume that the type
is SEE_NUMBER
):
SEE_NUMBER_ISNAN()
- return true if the number is not finite or real
SEE_NUMBER_ISPINF()
- return true if number is +∞
SEE_NUMBER_ISNINF()
- return true if number is -∞
SEE_NUMBER_ISINF()
- return true if number is ±∞
SEE_NUMBER_ISFINITE()
- number is not one of the above
int SEE_NUMBER_ISNAN(struct SEE_value *val); int SEE_NUMBER_ISPINF(struct SEE_value *val); int SEE_NUMBER_ISNINF(struct SEE_value *val); int SEE_NUMBER_ISINF(struct SEE_value *val); int SEE_NUMBER_ISFINITE(struct SEE_value *val);
SEE also provides constants SEE_Infinity
and SEE_NaN
which may be stored in number values, but should not be used to compare
number values. Use the macros mentioned previously, instead.
Numbers (and other values) may be converted to integers using the functions
SEE_ToInt32()
, SEE_ToUint32()
or
SEE_ToUint16()
.
SEE_int32_t SEE_ToInt32(struct SEE_interpreter *interp, struct SEE_value *val); SEE_uint32_t SEE_ToUint32(struct SEE_interpreter *interp, struct SEE_value *val); SEE_uint16_t SEE_ToUint16(struct SEE_interpreter *interp, struct SEE_value *val);
SEE provides three data types for integers:
SEE_uint16_t
- 16 bit unsigned integer
SEE_uint32_t
- 32 bit unsigned integer
SEE_int32_t
- 32 bit signed integer
String values are pointers to SEE_string
structures,
that hold UTF-16 strings.
The structure is defined something like this:
struct SEE_string { unsigned int length; SEE_char_t *data; ... };
The useful members are:
length
- Length of string content
data
- Read-only storage for the string content (UTF-16 characters)
Be aware that other strings may come to share the string's data, such
as by forming substrings.
A string's content must not be modified after construction because of this
risk. However, the length
field of a string may be changed to a smaller value
at any time without concern.
The SEE_char_t
type represents each Unicode character in the
string. It is equivalent to a 16-bit unsigned integer.
To manipulate a string, first create a new string using one of the following:
SEE_string_new()
- create a new, empty string
SEE_string_dup()
- create a new string with duplicate content
SEE_string_concat()
- create a new string by duplicating
two other strings
SEE_string_sprintf()
- create a new string using
printf
-like arguments (forced to 7-bit ASCII)
SEE_string_vsprintf()
- create a new string using
vprintf
-like arguments (forced to 7-bit ASCII)
struct SEE_string *SEE_string_new(struct SEE_interpreter *interp, unsigned int space); struct SEE_string *SEE_string_dup(struct SEE_interpreter *interp, struct SEE_string *s); struct SEE_string *SEE_string_concat(struct SEE_interpreter *interp, struct SEE_string *s1, struct SEE_string *s2); struct SEE_string *SEE_string_sprintf(struct SEE_interpreter *interp, const char *fmt, ...); struct SEE_string *SEE_string_vsprintf(struct SEE_interpreter *interp, const char *fmt, va_list ap);
And then, before passing your new string to any other function, append characters to it using the following:
SEE_string_addch()
- append a UTF-16 character
SEE_string_append()
- append contents of another string
SEE_string_append_int()
- append a signed integer's
representation in base 10
void SEE_string_addch(struct SEE_string *s, SEE_char_t ch); void SEE_string_append(struct SEE_string *s, const struct SEE_string *sffx); void SEE_string_append_int(struct SEE_string *s, int i);
Once a new string has been passed to any other SEE function, it is generally unwise to modify its contents in any way. It is OK to share a string between different interpreters if the string is guaranteed not to be modified, and the garbage collector can cope with it.
All strings in SEE use UTF-16 encoding, meaning that in some cases
you may need to be aware of Unicode 'surrogate' characters. If the host
application really needs UCS-4 strings (which is subtly different to UTF-16),
you will need to write your own converter function. Use the implementation of
SEE_input_string()
(§4.2) as
the basis for such a converter.
⚠ Note:
The SEE_string_sprintf()
and
SEE_string_vsprintf()
functions only generate Unicode
characters that lie in the 7-bit ASCII subset of Unicode.
Other string functions provided are:
SEE_string_substr()
- create a read-only substring string
SEE_string_literal()
- create a copy of the string, escaping chars and
enclosing it in double quotes ("
)
SEE_string_fputs()
- output the string to the stdio file using UTF-8 encoding,
returns EOF
on error
SEE_string_cmp()
- compares two strings, like strcmp()
struct SEE_string *SEE_string_substr(struct SEE_interpreter *interp, struct SEE_string *s, int index, int length); struct SEE_string *SEE_string_literal(struct SEE_interpreter *interp, const struct SEE_string *s); int SEE_string_fputs(const struct SEE_string *s, FILE *file); int SEE_string_cmp(const struct SEE_string *s1, const struct SEE_string *s2);
If you find yourself comparing strings a lot, you may find it easier to
compare internalised strings.
These are strings that are kept in a fast
hash table and may be compared equal using pointer equality.
The SEE_intern()
function returns an 'internalized' copy of the
given string and is very fast on already-interned strings.
It is worth using in lieu of SEE_string_cmp()
if the strings
are likely to be intern'ed already. (For example, all property names in
the standard library are.)
struct SEE_string *SEE_intern(struct SEE_interpreter *interp, struct SEE_string *s);
SEE supports statically initialised strings. If you have a large number of strings to create and use (e.g. properties and method names) over many interpreter instances, statically initialised strings can save space, and improve performance.
A statically initialised string, 'Hello, world
',
would look like this:
/* Example of a statically-initialised UTF-16 string */ static SEE_char_t hello_world_chars[12] = { 'H', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd' }; static struct SEE_string hello_world = { 12, /* length */ hello_world_chars /* data */ };
The main problem with static strings is finding an elegant way to initialise the strings' content. There is no simple way in ANSI C to have the compiler convert common ASCII strings into UTF-16 arrays. The approach taken by SEE in supporting all the standard ECMAScript objects and methods, is to generate C program text from a file of ASCII strings during the build process.
If an application wishes to internalise strings across interpreters,
it must add all its global strings into the global
intern table before creating any interpreters. This is done by calling
SEE_intern_global()
for each global string.
void SEE_intern_global(struct SEE_string *str);
When creating global strings, the application can either use
the static initialisation technique described above, or create
interpreter-less strings by passing a NULL
interpreter pointer
to the various string creation functions. Such strings should be immediately
placed into the global intern table.
ECMAScript uses a prototype-inheritance object model with simple named properties. More information on the object model can be found in the ECMA-262 standard, and in other JavaScript references.
This section describes how in-memory objects can be accessed and manipualated (the 'client interface'), and also how host applications can expose their own application objects and methods (the 'implementation interface').
Object instances are implemented as in-memory structures, with an
objectclass
pointer to a table of operational methods.
Object references are held inside values with a type field
of SEE_OBJECT
(see §5).
All object values are pointers to object instances.
The pointers are of type struct SEE_object *
.
No object pointer in a SEE_value
should ever point to
NULL
.
I find working with struct SEE_object *
pointer
types directly, instead of using struct SEE_value
to be
convenient, when I know that I am dealing with objects.
To use an existing object instance, you should interact with it using only the following macros:
SEE_OBJECT_GET()
- retrieve a named property or return undefined
('o.prop
')
SEE_OBJECT_PUT()
- create/update a named property
('o.prop = val
')
SEE_OBJECT_CANPUT()
- returns true if the property can be changed
SEE_OBJECT_HASPROPERTY()
- tests for existence of a property
SEE_OBJECT_DELETE()
- delete a property; returns true on success
('delete o.prop
')
SEE_OBJECT_DEFAULTVALUE()
- returns the string or number value associated with the object
SEE_OBJECT_CONSTRUCT()
- call object as a constructor
('new o(...)
')
SEE_OBJECT_CALL()
- call object as a function ('o(...)
')
SEE_OBJECT_HASINSTANCE()
- return true if the objects are related
('x instanceof o
')
SEE_OBJECT_ENUMERATOR()
- create a property enumerator
('for (i in o) ...
')
void SEE_OBJECT_GET(struct SEE_interpreter *interp, struct SEE_object *obj, struct SEE_string *prop, struct SEE_value *res); void SEE_OBJECT_PUT(struct SEE_interpreter *interp, struct SEE_object *obj, struct SEE_string *prop, struct SEE_value *res, int flags); int SEE_OBJECT_CANPUT(struct SEE_interpreter *interp, struct SEE_object *obj, struct SEE_string *prop); int SEE_OBJECT_HASPROPERTY(struct SEE_interpreter *interp, struct SEE_object *obj, struct SEE_string *prop); int SEE_OBJECT_DELETE(struct SEE_interpreter *interp, struct SEE_object *obj, struct SEE_string *prop); void SEE_OBJECT_DEFAULTVALUE(struct SEE_interpreter *interp, struct SEE_object *obj, struct SEE_value *hint, struct SEE_value *res); void SEE_OBJECT_CONSTRUCT(struct SEE_interpreter *interp, struct SEE_object *obj, struct SEE_object *thisobj, int argc, struct SEE_value **argv, struct SEE_value *res); void SEE_OBJECT_CALL(struct SEE_interpreter *interp, struct SEE_object *obj, struct SEE_object *thisobj, int argc, struct SEE_value **argv, struct SEE_value *res); int SEE_OBJECT_HASINSTANCE(struct SEE_interpreter *interp, struct SEE_object *obj, struct SEE_value *instance); struct SEE_enum *SEE_OBJECT_ENUMERATOR(struct SEE_interpreter *interp, struct SEE_object *obj);
⚠ Note:
The last four macros (SEE_OBJECT_CONSTRUCT()
,
SEE_OBJECT_CALL()
, SEE_OBJECT_HASINSTANCE()
,
SEE_OBJECT_ENUMERATOR()
)
will not check if the object
has a NULL
pointer for the corresponding object method.
Calling them on an
unchecked object will probably result in a memory access violation
(e.g. segmentation fault).
The following macros return true if the object safely provides
those methods:
SEE_OBJECT_HAS_CALL()
- object can be called with SEE_OBJECT_CALL()
SEE_OBJECT_HAS_CONSTRUCT()
- object can be called with SEE_OBJECT_CONSTRUCT()
SEE_OBJECT_HAS_HASINSTANCE()
- object can be called with SEE_OBJECT_HASINSTANCE()
SEE_OBJECT_HAS_ENUMERATOR()
- object can be called with SEE_OBJECT_ENUMERATOR()
int SEE_OBJECT_HAS_CALL(struct SEE_object *obj); int SEE_OBJECT_HAS_CONSTRUCT(struct SEE_object *obj); int SEE_OBJECT_HAS_HASINSTANCE(struct SEE_object *obj); int SEE_OBJECT_HAS_ENUMERATOR(struct SEE_object *obj);
When storing properties in an object with SEE_OBJECT_PUT()
,
a flags
parameter is required.
In normal operation, this flag should be supplied as zero, but when populating
an object with its properties for the first time, the following bit
flags can be used:
Flag | Meaning |
---|---|
SEE_ATTR_READONLY |
Future assignments (puts) on this property will fail |
SEE_ATTR_DONTENUM |
Enumerators will not list this property and will hide inherited prototype properties of the same name (see §6.2) |
SEE_ATTR_DONTDELETE |
Future delete s on this property will
fail |
A property enumerator is a mechanism for discovering the properties that
an object contains. The language exercises this with its
for (var v in ...)
construct.
The results of the enumeration need not be sorted, nor even
to be the same order each time.
Calling SEE_OBJECT_ENUMERATOR()
returns a
newly created enumerator which is a pointer to a
struct SEE_enum
.
Once obtained, the following macros can be used to access the enumerator:
SEE_ENUM_NEXT()
- return a pointer to a property name string, or NULL
when the properties have been exhausted.
SEE_ENUM_RESET()
- rewind the enumerator to the start again
struct SEE_string *SEE_ENUM_NEXT(struct SEE_interpreter *interp, struct SEE_enum *e, int *flags_return); void SEE_ENUM_RESET(struct SEE_interpreter *interp, struct SEE_enum *e);
Enumerators can assume that the underlying object does not change during enumeration. A suggested strategy for a caller that does need to remove or add an object's properties while enumerating them is to first create a private list of its property names, ensuring that it has exhausted the enumerator before attempting to modify the object.
When a host application wishes to expose its own 'host objects' to ECMAScript programs, it must use the object implementation API described in this section.
All SEE objects are in-memory structures starting with a
struct SEE_object
:
struct SEE_object { struct SEE_objectclass *objectclass; struct SEE_object * Prototype; };
Normally, this structure is part of a larger structure that maintains the
object's private state. For example, native Number
objects could be implemented with the following:
struct number_object { /* example implementation of Number */ struct SEE_object object; SEE_number_t number; };
Keeping the object
part at the top of the
number_object
structure means that pointers of type
struct number_object *
can be cast to and from pointers of type
struct SEE_object *
. This is a general idiom: begin all
host object structures with a field member of type
struct SEE_object
named object
.
Although the ECMAScript language does not use classes per se,
SEE's internal object implementation does use a class 'abstraction'
to speed up execution and make implementation re-use easier.
Each object has a field, object.objectclass
, that must
be initialised to point to a struct SEE_objectclass
that
provides the object's behaviour. The class structure looks like this:
struct SEE_objectclass { struct SEE_string * Class; /* mandatory */ SEE_get_fn_t Get; /* mandatory */ SEE_put_fn_t Put; /* mandatory */ SEE_boolean_fn_t CanPut; /* mandatory */ SEE_boolean_fn_t HasProperty; /* mandatory */ SEE_boolean_fn_t Delete; /* mandatory */ SEE_default_fn_t DefaultValue; /* mandatory */ SEE_enumerator_fn_t enumerator; /* optional */ SEE_call_fn_t Construct; /* optional */ SEE_call_fn_t Call; /* optional */ SEE_hasinstance_fn_t HasInstance; /* optional */ };
The application generally provides this structure in static storage, as
most of its members are function pointers or strings known at compile time.
A member marked optional should be set to NULL
if it is
meaningless.
The object methods marked mandatory
(Get
, Put
, etc.)
are never NULL
, and should provide the precise behaviours
that SEE expects on native objects.
These behaviours are fully described in the
ECMA-262 standard, and are summarised in the following table:
Method | Behaviour |
---|---|
Get |
retrieve a named property (or return undefined ) |
Put |
create/update a named property |
Delete |
delete a property or return 0 |
HasProperty |
returns 0 if the property doesn't exist |
CanPut |
returns 0 if the property cannot be changed |
DefaultValue |
turns the object into a string or number value |
Construct |
constructs a new object; as per the
new keyword |
Call |
the object has been called as a function |
HasInstance |
returns 0 if the objects are unrelated |
enumerator |
allow enumeration of the properties (see above) |
It is up to the host application to provide storage for the properties, and
so forth. The simplest strategy is to ignore property calls to
Put
and Get
that are meaningless.
To this end, if the host object does not want to expend effort
supporting some of the mandatory operations, it can use the
corresponding 'do-nothing' function(s) from this list:
SEE_no_get()
SEE_no_put()
SEE_no_canput()
SEE_no_hasproperty()
SEE_no_delete()
SEE_no_defaultvalue()
SEE_no_enumerator()
The Prototype
field of an object instance
can either be set to:
Object_prototype
,
NULL
(meaning no prototype), or
NULL
, it is recommended you provide a
toString()
method (to help with debugging).
Once the host application has constructed its own objects that conform to the API, they can be inserted into the 'Global object' as object-valued properties.
The 'Global object' is an unnamed, top-level object whose sole purpose
is to 'hold' all the built-in objects, such as Object
,
Function
, Math
,
etc., as well as all user-declared global variables. The host
application can access it through the Global
member of the
SEE_interpreter
structure.
SEE provides support for a special kind of object class called native
objects. Native objects maintain a hash table of properties, and
implement the mandatory methods (plus enumerator
), and
correctly observe the Prototype
field.
struct SEE_native { struct SEE_object object; struct SEE_property * properties[SEE_NATIVE_HASHLEN]; };
An application can create host objects based on native objects.
First, place a struct SEE_native
at the beginning of a
structure:
struct some_host_object { struct SEE_native native; int host_specific_info; };
Then, use the following objects methods, either directly in the
SEE_objectclass
structure, or by calling them indirectly
from method implementations:
SEE_native_get()
SEE_native_put()
SEE_native_canput()
SEE_native_hasproperty()
SEE_native_delete()
SEE_native_defaultvalue()
SEE_native_enumerator()
The host application will likely want a particular bit of C code to be able
to be called from the runtime environment.
To do this simply requires construction of an object whose
Prototype
field points to
Function.prototype
,
and whose objectclass
's Call
method points to a
C function that contains the desired code.
The convenience function SEE_cfunction_make()
performs this
construction.
It takes a pointer to the C
function, and an integer indicating the expected number of arguments.
(The integer becomes the function object's
length
property, which is advisory only.)
struct SEE_object *SEE_cfunction_make(struct SEE_interpreter *interp, SEE_call_fn_t func, struct SEE_string *name, int argc);
⚠ Note:
Objects returned by SEE_cfunction_make()
should really only
be used in the interpreter context in which they were created, but the
current version of SEE does not check for this. (Because cfunction objects
are essentially read-only after construction, and if memory allocation
operates independently of the interpreters, sharing cfunction objects
across interpreters will be OK, but it is not recommended for future
portability.)
The C function must conform to the SEE_call_fn_t
signature.
This is demonstrated below, with math_sqrt()
, which is
the actual code behind the Math.sqrt
object:
/* Implementation of Math.sqrt() method */ static void math_sqrt(interp, self, thisobj, argc, argv, res) struct SEE_interpreter *interp; struct SEE_object *self, *thisobj; int argc; struct SEE_value **argv, *res; { struct SEE_value v; if (argc == 0) SEE_SET_UNDEFINED(res); else { SEE_ToNumber(interp, argv[0], &v); SEE_SET_NUMBER(res, sqrt(v.u.number)); } }
The arguments to this function are described in the following table:
Argument | Purpose |
---|---|
interp |
the current interpreter context |
self |
a pointer to the object called
(Math.sqrt here) |
thisobj |
the this object
(the Math object in this
case) |
argc |
number of arguments |
argv |
array of value pointers, of length argc |
res |
uninitialised value location in which to store the result |
A common convention in all ECMAScript functions is that unspecified
arguments should be treated as undefined
, and
extraneous arguments should just be ignored.
If the function uses thisobj
,
it should check any assumptions made about, especially if it is expected
to be a host object, because method functions can easily be attached to
other objects.
Occasionally, a host application will wish to take some user text and
create a callable function object from it. An example of this problem is
in attaching the JavaScript code from HTML attributes onto form
elements of a web page.
One way to achieve this is to invoke the Function
constructor object with the
SEE_OBJECT_CONSTRUCT()
macro, passing it the formal arguments
text and body text as arguments.
(See the ECMAScript standard for details on the
Function
constructor.)
Another way, that is more convenient if the user text is available as
an input stream, is to use the SEE_Function_new()
function:
struct SEE_object *SEE_Function_new(struct SEE_interpreter *interp, struct SEE_string *name, struct SEE_input *param_input, struct SEE_input *body_input);
where any of the the name
, param_input
and
body_input
parameters may be NULL (indicating to use the
empty string).
The returned function object may be called with the
SEE_OBJECT_CALL()
macro.
Host applications sometimes need to convey errors to ECMAScript programs.
Errors in ECMAScript are typically indicated by throwing an exception
with an object value. The thrown objects conventionally have
Error.prototype
somewhere in their prototype chain,
and provide a message
and name
property which the Error.prototype
reads to generate
a human-readable error message.
Host applications can conveniently construct and throw error exceptions using the following macros:
void SEE_error_throw_string(struct SEE_interpreter *interp, struct SEE_object *error_constructor, struct SEE_string *text); void SEE_error_throw(struct SEE_interpreter *interp, struct SEE_object *error_constructor, const char *fmt, ...); void SEE_error_throw_sys(struct SEE_interpreter *interp, struct SEE_object *error_constructor, const char *fmt, ...);
These convenience macros construct a new error object, and throw it as an
exception using SEE_THROW()
.
The object thrown is given a message
string property that reflects the rest of the arguments provided
to the called macro.
The SEE_error_throw_sys()
macro works like
SEE_error_throw()
but appends a textual
description of errno
using strerror()
.
The error_constructor
argument should be one of the error
constructor objects found in the SEE_interpreter
structure:
Member | Meaning |
---|---|
Error |
runtime error |
EvalError |
error in eval() |
RangeError |
numeric argument has exceeded allowable range |
ReferenceError |
invalid reference was detected |
SyntaxError |
parsing error |
TypeError |
actual type of an operand different to that expected |
URIError |
error in a global URI handling function |
A simple example:
if (something_is_wrong) SEE_error_throw(interp, interp->Error, "something is wrong!");
Although Error
is usually sufficient for most errors,
host applications can create their own error constructor object with the
SEE_Error_make()
convenience function. Only one constructor
of the same name should be created per interpreter.
struct SEE_object *SEE_Error_make(struct SEE_interpreter *interp, struct SEE_string *name);
SEE provides backward-compatibility with earlier versions of JavaScript and JScript. These features ought never be used, since JavaScript program authors should be mindful of standards. Nevertheless, this section documents the compatibility modes that SEE supplies.
The behaviour of the SEE library is modified on a per-interpreter basis,
by passing special flags to a variant of the interpreter's initialisation
routine, SEE_interpreter_init_compat()
. This function otherwise
behaves just like SEE_interpreter_init()
(see §2).
void SEE_interpreter_init_compat(struct SEE_interpreter *interp, int flags);
The flags
parameter is a bitwise OR of the constants
described in the following table.
⚠ Note: The following compatibility flag names may change in the future.
Flag | Behaviour |
---|---|
SEE_COMPAT_UTF_UNSAFE
| Treat 'overlong' UTF-8 encodings as valid unicode characters. |
SEE_COMPAT_UNDEFDEF
| Don't throw a ReferenceError
when an undefined global property is used.
Instead, return the undefined value.
This violates step 3 of s8.7.1 of ECMA-262, but it seems
that so many interpreters are flexible on this point.
It was originally a JavaScript 1.5 thing I believe.
|
SEE_COMPAT_262_3B
|
Enable optional features in ECMA-262 ed3 Appendix B:
|
SEE_COMPAT_SGMLCOM
|
The lexical analyser will treat the 4-character sequence
'<!-- ' much like the
'// ' comment introducer.
This is useful in HTML SCRIPT elements.
|
SEE_COMPAT_EXT1
|
Random, unsorted extensions, mainly relating to behaviour of older JavaScript:
|
The SEE library contains various debugging facilities, that are
omitted if it is compiled with the NDEBUG
preprocessor define.
These functions are intended for the developer to use while application debugging, and not for general use.
void SEE_PrintValue(struct SEE_interpreter *interp, struct SEE_value *val, FILE *file); void SEE_PrintObject(struct SEE_interpreter *interp, struct SEE_object *obj, FILE *file); void SEE_PrintString(struct SEE_interpreter *interp, struct SEE_string *str, FILE *file); void SEE_PrintTraceback(struct SEE_interpreter *interp, FILE *file);
If debugging the library itself, it is worth reading the source code to find the debug flag variables that can be turned on by the host application to enable verbose traces during execution.
Defining the
NDEBUG
preprocessor symbol when building the library
also disables (slow) internal assertions that would otherwise
help show up application misuse of the API.
© David Leonard, 2004. This documentation may be entirely reproduced and freely distributed, as long as this copyright notice remains intact, and either the distributed reproduction or translation is a complete and bona fide copy, or the modified reproduction is subtantially the same and includes a brief summary of the modifications made.
$Id: USAGE.html,v 1.14 2004/08/15 01:47:49 d Exp $