Newton 2.x Q&A: NewtonScript Object Sizes

NewtonScript Object Sizes

One of the Newton 2.x OS Q&As
Copyright © 1997 Newton, Inc. All Rights Reserved. Newton, Newton Technology, Newton Works, the Newton, Inc. logo, the Newton Technology logo, the Light Bulb logo and MessagePad are trademarks of Newton, Inc. and may be registered in the U.S.A. and other countries. Windows is a registered trademark of Microsoft Corp. All other trademarks and company names are the intellectual property of their respective owners.

For the most recent version of the Q&As on the World Wide Web, check the URL:


http://www.newton-inc.com/dev/techinfo/qa/qa.htm

If you've copied this file locally, click here to go to the main Newton Q&A page.
This document was exported on 7/23/97.

NewtonScript Object Sizes (6/30/94)

These desciptions document current OS formats only, we reserve the right to extend or change the implementation in future releases.

Generic
NewtonScript objects are objects that reside either in the read-write NewtonScript memory, in pseudo-ROM memory, inside the package, or in ROM. In earlier MessagePad platforms, these objects are aligned to 8-byte boundaries. In Newton 2.0 OS, objects in the NewtonScript memory are aligned to 4-byte boundaries. Inside Newton 2.0 packages, you can optionally align objects to 4-byte boundaries (with NTK's "tighter object packing" checkbox). Alignment causes a very small amount of memory to be wasted, usually less than 2%.

The Newton Object System has four built-in primitive classes that describe an object's basic type: immediates, binary objects, arrays, and frames. The NewtonScript function PrimClassOf will return an object's primitive type.

Immediates and Magic Pointers
Immediates (integers, characters, TRUE and NIL) and magic pointers are stored in a 4-byte structure containing up to 30 bits of data and 2 bits of primitive class identification.

Referenced Objects
Binaries, arrays and frames are stored as larger separate objects and managed through references. A reference is a four- byte object. The binary objects, frames, or arrays themselves are stored separately as objects containing a so-called Object Header.

Object Header
Every referenced object has a 12-byte header that contains information concerning size, flags, class, lock count and so on. This information is implementation-specific.

Symbols
A symbol is a binary object that contains a four-byte hash value and a name, which is a null-terminated ASCII string. Each symbol uses 12 (header) + 4 (hash value) + length of name + 1 (null terminator) bytes.

Binary Objects
A binary object contains a 12- byte header plus space for the actual data (allocated in 8 -byte chunks.)

Strings
Strings are binary objects of class (or a subclass of) String. A string object contains a 12-byte header plus the Unicode strings plus a null termination character. Note that Unicode characters are two-byte values. Here's an example:

    "Hello World!"

This string contains 12 characters, in other words it has 24 bytes. In addition we have a null termination character (24 + 2 bytes) and an object header (24 + 2 + 12 bytes), all in all the object is 38 bytes big. Note that we have not taken into account any possible savings if the string was compressed (using the NTK compression flags).

Rich Strings
Rich strings extend the string object class by embedding ink information within the object. Within the unicode, a special character kInkChar is used to mark the position of an ink word. The ink data is stored after the null termination character. Ink size varies depending on stroke complexity.

Array Objects
Array objects have an object header (12 bytes) and additional four bytes per element which hold either the immediate value or a reference to a referenced object. To calculate the total space used by an array, you need to take into account the memory used by any referenced objects in the array.

Here's an example:

    [12, $a, "Hello World!", "foo"]

We have a header (12 bytes) plus four bytes per element (12 + (4 * 4) bytes). The integer and character are immediates, so no additional space is used, but we have 2 string objects that we refer to, so the total is (12 + (4*4) + 38 + 20 bytes) 86 bytes. We have not taken into account savings concerning compression. Note that the string objects could be referred by other arrays and frames as well, so the 38 and 20 byte structures are stored only once per package.

Frame Objects
We have two kinds of frames: frames that don't have a shared map object; and frames that do have a shared map object. We take the simple case first (no shared map object).

The frame is maintained as two array-like objects. One, called the frame map, contains the slot names, and the other contains the actual slot values. A frame map has one entry per symbol, plus one additional 4 -byte value.

The frame map uses a minimum of 16 bytes. If we add the frame's object header to this, the minimal size of a frame is 28 bytes. Each slot adds 8 bytes to the storage used by the frame (two array entries.) Here's an example:

    {Slot1: 42, Slot2: "hello"}

We have a header of 28 bytes, and in addition we have two slots, for a total of (28 + (2 * 8)) 48 bytes. This does not take into account the space used for each of the slot name symbols or for the string object. (The integer is an immediate, and so is stored in the array.)

Multiple similar frames (having the same slots) could share a frame map. This will save space, reducing the space used per frame (for many frames all sharing the same map) to the same as used for an array with the same number of slots. (If just a few frames share the frame map, we need to take into account the amortized map size that the frames share. So the total space for N frames sharing a map is N*28 bytes of header per frame, plus the size of the frame map, plus the size of the values for the N frames.

Here's an example of a frame that could share a map with the previous example:

    {Slot1: 56, Slot2: "world"}

We have a header of 12 bytes. In addition, we have two slots (2 * 4), and additional 16 bytes for the size of a map with no slots „ all in all, 36 bytes. We should also take into account the shared map, which is 16 bytes, plus the space for the two symbols.

When do frames share maps?

1. When a frame is cloned, both the copy and the original frame will share the map of the original frame. A trick to make use of this is to create a common template frame, and clone this template when duplicate frames are needed.

2. Two frames created from the same frame constructor (that is, the same line of NewtonScript code) will share a frame map. This is a reason to use RelBounds to create the viewBounds frame, and it means there will be a single viewBounds frame map in the part produced.

Note: These figures are for objects in their run-time state, ready for fast access. Objects in transit or in storage (packages) are compressed into smaller stream formats. Different formats are used (and different sizes apply) to objects stored in soups and to objects being streamed over a communications protocol.