apgcode

From LifeWiki
(Redirected from Apgsearch format)
Jump to: navigation, search

apgsearch and its search results database, Catagolue, use apgcodes to classify and denote patterns.

Encoding objects

apgcodes consist of a prefix and a suffix, separated by an underscore; both the prefix and the suffix are alphanumeric strings. The prefix further consists of a two-character type and a number. The types are:

The number following the type represents the population for a still life, or otherwise the period of the object.

In codes for xs, xp, and xq, the suffix following the underscore is a representation of the object in Extended Wechsler Format (see below); in codes for yl, the suffix consists of additional information encoding higher-order periods of the pattern.

Unencodable objects

In their original form, apgcodes are not guaranteed to encode still lifes, oscillators and spaceships exceeding a 40×40 bounding box.[note 1] These are instead reported by apgsearch and censused by Catagolue as oversized patterns, using the following types:

  • ov_s for an oversized still life;
  • ov_p for an oversized oscillator;
  • ov_q for an oversized spaceship.

These types are then followed by a number representing the population for a still life, or otherwise the period of the object, as in the non-oversized case. apgcodes of this form do not uniquely identify a specific object, but rather represent classes of objects.

An extension to apgcodes, dubbed "greedy apgcodes" (see below), was later used to relax the 40×40 bounding box constraint, although some sufficiently large patterns remain canonically unencodable.

Unclassifiable objects

Objects which apgsearch cannot classify as above are denoted by one of:

  • zz_EXPLOSIVE
  • zz_LINEAR
  • zz_QUADRATIC
  • zz_REPLICATOR
  • PATHOLOGICAL (e.g. high-period oscillators)

Chaotic-growth (zz_*) patterns are classified according to certain heuristics; the labels chosen do not necessarily reflect the nature of the object encountered, so an object classified as e.g. zz_REPLICATOR need not be an actual replicator.

Like oversized apgcodes, these apgcodes do not uniquely identify specific objects.

Extended Wechsler Format

Non-oversized still lifes, oscillators and patterns are encoded in extended Wechsler format, an extension of a pattern notation developed by Allan Wechsler in 1992.

  • The pattern is separated into horizontal strips of five rows.
    • Each strip, n columns wide, is encoded as a string of n characters in the set {0, 1, 2, ..., 8, 9, a, b, ..., v} denoting the five cells in a vertical column corresponding to the bitstrings {'00000', '00001', '00010', ..., '01000', '01001', '01010', '01011', ... '11111'}.
    • The characters 'w' and 'x' are used to abbreviate '00' and '000', respectively, and the symbols {'y0', 'y1', y2', ..., 'yx', 'yy', 'yz'} are used to encode runs of between 4 and 39 consecutive '0's.
    • Extraneous '0's at the ends of strips are not included in the encoding. (Note that in particular, blank five-row strips are represented by the empty string.)
  • The character 'z' separates contiguous five-row strips.

Examples

xq4_27deee6 encodes a HWSS:

Xq4 27deee6 annotated.png

xs31_0ca178b96z69d1d96 encodes a 31-bit still life; note how the 'z' separates the two five-row strips:

Xs31 0ca178b96z69d1d96 annotated strip1.png
Xs31 0ca178b96z69d1d96 annotated strip2.png

xp30_w33z8kqrqk8zzzx33 encodes a trans-queen-bee-shuttle; note how five-row strips are represented by the empty string:

Xp30 w33z8kqrqk8zzzx33 annotated strip1.png
Xp30 w33z8kqrqk8zzzx33 annotated strip2.png
(10 blank rows omitted)
Xp30 w33z8kqrqk8zzzx33 annotated striplast.png

xp2_31a08zy0123cko is a quadpole on ship; note how trailing zeros in the first five-row strip are not encoded:

Xp2 31a08zy0123cko annotated strip1.png
Xp2 31a08zy0123cko annotated strip2.png

Canonical form

In order to enforce a canonical form, there are further rules regarding encoding:

  • The leftmost column and uppermost row must each contain at least one live cell. (This gives a canonical position.)
  • A canonical orientation and phase must be determined. For example, with the caterer (p3 oscillator with no symmetry), there are three phases and eight orientations, so we have 24 possible encodings. A total order on these encodings is defined as follows:
    • Shorter representations are preferred to longer representations;
    • For representations of the same length, lexicographical ASCII ordering is applied, and preference given to earlier strings.

This gives, for any still-life, oscillator or spaceship, an unambiguous canonical code to represent the pattern. It has several desirable properties:

  • Compression: it is much more compact than RLE or SOF for storing very small patterns, and often even beats the common name ('xp15_4r4z4r4' is shorter than 'pentadecathlon')!
  • Character set: it only uses digits, lowercase letters and the underscore, so can be safely used in filenames and URLs.
  • Human-readability: the prefix means that we can instantly see whether a particular object is a still-life (and if so, what size), oscillator (and if so, what period) or spaceship (and if so, what period). It also means that the string is instantly recognised as being an encoding of an object ('xp2_7' is obviously a blinker, whereas the digit 7 on its own with no extra context is ambiguous).

Adapting apgcodes to larger patterns

As apgcodes were originally limited to patterns fitting into a 40×40 bounding box, different ways of extending them to allow unambiguously encoding larger patterns were proposed.[1] In their original form, apgcodes are not guaranteed to be able to encode larger patterns, due a lack of well-definedness for runs of 40 or more consecutive zeroes, and apgsearch versions up to 3.x would report such patterns as oversized instead of attempting to encode them.

Extended ("greedy") apgcodes were eventually adopted to allow larger patterns to be encoded. These avoid the problem by stipulating that a run of n zeroes be encoded as follows:

  1. If n is less than 40, the run is encoded as in regular apgcodes, using the characters w, x, and y0 .. yz as appropriate.
  2. If n is equal to or greater than 40, the run is encoded as yz (representing the first 39 zeroes), following by the encoding for a run of n - 39 zeroes according to this definition.

That is to say:

Extended apgcodes zeroes encoding.png

This extension, which is used on the LifeWiki for larger patterns, retains compatibility with existing apgcode decoders. Adapting encoders to produce extended apgcodes is similarly straightforward.

Greedy apgcodes were adopted for apgsearch 4.x[2], subject to the following constraints:

  • (width + 2) * (height + 2) <= 10000;
  • the resulting apgcode, without the prefix, cannot exceed 1280 bytes.

Adapting apgcodes to multistate rules

Work is ongoing on extending apgcodes to multistate (Generations) rules.[3] For this, apgcodes encoding still lifes, oscillators and spaceships will contain multiple suffixes, separated by underscores.[note 2]

Each suffix will encode one 'layer' of the pattern, with patterns in an n-state rule having 2+⌈log2(n-2)⌉ layers, where layer 0 is 'live', layer 1 is 'refractory', and the remaining layers implement a binary counter.

More precisely, a pattern in an n-state Generations rule is represented as follows:[4][5]

  1. "Dying" states are renumbered to 2, 6, 10, 14, 18, ..., in reverse chronological order.
  2. Each state is represented as a binary number.
  3. An extended Wechsler representation is generated for each bit position, starting with the least significant bit.
  4. The resulting representations, now called "layers" are concatenated with underscores to form a representation of the entire pattern.

To obtain the pattern's canonical object, representations are generated for each phase and each possible orientation of the pattern, and the canonical representation is chosen as described in the "Canonical form" section above. Finally, a suitable prefix describing the pattern's overall behavior is prepended.

In 3-state rules, this simplifies to the two layers directly representing cell states 1 and 2, respectively.

Examples

For example, the following spaceship in Brian's Brain (Generations rule /2/3, aka B2/S/G3) is represented by the apgcode xq4_3482h8a_03482lx8:

x = 11, y = 5, rule = /2/3 AB2.AB$AB.AB$.AB.A4.B$2.AB4.A.B$4.AB2.B! [[ GRID GRIDMAJOR 0 GPS 2 AUTOSTART THUMBLAUNCH THUMBSIZE 2 TRACKLOOP 4 -1 0 ]]
Catagoluehere
(click to open LifeViewer)

As Brian's Brain is a 3-state rule, the two layers of the code directly represent cells in states 1 (live) and 2 (dying); note how x represents 000 in the second layer:

Xq4 3482h8a 03482lx8 layer1 annotated.png Xq4 3482h8a 03482lx8 layer2 annotated.png

Putting both layers together yields the complete pattern:

Xq4 3482h8a 03482lx8 annotated.png

For a more complex example, consider the following reflectorless rotating oscillator in an 11-state version of Conway's Game of Life (B3/S23/G11), xp32_ss2gzw11_wsfm76cg_x86668_wke2024_ws6i1_x1:

x = 9, y = 6, rule = 23/3/11 3.B.F$2.ADCIG$2A2D3IH$2AFG3.I$2ADAF3.J$2.2A! [[ GRID GRIDMAJOR 0 GPS 16 AUTOSTART THUMBLAUNCH THUMBSIZE 2 LOOP 32 WIDTH 640 HEIGHT 640 ZOOM 32 ]]
Catagoluehere
(click to open LifeViewer)

In order to encode this pattern, states 2, 3, ..., 10, 11 are renumbered 34, 30, ..., 6, 2; six layers are then used to encode each bit of these new state numbers:

Xp32 ss2gzw11 wsfm76cg x86668 wke2024 ws6i1 x1 layer1 annotated.png Xp32 ss2gzw11 wsfm76cg x86668 wke2024 ws6i1 x1 layer2 annotated.png Xp32 ss2gzw11 wsfm76cg x86668 wke2024 ws6i1 x1 layer3 annotated.png Xp32 ss2gzw11 wsfm76cg x86668 wke2024 ws6i1 x1 layer4 annotated.png Xp32 ss2gzw11 wsfm76cg x86668 wke2024 ws6i1 x1 layer5 annotated.png Xp32 ss2gzw11 wsfm76cg x86668 wke2024 ws6i1 x1 layer6 annotated.png

Putting the layers together yields the complete pattern:

Xp32 ss2gzw11 wsfm76cg x86668 wke2024 ws6i1 x1 annotated.png

Limitations

apgcodes can be ambiguous if a rule is neither specified nor implicitly assumed, with the same code being assigned to different patterns evolving differently. For example, the code xp2_2a54 describes at least three different period-2 oscillators in outer-totalistic rules which are all titled as "clock"; further variants exist in non-totalistic rules:

Evolution of xp2_2a54 across different rules
Xp2 2a54 b3s23.gif Xp2 2a54 b34s.gif Xp2 2a54 b4s1.gif
B3/S3 to B35678/S02345678 B34/S to B345678/S0245678 B4/S1 to B45678/S01245678

Also see

Notes

  1. Patterns exceeding a 40×40 bounding box can still be encoded using "classic" apgcodes if they match either fit within such a box in at least one phase, or if, alternatively, they do not contain runs of more than 39 zeroes in their extended Wechsler representation.
  2. That is to say, these apgcodes will match the following regular expression: x[spq][0-9]+(_[0-9a-z]+)*

References

  1. Extending apgcodes to larger patterns (discussion thread) at the ConwayLife.com forums
  2. Re: apgsearch v3.1 (discussion thread) at the ConwayLife.com forums
  3. Re: apgsearch v3.1 (discussion thread) at the ConwayLife.com forums
  4. Re: Catagolue browser extension (discussion thread) at the ConwayLife.com forums
  5. Re: Catagolue browser extension (discussion thread) at the ConwayLife.com forums