Difference between revisions of "Apgcode"

From LifeWiki
Jump to navigation Jump to search
m (Links)
(Unsatisfactory un-explanation of yl suffixes)
Line 10: Line 10:
 
These are followed by a number. For xs or ov_s this is the [[population]] of the still life while for the others it is the [[period]] of the object.
 
These are followed by a number. For xs or ov_s this is the [[population]] of the still life while for the others it is the [[period]] of the object.
  
The codes for xs, xp, xq, and yl are followed by an underscore, then a string which is a representation of the object in Extended Wechsler Format, described below.
+
The codes for xs, xp, and xq are followed by an underscore, then a string which is a representation of the object in Extended Wechsler Format, described below. Codes for yl are followed by an underscore and additional information.
  
 
Additionally, objects which apgsearch cannot classify as above are denoted by one of:
 
Additionally, objects which apgsearch cannot classify as above are denoted by one of:

Revision as of 20:49, 18 April 2016

apgsearch and the search results database, Catagolue, classify and denote patterns as follows. Most objects are stored as two alphanumeric strings separated by an underscore. The 'prefix' refers to everything before the underscore and may begin with:

Oversize still lifes, oscillators, and spaceships, larger than 40 by 40, have a prefix beginning with ov_s, ov_p, or ov_q, respectively.

These are followed by a number. For xs or ov_s this is the population of the still life while for the others it is the period of the object.

The codes for xs, xp, and xq are followed by an underscore, then a string which is a representation of the object in Extended Wechsler Format, described below. Codes for yl are followed by an underscore and additional information.

Additionally, objects which apgsearch cannot classify as above are denoted by one of:

  • zz_EXPLOSIVE
  • zz_LINEAR
  • zz_QUADRATIC
  • zz_REPLICATOR
  • PATHOLOGICAL

Extended Wechsler Format

This is an extension of a pattern notation developed by Allan Wechsler in 1992. A string of n characters in the set {0, 1, 2, ..., 8, 9, a, b, ..., v} denotes a strip of five rows, n columns wide. Each character denotes five cells in a vertical column corresponding to the bitstrings {'00000', '00001', '00010', ..., '01000', '01001', '01010', '01011', ... '11111'}. For instance, xq4_27deee6 corresponds to a HWSS:

27deee6
.**....
**.****
.******
..****.
.......

The character 'z' separates contiguous five-row strips. For example, xs31_0ca178b96z69d1d96 is:

0ca178b96
...**.**.
..*.*.*.*
.*..*...*
.**..***.
.........

69d1d96
.*****.
*.....*
*.*.*.*
.**.**.
.......

The characters 'w' and 'x' are used to abbreviate '00' and '000', respectively. So xp30_w33z8kqrqk8zzzx33 is the trans-queen-bee-shuttle:

0033
..**
..**
....
....
....

8kqrqk8
...*...
..***..
.*...*.
*.***.*
.*****.

[10 blank rows omitted]

00033
...**
...**
.....
.....
.....

Note that extraneous '0's at the ends of strips are not included.

Finally, the symbols {'y0', 'y1', y2', ..., 'yx', 'yy', 'yz'} correspond to runs of between 4 and 39 consecutive '0's. A good example is the quadpole on ship, xp2_31a08zy0123cko:

31a08
**...
*.*..
.....
..*.*
.....

0000123cko
....*.*...
.....**...
.......**.
.......*.*
........**

Canonical form

In order to enforce a canonical form, there are further rules regarding encoding:

The leftmost column and uppermost row must each contain at least one live cell. (This gives a canonical position.)

Any '0's on the ends of strips are ignored, and {'00', '000', '0000', '00000', ...} are always replaced with {'w', 'x', 'y0', 'y1', 'y2', ...}.

(Theoretically, runs of more than 39 zeroes should be replaced by 'yz' followed by the coding for the remaining zeroes. At the moment apgsearch labels everything larger than 40-by-40 as 'oversized' and refuses to process it.)

A canonical orientation and phase must be determined. For example, with the caterer (p3 oscillator with no symmetry), there are three phases and eight orientations, so we have 24 possible encodings. Define a total order on these encodings as follows:

  • Prefer shorter representations to longer representations;
  • For representations of the same length, apply lexicographical ASCII ordering (and give preference to earlier strings).

This gives, for any still-life, oscillator or spaceship, an unambiguous canonical code to represent the pattern. It has several desirable properties:

Compression: it's much more compact than RLE or SOF for storing very small patterns, and often even beats the common name ('xp15_4r4z4r4' is shorter than 'pentadecathlon')!

Character set: it only uses digits, lowercase letters and the underscore, so can be safely used in filenames and URLs.

Human-readability: the prefix means that we can instantly see whether a particular object is a still-life (and if so, what size), oscillator (and if so, what period) or spaceship (and if so, what period). It also means that the string is instantly recognised as being an encoding of an object (xp2_7 is obviously a blinker, whereas the digit 7 on its own with no extra context is ambiguous).