Can you run multiple instances of it on different accounts on your computer?
I don't know enough about the internal workings of Mac OS X to answer that question. You can certainly run multiple instances, and since you can't parallelise one instance over multiple cores without OpenMP, you want to be running one instance per core (so 2 if you have a dual-core computer, 4 for a quad-core, etc.).
If it's anything like Linux, you might be able to use either
disown or
nohup depending on your situation.
What happens if you close the lid of your computer?
I think that depends on your personal settings. My (Windows 7) laptop essentially goes into stasis and is revived when the lid is opened, thereby resuming the search. Again, I don't know how Macs behave.
What makes it 7 times faster? The coding language, or did you execute tasks differently?
That's the question I
can answer. So the time taken to run a soup in version 1.x (Golly + Python) was roughly
7000 ms (on my computer):
- 1400 ms running QuickLife to stabilise soups;
- 5600 ms interpreting Python, talking to Golly, changing rules, recognising patterns, etc.
Clearly the bottleneck is the second of these bullet points. Now by rewriting everything in C++ and optimising, it no longer requires Python, or Golly, or rule-changing, and this 5600 ms can be reduce twentyfold down to 300 ms:
- 1400 ms running QuickLife to stabilise soups;
- 300 ms doing everything else.
So now it takes
1700 ms, making it four times faster than the original apgsearch. The bottleneck is now running QuickLife, which is a highly-optimised algorithm so is hard to improve. However, QuickLife is designed to run arbitrary rules, so has to process cells serially (well, 4 at a time using a 65536-element lookup table) -- whereas parallel speedups are possible by hard-coding B3/S23 into bitwise operations (c.f. Michael Simkin's LifeAPI).
Hence I wrote a bespoke highly-optimised algorithm (mostly in x86 assembly!
) to take advantage of parallelisation and processor architecture. It's called
Life128 because it processes 128 cells in parallel using
Streaming SIMD Extensions, a special set of highly parallel instructions which exist on modern x86-64 processors. Here is the source code for Life128 (warning: not for the faint-hearted!):
https://gitlab.com/apgoucher/apgnano/bl ... /life128.h
It splits the universe into slightly-overlapping 32-by-32 tiles and uses the assembly routine to compute each tile 2 generations into the future (computing the inner 28-by-28 square, hence why we need overlap). This routine is even faster than I'm making out, because each CPU has multiple ALUs which can work in parallel -- so this 882-instruction assembly routine actually runs in about 320 clock cycles!
If a tile is unchanged (i.e. is period-2), then it won't be recalculated until it actually can change (as a result of interference from neighbouring tiles). Since at any time in the evolution of a methuselah, the majority of the populated portion of the universe is occupied by period-2 ash, this significantly reduces the number of tiles to compute. Another (very minor!) optimisation is staggering the tiles in a brick-wall fashion so each tile needs only communicate with 6 neighbours, rather than 8.
To summarise, the resulting algorithm is twice as fast as QuickLife, so the current breakdown of time per soup resembles this:
- 700 ms running Life128 to stabilise soups;
- 300 ms doing everything else.
There's not really a clear bottleneck any more, and both halves are quite intensively optimised, so I decided to stop there. The total time is now
1000 ms per soup, which is indeed seven times faster than the original version.
Overall, great work! Nice Job
Thanks!