The new generator works on the same 'combinatorial coordinates' system
as the more recently written Hats and Spectre generators.
When I came up with that technique for the Hats tiling, I was already
tempted to rewrite the Penrose generator on the same principle,
because it has a lot of advantages over the previous technique of
picking a randomly selected patch out of the subdivision of a huge
starting tile. It generates the exact limiting distribution of
possible tiling patches rather than an approximation based on how big
a tile you subdivided; it doesn't use any overly large integers or
overly precise floating point to suffer overflow or significance loss,
because it never has to even _think_ about the coordinates of any
point not in the actual output area.
But I resisted the temptation to throw out our existing Penrose
generator and move to the shiny new algorithm, for just one reason:
backwards compatibility. There's no sensible way to translate existing
Loopy game IDs into the new notation, so they would stop working,
unless we kept the old generator around as well to interpret the
previous system. And although _random seeds_ aren't guaranteed to
generate the same result from one build of Puzzles to the next, I do
try to keep existing descriptive game IDs working.
So if we had to keep the old generator around anyway, it didn't seem
worth writing a new one...
... at least, until we discovered that the old generator has a latent
bug. The function that decides whether each tile is within the target
area, and hence whether to make it part of the output grid, is based
on floating-point calculation of the tile's vertices. So a change in
FP rounding behaviour between two platforms could cause them to
interpret the same grid description differently, invalidating a Loopy
game on that grid. This is _unlikely_, as long as everyone uses IEEE
754 double, but the C standard doesn't actually guarantee that.
We found this out when I started investigating a slight distortion in
large instances of Penrose Loopy. For example, the Loopy random seed
"40x40t11#12345", as of just before this commit, generates a game
description beginning with the Penrose grid string "G-4944,5110,108",
in which you can see a star shape of five darts a few tiles down the
left edge, where two of the radii of the star don't properly line up
with edges in the surrounding shell of kites that they should be
collinear with. This turns out to be due to FP error: not because
_double precision_ ran out, but because the definitions of COS54,
SIN54, COS18 and SIN18 in penrose.c were specified to only 7 digits of
precision. And when you expand a patch of tiling that large, you end
up with integer combinations of those numbers with coefficients about
7 digits long, mostly cancelling out to leave a much smaller output
value, and the inaccuracies in those constants start to be noticeable.
But having noticed that, it then became clear that these badly
computed values were also critical to _correctness_ of the grid. So
they can't be fixed without invalidating old game IDs. _And_ then we
realised that this also means existing platforms might disagree on a
game ID's validity.
So if we have to break compatibility anyway, we should go all the way,
and instead of fixing the old system, introduce the shiny new system
that gets all of this right. Therefore, the new penrose.c uses the
more reliable combinatorial-coordinates system which doesn't deal in
integers that large in the first place. Also, there's no longer any
floating point at all in the calculation of tile vertex coordinates:
the combinations of 1 and sqrt(5) are computed exactly by the
n_times_root_k function. So now a large Penrose grid should never have
obvious failures of alignment like that.
The old system is kept for backwards compatibility. It's not fully
reliable, because that bug hasn't been fixed - but any Penrose Loopy
game ID that was working before on _some_ platform should still work
on that one. However, new Penrose Loopy game IDs are based on
combinatorial coordinates, computed in exact arithmetic, and should be
properly stable.
The new code looks suspiciously like the Spectre code (though
considerably simpler - the Penrose coordinate maps are easy enough
that I just hand-typed them rather than having an elaborate auxiliary
data-generation tool). That's because they were similar enough in
concept to make it possible to clone and hack. But there are enough
divergences that it didn't seem like a good idea to try to merge them
again afterwards - in particular, the fact that output Penrose tiles
are generated by merging two triangular metatiles while Spectres are
subdivisions of their metatiles.
This uses a tile shape very similar to the hat, but the tiling
_structure_ is totally changed so that there aren't any reflected
copies of the tile.
I'm not sure how much difference this makes to gameplay: the two
tilings are very similar for Loopy purposes. But the code was fun to
write, and I think the Spectre shape is noticeably prettier, so I'm
adding this to the collection anyway.
The test programs also generate a pile of SVG images used in the
companion article on my website.
The 'core' library contains almost all the same objects as 'common',
but leaves out hat.c. And the auxiliary program 'hatgen' now links
against that slightly reduced core library instead of 'common'.
This avoids a dependency loop: one of hatgen's jobs is to generate
hat-tables.h, but hat-tables.h is a dependency of it.
Of course, the generated hat-tables.h is already committed, so this
doesn't present a bootstrapping problem in a normal build. But if
someone modifies hatgen.c in order to regenerate hat-tables.h, and
does so in a way that makes it uncompilable, they can't rebuild hatgen
and try again! Of course you can always revert changes with git, but
it's annoying to have to. Better to keep the dependencies non-cyclic
in the first place.
Having stated the principle in the previous commit, I should apply it
consistently. A source file linked into the Puzzles library of common
support code should not also define a main() under ifdef.
This commit only goes as far as the _library_ support modules. It
would be a much bigger job to do the same for all the actual _puzzles_
that have test main()s or standalone-solver main()s. And it's not
necessary, because modifying one of those source files only triggers a
rebuild of _one_ puzzle, not absolutely everything. (Not to mention
that it's quite likely the puzzle and the test main() will need to be
modified in conjunction anyway.)
As in the previous commit, this has required exposing a few internal
API functions as global, and maybe editing them a bit. In particular,
the one-shot internal function that divvy_rectangle() loops on until
it succeeds is now exposed as divvy_rectangle_attempt(), which means
the test program doesn't have to condition a failure counter into the
real function.
I've thrown away penrose-vector-test completely, because that didn't
look like a test program with any ongoing use at all - it was surely
vestigial, while James was getting the vector representation up and
running in the first place.
I noticed while hacking on hat-test recently that it's quite awkward
to be compiling a test main() program that lives in a source file also
built into the Puzzles support library, because every modification to
main() also triggers a rebuild of the library, and thence of all the
actual puzzles. So it's better if such a test main() has its own
source file.
In order to make hat-test work standalone, I've had to move a lot of
hat.c's internal declarations out into a second header file. This also
means making a bunch of internal functions global, which means they're
also in the namespace of programs other than hat-test, which means in
turn that they should have names with less implicit context.
The big mathematical news this month is that a polygon has been
discovered that will tile the plane but only aperiodically. Penrose
tiles achieve this with two tile types; it's been an open question for
decades whether you could do it with only one tile. Now someone has
announced the discovery of such a thing, so _obviously_ this
mathematically exciting tiling ought to be one of the Loopy grid
options!
The polygon, named a 'hat' by its discoverers, consists of the union
of eight cells of the 'Kites' periodic tiling that Loopy already
implements. So all the vertex coordinates of the whole tiling are
vertices of the Kites grid, which makes handling the coordinates in an
exact manner a lot easier than Penrose tilings.
What's _harder_ than Penrose tilings is that, although this tiling can
be generated by a vaguely similar system of recursive expansion, the
expansion is geometrically distorting, which means you can't easily
figure out which tiles can be discarded early to save CPU. Instead
I've come up with a completely different system for generating a patch
of tiling, by using a hierarchical coordinate system to track a
location within many levels of the expansion process without ever
simulating the process as a whole. I'm really quite pleased with that
technique, and am tempted to try switching the Penrose generator over
to it too - except that we'd have to keep the old generator around to
stop old game ids being invalidated, and also, I think it would be
slightly trickier without an underlying fixed grid and without
overlaps in the tile expansion system.
However, before coming up with that, I got most of the way through
implementing the more obvious system of actually doing the expansions.
The result worked, but was very slow (because I changed approach
rather than try to implement tree-pruning under distortion). But the
code was reusable for two other useful purposes: it generated the
lookup tables needed for the production code, and it also generated a
lot of useful diagrams. So I've committed it anyway as a supporting
program, in a new 'aux' source subdirectory, and in aux/doc is a
writeup of the coordinate system's concepts, with all those diagrams.
(That's the kind of thing I'd normally put in a huge comment at the
top of the file, but doing all those diagrams in ASCII art would be
beyond miserable.)
From a gameplay perspective: the hat polygon has 13 edges, but one of
them has a vertex of the Kites tiling in the middle, and sometimes two
other tile boundaries meet at that vertex. I've chosen to represent
every hat as having degree 14 for Loopy purposes, because if you only
included that extra vertex when it was needed, then people would be
forever having to check whether this was a 13-hat or a 14-hat and it
would be nightmarish to play.
Even so, there's a lot of clicking involved to turn all those fiddly
individual edges on or off. This grid is noticeably nicer to play in
'autofollow' mode, by setting LOOPY_AUTOFOLLOW in the environment to
either 'fixed' or 'adaptive'. I'm tempted to make 'fixed' the default,
except that I think it would confuse players of ordinary square Loopy!
I've changed my mind already. The other versions of fuzzpuzz all have
different command-line interfaces anyway, so I think the best approach
is to just accept that and decide that precisely how fuzzpuzz works
isn't a defined API. Fuzzing is inherently not an end-user activity, so
I think it's acceptable to make it a bit inconsistent.
This means that in Clang builds you get the non-libFuzzer version of
fuzzpuzz by default (so you can use it with other fuzzers), but if you
turn on WITH_LIBFUZZER then you'll get the libFuzzer version instead.
For AFL++ and Honggfuzz, our approach is to build a standard fuzzpuzz
binary with extra hooks for interacting with an external fuzzer. This
works well for AFL++ and tolerably for Honggfuzz. LibFuzzer, though,
provides its own main() so that the resulting program has a very
different command-line interface from the normal one. Also, since
libFuzzer is a standard part of Clang, we can't decide whether to use it
based on the behaviour of the compiler.
So what I've done, at least for now, is to have CMake detect when we're
using Clang and in that case build a separate binary called
"fuzzpuzz-libfuzzer" which is built with -fsanitize=fuzzer, while the
ordinary fuzzpuzz is built without. I'm not sure if this is the right
approach, though.
Unlike AFL, Honggfuzz's compiler wrapper doesn't provide a convenient
preprocessor macro, so we have to have CMake detect the existence of
HF_ITER. Also the resulting program can't run outside of Honggfuzz, so
maybe some additional cleverness is called for there as well. Still, it
makes Honggfuzz go ten times faster, which is nice.
The call to cliprogram() doesn't actually add the target 'fuzzpuzz' on
that platform, so the subsequent target_include_directories fails. Fix
is to condition target_include_directories on the build_cli_programs
flag.
It relied on reading gamedesc.txt to find a list of puzzle binaries to
run. But gamedesc.txt is now specific to the Windows build (since it
contains Windows executable names), and isn't available in the Unix
cmake build directory.
Fixed by making a simpler gamelist.txt available on all platforms.
Most of these aren't especially useful, but if we're going to have
them in the code base at all, we should at least ensure they compile:
bit-rotted conditioned-out code is of no value.
One of the new programs is 'galaxieseditor', which borrows most of the
Galaxies code but changes the UI so that you can create and remove
_dots_ instead of edges, and then run the solver to see whether it can
solve the puzzle you've designed. Unlike the rest, this is a GUI
helper tool, using the 'guiprogram' cmake function introduced in the
previous commit.
The programs are:
- 'combi', a test program for the utility module that generates all
combinations of n things
- 'divvy', a test program for the module that divides a rectangle at
random into equally-sized polyominoes
- 'penrose-test', a test program for the Penrose-tiling generator
used in Loopy, which outputs an SVG of a piece of tiling
- 'penrose-vector', a much smaller test program for the vector
arithmetic subroutines in that code
- 'sort-test', a test of this code base's local array sorting routine
- 'tree234-test', the exhaustive test code that's been in tree234.c
all along.
Not all of them compiled first time. Most of the fixes were the usual
kind of thing: fixing compiler warnings by removing unused
variables/functions, bringing uses of internal APIs up to date. A
notable one was that galaxieseditor's interpret_move() modified the
input game state, which was an error all along and is now detected by
me having made it a const pointer; I had to replace that with an extra
wrinkle in the move-string format, so that now execute_move() makes
the modification.
The one I'm _least_ proud of is squelching a huge number of
format-string warnings in tree234-test by interposing a variadic
function without __attribute__((printf)).
This is similar in concept to Minesweeper, in that each clue tells you
the number of things (in this case, just 'black squares') in the
surrounding 3x3 grid section.
But unlike Minesweeper, there's no separation between squares that can
contain clues, and squares that can contain the things you're looking
for - a clue square may or may not itself be coloured black, and if
so, its clue counts itself.
So there's also no hidden information: the clues can all be shown up
front, and the difficulty arises from the game generator choosing
which squares to provide clues for at all.
Contributed by a new author, Didi Kohen. Currently only has one
difficulty level, but harder ones would be possible to add later.
At least, for the Unix build, so as to support Debian stable and a
couple of prior Ubuntu LTSes.
Not much needed to change in the cmake scripts; the only noticeable
difference was that the 'install' command needs an explicit RUNTIME
DESTINATION.
This completely removes the old system of mkfiles.pl + Recipe + .R
files that I used to manage the various per-platform makefiles and
other build scripts in this code base. In its place is a
CMakeLists.txt setup, which is still able to compile for Linux,
Windows, MacOS, NestedVM and Emscripten.
The main reason for doing this is because mkfiles.pl was a horrible
pile of unmaintainable cruft. It was hard to keep up to date (e.g.
didn't reliably support the latest Visual Studio project files); it
was so specific to me that nobody else could maintain it (or was even
interested in trying, and who can blame them?), and it wasn't even
easy to _use_ if you weren't me. And it didn't even produce very good
makefiles.
In fact I've been wanting to hurl mkfiles.pl in the bin for years, but
was blocked by CMake not quite being able to support my clang-cl based
system for cross-compiling for Windows on Linux. But CMake 3.20 was
released this month and fixes the last bug in that area (it had to do
with preprocessing of .rc files), so now I'm unblocked!
CMake is not perfect, but it's better at mkfiles.pl's job than
mkfiles.pl was, and it has the great advantage that lots of other
people already know about it.
Other advantages of the CMake system:
- Easier to build with. At least for the big three platforms, it's
possible to write down a list of build commands that's actually the
same everywhere ("cmake ." followed by "cmake --build ."). There's
endless scope for making your end-user cmake commands more fancy
than that, for various advantages, but very few people _have_ to.
- Less effort required to add a new puzzle. You just add a puzzle()
statement to the top-level CMakeLists.txt, instead of needing to
remember eight separate fiddly things to put in the .R file. (Look
at the reduction in CHECKLST.txt!)
- The 'unfinished' subdirectory is now _built_ unconditionally, even
if the things in it don't go into the 'make install' target. So
they won't bit-rot in future.
- Unix build: unified the old icons makefile with the main build, so
that each puzzle builds without an icon, runs to build its icon,
then relinks with it.
- Windows build: far easier to switch back and forth between debug
and release than with the old makefiles.
- MacOS build: CMake has its own .dmg generator, which is surely
better thought out than my ten-line bodge.
- net reduction in the number of lines of code in the code base. In
fact, that's still true _even_ if you don't count the deletion of
mkfiles.pl itself - that script didn't even have the virtue of
allowing everything else to be done exceptionally concisely.