I've had a lot of Puzzles nightly builds fail recently in the NestedVM
stage, with a 'jar' command producing a message along these lines:
java.util.zip.ZipException: attempt to write past end of STORED entry
at java.base/java.util.zip.ZipOutputStream.write(ZipOutputStream.java:337)
at jdk.jartool/sun.tools.jar.Main.copy(Main.java:1250)
at jdk.jartool/sun.tools.jar.Main.copy(Main.java:1263)
at jdk.jartool/sun.tools.jar.Main.addFile(Main.java:1211)
at jdk.jartool/sun.tools.jar.Main.create(Main.java:879)
at jdk.jartool/sun.tools.jar.Main.run(Main.java:319)
at jdk.jartool/sun.tools.jar.Main.main(Main.java:1681)
Suppressed: java.util.zip.ZipException: invalid entry size (expected 0 but got 786 bytes)
at java.base/java.util.zip.ZipOutputStream.closeEntry(ZipOutputStream.java:288)
at java.base/java.util.zip.ZipOutputStream.finish(ZipOutputStream.java:361)
at java.base/java.util.zip.DeflaterOutputStream.close(DeflaterOutputStream.java:238)
at java.base/java.util.zip.ZipOutputStream.close(ZipOutputStream.java:378)
at jdk.jartool/sun.tools.jar.Main.create(Main.java:854)
... 2 more
It's hard to work out exactly what this error dump means, and
web-searching for the error message isn't much help because the same
exception can occur in application code using java.util.zip, and most
mentions on the web are about that, and not about what I want to know,
which is why it might happen in the 'jar' program in particular.
However, the clues visible in that message suggest that 'jar' had
somehow got confused about the size of one of the files it was adding
to the jar archive, in that it initially decided it was 0 bytes long
and later found it was longer. That suggests a problem of excessive
parallelism between the build steps, perhaps due to a missing
dependency in the makefile, which might plausibly cause the 'jar' step
to be running already while some file it needs to read is still being
written. (Which would also explain why it doesn't happen every time.)
An eyeball review of cmake/platforms/nestedvm.cmake didn't find any
obvious missing dependencies. But I vaguely remembered that in some
other context I'd had trouble with cmake 'add_custom_command'. So in
this commit I replace all those custom commands with custom _targets_,
listing the previous OUTPUT files as BYPRODUCTS. And then the
dependencies are written using the target names, instead of the file
names.
I don't fully understand why this should make a difference. But it
seems more reliable in a soak test, and still builds the right things,
so I'll commit it and see if it makes the flakiness in the actual
nightly builds stop happening.
Apparently, some compilers can't work out that the pattern
if (!t->root) { special-case handler followed by early return }
n = t->root;
while (n) { ... }
will execute the while loop at least once, on the grounds that the
_first_ test for n being non-NULL must pass, because we initialised n
from t->root which can't be NULL on any code path where we didn't take
the early return. So they might give an uninitialised-variable warning
for the variable 'ki', which is initialised inside the while loop.
Compilers, eh. But it's easy enough to turn the while into a do-while,
so that even the least alert compiler will know it runs at least once.
Apparently some compilers can't work out that new_window() will always
write to its error-message parameter if it returns failure, so they
complain at the call site that 'error' might be used uninitialised.
Fix by explicitly initialising it. (To NULL, which really _shouldn't_
stop the compiler from warning, because surely that's just as bad if
it reaches the following printf!)
Also, while I'm at it, move it into the block where it's used, so it
doesn't look as if it might pervade the whole of main().
I don't expect this to actually come up in any circumstance, but it
prevents a warning in some versions of gcc that would otherwise arise
from the use of 'int' to compute the input size: if gcc isn't
confident that the int is positive, then it complains that possible
inputs to malloc might be in the region of 2^64 - (small multiple of a
negative 32-bit int).
I would hope malloc would fail in any case on such an input, so
failing a couple of lines earlier makes no important difference.
Annoyingly, stdint.h is missing in my NestedVM build setup (though it
has stdbool.h - it's not _totally_ C90). So I have to check that at
cmake time.
Also, removed the #defines for smalloc and friends from the tree234
test mode. These were needed in the old build system, when
tree234-test was built ad-hoc without being linked against malloc.c.
But now tree234-test links against the same utils library as
everything else, and can use the real smalloc - and doing so prevents
another of these warnings when compiling with -flto.
My standard 'abort on failure' wrappers around malloc and friends look
more or less the same in most of my C software. In this case, they
were so much the same that there was even a comment betraying that I
copy-pasted them from Halibut. And nobody has noticed in the whole
lifetime of this code base :-)
Previously, the typedef 'clue' was just 'char', but it was used in the
expectation that it would be signed. So on platforms that default to
unsigned char, such as 32-bit Arm, Palisade would completely fail to
function correctly.
Every time we append to the string 'ret', we check first that there's
enough space, and realloc it larger if it's getting close to full.
Except that I missed one case at the join between the two parts of the
encoding.
(Spotted because apparently on someone's build platform this led to a
compiler warning that 'ret' might be null. I think _that's_ not a
serious worry, but the missing resize was definitely unintentional.)
Most games store a string in 'aux' during new_game_desc() that saves
solve_game() during gameplay from having to reconstruct the solution
from scratch. Galaxies had all the facilities available to do that,
but apparently just forgot to use them.
(Reindents existing code: diff best viewed with whitespace ignored.)
In the very first stage of game generation, we're supposed to pick 1/5
of the grid squares to put tents in, in a uniformly random manner. It
failed to be uniform, because I had the wrong limit in random_upto -
but it was too small rather than too large, so it never overran the
buffer in a way that something like ASan or valgrind would have caught.
This integer parameter appears in all of the game's anim_length,
flash_length and redraw methods, but the documentation only described
it properly in the section for anim_length.
The section for flash_length refers you to anim_length for the
description of all the parameters, but unhelpfully, it did so without
a conveniently clickable (in HTML) cross-reference.
And the section for game_redraw missed out the description of 'int
dir' completely, which is particularly unhelpful since that's the
function most likely to actually need to care about it! (Even if
forward and reversed move animations look different, they _probably_
still have the same duration, so it's more likely that anim_length
would ignore 'dir' and redraw would use it than vice versa.)
They disappeared in commit c0da615a933a667: I removed all the
per-puzzle code that erased the whole game window on first draw, and
accidentally also took out the code that drew the Tents grid.
Each individual grid tile includes its left and top grid lines, so
most of the grid ended up being drawn anyway by draw_tile(). But the
right and bottom borders aren't within any tile, so they weren't.
It relied on reading gamedesc.txt to find a list of puzzle binaries to
run. But gamedesc.txt is now specific to the Windows build (since it
contains Windows executable names), and isn't available in the Unix
cmake build directory.
Fixed by making a simpler gamelist.txt available on all platforms.
A user recently mentioned having found even 'Easy' to be harder than
they'd like. That difficulty level generates puzzles that can be
solved by filling in any square for which the other colour would
immediately generate a run of 3, and spotting rows that have the
correct total number of one colour and filling in all remaining
squares in the other colour.
Initially I thought there wasn't much I could do to make the solution
techniques easier than that. But after a bit of thought, I decided the
second criterion can be weakened a bit. The new 'Trivial' level
replaces it with a special case: when a row or column only has _one_
remaining unfilled square, the remaining square is filled in by
counting.
That version of the rule doesn't require the player to do any counting
in order to _spot_ possible applications of it: you can see at a
glance that a row or column has only one remaining grey square, even
if having seen it you then have to count to work out which colour to
fill it in. So it makes a gentler introduction to the game.
Patch due to Suller Andras, who noticed that the COL_HIGHLIGHT lines
at the top of the left clue bar and the left of the top one (i.e. the
ones next to the green 'done' button when it appears) were missing,
because the 'done' button was occupying a pixel too much space.
This seems like a generally helpful design for game editors in
general: if we're going to have a helper program that can construct an
instance of a game, then one obvious thing you'd want as output from
it would be the descriptive game id, suitable for pasting back into
the playing UI.
So, in the just-re-enabled helper program 'galaxieseditor', I've
rewritten game_text_format so that it generates exactly that. Now you
can place dots until you have a puzzle you like, then use the 'Copy'
menu item to make it into a game id usable in 'galaxies' proper.
This doesn't set a precedent that I'm planning to _write_ editors for
all the other games, or even any of them (right now). For some, it
wouldn't be too hard (especially games where the solution and clues
are the same kind of thing, like default-mode Solo); for others, it
would be a huge UI nightmare.
Most of these aren't especially useful, but if we're going to have
them in the code base at all, we should at least ensure they compile:
bit-rotted conditioned-out code is of no value.
One of the new programs is 'galaxieseditor', which borrows most of the
Galaxies code but changes the UI so that you can create and remove
_dots_ instead of edges, and then run the solver to see whether it can
solve the puzzle you've designed. Unlike the rest, this is a GUI
helper tool, using the 'guiprogram' cmake function introduced in the
previous commit.
The programs are:
- 'combi', a test program for the utility module that generates all
combinations of n things
- 'divvy', a test program for the module that divides a rectangle at
random into equally-sized polyominoes
- 'penrose-test', a test program for the Penrose-tiling generator
used in Loopy, which outputs an SVG of a piece of tiling
- 'penrose-vector', a much smaller test program for the vector
arithmetic subroutines in that code
- 'sort-test', a test of this code base's local array sorting routine
- 'tree234-test', the exhaustive test code that's been in tree234.c
all along.
Not all of them compiled first time. Most of the fixes were the usual
kind of thing: fixing compiler warnings by removing unused
variables/functions, bringing uses of internal APIs up to date. A
notable one was that galaxieseditor's interpret_move() modified the
input game state, which was an error all along and is now detected by
me having made it a const pointer; I had to replace that with an extra
wrinkle in the move-string format, so that now execute_move() makes
the modification.
The one I'm _least_ proud of is squelching a huge number of
format-string warnings in tree234-test by interposing a variadic
function without __attribute__((printf)).
These look like puzzles, in that they link against a frontend and
provide the usual 'struct game', but they don't count as a puzzle for
purposes of shipping, or even having to have descriptions and icons.
There's one of these buried in the code already under an ifdef, which
I'll re-enable in the next commit.
I just found out, in GTK3 under X11, that if you use the 'Copy' menu
item to copy a text representation of the game to the clipboard,
afterwards the puzzle window is no longer redrawn.
That was pretty mysterious, and I still don't _fully_ understand it.
But I think the main point is that when you set the GtkDrawingArea to
be the owner of an X11 selection, that requires it to have an X11
window of its own, where previously it was managing fine as a logical
subrectangle of its containing GtkWindow. And apparently switching
strategies half way through the run is confusing to GTK, and causes
redraws to silently stop working.
The easy workaround is to make fe->window rather than fe->area the
thing that owns GTK selections and receives the followup events. That
must already have an X window, so nothing has to be changed when we
make it a selection owner half way through the run.
In this platform's set_platform_puzzle_target_properties, I referred
to ${EXENAME} twice, which is not one of the function parameters. It
worked anyway, because CMake has dynamic scope, and that variable was
defined - to the right value - within the local-variable scope of the
calling function. But that wasn't at all what I meant to do!
Renamed it to ${TARGET}, which is the actual parameter name we get
passed.
The previous fix worked OK, but it was conceptually wrong. Puzzles
save files are better regarded as binary, not text: the length fields
are measured in bytes, so translating the file into a different
multibyte character encoding would invalidate them.
So it was wrong to fetch a C byte string containing the exactly right
binary data, then translate it into a Javascript string as if decoding
from UTF-8, then retranslate to a representation of a bytewise
encoding via encodeURIComponent, and then label the result as
application/octet-stream.
This probably wouldn't have caused any problems in practice, because I
don't remember any situation in which my save files use characters
outside printable ASCII (plus newline). But it's not actually
forbidden, so a save file might choose to do that some day, so that
UTF-8 decode/reencode hidden in the JS was a latent bug.
Now the URI-encoding is done on the C side, while we still know
exactly what the binary data ought to look like and can be sure we're
translating it byte for byte into the output encoding for the data:
URI. By the time the JS receives a string pointer from get_save_file,
it's already URI-encoded, which _guarantees_ that it's in ASCII and
won't be messed about with by Emscripten's UTF8ToString.
In commit f6434e84964d840 I said I had replaced all uses of
old-Emscripten's Pointer_stringify() function with new-Emscripten's
UTF8ToString(). In fact, I only replaced the ones in emcclib.js, but I
missed one in emccpre.js used in preparing downloadable save files.
Those were therefore broken, with a Javascript undefined-name error.
The previous code had multiple bugs. We had completely left out the
draw_update after drawing each arrow; we omitted the usual
precautionary clip() that constrains each arrow draw to the same
rectangle we just saved in the blitter; we re-computed the coordinates
of the opposite arrow at undraw time, instead of saving the
coordinates we _actually_ used after computing them the first time.
And we restored from the two blitters in the same order we saved them,
instead of reverse order, which was harmless at the time (the drawing
happened after both saves), but is generally bad practice, and needed
to be fixed when the code was rearranged to fix the rest of these
issues.
I noticed these issues in passing, while hunting the diagonal-line bug
fixed in the previous commit. These fixes by themselves would have
prevented any persistent drawing artefact as a result of that bug (the
clip() would have constrained the spurious diagonal line to the region
saved by the blitter, so it would have been undrawn again afterwards);
but it's better to have fixed the root cause as well!
During an interactive drag of an arrow, it's possible to put the mouse
pointer precisely on the pixel the arrow is pointing at. When that
happens, draw_arrow computes the zero-length vector from the pointer
to the target pixel, and tries to normalise it to unit length by
dividing by its length. Of course, this leads to computing 0/0 = NaN.
Depending on the platform, this can cause different effects.
Conversion of a floating-point NaN to an integer is not specified by
IEEE 754; on some platforms (e.g. Arm) it comes out as 0, and on
others (e.g. x86), INT_MIN.
If the conversion delivers INT_MIN, one possible effect is that the
arrow is drawn as an immensely long diagonal line, pointing upwards
and leftwards from the target point. To add to the confusion, that
line would not immediately appear on the display in full, because of
the draw_update system. But further dragging-around of arrows will
gradually reveal it as draw_update rectangles intersect the corrupted
display area.
However, that diagonal line need not show up at all, because once
draw_arrow has accidentally computed a lot of values in the region of
INT_MIN, it then adds them together, causing signed integer overflow
(i.e. undefined behaviour) to confuse matters further! In fact, this
diagonal-line drawing artefact has only been observed on the
WebAssembly front end. The x86 desktop platforms might very plausibly
have done it too, but in fact they didn't (I'm guessing because of
this UB issue, or else some kind of clipping inside the graphics
library), which is how we haven't found this bug until now.
Having found it, however, the fix is simple. If asked to draw an arrow
from a point to itself, take an early return from draw_arrow before
dividing by zero, and don't draw anything at all.
Thanks to Kaz Kylheku for pointing out that commit ff3e762fd007883
didn't do a complete job: I removed the code under '#ifdef _WIN32_WCE'
in windows.c, but missed sections under the same ifdef in puzzles.h
and puzzles.rc, together with an entire header file resource.h that
was only included by code under those ifdefs.
Trivially, no edge of this kind can be part of any legal solution, so
it's a minor UI affordance that doesn't spoil any of the puzzly
thinking to prevent the user accidentally placing them in the first
place.
Suggestion from Larry Hastings, although this is not his patch.
I just introduced the 'first_draw' flag in the midend, which should
force a screen clear whenever we draw a puzzle with a fresh drawstate.
But in fact there were several places where the midend replaces the
drawstate and I hadn't set that flag to true.
In particular, a user just reported that when you press 'n' for a new
game in an existing Magnets window, the new puzzle's clues are drawn,
but any old clues in places where the new puzzle doesn't have one is
not _un_drawn. (Because Magnets has no code to undraw a single clue -
it never needs to!)
Added a set of first_draw wherever we call new_drawstate, which should
make this reliable again.
Negative numbers are used as a sentinel for an absent clue, so we have
to use a type that's guaranteed to have some negative numbers. char is
unsigned on some platforms. So now Mosaic runs apparently correctly on
Raspbian, for example.
I don't know how I've never thought of this before! Pretty much every
game in this collection has to have a mechanism for noticing when
game_redraw is called for the first time on a new drawstate, and if
so, start by covering the whole window with a filled rectangle of the
background colour. This is a pain for implementers, and also awkward
because the drawstate often has to _work out_ its own pixel size (or
else remember it from when its size method was called).
The backends all do that so that the frontends don't have to guarantee
anything about the initial window contents. But that's a silly
tradeoff to begin with (there are way more backends than frontends, so
this _adds_ work rather than saving it), and also, in this code base
there's a standard way to handle things you don't want to have to do
in every backend _or_ every frontend: do them just once in the midend!
So now that rectangle-drawing operation happens in midend_redraw, and
I've been able to remove it from almost every puzzle. (A couple of
puzzles have other approaches: Slant didn't have a rectangle-draw
because it handles even the game borders using its per-tile redraw
function, and Untangle clears the whole window on every redraw
_anyway_ because it would just be too confusing not to.)
In some cases I've also been able to remove the 'started' flag from
the drawstate. But in many cases that has to stay because it also
triggers drawing of static display furniture other than the
background.
This is similar in concept to Minesweeper, in that each clue tells you
the number of things (in this case, just 'black squares') in the
surrounding 3x3 grid section.
But unlike Minesweeper, there's no separation between squares that can
contain clues, and squares that can contain the things you're looking
for - a clue square may or may not itself be coloured black, and if
so, its clue counts itself.
So there's also no hidden information: the clues can all be shown up
front, and the difficulty arises from the game generator choosing
which squares to provide clues for at all.
Contributed by a new author, Didi Kohen. Currently only has one
difficulty level, but harder ones would be possible to add later.
Using a stunt webserver which artificially introduces a 3s delay just
before the last line of the HTML output, I have reproduced a
uwer-reported loading/startup race bug:
Previously the wasm loading was started by the <script> element,
synchronously. If the wasm loading is fast, and finishes before the
HTML loading, the onRuntimeInitialized event may occur before
initPuzzles. But initPuzzles sets up the event handler.
Fix this bug, and introduce a new comment containing an argument for
the correctness of the new approach.
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
A grid based on dodecagons with square symmetry. In between dodecagons
there are 4 triangles around 1 square, which resembles a compass rose.
https://en.wikipedia.org/wiki/3-4-3-12_tiling
These came from Visual Studio, and seem to be real problems - we're
casting pointers to 32-bit integers, which surely only works by luck
of address-space allocation.
This set of rules should cover make and ninja on Linux, and all of
nmake, ninja and vcxproj on Windows, so that if someone follows the
README build instructions (by doing 'cmake .' in-tree), it should
generate no debris that .gitignore can't filter out.
How embarrassing. When I updated the Emscripten build to use WASM, a
major reason I bothered to do it at all was that I'd heard that WASM
was capable of reallocating its memory arena larger on the fly. Turns
out that it _can_, but only if you specifically set the option in
Emscripten to allow it.
With this option set, I can finish a 25x25 Galaxies, where previously
the game would crash part way through (and not even a very large part)
with errors about memory growth in the Javascript console.
This allows the icons build to automatically disable itself if Perl
can't be found at all (and print a warning explaining that that's
why). It also means that if Perl exists on the system but is somewhere
other than /usr/bin (where our #! lines expect it), the icons build
can still run.
Now I've had a chance to collect some more data from browser users,
it's got a list of browsers in which we've definitely seen the WASM
puzzle working (so that if _your_ instance of one of those browsers
fails, you should check it for problems at your end, like
configuration details or overzealous web filtering software), and some
thoughts about what to report.
The result is a lot longer than the previous error message, so I've
put the bulk of it in a <details>. That way it won't look so silly
when it initially flashes up while things are loading.
The old one was totally out of date (it mentioned typed arrays and a
specific set of browser versions against which the previous Emscripten
build had been tested).
Also, a couple of users in the last day or two have reported
mysterious failures of the WASM puzzles, which I haven't been able to
reproduce in the same version of the same browser. So something odd is
going on, and this seems like a good place to put suggestions about
what diagnostic information to send.
Various cmake variables that I was informally expecting users to set
on the cmake command line (e.g. cmake -DSTRICT=ON, or cmake
-DPUZZLES_GTK_VERSION=2) are now labelled explicitly with the CACHE
tag, and provided with a documentation string indicating what they're
for.
One effect of this is that GUI-like interfaces to your cmake build
directory, such as ccmake or cmake-gui, will show those variables
explicitly to give you a hint that you might want to change them.
Another is that when you do change them, cmake will recognise that it
needs to redo the rest of its configuration. Previously, if you sat in
an existing cmake build directory and did 'cmake -DSTRICT=ON .'
followed by 'cmake -DSTRICT=OFF .', nothing would happen, even though
you obviously meant it to.
I presume this will improve performance. Also, if I've understood
correctly, WASM-based compiled web code is capable of automatically
growing its memory, which the previous asm.js build of the puzzles
could not do, and occasionally caused people to complain that if they
tried to play a _really big_ game in their browser, the JS would
eventually freeze because the emulated memory ran out.
I've been putting off doing this for ages because my previous
Emscripten build setup was so finicky that I didn't like to meddle
with it. But now that the new cmake system in this source tree makes
things generally easier, and particularly since I've just found out
that the up-to-date Emscripten is available as a Docker image (namely
"emscripten/emsdk"), this seemed like a good moment to give it a try.
The source and build changes required for this update weren't too
onerous. I was half expecting a huge API upheaval, and indeed there
was _some_ change, but very little:
- in the JS initPuzzle function, move the call to Module.callMain()
into Module.onRuntimeInitialized instead of doing it at the top
level, because New Emscripten's .js output likes to load the
accompanying .wasm file asynchronously, so you can't call the WASM
main() until it actually exists.
- in the JS-side library code, replace all uses of Emscripten's
Pointer_stringify() function with the new name UTF8ToString(). (The
new version also has an ASCIIToString(), so I guess the reason for
the name change is that now you get to choose which character set
you meant. I need to use UTF-8, so that the × and ÷ signs in Keen
will work.)
- set EXTRA_EXPORTED_RUNTIME_METHODS=[cwrap,callMain] on the emcc
link command line, otherwise they aren't available for my JS setup
code to call.
- (removed -s ASM_JS=1 from the link options, though I'm not actually
sure it made any difference one way or the other in the new WASM
world)
- be prepared for a set of .wasm files to show up as build products
alongside the .js ones.
- stop building with -DCMAKE_BUILD_TYPE=Release! I'm not sure why
that was needed, but if I leave that flag on my cmake command line,
the output .js file fails to embed my emccpre.js, so the initial
call to initPuzzle() fails from the HTML wrapper page, meaning
nothing at all happens.