===========================
 SFTP, VoIP, serialization
===========================

This month's personal news digest is largely a continuation of the
last month's.


SFTP, curl, libssh2, Haskell
============================

The quest for working file download over SFTP keeps going on: the
custom globbing worked, and I added caching for directory listings,
since it seemed simpler than a dedicated thread (especially given that
multiple and dynamic remote servers may be used in the future). But
then noticed memory leakage. Spent some time poking GHC RTS options,
simplifying cache invalidation (making it simply time-based, rather
than renewing on the second hit for the same task), employing deepseq
and strict-concurrency packages (with the latter leading to high CPU
load and delays; the former doing that as well, if it is used where it
is not really needed), to use threaded runtime, to call cleanup of the
curl handles manually, and profiling (only found that the memory seems
to be mostly in ByteString chunks, though if the leakage is in
bindings, in curl itself, or the libraries it uses, it would not have
shown up in GHC's profiling); nothing helped. Then rewrote it to use
libssh2 bindings instead, at least to make it clear whether the issue
is in curl (bindings) or somewhere else, and it kept leaking. Adjusted
it to use remote globbing over SSH instead of plain directory listing
and caching, it still leaked. Wrote a small standalone test program,
to remove all the rest, and eventually even skipped file downloads or
directory listing; it kept leaking. Today discovered that there is a
memory leak in libssh2 bindings (particularly in (de)initialization
functions), fixed it, submitted a PR, hopefully it will be merged
soon. So apparently it is simply a coincidence that both curl bindings
(or the library itself) and libssh2 bindings leak memory. Speaking of
PRs and coincidences, I have also submitted a typo fix for the curl
bindings, but apparently just as SSHFS, that project is not actively
maintained. Quite a mess, but there is a hope that it will not leak
anymore with the today's fix.

Apparently there is still some time to write a futuristic sci-fi novel
about a world in which file transfer is a solved problem: maybe not
just with rather involved setups like this, but even between casual
computer users.


Serialization
=============

The memory leakage made me to think of implementing it in C instead,
especially when it was not clear whether the issue is in the GC, the
laziness, the bindings, or something else: all the moving parts do not
help with debugging, while with C leaks are generally easier to search
for with Valgrind, and the libraries are more polished than the
bindings are. I use a basic language-agnostic IPC for those programs,
Unix domain sockets and JSON, but the JSON structures are simply
automatically derived out of Haskell types: planned to specify them,
but that is additional work (and code), and I am not entirely happy
with JSON anyway.

Sometimes thinking of switching to XML, though that has its own
awkwardness. S-expressions usually look nice to me, but there is no
standardized version disconnected from lisps. So I decided to take a
stab at specifying a serialization format, in addition to ranting
about those regularly: probably it will not lead to anything useful,
but at least going to try. So far considered what would be a nice and
simple structure at
<https://thunix.net/~defanor/notes/serialisation-formats.xhtml>,
have put together the grammar, and implemented a few parsers for it (C
with flex and Bison, plain C, Haskell with attoparsec, Python with
pyparsing) at <https://codeberg.org/defanor/word-tree>. Basically it
is like S-expressions without quoted strings (so more like XML in
that), representing a tree of strings. Parenthesized to shape the
tree, whitespace-delimited, without primitive types:

delimiter = " " | "\n"
restricted-char = "(" | ")" | delimiter | "\"
tree-or-val = "(" forest ")"
            | delimiter +
            | ("\" restricted-char | any-char - restricted-char) +
forest = tree-or-val *

Now thinking about a schema for it. Initially thought of focusing on
lexical analysis, since it has no types anyway. Could use regexps for
literals, or maybe (A)BNF to handle context-free grammars, which would
be nice. But if it will be (A)BNF, it can be used to describe the
overall structure (the shape of the tree) through it as well, and just
to use (A)BNF instead of a schema.


Voice conferences with Mumble
=============================

Finally tried seemingly working and easy to setup voice conferencing
software, Mumble (and Mumla on Android). It uses a custom container
format with encryption (where Ogg or (S)RTP could have been used, I
think, although its custom format seems to fit it better), TLS without
PKIX or TLSA verification, and not using PSK, either, but required,
and generally seems quite hacky, pragmatic, and relatively simple. I
heard of it before, but did not try, since it is not likely to be
usable in the settings when voice conferences are needed, such as
work: people are too reluctant to install special client software, and
probably somebody will have MS Windows or an Apple system, where
software cannot be installed or does not work properly. So chances are
I will not be able even to try it for real conferencing with others,
but nice to know that it is available. Not a bad example of applying
the "worse is better" approach.


Voice control
=============

Some time ago I wondered about trying voice control to execute a few
predefined commands. Tried out CMU Sphinx back then, which seemed to
barely work, and DeepSpeech, which worked much better, though was
awkward to setup, and being a new project, its future was less
clear. Now looked into it again: DeepSpeech appears to be abandoned
(as is Mycroft, by the way), but now there is Whisper, which also
works well, though now it is new and relatively awkward to
setup. Tried CMU Sphinx with "adaptation" this time, and with a
defined grammar restricted to about 40 words: thought that maybe if it
cannot handle arbitrary words, I can live with dictating them using a
phonetic alphabet and digit names instead. But that barely worked:
Sphinx recognizes 6 to 9 words out of about 40, while Whisper
recognizes almost all of them, even without any "adaptations" or
grammar restriction. Though for a single speaker and such a restricted
list of words, I wonder whether some simpler software may work well. I
think rather old software was capable of that decades ago.


Miscellany
==========

- Still going through the physics textbook, still slowly and not
  skipping exercises, only reached chapter 5 (while the first 14
  chapters are on mechanics). But rather happy that I did not drop it
  by now. Far from active usage of fun SymPy and LaTeX bits, as I did
  with the electrostatics book, but doing some diagram drawing with
  Inkscape, plotting with matplotlib, spreadsheets with org-mode's
  tables; tools I rarely use otherwise, but they are nice.

- Tried a slightly different cheesecake recipe, I think it is hard to
  go particularly wrong with them: combinations of ricotta,
  mascarpone, possibly yogurt or sour cream, eggs, flour, sugar, and
  flavorings, with cookies and butter for crust, tend to produce
  something nice. As various vegetables, meat, mushrooms, and spices
  do in stews, casseroles, soups. Had a store-bought serving of
  steamed vegetables though, those were not as nice: tough,
  plastic-flavored, no oil or salt. But as long as they are prepared
  sensibly, nice ingredients are not easily messed up. Speaking of
  stews, tried making a beef stew (out of a chuck roll) in a ceramic
  pot (which may qualify as a Dutch oven), employing an oven, in
  addition to a stovetop. Uncertain if this is better than a regular
  pan and a stovetop, but seems like an okay method.

- Frozen broccoli with eggs make a nice dish, and I plan to try using
  more of frozen vegetable bags, to bother less with cutting. Cutting
  vegetables takes too much time, and it is nice to have those in a
  freezer, rather than to shop specifically for a planned dish. Not as
  good as fresh vegetables, perhaps, but beats those store-bought
  prepared ones, or things like bread with cheese, which I have
  sometimes as well.

- Going to finally try pour-over coffee brewing, ordered a V60 dripper
  and filters. I probably have too many coffee brewing devices
  already though.

- Started skipping the evening stretching (balancing) routine and
  meditation (not sure if it does anything, but sitting calmly for a
  few minutes should not harm, and probably I will try to resume it),
  though doing all the other physical exercises daily still. Sometimes
  it feels like there is not enough time for everything, but I also
  notice that the time it takes to do the same parts of the routine
  can vary considerably, depending on whether I hurry and do things as
  quickly as manageable, or procrastinating in front of the computer
  after every small bit.

- Learned that turkey thighs work well for a soup, even though I
  thought I dislike darker chicken or turkey meat: rather flavorful,
  tender, and unlike chicken thighs, not much of connective tissue and
  other unpleasant bits around them.


----

:Date: 2024-05-22