Monthly Shaarli

All links of one month in a single page.

May, 2025

MAC Address Vendor Lookup | MAC Address Lookup

Find vendor names from MAC addresses.

About Lojban :: lojban.io

A very nice FAQ for Lojban.

Delimited continuations

Deliminted continuation is a control flow mechanism involving two primitives prompt and control. It resembles call/cc at a glance, but the scope is different.

For example, the following expression returns 7:

(prompt
  (+ 1 (control k 
    (let ([x (k 1)] [y (k 2)] [z (* x y)])
      (k z)))

k is a continuation reified as a function. To understand the code above, read k as a blackbox function. Its input replaces the (control ...) where the part of body of (prompt ...) starting from the (control ...) gets evaluated, then the output of (prompt ...) gets returned as the output of (k ...). I find visualizing (prompt ...) as a box and (control k ...) as a hole helpful for understanding.

Announcing Clipper: TLS-transparent HTTP debugging for native apps - jade's www site

Clipper is a network debugging tool that intercepts TLS traffic to allow the traffic to be viewed on Chrome Dev Tools. What interests me the most is how it decrypt TLS traffic.

There are several ways to do that that I know of: The first method is with environment variable SSLKEYLOGFILE; tools that respect the environment can dump the keys to the specified file, which can picked up by tools like Wireshark. The problem is that many tools doesn't respect the variable out of box. The second is MITM the traffic with a self-signed certificate. This method doesn't work with TLS key pinning and does not truly reflect the traffic due to the proxy layer.

Clipper instead used the trick to LD_PRELOAD a library that uses Frida to hook library functions (e.g. OpenSSL) to extract the keys, and implement a universal SSLKEYLOGFILE support.

Lossless Video Compression Using Rational Bloom Filters

Observation: bitstream with low density of 1s (< p=0.32) can be encoded more efficiently by positions of the ones. We store those positions into a set implemented using bloom filter. We can calculate the theoretically optimal parameters for bloom filter to maximize compression. With a bloomfilter bitmap plus a data structure for recovering info in case of false positive we can achieve lossless compression. To decompress, we iterate through all possible positions, set the bit to one by querying set membership of the location.

To compress video, the author encodes the difference between video frames as a bitstream of mostly one, which then gets compressed with bloom filter.

Another innovation is rational bloom filter - the number of optimal hash functions is usually not integer. Typical implementation just round the number to an integer and use that. The author proposed to use ⌊k⌋ hash functions deterministically and with k-⌊k⌋ probability applying an additional hash. The probability is applied deterministically based on the value.

Attention Wasn't All We Needed

Stephen Diehl's summary of the most influential advancements on top of the Attention Mechanism. The article comes with detailed descriptions and from-scratch codes demonstrating how exactly these techniques work.

Reservoir Sampling

Reservoir sampling is a technique for selecting a fair random sample when you don't know the size of the set you're sampling from.

An algorithm like this is applicable in cases like rate limiting sending a fixed number of logs per time interval to protect against bursting. A naive greedy approach will result in biases. We want to evenly sample the logs.

The trick is to use a fixed K slots of log entries (K corresponds to the per-bucket rate limit). For upcoming entries, first fill the slots (N<=K), for an upcoming entry after N>K, we decide to replace a slot with the new entry at the probability of K/N. The evicted slot is chosen randomly from the K slots. In the end of the time interval, flush the slots.

A faster way to copy SQLite databases between computers | Hacker News

TIL the author of SQLite wrote a tool called sqlite3_rsync that can be used to copy one database across machines. It does so incrementally like rsync while preserving the integrity of the database so both the source and the replica databases are safe to use when copying.