871 private links
An intuitive analogy to memory barriers:
- two threads running on two cpu with their own L1 cache are like two person collaboratively editing files on their computer, connected to central file repo.
- on each person's own computer is a working copy (L1 cache) of the files (memory locations) that correspond to the the central repo (memory, or shared cache like L2)
- each person's computer randomly issue
pull
andpush
operations on the files (note that each file is push/pulled independently unlike git). the intuition captures the unpredictable nature of cache flushes and misses, where each cache line operates individually. - when collaborating on a file, a person should manually
pull
on a file before editing. but that's of no use if the other person don'tpush
after their write. but if both party collaborates, there will be no inconsistency. - the
pull
operation is equivalent to aloadload
fence, captured by acquire semantics. thepush
operation is equivalent to astorestore
fence, captured by release semantics. - the
loadstore
fence ensures the out-of-order execution doesn't happen around the barrier such that read always happens before the barrier and write always after the barrier. there is no simple version control analog, but bothloadload
andstorestore
implies theloadstore
fence. - the
storeload
fence ensures the write before the barrier always before reads after the barrier.
In addition to that, some of my own notes:
- relaxed semantics uses special cpu instructions (e.g. LOCK) to skip using non-shared caches (i.e. L1), this ensures all cores access to the same data.
- acqrel semantics uses three operations together:
pull
, change,push
. - seqcst semantics, on top of acqrel guarantee, ensures a total order of load and store. it can be implemented by cpu waiting for the pipeline to finish and flushing the caches before issuing future instructions. (my speculation)
An interesting theory about why the music scales take on the values they have.
- dissonance of two pure tone comes from the inaudible beats in the superposition
- the base frequency + the overtones determine the note and the timber
- overtones are not necessarily harmonics. it depends on the musical instrument.
- adding up all the dissonance graph of all overtones of two notes, we can get a graph of dissonance, where the troughs have lowest dissonance.
- do the same for three notes, we get a 3-d graph of dissonance.
- the insight is that we should consider not only the dissonance of the base notes but also of all overtones
- it's found that many scales (relative to the base) and chords (relative to the lowest note) fall in the troughs of this graph
- conclusion: overtones in the instruments influence the musical scales used by a culture
This article describes the basics about USB protocol enough to overcome my usual fear of dappling into USB hacking. The author demonstrated the means to use libusb (user-space) to interact with an unsupported USB RGB light.
A nice list of toy programming projects to try.
A nice cheatsheet for GNU Make.
Why would anybody bet on Jesus' return in 2025?
Polymarket pegs the price of Yes and No share using the following mechanism. $1 can be minted into 1 Yes share and 1 No share. Conversely, 1 Yes share and 1 No share can combine to redeem into $1. So if the prices of Yes and No shares don't add up to $1, arbitrageurs will come and profit from the difference.
Yes share holders expect the No share holders to sell their No shares at lower price later on, for example, when some more interesting future event happens that they need money to bet on. When that happens they will want to sell their No shares at lower price, or equivalently buy Yes shares to redeem their dollars back. The Yes share holders can therefore sell their tokens and profit from such price raise.
In conclusion, the Yes hoarders are betting not on Jesus but on whether there will be interesting events that people need money to bet on before the end of 2025.
An video explainer of Polymarket mechanism: https://www.youtube.com/watch?v=c6HWyACjE2A
Find vendor names from MAC addresses.
Observation: bitstream with low density of 1s (< p=0.32) can be encoded more efficiently by positions of the ones. We store those positions into a set implemented using bloom filter. We can calculate the theoretically optimal parameters for bloom filter to maximize compression. With a bloomfilter bitmap plus a data structure for recovering info in case of false positive we can achieve lossless compression. To decompress, we iterate through all possible positions, set the bit to one by querying set membership of the location.
To compress video, the author encodes the difference between video frames as a bitstream of mostly one, which then gets compressed with bloom filter.
Another innovation is rational bloom filter - the number of optimal hash functions is usually not integer. Typical implementation just round the number to an integer and use that. The author proposed to use ⌊k⌋ hash functions deterministically and with k-⌊k⌋ probability applying an additional hash. The probability is applied deterministically based on the value.
Stephen Diehl's summary of the most influential advancements after the publication of the Attention Mechanism. The article comes with detailed descriptions and from-scratch codes demonstrating how exactly these techniques work.
A very nice FAQ for Lojban.
Deliminted continuation is a control flow mechanism involving two primitives prompt
and control
. It resembles call/cc
at a glance, but the scope is different.
For example, the following expression returns 7:
(prompt
(+ 1 (control k
(let ([x (k 1)] [y (k 2)] [z (* x y)])
(k z)))
k
is a continuation reified as a function. To understand the code above, read k
as a blackbox function. Its input replaces the (control ...)
where the part of body of (prompt ...)
starting from the (control ...)
gets evaluated, then the output of (prompt ...)
gets returned as the output of (k ...)
. I find visualizing (prompt ...)
as a box and (control k ...)
as a hole helpful for understanding.
Reservoir sampling is a technique for selecting a fair random sample when you don't know the size of the set you're sampling from.
An algorithm like this is applicable in cases like rate limiting sending a fixed number of logs per time interval to protect against bursting. A naive greedy approach will result in biases. We want to evenly sample the logs.
The trick is to use a fixed K slots of log entries (K corresponds to the per-bucket rate limit). For upcoming entries, first fill the slots (N<=K), for an upcoming entry after N>K, we decide to replace a slot with the new entry at the probability of K/N. The evicted slot is chosen randomly from the K slots. In the end of the time interval, flush the slots.
Clipper is a network debugging tool that intercepts TLS traffic to allow the traffic to be viewed on Chrome Dev Tools. What interests me the most is how it decrypt TLS traffic.
There are several ways to do that that I know of: The first method is with environment variable SSLKEYLOGFILE
; tools that respect the environment can dump the keys to the specified file, which can picked up by tools like Wireshark. The problem is that many tools doesn't respect the variable out of box. The second is MITM the traffic with a self-signed certificate. This method doesn't work with TLS key pinning and does not truly reflect the traffic due to the proxy layer.
Clipper instead used the trick to LD_PRELOAD a library that uses Frida to hook library functions (e.g. OpenSSL) to extract the keys, and implement a universal SSLKEYLOGFILE
support.
TIL the author of SQLite wrote a tool called sqlite3_rsync
that can be used to copy one database across machines. It does so incrementally like rsync while preserving the integrity of the database so both the source and the replica databases are safe to use when copying.
An encyclopedia of consistency models for concurrent systems.
The algorithm for de-pixelating a pixelated moving window with fixed content:
- extract the frames and find the absolute positions of the window for each frame
- assume the color of each pixelated cell is the color for the center pixel of each cell (think of pixelation as a magnifier filter focusing on the center pixel for each cell)
- by tracking the locations of the centered pixels and its colors in each frame, we fill in more and more pixels with known colors.
- for the remaining unfillled pixels, interpolate from nearest pixel of known colors (e.g. voronoi)
Analyze the bytes in a SQL database.
A small hash function reconstruct a ultra-blurred image from a 20-bit integer, used for image preview. The integer is bit-decomposed into nine values, six brightness values (0-3) at the 2x3 grid-points of the image, three other values to encode a color for the background (oklab). Then the blurred image is rendered by drawing six radial gradient at the points fading into transparency, then a solid background.
A non-interactive zero-knowledge proof scheme: The prover commits (shuffle+lock) the graph N times: s[0..N]
. He then calculate r=hash(s[0] || s[1] || .. || s[N])
. And then he use the random bits in r
generate N
pseduorandom numbers c[0..N]
. For each i=0..N
, use c[i]
to select a fixed edge in s[i]
and reveal the keys k[i][0]
and k[i][1]
for the two nodes of the edge. The final message consists of s[0..N]
and k[0..N][0..1]
. The verifier is able to use these information to replay the steps to verify the prover's claim.
Another thing I learned from the video is how to "lock" messages cryptographically and allow the verifier to verify any message without revealing the others. First, the prover calculate h[i] = hash(m[i] || nounce[i])
and send h[0..N]
to the verifier. The verify choose any integer t
between 0
and N
and send to prover. The prover will then send back (m[t], nounce[t])
to the verifier that h[t]
is indeed calculated from m[t]
. Here h[t]
is the locked message m[t]
and nounce[t]
is the key.
Various IP lists for download - useful for setting up firewall against attackers, malware, p2p, crawlers etc.