I have seen the Future, and it is Not Claude
Unparalleledly deep systems knowledge exclusively licensed to the Win Joe Software Foundation (WSF) under one or more contributor license agreements
Thursday, 23 April 2026
Downsampling from a Data Science Perspective
Scala, Scala, Everywhere
Apache Spark (and its roots in Scala)
Apache Spark is a foundational layer underlying many data platforms.
It is written both in Java and Scala. Read the source code here.
A good starting point is SparkSession.scala.
One of Spark's "selling points" is "Exploratory Data Analysis (EDA) on petabyte-scale data without having to resort to downsampling" (see detailed post on downsampling).
A petabyte (PB) holds 1000 terabytes (one thousand million million bytes).
The Apache Incubator
Wednesday, 22 April 2026
Qwen Series of Models
The Qwen series of models comes from Alibaba Cloud. The Qwen 3.5 models, released in early 2026, has set new records for sub 2B models. It is much smaller than gpt-oss.
Compile to WASM - The Emscripten Toolchain
WebAssembly Not Automatically Blocked by Browsers
WebAssembly is a type of code designed to run in modern web browsers. It is designed to run alongside JavaScript using WebAssembly JavaScript APIs - creating an option for performance critical functionality.
As WebAssembly increases the browser's attack surface, so browsers contain WASM inside the browser's sandbox and restricts system access.
A risk maybe breaking out of the sandbox. Adobe Flash was a product sandboxed after a bunch of exploits, and after sandboxing exploits still occurred.
Transmission of WASM does not require TLS, HSTS or any other transport layer security mechanism making it susceptible to man-in-the-middle attacks.
Integrity checking is also impossible as WASM modules need not be signed by the author.
Some security-focused browser configurations can block WASM.