Thursday, 30 April 2026

More on the Common Information Model

We have previously mentioned the CIM, or  Common Information Model, in the context of systems management standards. It is effectively an object-oriented schema for classifying objects pertaining to systems management. 

An example schema can be found here (note: there are multiple versions of the schema).

Notes from Microsoft Learn on this topic can be found in Microsoft's WMI SDK notes here.

From an organization perspective, the most important aspect is consistent adoption of a sufficiently descriptive data model, rather than the details of the data modelling itself.

Wednesday, 29 April 2026

The Mesh Network

A mesh network is a type of LAN topology where every node connects (in a flat layout) to as many other nodes as possible. Nodes can then work together to route data as efficiently as possible.

The LM Link Feature in LM Studio

LM Link is a way to connect devices on which LM Studio is installed; allowing you to load models on remote devices as if they were local. Chats remain local and the only thing loaded on LM Studio's backend servers are your device list.  In a way, it's model-connectivity-as-a-service using your own hardware.

LM Link is implemented on top of Tailscale VPN.

Monday, 27 April 2026

WinJoe, Was ist Delta Format?

Delta format (often called Delta Lake) is an open-source data storage layer originally developed by Databricks. 

It sits on top of Apache Parquet and enhances it with database‑like guarantees and metadata management. 

Delta is designed specifically for large-scale data engineering where reliability, consistency, and performance are essential - according to the creators, Databricks,

Big Data's New Vacation Home - The Lakehouse; Microsoft's Approach

The lakehouse concept combines the capabilities of data lakes (which have scalability qualities) and data warehouses (which have advanced query functionality).

Microsoft Fabric's resources on Data Engineering delves into the concept of data lakehouse (and Microsoft's SaaS implementation, OneLake, billed as OneDrive for data)  with notes on how the lakehouse makes use of Apache Spark.

Friday, 24 April 2026

Troubleshooting WSL2 Memory Hogging

WSL2 hogs memory and doesn't release it even when all consoles are closed. Do wsl --shutdown to free up memory.

Why does TypeScript feel a bit C-Sharpy?

TypeScript was created by Anders Hejlsberg, a Danish software engineer, in 2012. He formerly created C# around the year 2000. He is also known for Turbo Pascal and Delphi, both extraordinary products in their time. Deservedly he is a Microsoft Technical Fellow (a list of whom appear here).

Types in TypeScript

The basic types are called primitives: 
  • boolean
  • number (which represents integers and floating points)
  • string  
There is also:
  •  BigInt (ES2020+) to represent whole numbers larger than 2^53 -1, and
  •  symbol to create unique identifiers
Starting with ES2015, symbol is a primitive type, whose values are created by calling the Symbol constructor.

Examples:

let sym1 = Symbol();
let sym2 = Symbol("keyname");

Symbols are immutable and unique, which can result in what may be initially feel like strange behaviour, but on reflection makes sense.

let sym2 = Symbol("key");
let sym3 = Symbol("key");

sym2 === sym3; // triple equality - false, Symbols are unique.

Node Version Manager - Strongly Recommended

The Node version manager, nvm, is strongly recommended to manage your version of Node.js and npm. 

It also allows switching between various versions of Node (Nodejs and npm) for testing purposes. 

As per official docs, nvm is designed to be installed per-user and invoked per-shell. It works on "any POSIX compliant shell" - including on Unix, macOS and WSL.

Once you install nvm (by wget'ing the installation shell script and piping it to bash) you can restart WSL and start using nvm.

Some nvm commands to know:

nvm install node   # install latest version

nvm install --lts     # install latest LTS version

nvm use node        # switches to latest version

nvm use <version>    #switch to a specific version

To see all Node versions, do nvm ls.   Node uses semantic versioning, following the pattern MAJOR.MINOR.PATCH.

nvm ls shows the version active in shell in blue, and installed versions in green. Yellow are versions referenced by aliases but not installed.

Installing a Transpiler

Do install a transpiler in WSL as follows.

sudo npm install -g typescript

The -g option to npm install is short for --global and means install the said package globally (global npm directory) as opposed to the local node_modules folder of a project.

Binaries are then also exposed on your PATH, so you can run tsc conveniently.

Note that you need an up-to-date installation of Node to run TypeScript. If not, some of the modern operators (e.g. null coalescing operator) will not work when running tsc.

Dawn of the Transpiler

The term "transpiler" (referring to a source-to-source translation tool, or "translating compiler") gained popularity around 2013 with the proliferation of translators from TypeScript and other abstractions (CoffeeScript, Dart) into JavaScript.  

JavaScript at the time was becoming a "universal runtime".

Babel is a popular transpiler. tsc is the official transpiler. It can be installed via npm.

Thursday, 23 April 2026

Downsampling from a Data Science Perspective

Downsampling in data science and data processing is as follows (this excludes the DSP, or digital signal processing, technical definition of downsampling - which is similar in spirit but differently defined).

Downsampling involves reducing the number of data points in a data set to enable comparability (sometimes referred to as "balancing the data").  This helps machine learning models avoid bias towards a dominant class.

Various approaches to downsampling (e.g. random downsampling) are described in this IBM article.

Scala, Scala, Everywhere

For legacy observations on Scala, check out JVM stuff.  Here we build a fresh relationship with Scala.

Scala is a strongly statically typed language supporting OOP and functional programming.  Strong static typing means it avoids implicit type conversions when calling functions and other scenarios.

A good starting point for learning Scala is scala.dev here.

Apache Spark (and its roots in Scala)

Apache Spark is a foundational layer underlying many data platforms. 

It is written both in Java and Scala. Read the source code here.

A good starting point is SparkSession.scala.

One of Spark's "selling points" is "Exploratory Data Analysis (EDA) on petabyte-scale data without having to resort to downsampling" (see detailed post on downsampling). 

A petabyte (PB) holds 1000 terabytes (one thousand million million bytes).

The Apache Incubator

The Apache Incubator services projects seeking to enter the almighty Apache Software Foundation. Projects (called "podlings") are "ingested" and become subject to Apache-style governance and operation.

The name Apache was taken from the Apache Indian people, a Native American tribe known for their warrior spirit and inexhaustible endurance, and was first used in the context of the cross-platform Apache Web Server (launched in 1995; despite being cross-platform most instances run on Linux distributions).

Wednesday, 22 April 2026

Qwen Series of Models

The Qwen series of models comes from Alibaba Cloud.  The Qwen 3.5 models, released in early 2026, has set new records for sub 2B models. It is much smaller than gpt-oss.

Compile to WASM - The Emscripten Toolchain

Emscripten is an open-source compiler toolchain to Wasm. C/C++ (or any other LLVM-supported language) can be compiled and run on the Web, Node.js or other Wasm runtimes.

WebAssembly Not Automatically Blocked by Browsers

WebAssembly is a type of code designed to run in modern web browsers.  It is designed to run alongside JavaScript using WebAssembly JavaScript APIs - creating an option for performance critical functionality.

As WebAssembly increases the browser's attack surface, so browsers contain WASM inside the browser's sandbox and restricts system access. 

A risk maybe breaking out of the sandbox. Adobe Flash was a product sandboxed after a bunch of exploits, and after sandboxing exploits still occurred.

Transmission of WASM does not require TLS, HSTS or any other transport layer security mechanism making it susceptible to man-in-the-middle attacks.

Integrity checking is also impossible as WASM modules need not be signed by the author.

Some security-focused browser configurations can block WASM.

An Insider Look at CPython: The "Compiler-Interpreter"

A run-of-the-mill Python programmer may not necessarily think about CPython on a day-to-day basis. 

But CPython is an interesting thing to think about.

It is the reference implementation for Python, written in C and Python. C was used in theory to make portability easier - it's also more efficient (so there's no C++ or STL in there).

CPython is both a compiler and an interpreter. Python code is compiled (into bytecode) before being interpreted.  So you can think of it as a "compiler-interpreter".

One (potentially) painful feature of CPython is the Global Interpreter Lock (GIL) - and the GIL is used on each interpreter process - which means effectively only one thread can run at any one time (more explicitly, only one thread can process Python bytecode at any one time). While this simplifies the implementation, it becomes a bottleneck for CPU-intensive tasks.

Concurrency can be achieved by having multiple Python processes (which have by extension, multiple interpreter processes) and enable inter-process communication.  The Python multiprocessing module aims to make this paradigm simpler to implement.  This is however not available on mobile platforms or WebAssembly platforms.

Thursday, 16 April 2026

UTM is Urchin Tracking Module

UTM is something you may come across first in URLs. 

UTM refers to Urchin Tracking Module, named after Urchin, the firm Google acquired in 2005 to form the basis for Google Analytics.

  • utm_source denotes a tracking parameter in a URL - to denote where traffic is coming from
  • utm_source=google indicates traffic came from google
  • utm_source=email traffic came from an email
Google Analytics alternatives such as Plausible have been built which follow the UTM convention.

Saturday, 4 April 2026

TypeScript in CodePen

Can you do TypeScript in CodePen?  

The question is valid as CodePen reveals three containers, one for HTML, one for CSS and one for JavaScript, in its default interface.

To enable TypeScript input, go to Settings, select JavaScript preprocessor and choose TypeScript. Other available preprocessors are LiveScript, CoffeeScript (billed as a "simple and elegant way" to write JavaScript) and Babel.

Note that CodePen will run TypeScript without type-checking errors blocking execution.

Friday, 3 April 2026

Bun - The JavaScript Runtime used by All (Cool Cats)

Bun is a fast JavaScript runtime. It's website is bun.sh (where the suffix sh denotes a St Helena domain). Bun is built from scratch to "serve the modern JavaScript ecosystem".

A major selling point of Bun is it starts fast and runs fast. It extends the performance-minded JS engine built for Safari known as JavaScriptCore. Fast start times leads to fast apps like Claude Code CLI.

Bun also boasts "cohesive DX" (developer experience) with a package manager, test runner and bundler all included.

Design-wise it has been designed to be a drop-in replacement for Node.js. Thousands of Node.js and Web APIs have been implemented in Bun like fs, path and Buffer.

Bun's ambition is to run most of the world's server-side JavaScript.

CodePen

CodePen is a web environment to experiment with front-end code.

Thursday, 2 April 2026

Inside the Claude Code CLI

On 31 March 2026, Anthropic's CLI tool Claude Code that lets you interact with Claude for software engineering tasks from the command line - edit files, search codebases, manage git workflows and more - had its src directory leaked revealing TypeScript code with UI written in React and Ink (React for interactive command-line applications). It uses the Bun runtime - a fast JavaScript, TypeScript and JSX toolkit.