Wednesday, 18 March 2026

PowerShell Inspired Installations using iwr

iwr is the short form for Invoke-WebRequest which can be used in PowerShell via its aliases as iwr, wget or curl.

npm and pnpm - the differences

npm, the Node package manager, can be incredibly disk-inefficient. pnpm was created to be (literally) a "performant npm" sometimes also called "painless npm".

The difference lies in each others' ability to store packages. 

npm duplicates node_modules per project, resulting in a huge disk footprint, whereas pnpm uses a global store and stores links to the same, resulting in 70-90% space savings.

node_modules is a directory in a NodeJS project storing third-party libraries and dependencies.

Tuesday, 17 March 2026

TypeScript for Java and C# Programmers

There is a good tutorial here.

An important point to note is that while TypeScript adds static typing to JavaScript, the underlying runtime is the same as JavaScript.

Recall that with static typing, the type of every variable and expression is checked before the program runs.  This enables errors to be caught at compile-time rather than run-time (in dynamic typing, by contrast, types are enforced only when code executes).

TypeScript is not a "mandatory" OOP language, in the same way as Java or C# (wherein the class is the basic unit of code organization - all data and behaviour is contained in a class). In JavaScript, and by extension TypeScript, this constraint is not present.  Functions can live anywhere. Avoiding OOP hierarchies where possible tends to be the preferred programming model.

In the spirit of not mandating classes for general programming, static classes are unnecessary in JavaScript. Singletons are also generally not used.

Monday, 16 March 2026

LoRA in Real Workflows

LoRA, or low-rank adaptation, is a fine-tuning technique for LLMs (one of many disparate techniques). The idea is to inject low rank matrices into large pre training models.

Recall that the rank of a matrix A is the dimension of the vector space spanned by its columns. This in turn corresponds to the number of linearly independent columns of A.

Books and Resources on AI Engineering

Apart from staying up to date through websites there are a number of good books on AI Engineering. Here is a recommended reading list.

AI Engineering, Chip Huyen (2025, O'Reilly) - really good book on building systems on top of LLMs. Chip's Github is here.

Hands-On Large Language Models, by Jay Alammar and Maarten Grootendorst (O'Reilly) - uses Python to convey an understanding of how LLMs operate under the hood, covers similar ground to AI Engineering - definitely worth reading. It has quite a few text processing canned examples which are quite interesting.

Mathematics for Machine Learning, by Deisenroth et al. - not as directly connected to AI Engineering but good at explaining some of the underlying maths of ML intuitively (and in somewhat long winded fashion - at least from an engineering perspective).

OpenAI's Open Source Tokeniser

OpenAI has created a Python package called tiktoken which is a BPE tokeniser.

Tokenising is something that's needed by chat interfaces (and other applications: compiler, interpreter etc.) to break text into tokens.

BPE stands for byte-pair encoding. It was described in 1994 by Philip Gage and a modified version is used in LLMs. The original algorithm is a clever compression technique replacing the most frequently occurring pair of bytes with a new byte not in the original data set, and uses as lookup table to recreate the original text. A modification extends this technique into tokenisation.

Sunday, 1 March 2026

Lambda Calculus and System F

The lambda calculus is a theory that treats functions as formulas or expressions. Arithmetic is another example of a language of expressions.  In arithmetic, you have variables (x,y,z..), numbers (1,2,3...) and operators (+, - ...). x+y then denotes the output of applying the addition operator to x and y and this can be extended to more complicated expressions. 

Lambda calculus extends this concept to functions. If we define a function f mapping x to x squared; then consider A = f(10); then in the lambda calculus we simply write A = (lambda x. x^2)(10). The expression (lambda x. x^squared) stands for the function that maps x to x squared rather than the statement that x is mapped to x squared.

One advantage of the lambda calculus, is it allows us to easily consider higher-order functions, i.e. functions with functions as inputs and/or outputs. An example is the expression f maps to f.f which takes the function f and applies it to the function f, the composition of f with itself. In lambda notation we write (lambda x.f(f(x)) and the operation that maps f to f composed with itself is (lambda f . lambda x. f(f(x)). You can see this is easy to extend to triple composition, and so on.

Technically speaking, lambda calculus is Turing-complete, that is, it is a universal model of computation that can be used to simulate any Turing machine.

Now lambda calculus can be typed or untyped, typed is more restrictive - we say it is weaker than untyped lambda calculus. In untyped lambda calculus we are flexible about domains and codomains. For typed calculus we have simply-typed - where we specify the type of every expression and polymorphically typed, where we have types of a specific form X->X but we don't specify the type.

System F is a form of polymorphic lambda calculus.

System F formalizes parametric polymorphism in languages. In so doing, it forms a theoretical basis for languages like ML and Haskell. 

System F was discovered independently by logician Jean-Yves Girard (1972) working in proof theory, and computer scientist John C Reynolds, who held positions at Edinburgh University, Imperial College and Carnegie Mellon.

The ideas aforementioned stemmed from interest and investigation in the 1930s into what does it mean for a function to be "computable" - in other words, have results derivable using (in principle) pencil and paper only.