- Terraform - multi-cloud, open-source Infrastructure-as-Code tool that works across AWS, Azure, GCP and more
- ARM templates - Azure's native JSON-based IaC format, verbose but powerful. Many companies still have ARM templates in their arsenal, even though it's time has come
- Bicep - domain-specific language (DSL) for Azure (where said domain is IaC, or more broadly "declarative deployment of Azure resources") that simplifies ARM templates with cleaner syntax. A good one if you are not hybrid-clouding
Programming is Not Rocket Science, Don't let AI Write Your Code, Fight Back, Learn from ODML
Monday, 1 December 2025
IaC Zoology
Saturday, 29 November 2025
Boosting versus Bagging
- Libraries for Boosting: catboost, XGBoost (eXtreme Gradient Boosting)
- Libraries for Bagging: scikit-learn (BaggingClassifier, BaggingRegressor), imbalanced-learn (scikit-learn extension for imbalanced data), ML-Ensemble
Friday, 28 November 2025
ufunc in numpy - understanding universal functions
statsmodels in Python
statsmodels is a Python package that complements scipy for statistical computation.
The stable version is found here.
statsmodels takes ideas from other libraries and ecosystems, specifically it uses R-style formulas, and pandas DataFrames.
Chances are you are using the library with other libraries too, like numpy.
It can be installed via Conda or pip. Examples:
python -m pip install statsmodels
Thursday, 27 November 2025
Validating DataFrames in pandas
In Advance of Node.js Learning
Learn the basic rudiments of JavaScript. Include asynchronous JavaScript.
Connoisseur's Guide to JavaScript Engines: V8 Rules
Node.js uses the V8 JavaScript engine which powers Google Chrome (and is open sourced by Google).
Other browsers use their own engine, for example Firefox uses SpiderMonkey and Safari uses JavaScriptCore (aka Nitro). Edge was based on Chakra (a Microsoft project that was open-sourced) before being rebuilt with V8 and Chromium.
What JavaScript is Not Allowed to Do in the Browser
JavaScript in the browser is not normally allowed to do too much for security reasons.
Stuff it cannot do includes anything OS related - specifically:
- Cannot read/write arbitrary files
- Cannot access hardware directly
- Cannot control processes
- DOM manipulation (HTML/CSS)
- Local/session storage
- IndexedDB (database built into the browser)
- Cookies (with same origin)
- Clipboard (with user permission)
- Geolocation, camera, microphone (with user consent)
Node.js relationship with Electron
Node.js is needed to "scaffold" an Electron project - the node package manage (npm) is used to download Electron packages, install dependencies and generate starter files.
Electron itself comes bundled with its own Node.js runtime.
This serves as the "back end" of your Electron app.
It manages windows, menus, system events and native OS integration. You can use Node modules directly in this environment e.g. fs, path and http.
The "renderer" process is Chromium.
By default, this can also access Node.js APIs unless explicitly disabled for security reasons. So you effectively have a "contained" web app with OS access.
The main process (Node.js runtime) communicates to the renderer via IPC.
This combined architecture of Node.js and Chromium enables applications to be written that run on Windows, macOS and Linux without experience of native UI development.
The Same Origin Policy (SOP) on Modern Web Browsers
The Same Origin Policy (SOP) is a browser-enforced security rule that prevents scripts from one "origin" (PDP -> protocol + domain + port) from accessing resources from another origin.
The SOP prevents cookies, DOM and local storage from being read by malicious cross-site scripts.
The SOP does not just apply to web browsers. For example, Electron apps (desktop apps built with web tech) enforce SOP because they embed Chromium.
The Same Origin Policy is an "isolation model" designed to ensure "secure workflow".
Basics of Selenium
Wednesday, 26 November 2025
git remote add
Tuesday, 25 November 2025
pandas and DataFrames
pandas provides data structures and data analysis tools for Python.
The basic data structures in pandas are:
- Series: one-dimensional (labelled) array holding data of any type e.g. integers, strings
- DataFrame: a two-dimensional data structure, holding a 2d-array or table with rows and columns
Monday, 24 November 2025
Microsoft Launch Fara-7B: A CUA (Computer Use Agent) in SLM Form
And here we have it. Ready for action on Hugging Face, Sir.
Sandboxing and monitoring are recommended. The agent itself is a wrapper around Playwright.
Sunday, 23 November 2025
Hacking Transformers with Hugging Face
The Runtime Formerly Known as TensorFlow Lite
Mastering BERT and DistilBERT
It is worthwhile to study BERT as DistilBERT has the same general architecture as BERT.
IBM's Guide to Small Language Models
sbs_diasymreader.dll
- sbs - side by side (Recall - allows multiple versions of a DLL to sit side-by-side without conflicts)
- dia - Debug Interface Access, reference to SDK used to read debugging symbols (PDB files)
Friday, 21 November 2025
Programming Realities - The Awkward Error Message
The awkward error message can stop you in your tracks. Keep pushing on. Discover. Remediate error. Every error is a massively valuable learning opportunity.
Thursday, 20 November 2025
The Concept of C# Scripting (.csx files)
Compiling C# in VS Code
For this you need the C# Dev Kit extension. It's Roslyn-powered.
The code name Roslyn was first written publicly by engineer Eric Lippert (the code was originally hosted on Microsoft's CodePlex before being moved to GitHub).
Dev Containers Extension in VS Code
What it is, What it does
The Dev Containers ("DC") extension is needed by Semantic Kernel in VS Code. It's worth expanding on its purpose here.
DC allows you to use a Docker container as a full-feature development environment (this is independent of how you deploy the thing).
More details here.
Dev Containers Dependencies
It requires Docker Desktop to be installed, which interacts with WSL2. If you don't have it, don't worry, however. VS Code will prompt you automatically to install it. After installation, you will see a status bar labelled "Setting up Dev Containers" followed by "Connecting to Dev Container (show log)".
Dev Container Configuration
This is located in semantic-kernel\.devcontainer\devcontainer.json. This is similar to launch.json for debugging configurations. More info here.
Git on Windows
Git on Windows is good.
However there are a few options to select before you get this going.
- add git to path (can use from cmd.exe and Powershell etc.)
- use bundled OpenSSH (uses ssh.exe that comes with git) - alternative is to use an ssh.exe you install and add to your path
- which SSL/TLS library should Git use for HTTPS connections? OpenSSL library or native Windows Secure Channel (Choose latter). Here Server Certs will be validated by Windows Certificated Stores. Also allows you to use your company's internal Root CA certificates distributed e.g. via Active Directory Domain Services
- Git Bash to use MinTTY for terminal emulation (better scrollback than cmd.exe)
- Use Git Credential Manager or use none
Do You Understand Fully How This Works
I think you need to do. Attribution: One software engineer to another.
Mastering git clone
The Unstable Book for Rust
Rust has The Unstable Book to cover unstable features of Rust. When using an unstable feature of Rust, you need to use a special flag or rather a special attribute: #![feature(...)].
Unstable features in Rust refer to specific capabilities (language or library) as yet unstabilized for general use. You can access these on the nightly compiler (not in stable or beta channels).
Unstable features may be experimental, incomplete or subject to change.
They are in the language as a means of balancing innovation with stability. Developers get access to new features but basically on a trial basis. While the feature is classed as unstable, Rust team can refine the design, fix edge cases or abandon features if problematic.
cargo new hello_cargo
For details on how to use cargo new type cargo new -help at the command line.
cargo new hello_cargo
creates a new directory for your project, a .git subdirectory, an src directory for your source code, and a Cargo.Toml file.
TOML, or Tom's Obvious Minimal Language, format, is Cargo's configuration format. (TOML brands itself as "A Config File Format for Humans" and is nice, simple and neat).
It has a [package] section which configures package settings, and a [dependencies] section for any of the crates required for the package to run (crates are Rust packages).
cargo init -help
for help integrating any Rust code developed outside of Cargo.
Elliptic Curves Assert Presence on Linux
source command in Linux shell
The source command in the Linux shell sources (loads) a script. Its short form is .
emacsx - An Emacs Launcher for .bashrc
Here is a cool emacs launcher for your .bashrc file if you like reverse video.
# Launch Emacs in full screen, optionally with a file
# Launch Emacs in full screen, optionally with a file
if [ "$#" -eq 0 ]; then
command emacs -rv --fullscreen
else
command emacs -rv --fullscreen "$@"
fi
}
Cargo for Rustaceans
Wednesday, 19 November 2025
What is the "tensor" in TensorFlow?
Deeper Look at Sequential Model Building in TensorFlow
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10)
])
- The first argument is the positive integer units, representing the dimensionality of the output space
- The second argument is the activation function to use (if this is missing, no activation is applied which is actually linear activation a(x) = x)
- Essentially these functions work with neurons and transform neural computations into output signals
- ReLU (rectified linear unit) is one of the most widely used activation functions in neural networks (f(x) = max(0,x)).
Train your First Neural Network on the MNIST dataset
- Use Keras API as a "portal" into TensorFlow library to build the neural network
- Use "Sequential" model - allows you to add "layers" sequentially
- "Feed" the model the training data (creating the model takes a bit of study/effort)
- Model gives back a vector of "Logits" or "Logs-odd" scores, one per class
- Run softmax to convert these scores to probabilities
- Compile the model - with an optimizer and a loss function, configure for 'accuracy'
- model.fit
- model.evaluate to see how the model performed (was it a good fit to the data)
I had to bring pip into my Ubuntu Linux VM
Even with Python installed a whole host of packages are needed for pip. These include (non-exhaustively):
- build-essential
- bzip2
- cpp
- cpp-11
- fakeroot
- ...
- zlib1gdev
Just to mention a few.
build-essential is a Debian specific package, consisting of a bunch of useful build tools, to build software from source and create Debian packages.
Tuesday, 18 November 2025
Rustup
Microsoft's Big Bet on Rust
Will two git commits ever have the same id?
This is highly unlikely due to the design of the commit algorithm. The probability is less than 1.47 in 10^48. Digging into this, the commit id is actually a 40-character hexadecimal number - using a cryptographic hash function (SHA-1 or SHA-256 in newer versions) producing 2^(4*40) possible hashes.
Metadata that's fed into the cryptographic hash function include snapshot of the project tree (folders, files and contents), commit message, the commit data itself, author information and timestamp (time elapsed in seconds since Jan 1, 1970).
TCPL Still Rules the Linux Roost
The programming language C (born in the 1970s, created by Dennis Ritchie, who also created its predecessor, B) still rules the roost as far as the Linux kernel is concerned, whereby of 37 million lines of code, we have just over 35m LOCs written in C as per analysis from OpenHub.
Linux is the open source version of Unix, which was written in C, and prior to that in assembler. The introduction of the programming language C made the code portable to different hardwares.
Ken Thompson and Dennis Ritchie and their colleagues at Bell Labs (AT&T) were the co-creators of the Unix Operating System, created in 1969. Douglas McIlroy introduced the Unix philosophy of small, composable tools. Brian Kernighan helped to popularise UNIX and C through co-authoring The C Programming Language with Dennis Ritchie.
git add and git commit
The Mechanics of Commits
You can add a new file to your repo doing:
git status (this will show any pending commits)
git commit -m "a little comment, if you please" <mynewfile>
Allowing wsl.localhost in list of allowed hosts
When navigating to a directory in WSL from VS Code you may get the message:
The host 'wsl.localhost' was not found in the list of allowed hosts. Do you want to allow it anyway?
You will get the opportunity to hit Allow (because you trust the host, it is after all, your WSL installation) together with the option to flag: "Permanently allow host 'wsl.localhost'".
Accessing Linux Directories in WSL From Explorer
Directories in WSL can be accessed by navigating to your distribution and filesystem after opening \\wsl$ in File Explorer.
Telling git who you are with git config
To get your commits attributing correctly you need to let git know who you are. This means sharing your name and email address you want to use for git. Here is the way:
git config --global user.email "youremail@provider.com"
Creating a git repository: git init
What git init does
Run git init in the folder where your code will be based (this assumes you have already created an appropriately named folder and done cd into it).
This creates a .git subdirectory.
What does the .git subdirectory hold?
.git is filled with configuration files and a subfolder for snapshots to be stored. All commits are stored in .git as well.
Name of initial branch
This may change in successive versions of Git, but as of 2.34, the name of the initial branch created by git init is 'master' (other commonly used names include 'main' and 'trunk' and 'development').
Re-running git init
Re-running git init will not do anything. It will simply output "Reinitializing existing Git repository" in /home/YOURLOGIN/YOURPROJECT/.git/.
Re-running git init in the .git subdirectory
Renaming a branch in git
The git branch command can be used to rename the branch.
Linux VM Skills - The ls command
ls -a and ls -A will list all files including hidden files.
ls -A (capital A) will attempt to remove the . and .. directory (i.e. current and parent directories).
A Head-First Guide to Git
Saturday, 15 November 2025
Canonical Snaps and Approach to Linux Packaging
Concept of Snaps and Why Its Useful
When running wsl in Windows (with Ubuntu) you will eventually come across the concept of snaps.
Snaps are a package management feature that offer an alternative to the usual sudo apt-get (or sudo apt install, which is a wrapper over apt-get).
Snaps was developed by Canonical for security (via sandboxing, or in Canonical language "confinement") and convenience (an "all-in-one" snap removes the need to download and install individual dependencies).
The security side aims to guarantee safe execution of software by mandating packages abide by the principle of least privilege (this is diluted however by the option of classic confinement).
Concrete Example: Installing emacs edito
Trying to run emacs at the command line, you find it is not installed. You may see:
sudo snap install emacs # version 30.2
- Strict confinement - abide by sandbox rules
- Classic confinement - liberal / "laissez-faire" (but needs explicit user approval on install)
Friday, 14 November 2025
Wayland for WinDevs Who May Not Know It
What is Wayland?
Wayland is intended as a replacement for the X Window system with the aim to make it easier to develop and maintain.
Specifically, Wayland is the protocol used to talk to the display server to render the UI and receive input from the user.
Wayland also refers to a system architecture (more below), which will give you an understanding of how the protocol is used to build a working UI.
Wayland versus X Architecture? Call it a "Simplified X".
In an X setup, X clients talk to an X server and the server talks to a compositor. The comms between server an compositor is bidirectional.
The X Server also talks to the kernel. There is a critical interface called evdev (short for event device) which is the input event interface in the Linux kernel.
In Wayland - the display server and the compositor are rolled into one. The architecture is thus simpler.
What is WSLg?
Google Colab
Tuesday, 11 November 2025
Addressing AI Misuse
Getting Jiggy with gpt-oss-20b (and why open weights matter)
LM Studio Setup
Disk Space in Windows 11
Type Storage Settings in the Search bar. This will lead you into Settings ->System -> Storage.
Monday, 10 November 2025
Python Wheels
Python wheels are pre-built binary packages for Python which make installation via pip faster and more efficient. Learn more here.
Why Python From Windows Store is Flawed
Installing Python from the Windows Store is flawed as everything goes into AppData\Local. This is a local directory associated with the logged in user C:\Users\<Name>\AppData\Local.
This is a way to sidestep a "proper installation" in C:\Program Files which requires administrator privileges. It's a way to overcome potential UAC restrictions.
Once installed - there is no proper way to uninstall. You need to get into the AppData directory (which is hidden in File Explorer). Once opened, you can navigate to Microsoft\WindowsStore to find python and related exe files (e.g. pip.exe). Then do a clean of the registry.
You may also find Python subdirectories in the WindowsStore directory.
While cleaning out registry entries you may come across references to .whl files (or Python Wheels).
Understand the Simple Power of Backpropagation but also the Dangers
So states Lex Fridman in his lecture on Recurrent Neural Networks (from the course on Deep Learning for Self-Driving Cars).
pip install tensorflow
This will install the current stable release for TensorFlow.
Friday, 7 November 2025
Control-Star - The Magic Key Combination in Word
Control-Star - will make hidden characters appear and disappear in Word.
Word AI is Not Intelligent
Wednesday, 5 November 2025
Azure Kubernetes Service - Rationale for Using AKS for .NET Applications
Exciting Thing about AI
Technology Knowledge
Biztalk and Azure Integration Services
Monday, 27 October 2025
Keras Models API
Keras Models API provides three ways to create models.
- Sequential Model - the simple model - consists of layers. applied in succession. You can create a Sequential model by passing a list of layers to the constructor of Sequential.
- The alternative, and preferred method for most use cases, is the Functional API. which is more flexible than the keras.Sequential API. It enables the building of graphs of Layers.
- Model subclassing is building from scratch. for out of the box use cases.
All Eyes on Keras - Layer = IO TRANSFORMATION
Thursday, 16 October 2025
AI as an Accelerated Research Tool
Tuesday, 14 October 2025
What is PCI-DSS?
Friday, 10 October 2025
Updating your Printer Driver
Go to Device Manager -> Printers and right click, then click on Update Driver.
Renaming your Printer
It can be good to rename your printer - particularly if the current name is a very complex model number. Simply open the printer in Settings, click on Additional Printer Settings and hit Rename.
Printer Showing Out of Paper But Not
Relationship of WMIC to WBEM
WMI is the Microsoft implementation of Web-based Enterprise Management (WBEM), a set of systems management technologies for use in distributed systems. It is based on common standards like the Common Information Model (CIM).
The WBEM initiative was started in 1996 by BMC Software, Cisco, Compaq, Intel and Microsoft.
The specification for the CIM is maintained by the DMTF (Distributed Management Task Force), an industry standards organization consisting of members and alliance partners collaborating on specifications. It also maintains a repository of its operating policies on issues including IP.
Use wmic to debug printer status in cmd.exe
WMIC is Windows Management Instrumentation Command Line. It is a command-line interface to WMI.
To use it to debug printer status on Windows, try the following in cmd.exe.
wmic printer get name, status
If status is Error this needs investigation.
Why Does Printing in Windows Always Try to Print in Letter Format?
Go to "Printers & Scanners" in System Settings. Select your device and click on "Printing Preferences". Then click on "Advanced" (Alt-V). Switch paper size to a new default size (e.g. A4). Click OK, Apply.
Tuesday, 30 September 2025
The Downlow on Explainable AI
Here's a great compilation of the latest in Explainable AI (XAI). Deep learning models (CNNs, RNNs, LLMs) as well as older models such as Support Vector Machines are covered. Techniques such as SHAP (short for Shapley Additive Explanations) are covered as well.
Monday, 29 September 2025
Open Source Community Communication with Discord
Mermaid Diagramming
Friday, 12 September 2025
dotnet.exe - what it means
dotnet.exe has many uses, but running compiled .NET executables distributed as EXE files is not one of them. One application is running .NET SDK commands.
File format for Jupyter Notebooks
Jupyter notebooks are stored as .ipynb files (which stands for interactive Python notebook).
Microsoft Semantic Kernel
Microsoft Semantic Kernel is a "lighweight, open-source development kit" to build AI agents and integrate models into C#, Python and Java code.
When you load up SK into a fresh Visual Studio Code (no extensions) it will prompt to install recommended extensions. These will include:
- ESLint - integrates ESLint JavaScript into VS Code (for static analysis)
- Prettier - integrates Prettier, the opinionated code formatter (for JavaScript, TypeScript and other webby stuff)
- Azure Functions - to quickly manage serverless apps directly from VS Code
- vscode-pdf - to display pdf files in VSCode (required to open PDF code maps for .NET and Python)
Cost Effective Deployment of Language Models
Cost effective deployment of language models (explicit financial as well as implicit environmental cost) is partly responsible for triggering the interest in small language models (SLMs) as alternatives for specific applications.
Nvidia Research have a great paper on this entitled "Small Language Models are the Future of Agentic AI" with the recommendation that more routine tasks (non reasoning tasks) move from LLMs to SLMs. Fine tuning these SLMs for specific tasks can also enhance the effectiveness of deployed models.
Thursday, 11 September 2025
Introducing the Mojo Programming Language
Tuesday, 9 September 2025
Where Winforms lives on Github
Can't Resize a Form in Design Mode for Winforms
Check a control with Docking.Fill set is not blocking the resize. This can intercept clicks meant for the form. You can workaround by temporarily setting Dock=None on the blocking control.
Resource-Aware Design in WinForms - TextBox versus RichTextBox
Monday, 8 September 2025
F7 and Shift-F7 - Key Visual Studio Solution Explorer Shortcuts
Environment Variables in Windows 11
To see your environment variables, type set in the command line. This will show you a bunch of stuff like:
ALLDATA=C:\Users\windowsjoe\AppData\Roaming
The setx command is an extension of set which allows you to create or modify environment variables, persisting the result across sessions. It was first integrated into Windows Vista and is a staple for Windows 10 and Windows 11.
To use setx to append to your PATH variable you will want to do something like this:
setx PATH "%PATH%;C:\Your\New\Directory"
For Windows Joe, a special scripts directory holds a lot of useful scripts. Hence, the path is updated to:
setx PATH "%PATH%;C:\users\windowsjoe\scripts"
If successful, you will see the message "SUCCESS: Specified value was saved.".
However, you will not be able to see the results using echo %PATH% until you start a new session.
Sunday, 7 September 2025
Alt-TNP - Package Manager Settings in Visual Studio
Alt-TNO - the Nuget Package Manager Console in Visual Studio
The Jungle of Text Encoding
Dealing with textual data on the Internet is like navigating a jungle.
Without some normalisation, you need to get adept at handling multiple encodings.
System.Text is your partner here.
This holds the Encoding class, which has various useful properties.
Saturday, 6 September 2025
PascalCase for WinForms Controls - Always
Key Properties of SplitContainer
- Panel1 - the leftmost or topmost panel in a SplitContainer
- Panel2 - the rightmost or bottommost panel in SplitContainer
- FixedPanel - which panel remains fixed sized when container is resized
- Orientation - orientation of the panels
- Panel1Collapsed - whether Panel1 is collapsed
- Panel2Collapsed - whether Panel2 is collapsed
- BorderStyle - container border style
Prompt Maintenance
Shortcuts for WinForms Custom Controls
Changes in .NET 6 to WinForms
Compiler Intrinsics - Basics, Pros and Cons
PROS
MASM Decoded
MASM is the Microsoft Macro Assembler.
It was introduced by Microsoft in the early 1980s to support x86 programming on DOS and Windows Platforms, and competed with IBM and Borland assembly tools. The "macro" in the naming refers to assembly macros - reusable code snippets to simplify complex or repetitive assembly tasks. It is an assembler in the sense it converts assembly language into executable machine code.
Inline assembler used to be a "thing" in earlier versions of Visual Studio. This allowed you to embed assembly language in a higher level programming language source code. This is no longer supported for x64 or ARM64 targets.
Options to port inline assembler include: conversion to C++, create separate assembly language source files, or use compiler intrinsics (supported by the Microsoft C++ compiler).
Visual Studio 2022 August 2025 Updates
Friday, 5 September 2025
Python platform module API shows errors - Oh Yeah
The Python platform module shows Windows 10 on Windows 11 machines - this is similar to other APIs which have been made to identify Windows 11 as a more advanced build of Windows 10 to avoid breaking backward compatibility. If build number is greater than 22000, you got Windows 11.
Sunday, 24 August 2025
Silverlight is Dead, Long Live Silverlight
Silverlight lives on in XAML. Visual Studio supports XAML but has lost the visual XAML editor. It's ok.
Figma Basics and First Impressions
Figma feels a lot "lighter" than Microsoft's historical desktop design tool Expression Blend. Being a SaaS it is also easy to get started without installing software - which is time consuming and taxing on laptops.
It operates on design files, has a central canvas, toolbox and two sidebars. The left sidebar is Navigation and the right sidebar is the Properties bar. Very neat and simple.
Figma for Non Design People
The Concept of Figma
Figma is a SaaS tool - popular with UI/UX designers - for creating prototypes and designs enabling real-time collaboration. Its purpose is to "bridge the gap" between designers and developers.
Adobe XD (Adobe Experience Design), now being discontinued, and Sketch are alternatives - however Figma was designed from the ground up to be cloud-native. This adds convenience as well as practical usefulness e.g. collaborating around live synchronised files.
Microsoft Integration with Figma
Microsoft has built custom Figma plugins to integrate its Fluent Design System into Figma. This creates design consistency e.g. in icons, text strings etc. The Fluent Design System is the design language used across Windows, Office and other products.
Microsoft PowerApps
Microsoft enables app creation from Figma designs using Power Apps, part of the Power Platform series of low code tools.
Figuring out Fluent 2
Monday, 18 August 2025
Sidestep Kernel with RDMA
- Skipping kernel involvement
- Avoid creation of multiple intermediate copies
- Avoid switching between kernel space and user space
- InfiniBand
- RoCE (RDMA over Converged Ethernet)
- iWARP (RDMA over TCP/IP).
Kernel Bypass Networking
Kernel bypass networking is a technique to improve network performance by enabling applications to interact with network hardware directly, instead of going via an abstraction layer (the OS kernel's network stack). All intermediate abstractions are bypassed. There are various specialized libraries such as DPDK, RDMA and OpenOnload to access Network Interface Cards (NICs) directly.
Monday, 11 August 2025
Built-in Secure VPN for Microsoft Edge
Microsoft Edge now comes with a free built-in VPN to keep your location private, safeguard sensitive data, fill out forms and more. There is an option to automatically use VPN for public wifi. The VPN is allegedly powered by CloudFlare and comes with usage limits.
Friday, 8 August 2025
OpenAI Releases Open Models under Apache 2.0
- gpt-oss-120b
- gpt-oss-20b
AWS S3 (and Auth Methods) for Azure Folks
Amazon S3 is something akin to Azure Blob Storage.
A quick revision of Azure Blob Storage:
Azure Blob Storage is a general-purpose cloud storage option for cloud native workloads.
It supports WORM operating scenarios (Write Once Read Many) aka WORM-compliant storage (which prevents data from being edited or deleted). Role based access control (RBAC) is supported as is authentication with Microsoft Entra (formerly Azure Active Directory).
Amazon S3 (Simple Storage Service) is a similar offering in AWS.
The services are exposed as a web service with a custom HTTP scheme known as keyed-HMAC (Hash Message Authentication Code). A more "high level" authentication option is available via AWS IAM.
The End of Microsoft Authenticator
Thursday, 10 July 2025
Do you know Windsurf? Oh, sorry, Codeium, actually under the hood!
Windsurf is Codeium. A rebrand, but not a bad one. It states it is the "most powerful AI code editor". Investors include Founders Fund, General Catalyst, Greenoaks and Kleiner Perkins, but that need not trouble Windows Joe. The main thing is this is an invested platform, so developers can invest time in it.
Wednesday, 9 July 2025
Apache Nutch - the Tool that Drives Common Crawl
LLM Training Data
LLMs are trained on large data sets. One such data set is Common Crawl which consists of 250 billion Internet pages with 3-5 billion pages added each month. This is petabytes worth of data (1 petabyte = 10^15 bytes of digital information). The data is stored on Amazon's S3 service allowing direct download or access for Map-Reduce processing in EC2.
Tuesday, 8 July 2025
What is InstructLab?
Friday, 4 July 2025
Latest .NET Version as of July 2025
Latest Supported .NET Versions as of July 2025 is .NET 9 (STS)
The latest stable .NET version as of July 2025 is 9.0.6, released on June 10, 2025.
.NET 9 Patch version 9.0.6; Release: STS; End of support: May 2026 (ORD: November 2024)
.NET 8 Patch version 8.0.17; Release: LTS; End of support: November 2026 (ORD: November 2023)
ORD means Original Release Date.
Release Schedule
Major .NET versions are released annually in November. Each release is defined as STS or LTS at the beginning of the release.
Details of Microsoft's Lifecycle Policy are Below
2 Become 1 - Story of .NET Frame** and .NET Core
What is TriG in Semantic Computing?
TriG is an extension of Turtle for representing all the data in RDF graphs in a compact format. It is a W3C recommendation as of February 2014. TriG stands for "triples in graphs".
Any Turtle statement is also a valid statement in TriG.
Thursday, 3 July 2025
Unsloth
Unsloth aims to speed up the expensive process of LLM training. It does this by rewriting different components of the training pipeline including rewriting the gradient calculation. Their motto is "24 hours not 30 days" which is a reference to LLM training time. It also claims to rewrite GPU kernels for efficiency (functions designed to be executed on GPUs).
The Hugging Face Transformers Library and MRM
Transformers library puts trained open AI models in the hands of Python programmers.
It is maintained by Hugging Face, a hub for "SOTA" AI models.
Hugging Face also maintain markdowns called Model Cards in each relevant model repo to give you insight into the models.
The concept of Model Cards is explained in this paper. It argues, for high impact applications, the Model Card brings critical usage information for deployers to consider. This could be seen as a tool to support Model Risk Management (MRM).
Transformers is available on PyPI and can be installed with pip.
So - what is Preference Alignment in LLM Training?
Preference alignment in LLM training aims to improve an LLM's behavior by forcing it to follow rules and preferences. It could related to stopping offensive language or some other restriction.
Some approaches to preference alignment are detailed in this blog post from Miguel Mendez. There are a number of known techniques for this - these include:
PPO: Proximal Policy Optimization
DPO: Direct Preference Optimization
ORPO: Optimization without Reference Model
For preference alignment we usually need data which is good or bad. Human annotation of such data is often expensive and in some cases a clear "winner" in terms of contrasting data points is not decidable. With KTO two answers can both be regarded as good. This arguably is closer to reality.
KTO stands for Kahneman-Tversky Optimization and is detailed more in a blog post from contextual.ai.
The research paper on KTO should be read to understand how to construct the relevant KTO loss function.
SPARQL is the query language for RDF - Know It
SPARQL is THE query language for RDF. Here are some learning resources.
Both universities and commercial firms are involved in the RDF Star WG.
SPARQL uses pattern matching to query an RDF graph and also allows aggregate algebra operations (such as COUNT) to be performed on qualifying nodes.
The Power is INDEED the LLAMA
Llama 4 ("Leading Intelligence") is out.
There is something called the Llama 4 Community License Agreement. This states you can use Llama models in derived products - but you must tell the world you are using Llama and what version it is.
Machine Unlearning
Tuesday, 1 July 2025
Concept of LoRA or Low Rank Adapation in LLMs
LoRA is an approach to optimizing LLMs by reducing the "size" of the matrix of trainable parameters, as measured by "rank" of the matrix i.e. the number of linearly independent rows or columns.
What is RIO in RDF? Is it relevant post OpenRDF?
RIO is the "RDF Input/Output" toolkit.
The RIO appellation persists even in the post OpenRDF world.
RIO was part of OpenRDF and is now part of RDF4J. Docs are here.
These parsers and writers can be used independently of the rest of the RIO library. An important parser in the toolkit is the RDFHandler, which receives parsed RDF triples. This can be used as a pure listener, or as a reporting tool (being passed to a function that needs to report results back).
It's good to understand RIO both for comprehending legacy messages from OpenRDF and also more recent exceptions from RDF4J.
OpenRDF Usage in Blazegraph
Blazegraph (no longer maintained and unofficially superseded by Amazon Neptune) uses OpenRDF under the hood (rather than the renamed version RDF4J).
This can be found by forcing an exception in the Blazegraph workbench:
org.openrdf.rio.RDFParseException
By typing in some bad syntax into the Update window. Typing "hello world" will do nicely.
Why does Copilot want to rewrite my RDF when it doesn't know how?
What is Turtle in the World of Semantic Web?
Monday, 30 June 2025
Why are GPUs fundamental to AI? And AI training?
Is Blazegraph now Amazon Neptune?
Apache JENA
The Rationale Behind "Internationalized" Resource Identifiers
What is OpenRDF Better Known as?
RDF hackers will know about OpenRDF, which officially became Eclipse RDF4J in May 2016.
Its tagline is its power to "create applications that leverage the power of linked data and Semantic Web".
Sesame was another name for what is now known as RDF4J.
Many name changes were also effected in the move to a new governance structure. For example:
org.openrdf.* Java packages moved to org.eclipse.rdf4j.*
In particular the RDF4J project houses the SAIL (Storage and Inference Layer) API for low level transactional access to RDF data. Sail is dubbed the "JDBC of the RDF database world".
What is Dublin Core in Computer Systems?
Dublin Core is a metadata labelling system. Its full title is the DCMI, or Dublin Core Metadata Initiative.
org.openrdf.query.MalformedQueryException
What is Snake Case in Computer Programming?
Snake case is a way of writing compound words, so that each word is separated by an underscore symbol. An example would be "cavendish_laboratory" or "sainsbury_laboratory".
It is meant to be easier to read than camelCase, _camelCase or PascalCase.
A variant, Kebab Case, uses dashes and though commonly found in URLs, is also used in other contexts e.g. SPARQL-QUERY.
Change Font Size in Terminal on Windows 11
Changing font size in the Terminal in Windows 11 is a multistep process. It is an important skill for developers.
1. Press the DOWN ARROW on the menu bar
2. Go to SETTINGS
3. Select the THREE BARS at the top left
4. Select the appropriate profile (COMMAND PROMPT, as opposed to e.g. Windows PowerShell, Azure Cloud Shell).
5. Click Appearance and Edit Font Size (Default is 12 point, 10 point better for development tasks).
The Blazegraph Database - 50 Billion Edges Supported
The Blazegraph database is an ultra-high-peformance graph database supporting Blueprints and RDF/SPARQL supporting 50 billion edges on a single machine.
It powers the Wikidata Query Service.
There is a Quick Start guide that shows you how to start the Blazegraph JAR file from its installed location. It will then greet you with a Welcome Message from SYSTAP.
java -server -Xmx4g -jar blazegraph.jar
What is THE Semantic Web Stack?
The Semantic Web Stack illustrates the architecture of the Semantic Web.
Another weird name for this is Semantic Web Cake or Semantic Web Layer Cake.
It's built from hypertext technologies (such as XML, XML namespaces) and utilizes middle layer technologies like RDF and SPARQL (which is a middle layer RDF query language).
Security layer and UI layer are evolving areas of Semantic Web technology which are not standardised.
Win Joe's Buzzword Alert - What is a SIEM?
Friday, 6 June 2025
The Model Context Protocol (Merci, Anthropique)
What is WebRTC?
WebRTC (Web Real-Time Communication) is an open standard providing real-time communication between web applications using APIs. It was initially released in 2011 and was the work of Justin Uberti and Peter Thatcher.
The official website can be found on the Google for Developers web.
One application it enables is browser-based VoIP telephony, or "web phones", enabling calls to be made and received from within a web browser.
Tuesday, 3 June 2025
Against AI-First
AI First is not often what you want in software. Systems need to start with humans and human intentions and AI needs to provide seamless support not be front and center stage. The other thing is AI written by humans introduces human biases. These may make sense where the creator and end user are the same, but too often you have programmer biases entering consumer software which is bad for human-centered software development.
Monday, 26 May 2025
Progressive Rollout (aka "Canary Deployment") in the Cloud
A canary deployment is an old concept with a new branding in the age of cloud, and terminology-wise is used by both Google, AWS and Azure. Kubernetes technology is one way to manage canary deployments.
Canary deployments are progressive rollouts where new functionality is released to a subset of specially selected users. Therefore the canary deployment runs in parallel to current production deployment used by your regular users. This gives you more time and space to test the reliably of new features "in the wild".
Introducing the Azure SRE Agent
The new Azure SRE agent (announced May 2025) and demonstrated at MS Build, is designed to make it easier to "sustain production environments". This includes taking toil away from checking log files, analyzing historical changes and augmenting this with LLMs. Incident and infrastructure management is set to be transformed, with the Azure SRE agent able to partner in incident investigation and root cause analysis. An example prompt may be: "visualize HTTP request and 500 errors for last week for my app".
Wednesday, 14 May 2025
Tuesday, 13 May 2025
CNCF
We have discussed CNCF in the context of gRPC.
Other famous hosted projects are Kubernetes, Prometheus and CoreDNS.
In their own words they host "critical components of the global technology infrastructure". They also organize conferences.
wsl for Windows 11
wsl is not installed by default on Windows 11. To install, just type wsl and tap any key to install.
You can then type wsl --version to get version info.
This will tell you the wsl version (e.g. 2.4.13.0), kernel version (5.15.1674-1) and MSRDC version (e.g. 1.2.5716).
The kernel version does not refer to the Windows kernel version but the WSL kernel version.
Kernel releases can be found here. MSRDC version refers to Remote Desktop Client whose versions can be found here (and which can be used to connect with Azure Virtual Desktop).
TCF Vendors
- BeeswaxIO Corporation (a programmatic advertising company now owned by Comcast)
- Magnite, Inc
- Comscore BV
- LiveIntent Inc (acquired by Zeta Global)
- Amazon Advertising
Friday, 9 May 2025
Papers with Code
Papers with Code is a Meta AI initiative that organizes machine learning papers under various themes including Computer Vision, Natural Language Processing, Reasoning, Time Series and Knowledge Representation. Some of these papers are written by corporate researchers contributing to open source.
Thursday, 1 May 2025
Visual Studio Magazine
MSVS is large, complex and suitably changing enough to warrant its own magazine. Read it well.
Dot net (.NET) MAUI for Dummies
Dot net (.NET) MAUI (Multi-platform App UI) is a cross-platform framework for creating native mobile and desktop applications with C# (and XAML optionally).
The upside is you just have one codebase which can be used to render UI on Windows, Android, iOS, macOS and Samsung Tizen.
For Windows, WinUI 3 is used as the native platform (this means it will work on Windows 10 version 1809 or later, and Windows 11).
Mermaid has evolved from multifarious UI technologies.