- Libraries for Boosting: catboost, XGBoost (eXtreme Gradient Boosting)
- Libraries for Bagging: scikit-learn (BaggingClassifier, BaggingRegressor), imbalanced-learn (scikit-learn extension for imbalanced data), ML-Ensemble
Programming is Not Rocket Science, Don't let AI Write Your Code, Fight Back, Learn from ODML
Saturday, 29 November 2025
Boosting versus Bagging
Friday, 28 November 2025
ufunc in numpy - understanding universal functions
statsmodels in Python
statsmodels is a Python package that complements scipy for statistical computation.
The stable version is found here.
statsmodels takes ideas from other libraries and ecosystems, specifically it uses R-style formulas, and pandas DataFrames.
Chances are you are using the library with other libraries too, like numpy.
It can be installed via Conda or pip. Examples:
python -m pip install statsmodels
Thursday, 27 November 2025
Validating DataFrames in pandas
In Advance of Node.js Learning
Learn the basic rudiments of JavaScript. Include asynchronous JavaScript.
Connoisseur's Guide to JavaScript Engines: V8 Rules
Node.js uses the V8 JavaScript engine which powers Google Chrome (and is open sourced by Google).
Other browsers use their own engine, for example Firefox uses SpiderMonkey and Safari uses JavaScriptCore (aka Nitro). Edge was based on Chakra (a Microsoft project that was open-sourced) before being rebuilt with V8 and Chromium.
What JavaScript is Not Allowed to Do in the Browser
JavaScript in the browser is not normally allowed to do too much for security reasons.
Stuff it cannot do includes anything OS related - specifically:
- Cannot read/write arbitrary files
- Cannot access hardware directly
- Cannot control processes
- DOM manipulation (HTML/CSS)
- Local/session storage
- IndexedDB (database built into the browser)
- Cookies (with same origin)
- Clipboard (with user permission)
- Geolocation, camera, microphone (with user consent)
Node.js relationship with Electron
Node.js is needed to "scaffold" an Electron project - the node package manage (npm) is used to download Electron packages, install dependencies and generate starter files.
Electron itself comes bundled with its own Node.js runtime.
This serves as the "back end" of your Electron app.
It manages windows, menus, system events and native OS integration. You can use Node modules directly in this environment e.g. fs, path and http.
The "renderer" process is Chromium.
By default, this can also access Node.js APIs unless explicitly disabled for security reasons. So you effectively have a "contained" web app with OS access.
The main process (Node.js runtime) communicates to the renderer via IPC.
This combined architecture of Node.js and Chromium enables applications to be written that run on Windows, macOS and Linux without experience of native UI development.
The Same Origin Policy (SOP) on Modern Web Browsers
The Same Origin Policy (SOP) is a browser-enforced security rule that prevents scripts from one "origin" (PDP -> protocol + domain + port) from accessing resources from another origin.
The SOP prevents cookies, DOM and local storage from being read by malicious cross-site scripts.
The SOP does not just apply to web browsers. For example, Electron apps (desktop apps built with web tech) enforce SOP because they embed Chromium.
The Same Origin Policy is an "isolation model" designed to ensure "secure workflow".
Basics of Selenium
Wednesday, 26 November 2025
git remote add
Tuesday, 25 November 2025
pandas and DataFrames
pandas provides data structures and data analysis tools for Python.
The basic data structures in pandas are:
- Series: one-dimensional (labelled) array holding data of any type e.g. integers, strings
- DataFrame: a two-dimensional data structure, holding a 2d-array or table with rows and columns
Monday, 24 November 2025
Microsoft Launch Fara-7B: A CUA (Computer Use Agent) in SLM Form
And here we have it. Ready for action on Hugging Face, Sir.
Sandboxing and monitoring are recommended. The agent itself is a wrapper around Playwright.
Sunday, 23 November 2025
Hacking Transformers with Hugging Face
The Runtime Formerly Known as TensorFlow Lite
Mastering BERT and DistilBERT
It is worthwhile to study BERT as DistilBERT has the same general architecture as BERT.
IBM's Guide to Small Language Models
sbs_diasymreader.dll
- sbs - side by side (Recall - allows multiple versions of a DLL to sit side-by-side without conflicts)
- dia - Debug Interface Access, reference to SDK used to read debugging symbols (PDB files)
Friday, 21 November 2025
Programming Realities - The Awkward Error Message
The awkward error message can stop you in your tracks. Keep pushing on. Discover. Remediate error. Every error is a massively valuable learning opportunity.
Thursday, 20 November 2025
The Concept of C# Scripting (.csx files)
Compiling C# in VS Code
For this you need the C# Dev Kit extension. It's Roslyn-powered.
The code name Roslyn was first written publicly by engineer Eric Lippert (the code was originally hosted on Microsoft's CodePlex before being moved to GitHub).
Dev Containers Extension in VS Code
What it is, What it does
The Dev Containers ("DC") extension is needed by Semantic Kernel in VS Code. It's worth expanding on its purpose here.
DC allows you to use a Docker container as a full-feature development environment (this is independent of how you deploy the thing).
More details here.
Dev Containers Dependencies
It requires Docker Desktop to be installed, which interacts with WSL2. If you don't have it, don't worry, however. VS Code will prompt you automatically to install it. After installation, you will see a status bar labelled "Setting up Dev Containers" followed by "Connecting to Dev Container (show log)".
Dev Container Configuration
This is located in semantic-kernel\.devcontainer\devcontainer.json. This is similar to launch.json for debugging configurations. More info here.
Git on Windows
Git on Windows is good.
However there are a few options to select before you get this going.
- add git to path (can use from cmd.exe and Powershell etc.)
- use bundled OpenSSH (uses ssh.exe that comes with git) - alternative is to use an ssh.exe you install and add to your path
- which SSL/TLS library should Git use for HTTPS connections? OpenSSL library or native Windows Secure Channel (Choose latter). Here Server Certs will be validated by Windows Certificated Stores. Also allows you to use your company's internal Root CA certificates distributed e.g. via Active Directory Domain Services
- Git Bash to use MinTTY for terminal emulation (better scrollback than cmd.exe)
- Use Git Credential Manager or use none
Do You Understand Fully How This Works
I think you need to do. Attribution: One software engineer to another.
Mastering git clone
The Unstable Book for Rust
Rust has The Unstable Book to cover unstable features of Rust. When using an unstable feature of Rust, you need to use a special flag or rather a special attribute: #![feature(...)].
Unstable features in Rust refer to specific capabilities (language or library) as yet unstabilized for general use. You can access these on the nightly compiler (not in stable or beta channels).
Unstable features may be experimental, incomplete or subject to change.
They are in the language as a means of balancing innovation with stability. Developers get access to new features but basically on a trial basis. While the feature is classed as unstable, Rust team can refine the design, fix edge cases or abandon features if problematic.
cargo new hello_cargo
For details on how to use cargo new type cargo new -help at the command line.
cargo new hello_cargo
creates a new directory for your project, a .git subdirectory, an src directory for your source code, and a Cargo.Toml file.
TOML, or Tom's Obvious Minimal Language, format, is Cargo's configuration format. (TOML brands itself as "A Config File Format for Humans" and is nice, simple and neat).
It has a [package] section which configures package settings, and a [dependencies] section for any of the crates required for the package to run (crates are Rust packages).
cargo init -help
for help integrating any Rust code developed outside of Cargo.
Elliptic Curves Assert Presence on Linux
source command in Linux shell
The source command in the Linux shell sources (loads) a script. Its short form is .
emacsx - An Emacs Launcher for .bashrc
Here is a cool emacs launcher for your .bashrc file if you like reverse video.
# Launch Emacs in full screen, optionally with a file
# Launch Emacs in full screen, optionally with a file
if [ "$#" -eq 0 ]; then
command emacs -rv --fullscreen
else
command emacs -rv --fullscreen "$@"
fi
}
Cargo for Rustaceans
Wednesday, 19 November 2025
What is the "tensor" in TensorFlow?
Deeper Look at Sequential Model Building in TensorFlow
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10)
])
- The first argument is the positive integer units, representing the dimensionality of the output space
- The second argument is the activation function to use (if this is missing, no activation is applied which is actually linear activation a(x) = x)
- Essentially these functions work with neurons and transform neural computations into output signals
- ReLU (rectified linear unit) is one of the most widely used activation functions in neural networks (f(x) = max(0,x)).
Train your First Neural Network on the MNIST dataset
- Use Keras API as a "portal" into TensorFlow library to build the neural network
- Use "Sequential" model - allows you to add "layers" sequentially
- "Feed" the model the training data (creating the model takes a bit of study/effort)
- Model gives back a vector of "Logits" or "Logs-odd" scores, one per class
- Run softmax to convert these scores to probabilities
- Compile the model - with an optimizer and a loss function, configure for 'accuracy'
- model.fit
- model.evaluate to see how the model performed (was it a good fit to the data)
I had to bring pip into my Ubuntu Linux VM
Even with Python installed a whole host of packages are needed for pip. These include (non-exhaustively):
- build-essential
- bzip2
- cpp
- cpp-11
- fakeroot
- ...
- zlib1gdev
Just to mention a few.
build-essential is a Debian specific package, consisting of a bunch of useful build tools, to build software from source and create Debian packages.
Tuesday, 18 November 2025
Rustup
Microsoft's Big Bet on Rust
Will two git commits ever have the same id?
This is highly unlikely due to the design of the commit algorithm. The probability is less than 1.47 in 10^48. Digging into this, the commit id is actually a 40-character hexadecimal number - using a cryptographic hash function (SHA-1 or SHA-256 in newer versions) producing 2^(4*40) possible hashes.
Metadata that's fed into the cryptographic hash function include snapshot of the project tree (folders, files and contents), commit message, the commit data itself, author information and timestamp (time elapsed in seconds since Jan 1, 1970).
TCPL Still Rules the Linux Roost
The programming language C (born in the 1970s, created by Dennis Ritchie, who also created its predecessor, B) still rules the roost as far as the Linux kernel is concerned, whereby of 37 million lines of code, we have just over 35m LOCs written in C as per analysis from OpenHub.
Linux is the open source version of Unix, which was written in C, and prior to that in assembler. The introduction of the programming language C made the code portable to different hardwares.
Ken Thompson and Dennis Ritchie and their colleagues at Bell Labs (AT&T) were the co-creators of the Unix Operating System, created in 1969. Douglas McIlroy introduced the Unix philosophy of small, composable tools. Brian Kernighan helped to popularise UNIX and C through co-authoring The C Programming Language with Dennis Ritchie.
git add and git commit
The Mechanics of Commits
You can add a new file to your repo doing:
git status (this will show any pending commits)
git commit -m "a little comment, if you please" <mynewfile>
Allowing wsl.localhost in list of allowed hosts
When navigating to a directory in WSL from VS Code you may get the message:
The host 'wsl.localhost' was not found in the list of allowed hosts. Do you want to allow it anyway?
You will get the opportunity to hit Allow (because you trust the host, it is after all, your WSL installation) together with the option to flag: "Permanently allow host 'wsl.localhost'".
Accessing Linux Directories in WSL From Explorer
Directories in WSL can be accessed by navigating to your distribution and filesystem after opening \\wsl$ in File Explorer.
Telling git who you are with git config
To get your commits attributing correctly you need to let git know who you are. This means sharing your name and email address you want to use for git. Here is the way:
git config --global user.email "youremail@provider.com"
Creating a git repository: git init
What git init does
Run git init in the folder where your code will be based (this assumes you have already created an appropriately named folder and done cd into it).
This creates a .git subdirectory.
What does the .git subdirectory hold?
.git is filled with configuration files and a subfolder for snapshots to be stored. All commits are stored in .git as well.
Name of initial branch
This may change in successive versions of Git, but as of 2.34, the name of the initial branch created by git init is 'master' (other commonly used names include 'main' and 'trunk' and 'development').
Re-running git init
Re-running git init will not do anything. It will simply output "Reinitializing existing Git repository" in /home/YOURLOGIN/YOURPROJECT/.git/.
Re-running git init in the .git subdirectory
Renaming a branch in git
The git branch command can be used to rename the branch.
Linux VM Skills - The ls command
ls -a and ls -A will list all files including hidden files.
ls -A (capital A) will attempt to remove the . and .. directory (i.e. current and parent directories).
A Head-First Guide to Git
Saturday, 15 November 2025
Canonical Snaps and Approach to Linux Packaging
Concept of Snaps and Why Its Useful
When running wsl in Windows (with Ubuntu) you will eventually come across the concept of snaps.
Snaps are a package management feature that offer an alternative to the usual sudo apt-get (or sudo apt install, which is a wrapper over apt-get).
Snaps was developed by Canonical for security (via sandboxing, or in Canonical language "confinement") and convenience (an "all-in-one" snap removes the need to download and install individual dependencies).
The security side aims to guarantee safe execution of software by mandating packages abide by the principle of least privilege (this is diluted however by the option of classic confinement).
Concrete Example: Installing emacs edito
Trying to run emacs at the command line, you find it is not installed. You may see:
sudo snap install emacs # version 30.2
- Strict confinement - abide by sandbox rules
- Classic confinement - liberal / "laissez-faire" (but needs explicit user approval on install)
Friday, 14 November 2025
Wayland for WinDevs Who May Not Know It
What is Wayland?
Wayland is intended as a replacement for the X Window system with the aim to make it easier to develop and maintain.
Specifically, Wayland is the protocol used to talk to the display server to render the UI and receive input from the user.
Wayland also refers to a system architecture (more below), which will give you an understanding of how the protocol is used to build a working UI.
Wayland versus X Architecture? Call it a "Simplified X".
In an X setup, X clients talk to an X server and the server talks to a compositor. The comms between server an compositor is bidirectional.
The X Server also talks to the kernel. There is a critical interface called evdev (short for event device) which is the input event interface in the Linux kernel.
In Wayland - the display server and the compositor are rolled into one. The architecture is thus simpler.
What is WSLg?
Google Colab
Tuesday, 11 November 2025
Addressing AI Misuse
Getting Jiggy with gpt-oss-20b (and why open weights matter)
LM Studio Setup
Disk Space in Windows 11
Type Storage Settings in the Search bar. This will lead you into Settings ->System -> Storage.
Monday, 10 November 2025
Python Wheels
Python wheels are pre-built binary packages for Python which make installation via pip faster and more efficient. Learn more here.
Why Python From Windows Store is Flawed
Installing Python from the Windows Store is flawed as everything goes into AppData\Local. This is a local directory associated with the logged in user C:\Users\<Name>\AppData\Local.
This is a way to sidestep a "proper installation" in C:\Program Files which requires administrator privileges. It's a way to overcome potential UAC restrictions.
Once installed - there is no proper way to uninstall. You need to get into the AppData directory (which is hidden in File Explorer). Once opened, you can navigate to Microsoft\WindowsStore to find python and related exe files (e.g. pip.exe). Then do a clean of the registry.
You may also find Python subdirectories in the WindowsStore directory.
While cleaning out registry entries you may come across references to .whl files (or Python Wheels).
Understand the Simple Power of Backpropagation but also the Dangers
So states Lex Fridman in his lecture on Recurrent Neural Networks (from the course on Deep Learning for Self-Driving Cars).
pip install tensorflow
This will install the current stable release for TensorFlow.
Friday, 7 November 2025
Control-Star - The Magic Key Combination in Word
Control-Star - will make hidden characters appear and disappear in Word.