Saturday, 29 November 2025

Boosting versus Bagging

Boosting and bagging are two classes of machine learning techniques.
Boosting is basically stacking mini-models that incrementally improve on previous models. Bagging is using ensemble/averaging techniques i.e. running models in parallel and computing some form of average.

Friday, 28 November 2025

ufunc in numpy - understanding universal functions

It operates on ndarrays. Here's the lowdown.

ufuncs operate on ndarrays in an element-by-element function. Several features are supported such as array broadcasting and type casting. 

Broadcasting is when a smaller array is spread over a bigger one. For example. a single number can be "broadcast" to every element of an array.  Analogously, a 1D array can be "broadcast" across the rows and columns of a 2D array.

The idea of "broadcasting" predates NumPy and was first implemented in an array-focused scientific programming language called Yorick from the 1990s.

statsmodels in Python

statsmodels is a Python package that complements scipy for statistical computation. 

The stable version is found here.

statsmodels takes ideas from other libraries and ecosystems, specifically it uses R-style formulas, and pandas DataFrames.  

Chances are you are using the library with other libraries too, like numpy.

It can be installed via Conda or pip. Examples:

conda install -c conda-forge statsmodel
python -m pip install statsmodels

Among the tricks statsmodels can perform are: time series analysis, various flavours of regression (OLS, generalized and weighted least squares), as well as PCA.

Thursday, 27 November 2025

Validating DataFrames in pandas

You may ingest a time series into a DataFrame in pandas. 

You may then need to access part of that DataFrame using something like dataframe.iloc[0] where the aforementioned command gives the row at position 0. 

However, what if that row is empty? There is a predicate dataframe.empty you can use as follows: 

if df.empty:
    print "DataFrame is empty"

DataFrames are like spreadsheets or SQL tables. They are the most commonly used data structures in pandas, and like a spreadsheet, columns don't need to be of the same type.

In Advance of Node.js Learning

Learn the basic rudiments of JavaScript. Include asynchronous JavaScript.

Connoisseur's Guide to JavaScript Engines: V8 Rules

Node.js uses the V8 JavaScript engine which powers Google Chrome (and is open sourced by Google).

Other browsers use their own engine, for example Firefox uses SpiderMonkey and Safari uses JavaScriptCore (aka Nitro). Edge was based on Chakra (a Microsoft project that was open-sourced) before being rebuilt with V8 and Chromium.

What JavaScript is Not Allowed to Do in the Browser

JavaScript in the browser is not normally allowed to do too much for security reasons.  

Stuff it cannot do includes anything OS related - specifically:

  • Cannot read/write arbitrary files
  • Cannot access hardware directly
  • Cannot control processes
JavaScript can do certain things through Web APIs:
  • DOM manipulation (HTML/CSS)
  • Local/session storage
  • IndexedDB (database built into the browser)
  • Cookies (with same origin)
  • Clipboard (with user permission)
  • Geolocation, camera, microphone (with user consent)

Node.js relationship with Electron

Node.js is needed to "scaffold" an Electron project - the node package manage (npm) is used to download Electron packages, install dependencies and generate starter files. 

Electron itself comes bundled with its own Node.js runtime. 

This serves as the "back end" of your Electron app. 

It manages windows, menus, system events and native OS integration. You can use Node modules directly in this environment e.g. fs, path and http.

The "renderer" process is Chromium. 

By default, this can also access Node.js APIs unless explicitly disabled for security reasons. So you effectively have a "contained" web app with OS access.

The main process (Node.js runtime) communicates to the renderer via IPC.

This combined architecture of Node.js and Chromium enables applications to be written that run on Windows, macOS and Linux without experience of native UI development.

The Same Origin Policy (SOP) on Modern Web Browsers

The Same Origin Policy (SOP) is a browser-enforced security rule that prevents scripts from one "origin" (PDP -> protocol + domain + port) from accessing resources from another origin.

The SOP prevents cookies, DOM and local storage from being read by malicious cross-site scripts.

The SOP does not just apply to web browsers. For example, Electron apps (desktop apps built with web tech) enforce SOP because they embed Chromium.

The Same Origin Policy is an "isolation model" designed to ensure "secure workflow".

Basics of Selenium

Selenium started out in 2004 at ThoughtWorks as a way to automate UI testing for a timesheet application in Python and Plone. Discussions were held on open sourcing this (internal) tool and Selenium was born.

Wednesday, 26 November 2025

Microsoft Edge has an optional WebDriver for Automation

Find out more here

git remote add

The git remote add command is used to link you local Git repository to a remote repository (e.g. on GitHub, GitLab or BitBucket).

Syntax:
git remote add <name> <url>

name is the name (alias) you are giving to that remote. By convention, the main remote is called origin. Hence you will see commonly:

git remote add origin https://github.com/username/my-project.git

Note that my-project should be pre-created from within GitHub using New Repository.

Tuesday, 25 November 2025

pandas and DataFrames

pandas provides data structures and data analysis tools for Python.

The basic data structures in pandas are:

  • Series: one-dimensional (labelled) array holding data of any type e.g. integers, strings
  • DataFrame: a two-dimensional data structure, holding a 2d-array or table with rows and columns

Monday, 24 November 2025

Microsoft Launch Fara-7B: A CUA (Computer Use Agent) in SLM Form

And here we have it. Ready for action on Hugging Face, Sir.

Sandboxing and monitoring are recommended. The agent itself is a wrapper around Playwright.

Sunday, 23 November 2025

Hacking Transformers with Hugging Face

Knowledge here. But in short:

pip install huggingface_hub
pip install --upgrade huggingface_hub

To test the install, you can try the below:

 python -c "import huggingface_hub; print(huggingface_hub.__version__)"
 python3 -c "import huggingface_hub; print(huggingface_hub.__version__)"

pip install transformers tensorflow (if you are using tensorflow, else type torch)
pip install transformers tensorflow datasets 

To test bring up Python CLI and do:

from transformers import AutoTokenizer.

The Runtime Formerly Known as TensorFlow Lite

LiteRT is the  Google on-device runtime for machine learning, formerly known as TensorFlow Lite. 

You can convert TensorFlow, PyTorch and JAX models to the TFLite format. 

This can be done using AI Edge conversion tools.

LiteRT rises to various ODML (On-Device Machine Learning) challenges:

1. Connectivity - ability to execute without an Internet connection
2. Size - reduced model and binary size
3. Privacy/data restrictions - no personal data leaves the device
4. Power consumption - efficient inference and a lack of network connections

Operationally, LiteRT models use an efficient portable format known as FlatBuffers, and the .tflite file extension. (See here for the difference between FlatBuffers and protobuf).


Mastering BERT and DistilBERT

This is BERT.  

Introduced in 2018, it stands for "Bidirectional Encoder Representations from Transformers". 

The "bidirectional" component implies it use context to the left and right of critical words.

It's GLUE score is 80%.

This is DistilBERT, introduced in the context of edge computing.

In the paper, the authors also point out "We have made the trained weights available along with the training code in the Transformers library from HuggingFace".
 
It is worthwhile to study BERT as DistilBERT has the same general architecture as BERT.

IBM's Guide to Small Language Models

IBM have made a guide to SLMs.

Examples SLMs listed as:

  • DistilBERT (DistilBERT is Google's groundbreaking BERT model in "distilled" form (hence the name "Distilled BERT"), retaining 97% of BERT's NLU abilities)
  • Gemma
  • GPT-4o mini
  • Granite
  • Llama
  • Ministral
  • Phi

sbs_diasymreader.dll

This DLL is part of the .NET Framework.
  • sbs - side by side (Recall - allows multiple versions of a DLL to sit side-by-side without conflicts)
  • dia - Debug Interface Access, reference to SDK used to read debugging symbols (PDB files)
In Windows 11, it sits in C:\WINDOWS\Microsoft.NET\Framework along with other sbs_xxx DLLs.

Friday, 21 November 2025

Programming Realities - The Awkward Error Message

The awkward error message can stop you in your tracks. Keep pushing on. Discover. Remediate error. Every error is a massively valuable learning opportunity.

Thursday, 20 November 2025

Activation Functions in Neural Networks

The Concept of C# Scripting (.csx files)

The concept of C# scripting bears semblance to Jython in Java (more so than JavaScript to Java).

Scripting commands will not be allowed in regular programs.

Here's an example:

#r "nuget: Microsoft.SuperAdvancedKernel, 1.23.0"

It can be used in environments like .NET Interactive and Jupyter Notebooks with .NET. It came into being in Visual Studio 2015.

The file needs to be .csx file.

#r is the reference directive, to reference an assembly or package.

Compiling C# in VS Code

For this you need the C# Dev Kit extension. It's Roslyn-powered. 

The code name Roslyn was first written publicly by engineer Eric Lippert (the code was originally hosted on Microsoft's CodePlex before being moved to GitHub).

Dev Containers Extension in VS Code

What it is, What it does

The Dev Containers ("DC") extension is needed by Semantic Kernel in VS Code. It's worth expanding on its purpose here.

DC allows you to use a Docker container as a full-feature development environment (this is independent of how you deploy the thing).

More details here.

Dev Containers Dependencies

It requires Docker Desktop to be installed, which interacts with WSL2. If you don't have it, don't worry, however. VS Code will prompt you automatically to install it.  After installation, you will see a status bar labelled "Setting up Dev Containers" followed by "Connecting to Dev Container (show log)".

Dev Container Configuration

This is located in semantic-kernel\.devcontainer\devcontainer.json.  This is similar to launch.json for debugging configurations. More info here.

Git on Windows

Git on Windows is good.

However there are a few options to select before you get this going.

  • add git to path (can use from cmd.exe and Powershell etc.)
  • use bundled OpenSSH (uses ssh.exe that comes with git) - alternative is to use an ssh.exe you install and add to your path
  • which SSL/TLS library should Git use for HTTPS connections? OpenSSL library or native Windows Secure Channel (Choose latter). Here Server Certs will be validated by Windows Certificated Stores. Also allows you to use your company's internal Root CA certificates distributed e.g. via Active Directory Domain Services
  • Git Bash to use MinTTY for terminal emulation (better scrollback than cmd.exe)
  • Use Git Credential Manager or use none

Do You Understand Fully How This Works

 I think you need to do. Attribution: One software engineer to another.

Mastering git clone

You will want to use git clone to clone a repository and see its contents. This is similar to an svn checkout, or svn co.

So what does git clone actually do?

git clone clones a repository into a newly created directory, plus a lot more - read and summarise this later. Pay close attention to the notion of "remote-tracking branches".

There is a -l or --local option for git clone which does a clone from a local machine.

The Unstable Book for Rust

Rust has The Unstable Book to cover unstable features of Rust. When using an unstable feature of Rust, you need to use a special flag or rather a special attribute:  #![feature(...)].

Unstable features in Rust refer to specific capabilities (language or library) as yet unstabilized for general use.  You can access these on the nightly compiler (not in stable or beta channels).

Unstable features may be experimental, incomplete or subject to change.

They are in the language as a means of balancing innovation with stability. Developers get access to new features but basically on a trial basis. While the feature is classed as unstable, Rust team can refine the design, fix edge cases or abandon features if problematic.

cargo new hello_cargo

 For details on how to use cargo new type cargo new -help at the command line.

cargo new hello_cargo

creates a new directory for your project, a .git subdirectory, an src directory for your source code, and a Cargo.Toml file.   

TOML, or Tom's Obvious Minimal Language, format, is Cargo's configuration format. (TOML brands itself as "A Config File Format for Humans" and is nice, simple and neat).

It has a [package] section which configures package settings, and a [dependencies] section for any of the crates required for the package to run (crates are Rust packages).

cargo init -help

for help integrating any Rust code developed outside of Cargo.

Elliptic Curves Assert Presence on Linux

WinJoes may be surprised to see refs to elliptic curves appearing in their Linux VMs in their D2D Ops.

These appearances often reference EdDSA (e.g. in messages like "using EDDSA key 0327BE68.....").

EdDSA refers to the Edwards-curve Digital Signature Algorithm (EdDSA) in public key cryptography.

Edwards curves are an artefact of algebraic geometry and named after the American mathematician Harold Edwards.  The specific form of Edwards curves used in the algorithm are known as twisted Edwards curves, where the "twist" comes from a non-unitary coefficient (from some field F) in the curve equation. The twisted Edwards curve equation is an interesting equation and the first question when you see it is how did someone come up with it.

source command in Linux shell

 The source command in the Linux shell sources (loads) a script. Its short form is .

emacsx - An Emacs Launcher for .bashrc

Here is a cool emacs launcher for your .bashrc file if you like reverse video.

# Launch Emacs in full screen, optionally with a file

# Launch Emacs in full screen, optionally with a file

emacsx() {
  if [ "$#" -eq 0 ]; then
    command emacs -rv --fullscreen
  else
    command emacs -rv --fullscreen "$@"
  fi
}

$# in bash is a special parameter that represents the number of positional parameters to a script of function.

Cargo for Rustaceans

A simple rust program with no dependencies can be compiled with good old rustc. But for Rustaceans who want more - more dependencies, more complexity - cargo is the tool to use. Read hello cargo here.

Wednesday, 19 November 2025

The DevOps HUD - GoLand

DevOps brothers and sisters. Discover GoLand.

What is the "tensor" in TensorFlow?

In TensorFlow, a tensor is a multidimensional array used to represent data in a machine learning model. It generalizes scalars (0D-array), vectors (1D-array) and matrices (2D-array) to higher dimensions.

But don't be fooled. A true mathematical tensor has much more going on behind the scenes, being multilinear maps with specified transformation rules. 

A TensorFlow tensor is a looser construct than that.

Describing Einstein's General relativity mathematically involves the use of tensors of the math variety.

Deeper Look at Sequential Model Building in TensorFlow

Let's revisit the model building command in TensorFlow in our "hello world" equivalent example.

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10)
])

So Sequential lets you build up a model in layers (see Layer class, or tf.keras.Layer, that inherits from Operation). 

Layers are callable objects. In Python, a callable is any object that can be called using parentheses (optionally with arguments). Read the implementation here (in keras/source/layers/layer.py).

But what does the Flatten method/Layer do?

Dense creates a densely-connected NN Layer (convolutional neural network architecture). 
  • The first argument is the positive integer units, representing the dimensionality of the output space
  • The second argument is the activation function to use (if this is missing, no activation is applied which is actually linear activation a(x) = x)
Activation functions in a neural net introduces non-linearity into the network.  
  • Essentially these functions work with neurons and transform neural computations into output signals
  • ReLU (rectified linear unit) is one of the most widely used activation functions in neural networks (f(x) = max(0,x)).

Train your First Neural Network on the MNIST dataset

You can train your first neural network on the MNIST dataset (used for image recognition models). The MNIST example is tantamount to being a "hello world" of machine learning programs.

Key features:
  • Use Keras API as a "portal" into TensorFlow library to build the neural network
  • Use "Sequential" model - allows you to add "layers" sequentially
  • "Feed" the model the training data (creating the model takes a bit of study/effort)
  • Model gives back a vector of "Logits" or "Logs-odd" scores, one per class
  • Run softmax to convert these scores to probabilities
  • Compile the model - with an optimizer and a loss function, configure for 'accuracy'
  • model.fit
  • model.evaluate to see how the model performed (was it a good fit to the data)
Doing this example immediately raises a billion questions! Answering these questions will help you in future machine learning projects with TensorFlow. So get your answers now!

Some numbers to remember in this "post game analysis" are 0 to 255 and 28x28.

All About the Data - the MNSIT Dataset & (Numpy-friendly) Data Format 

The MNIST dataset consists of 60,000 training images of handwritten digits and 10,000 test images, each a size of 28x28 pixels. Images are grayscale and numbers are 0-9. The data set is vectorized and in numpy format. Each pixel has an encoding of 0 to 255 (typical for grayscale images) where the number represents brightness, 0 is black and 255 is white.

The Data Set Loading Process (Involves Normalization)

So MNIST is one of the built-in datasets in Keras. 

The first step is to normalize the data by dividing each pixel value (in the training and testing data set) by PIXEL_MAX=255 which creates a value between 0 and 1 (inclusive) and converts an integral value into a decimal value.

Model.fit - In depth

How does this from a function-calling perspective.

How do I see how good this model is visually?

This requires some additional programming.

I had to bring pip into my Ubuntu Linux VM

sudo apt install python3-pip

Even with Python installed a whole host of packages are needed for pip. These include (non-exhaustively):

  • build-essential
  • bzip2
  • cpp
  • cpp-11
  • fakeroot
  • ...
  • zlib1gdev

Just to mention a few.

build-essential is a Debian specific package, consisting of a bunch of useful build tools, to build software from source and create Debian packages.

Tuesday, 18 November 2025

Rustup

what rustup is and what it does for rust

Rust is installed and managed with the rustup tool. There are some concepts underlying how the tool works, including its role as a toolchain multiplexer.

rustup's role as a toolchain multiplexer

A toolchain represents a full installation of a Rust compiler (rustc) and related tools (like cargo).  Rustup enables switching between multiple versions of these tools which can be selected dynamically.

analogies in wider programming pantheon

rustup is similar to pyenv.

Microsoft's Big Bet on Rust

Microsoft is writing more code in Rust (a memory safe "cousin" of C++).  It is also investing time in developing the language ecosystem in its role as a founding member of the Rust Foundation.

This is due to 1) performance 2) memory safety and 3) developer mindshare.

Mark-R has called for a halting of new code in C and C++ and using Rust 

"for those scenarios where a non-GC language is required".

Linus has also confirmed Version 6.1 of the Linux kernel will use Rust.

Rust was created by Graydon Hoare in 2006 while working at Mozilla as a personal side project to solve memory safety and concurrency issues in systems languages like C and C++.

Other companies that have embraced Rust include Cloudflare who coded up Pingora in Rust to overcome limitations in nginx.

Will two git commits ever have the same id?

 This is highly unlikely due to the design of the commit algorithm.  The probability is less than 1.47 in 10^48.  Digging into this, the commit id is actually a 40-character hexadecimal number - using a cryptographic hash function (SHA-1 or SHA-256 in newer versions) producing 2^(4*40) possible hashes. 

Metadata that's fed into the cryptographic hash function include snapshot of the project tree (folders, files and contents), commit message, the commit data itself, author information and timestamp (time elapsed in seconds since Jan 1, 1970).

TCPL Still Rules the Linux Roost

The programming language C (born in the 1970s, created by Dennis Ritchie, who also created its predecessor, B) still rules the roost as far as the Linux kernel is concerned, whereby of 37 million lines of code, we have just over 35m LOCs written in C as per analysis from OpenHub.

Linux is the open source version of Unix, which was written in C, and prior to that in assembler. The introduction of the programming language C made the code portable to different hardwares.

Ken Thompson and Dennis Ritchie and their colleagues at Bell Labs (AT&T) were the co-creators of the Unix Operating System, created in 1969.   Douglas McIlroy introduced the Unix philosophy of small, composable tools. Brian Kernighan helped to popularise UNIX and C through co-authoring The C Programming Language with Dennis Ritchie.

git add and git commit

The Mechanics of Commits 

You can add a new file to your repo doing:

git add <mynewfile>
git status (this will show any pending commits)
git commit -m "a little comment, if you please" <mynewfile>

Once you commit, you should get a message like:

[master (root-commit) strange_hexadecimal_code] comment from -m command line flag
 "1 file changed, X insertions(+)" 
create mode 100644 filename.xx

 If you didn't do a git add, it will say "nothing to commit, working tree clean".
The 100644 is Unix style permissions - 100 means regular file, 644 means read/write for owner and read-only for group and others.

Committing to Git is a Two-Step Process

Stating the obvious here, but as stated above, you need to git add before you git commit for new files.

The Semantics of Commits

When we commit, Git invokes an algorithm to create a "commit object", a binary object which it stores inside the .git folder.  A unique identifier is used to "stamp" the commit. Part of the "stamp" is shown in the git commit response message - and is computed from a bunch of metadata including the commiter's name, time of commit, commit message and information on the change.

The Index and the Object Database

When you do a git add, a copy is made of the file and put into the index. The index is like a "waiting room" where we put in our "candidate" objects until we are ready to commit them.  git commit takes the objects in the waiting room and puts them into the object database.

Allowing wsl.localhost in list of allowed hosts

When navigating to a directory in WSL from VS Code you may get the message:

The host 'wsl.localhost' was not found in the list of allowed hosts. Do you want to allow it anyway?

You will get the opportunity to hit Allow (because you trust the host, it is after all, your WSL installation) together with the option to flag: "Permanently allow host 'wsl.localhost'".

Accessing Linux Directories in WSL From Explorer

Directories in WSL can be accessed by navigating to your distribution and filesystem after opening \\wsl$ in File Explorer.

Telling git who you are with git config

To get your commits attributing correctly you need to let git know who you are. This means sharing your name and email address you want to use for git. Here is the way:

git config --global user.name "Your Name"
git config --global user.email "youremail@provider.com"

Then you can type:

git config -l

to see if git remembered your changes. The user settings will remain even if you then eliminate the project (project specific settings will disappear).

Creating a git repository: git init

What git init does

Run  git init in the folder where your code will be based (this assumes you have already created an appropriately named folder and done cd into it).

This creates a .git subdirectory.

What does the .git subdirectory hold?

.git is filled with configuration files and a subfolder for snapshots to be stored. All commits are stored in .git as well.

Name of initial branch

This may change in successive versions of Git, but as of 2.34, the name of the initial branch created by git init is 'master' (other commonly used names include 'main' and 'trunk' and 'development').

Re-running git init

Re-running git init will not do anything. It will simply output "Reinitializing existing Git repository" in /home/YOURLOGIN/YOURPROJECT/.git/. 

Re-running git init in the .git subdirectory

This is weird and will create a new . git subdirectory in the .git subdirectory, so if you do this, you need to clean up your .git. There is no valid use case for doing this.

Renaming a branch in git

The git branch command can be used to rename the branch.

Linux VM Skills - The ls command

ls -a and ls -A will list all files including hidden files.

ls -A (capital A) will attempt to remove the . and .. directory (i.e. current and parent directories). 

A Head-First Guide to Git

A head-first guide to Git is available on O'Reilly. Head First Git also has an accompanying website.

Saturday, 15 November 2025

Canonical Snaps and Approach to Linux Packaging

Concept of Snaps and Why Its Useful

When running wsl in Windows (with Ubuntu) you will eventually come across the concept of snaps. 

Snaps are a package management feature that offer an alternative to the usual sudo apt-get (or sudo apt install, which is a wrapper over apt-get).

Snaps was developed by Canonical for security (via sandboxing, or in Canonical language "confinement") and convenience (an "all-in-one" snap removes the need to download and install individual dependencies).

The security side aims to guarantee safe execution of software by mandating packages abide by the principle of least privilege (this is diluted however by the option of classic confinement).

Concrete Example: Installing emacs edito

Trying to run emacs at the command line, you find it is not installed. You may see:

Command 'emacs' not found, but can be installed with
sudo snap install emacs # version 30.2

This is achieved by placing the package in a sandbox with snapd mediating all access to host system resources. 
The snap's confinement level controls the degree of isolation from the user's system.
  • Strict confinement - abide by sandbox rules
  • Classic confinement - liberal / "laissez-faire" (but needs explicit user approval on install)
Searching for Pre-Created Snaps using Canonical's Search Engine

There is a search engine for Snaps on Canonical's website. Canonical are calling it the "app store" for Linux.

Friday, 14 November 2025

Wayland for WinDevs Who May Not Know It

What is Wayland?

Wayland is intended as a replacement for the X Window system with the aim to make it easier to develop and maintain.

Specifically, Wayland is the protocol used to talk to the display server to render the UI and receive input from the user.  

Wayland also refers to a system architecture (more below), which will give you an understanding of how the protocol is used to build a working UI.

Wayland versus X Architecture?  Call it a "Simplified X".

In an X setup,  X clients talk to an X server and the server talks to a compositor. The comms between server an compositor is bidirectional.

The X Server also talks to the kernel. There is a critical interface called evdev (short for event device) which is the input event interface in the Linux kernel. 

In Wayland - the display server and the compositor are rolled into one. The architecture is thus simpler.

What is WSLg?

WSLg is the Windows Subsystem for Linux GUI to enable running Linux GUI applications (X11 and Wayland) on Windows.

Google Colab

Google Colaboratory ("Colab") is a hosted Jupyter notebook which includes free access to GPUs and TPUs. It is for machine learning, data science and education.

For cool datasets to explore ML with, check out Google Dataset Search.

There is an interesting Colab workbook by Ashwin Rao on the SVB crisis.

Colab supports a large number of constantly upgraded Python packages including kagglehub (to use Kaggle resources) and narwhals (dataframe library).

Tuesday, 11 November 2025

Addressing AI Misuse

OpenAI has a Preparedness Framework aimed at addressing AI misuse.

The domain of cybersecurity features prominently here, since AI can be used to enhance security, but equally make it easier to scale up cyberattacks.

Getting Jiggy with gpt-oss-20b (and why open weights matter)

gpt-oss-20b is an open weight language model. These so-called "open weights" reflect the pre-training the model has received.

The model is a significant 12GB download.

LM Studio Setup

LM Studio has two setup options:
1. For anyone who uses the computer
2. Only for the currently logged in user
Both have advantages, but if you are working with Studio to build custom LLMs tailored to you, you may want option 2 despite the (potential) convenience of option 1 where LM Studio is "universally" available to all users of the machine.

Disk Space in Windows 11

Type Storage Settings in the Search bar. This will lead you into Settings ->System -> Storage.

Monday, 10 November 2025

Python Wheels

Python wheels are pre-built binary packages for Python which make installation via pip faster and more efficient. Learn more here.

Why Python From Windows Store is Flawed

Installing Python from the Windows Store is flawed as everything goes into AppData\Local.  This is a local directory associated with the logged in user C:\Users\<Name>\AppData\Local.

This is a way to sidestep a "proper installation" in C:\Program Files which requires administrator privileges. It's a way to overcome potential UAC restrictions.

Once installed - there is no proper way to uninstall. You need to get into the AppData directory (which is hidden in File Explorer). Once opened, you can navigate to Microsoft\WindowsStore to find python and related exe files (e.g. pip.exe).  Then do a clean of the registry.

You may also find Python subdirectories in the WindowsStore directory.

While cleaning out registry entries you may come across references to .whl files (or Python Wheels).

Understand the Simple Power of Backpropagation but also the Dangers

So states Lex Fridman in his lecture on Recurrent Neural Networks (from the course on Deep Learning for Self-Driving Cars).

pip install tensorflow

This will install the current stable release for TensorFlow.

Friday, 7 November 2025

Control-Star - The Magic Key Combination in Word

 Control-Star - will make hidden characters appear and disappear in Word.

Word AI is Not Intelligent

Macros are better. Even with feedback, Word AI is not good. It cannot be classified as "intelligent".

Wednesday, 5 November 2025

Azure Kubernetes Service - Rationale for Using AKS for .NET Applications

Why use the Azure Kubernetes Service (AKS) for your .NET applications (rather than just deploy them to virtual machines)?

First, there are scalability benefits - these are ideal for microservices.

Kubernetes has "canned processes" for scaling.

Kubernetes has what's called Horizontal Pod autoscaling - which means when load increases, the system deploys more pods.  It doesn't change the dynamics of the pods themselves i.e. doesn't allocate more memory or CPU to them, it just makes more pods.

Exciting Thing about AI

The exciting thing about AI is accelerated learning - and there is no field where no impact is most immediate is computer programming.   But the edge is the human learning that is accelerated - not so much the automatic code generation, prototype generation - but the ability to come to quick conclusions around technology choices, implementation choices and optimised solution building.

Technology Knowledge

Technology knowledge always needs refreshing.

Software is changing all the time.
Hardware is advancing - making new things possible (new software).
Models of software hosting are changing as well (centralised models - decentralised - centralised in cloud).

Patterns repeat themselves, but with variations.

That's why you need to know the past and the present. Keeping up with Technology change is a full time job and that's why you need people in a team, always updating themselves.

Technology is a knowledge business.

Biztalk and Azure Integration Services

Biztalk was pitched as an application server and an application integration server.

It is designed to be an integrator of different software systems, and automate business processes (you may have a business process that touches on multiple IT systems - so this solution makes perfect sense - deploy a middleware to manage all the interaction).

Biztalk has/had a bunch of adapters including file adapters, FTP, HTTP, SOAP, SMTP, WCF, MSMQ, MQSeries (now IBM MQ) etc.

It's actually a brilliant product name because it captures the concept of what it does so well.

The replacement for Biztalk is Azure Integration Services.

It is interesting to consider the relationship between "AIS" (as defined above in the Azure context) and "RPA" (or "Robotic Process Automation"). The latter focuses on streamlining human-system interactions, and the former focuses on system-to-system interactions.