Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Page Not Found

Page not found. Your pixels are in another canvas.

TIM-360M: Transformer Inference Model

A 360M-parameter small language model, trained end-to-end on consumer-grade hardware.

Jupyter notebook markdown generator

Posts

Blog Post number 4

Published: August 14, 2015

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

Published: August 14, 2014

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

Published: August 14, 2013

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

Published: August 14, 2012

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

patents

Method of Utility Usage Analysis

Published: July 14, 2026

A method and apparatus for analysing utility consumption at a utility supply location is described. The method comprises the steps of: receiving utility consumption data corresponding to utility consumption at the utility supply location over a time period to be analysed; generating a recurring consumption model indicative of repeating consumption patterns in the utility consumption data; identifying divergences between the utility consumption data and the recurring consumption model; computing a diagnostic measure indicative of irregular consumption based on the identified divergences; and outputting the diagnostic measure. The diagnostic measure may be used to identify flexibility or irregularities in consumption and/or to control supply of the utility. The utility may be e.g. electricity, gas or water.

Determining Operating State from Complex Sensor Data

Published: July 14, 2026

A method of detecting an operating state of a process, system or machine based on sensor signals from a plurality of sensors is disclosed. The method comprises receiving sensor data, the sensor data based on sensor signals from the plurality of sensors and providing the sensor data as input to a neural network. The neural network comprises an encoder sub-network arranged to receive the sensor data as input and to generate a context vector based on the sensor data; and a decoder sub-network arranged to receive the context vector as input and to regenerate sensor data corresponding to at least a subset of the sensors based on the context vector. The method comprises comparing the context vector to at least one context vector classification; detecting an operating state in dependence on the comparison; and outputting a notification indicating the detected operating state.

Method and system for detecting anomalies in energy consumption

Published: July 14, 2026

A method of detecting conditions indicative of energy meter tampering, meter faults or energy loss (e.g. due to energy theft) is disclosed. The method includes receiving energy consumption data from an energy meter indicating consumption of energy at a location served by the energy meter. Event data is also received from the meter comprising one or more events generated by the energy meter. The consumption data is analysed to detect a predetermined consumption condition. The event data is analysed to detect a predetermined event or event pattern in the event data. An alert condition is generated in response to detecting both the consumption condition and the event or event pattern.

portfolio

publications

Recurrent Auto-Encoder Model for Large-Scale Industrial Sensor Signal Analysis

Published in 19th International Conference on Engineering Applications of Neural Networks (EANN 2018), 2018

Recurrent auto-encoder model summarises sequential data through an encoder structure into a fixed-length vector and then reconstructs the original sequence through the decoder structure. The summarised vector can be used to represent time series features. In this paper, we propose relaxing the dimensionality of the decoder output so that it performs partial reconstruction. The fixed-length vector therefore represents features in the selected dimensions only. In addition, we propose using rolling fixed window approach to generate training samples from unbounded time series data. The change of time series features over time can be summarised as a smooth trajectory path. The fixed-length vectors are further analysed using additional visualisation and unsupervised clustering techniques. The proposed method can be applied in large-scale industrial processes for sensors signal analysis purpose, where clusters of the vector representations can reflect the operating states of the industrial system.

Recommended citation: Wong T., Luo Z. (2018) Recurrent Auto-Encoder Model for Large-Scale Industrial Sensor Signal Analysis. In: Pimenidis E., Jayne C. (eds) Engineering Applications of Neural Networks. EANN 2018. Communications in Computer and Information Science, vol 893. Springer, Cham https://link.springer.com/chapter/10.1007/978-3-319-98204-5_17

Mobile Coverage Analysis using Crowdsourced Geolocation Data

Published in arXiv pre-print (cs.AI), 2025

Effective assessment of mobile network coverage and the precise identification of service weak spots are paramount for network operators striving to enhance user Quality of Experience (QoE). This paper presents a novel framework for mobile coverage and weak spot analysis utilising crowdsourced QoE data. The core of our methodology involves coverage analysis at the individual cell (antenna) level, subsequently aggregated to the site level, using empirical geolocation data. A key contribution of this research is the application of One-Class Support Vector Machine (OC-SVM) algorithm for calculating mobile network coverage. This approach models the decision hyperplane as the effective coverage contour, facilitating robust calculation of coverage areas for individual cells and entire sites. The same methodology is extended to analyse crowdsourced service loss reports, thereby identifying and quantifying geographically localised weak spots. Our findings demonstrate the efficacy of this novel framework in accurately mapping mobile coverage and, crucially, in highlighting granular areas of signal deficiency, particularly within complex urban environments.

Recommended citation: Wong, T., Freeman, T., & Feehily, J. (2025). Mobile Coverage Analysis using Crowdsourced Data. arXiv [Cs.AI]. Retrieved from http://arxiv.org/abs/2510.13459 https://arxiv.org/abs/2510.13459

talks

Multi-seasonal Time Series Modelling using Recurrent Neural Nets

Published: September 14, 2016

The ability to foresee what’s about to happen is crucial to the success of energy companies like Centrica. For instance, predicting the number of boiler breakdown on any given day allows us to ensure sufficient number of gas engineers to be staffed.

Parallelised Time Series Spike Detection using R on the Hadoop Platform

Published: July 04, 2017

Smart meters records continuous stream of electricity consumption for each and every supply point across the United Kingdom. Energy suppliers are interested in understanding customer’s consumption pattern in order to provide better service for them. FlexiScore (F) is a new concept which British Gas has developed. It is a single numeric value ranging between 0 and 1 which quantifies the amount of flexible energy load for each electric supply points. High F value suggests the presence of erratic spikes, while low F value indicates prolonged consistency and non-spiky behaviour. The algorithm has been productionised on the Hadoop platform (on premise) using Microsoft R Server 8.0 as a fully-scalable analytics framework. The large-scale distributed process contains an array of Markov Chains Monte Carlo (MCMC) for missing data permutation. A layer of Fourier transformation has been applied to create seasonal time series model. Afterwards, simple heuristics is applied to isolate erratic consumption spikes. The F score is then computed as output alongside other descriptive statistics.

Analysing High-Frequency Industrial Component Failure using Text Mining Techniques

Published: September 13, 2017

Centrica plc is an energy service company and its Exploration and Production (E&P) division currently operates several gas production assets across the world. A largescale production asset usually contains thousands of components which require regular inspection and maintenance. Understanding the pattern of component failure is the key to manage large-scale assets successfully.

Text Mining for Preventative Maintenance

Published: December 15, 2017

Large-scale industrial processes are normally comprised of thousands and thousands of individual components which are vulnerable to breakdown. Maintenance of these components is the key to reduce unplanned outages. The repair log dataset contains unstructured, free-format text description detailing the issues. We applied text mining algorithms to this dataset and turned it into an analysable format. A combination of techniques were used including tf-idf scheme and n-grams approach. Groups of vulnerable components can be visualised as a graph network.

Deep Neural Network Training and Applications

Published: January 29, 2018

Deep learning models can be used to extract representations for multidimensional time series data. We have used a sensors dataset collected from an industrial-scale compresssor unit to illustrate this problem. Real-values sensor signals were treated as multidimensional time series and fed through a recurrent auto-encoder model. Representations extracted can be projected to low dimensionity space and reflect temporal behaviour of the underlying time series. Specific signals can be isolated for detailed analysis using partial reconstruction of the original input.

Generalised Additive Model for Gas Boiler Breakdown Demand Prediction

Published: May 15, 2018

At British Gas, we operate a service and repair business with more than 6,000 qualified engineers ready to serve customers who are urgently in need across the country. Predicting demand accurately ahead of time allows us to optimally schedule workforce. Additional workforce can be scheduled in case demand is forecasted to increase substantially. We have developed a prototype demand forecasting procedure which uses a mixture of machine learning techniques. The key component uses Generalised Additive Model (GAM) to estimate the number of incoming work requests. It takes into account the non-linear effects of multiple predictor variables. The models were trained at patches level in order to capture local behaviour. Planning operators can then use the model output to fine-tune workforce assignment at the local level to meet changing demand.

Modelling Field Operation Capacity using Generalised Additive Model and Random Forest

Published: July 11, 2018

In any customer-facing business, accurately predicting demand ahead of time is of paramount importance*. Workforce capacity can be flexibly scheduled at local area accordingly. In this way, we can ensure having sufficient workforce to meet volatile demand.In this case study, we focus on the gas boiler repairing field operation in the UK. We have developed a prototype capacity forecasting procedure which uses a mixture of machine learning techniques to achieve its goal. Firstly, it uses Generalised Additive Model approach to estimate the number of incoming work requests. It takes into account the non-linear effects of multiple predictor variables. The next stage uses a large random forest to estimate the expected number of appointments for each work request by feeding in various ordinal and categorical inputs. At this stage, the size of the training set is considerable large and does not fully-fit in memory. In light of this, the random forest model was trained in chunks / parallel to enhance computational performance. Once all previous steps have been completed, probabilistic input such as the ECMWF Ensemble weather forecast to give a view of all predicted scenarios.

Signal Analysis using Deep Learning

Published: December 19, 2018

Deep learning models can be used to extract representations for multidimensional time series data. We have used a sensors dataset collected from a large-scale industrial facility to illustrate this problem. Real-values sensor signals were treated as multidimensional time series and fed through a recurrent auto-encoder model. Representations extracted can be projected to low dimensionality space and reflect temporal behaviour of the underlying time series. In this way, the change of time series features over time can be summarised as a smooth trajectory path. The fixed-length vectors are further analysed using additional visualisation and unsupervised clustering techniques.

Large-Scale Time Series Forecasting in Apache Spark

Published: September 11, 2019

Accurately forecasting power demand is important for securing energy supply. Time series forecasting methods and other machine learning algorithms can be used to create energy forecasts. We have developed a forecasting framework based on multi-model approach at customer account level. The framework uses a wide range of algorithms (e.g. GLM, ElasticNet, Seasonal ARIMA-X, Decision Tree, Random Forest and Gradient Boosting Machine). Models are pre-trained on AWS EMR cluster using Spark/SparklyR. The process is run at massively parallel scale (>3000 vCores). Once the model training algorithm has completed, the model objects are persisted on AWS S3 so that they can be reused at a later date. To trigger a forecast, the deploy pipeline will load the pre-trained model object from S3 and create a forecast based on the prevailing inputs. The output is stored as partitioned parquet files on S3, which can be converted into table view through AWS Athena.

Using Linear Programming for Route Planning and Job Scheduling

Published: September 05, 2024

Efficiently managing travel and job scheduling for multiple workers across various locations presents a significant operational challenge. We use a Linear Programming (LP) model to optimise route planning and job allocation among multiple workers, aiming to minimise travel time and adhere to individual working hours constraints. Utilising variables such as travel costs, job durations, and resource capacities, we construct a framework that accommodates each worker’s starting location and contractual obligations. This approach not only enhances operational efficiency but also contributes to the broader field of operations research by providing a scalable solution for multi-location, multi-personnel scheduling problems.

teaching

Business Analytics in R

Workshop, Various Locations, 2018

R is an open source language for statistical programming. It is widely used among statisticians, academic researchers, and business analysts across the world. This is a three days workshop for business analysts with little or no experience in the R language. The course focuses on several topics.

Renewable Energy Systems

Guest Lecture, Royal Holloway, University of London, 2020

This guest lecture covers the basics of Transmission and Distribution, and the Balance Mechanism (BM) from a supplier perspective. It also covers demand estimation methodology, especially the Non-Half-Hourly (NHH) settlement process and how advances in metering technologies bring changes to this area. Lastly, the lecture briefly discusses emerging trends in the energy industry.

tutorials

Linear Programming for Optimal Resources Allocation

Published: July 17, 2020

Using Linear Programming (LP) solver to allocate resources to geographical regions. This generalised example shows how to dynamically allocate engineers to local areas, while satisfying business constraints and minimising travel cost (e.g. travel time, fuel expenditurem, motor insurance, etc).

Using Linear Programming for Route Planning and Job Scheduling

Published: February 16, 2022

Efficiently managing travel and job scheduling for multiple engineers across various locations presents a significant operational challenge. We use a Linear Programming (LP) model designed to optimise route planning and job allocation among engineering teams, aiming to minimise travel time and adhere to individual working hours constraints. Utilising variables such as travel costs, job durations, and resource capacities, we construct a mathematical framework that accommodates each engineer’s starting location and contractual obligations. Experimental results, visualised through Gantt charts and geographical plotting, demonstrate the model’s efficacy in reducing total travel time while ensuring equitable workload distribution. This approach not only enhances operational efficiency but also contributes to the broader field of operations research by providing a scalable solution for multi-location, multi-personnel scheduling problems.

Textual Inversion for Stable Diffusion

Published: February 24, 2023

This tutorial provides a comprehensive guide on using Textual Inversion with the Stable Diffusion model to create personalized embeddings. It covers the significance of preparing diverse and high-quality training data, the process of creating and training an embedding, and the intricacies of generating images that reflect the trained concept accurately. The author shares practical insights into overcoming challenges such as data preparation, the training process, and adjusting the weight of embeddings to achieve desired results. This resource is valuable for anyone looking to tailor generative models to recognize and generate images of specific objects, faces, or styles.

Geospatial Optimization

Published: June 17, 2025

This tutorial provides a Linear Programming (LP) approach to determine where stationary sensors should be placed to maximize coverage in a defined area while respecting constraints on sensor count and overlapping. The optimization problem assigns a binary variable to each potential location/configuration pairing. The objective minimizes the number of sensors while discouraging redundant overlap. Constraints ensure required coverage, limit overlap, and allow no more than one sensor per spot.

Timothy

Sitemap

Pages

Posts

patents

portfolio

publications

talks

teaching

tutorials