Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

Blog Post number 4

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

patents

Method of Utility Usage Analysis

Published:

A method and apparatus for analysing utility consumption at a utility supply location is described. The method comprises the steps of: receiving utility consumption data corresponding to utility consumption at the utility supply location over a time period to be analysed; generating a recurring consumption model indicative of repeating consumption patterns in the utility consumption data; identifying divergences between the utility consumption data and the recurring consumption model; computing a diagnostic measure indicative of irregular consumption based on the identified divergences; and outputting the diagnostic measure. The diagnostic measure may be used to identify flexibility or irregularities in consumption and/or to control supply of the utility. The utility may be e.g. electricity, gas or water.

Determining Operating State from Complex Sensor Data

Published:

A method of detecting an operating state of a process, system or machine based on sensor signals from a plurality of sensors is disclosed. The method comprises receiving sensor data, the sensor data based on sensor signals from the plurality of sensors and providing the sensor data as input to a neural network. The neural network comprises an encoder sub-network arranged to receive the sensor data as input and to generate a context vector based on the sensor data; and a decoder sub-network arranged to receive the context vector as input and to regenerate sensor data corresponding to at least a subset of the sensors based on the context vector. The method comprises comparing the context vector to at least one context vector classification; detecting an operating state in dependence on the comparison; and outputting a notification indicating the detected operating state.

Method and system for detecting anomalies in energy consumption

Published:

A method of detecting conditions indicative of energy meter tampering, meter faults or energy loss (e.g. due to energy theft) is disclosed. The method includes receiving energy consumption data from an energy meter indicating consumption of energy at a location served by the energy meter. Event data is also received from the meter comprising one or more events generated by the energy meter. The consumption data is analysed to detect a predetermined consumption condition. The event data is analysed to detect a predetermined event or event pattern in the event data. An alert condition is generated in response to detecting both the consumption condition and the event or event pattern.

portfolio

publications

Recurrent Auto-Encoder Model for Large-Scale Industrial Sensor Signal Analysis

Published in 19th International Conference on Engineering Applications of Neural Networks (EANN 2018), 2018

Recurrent auto-encoder model summarises sequential data through an encoder structure into a fixed-length vector and then reconstructs the original sequence through the decoder structure. The summarised vector can be used to represent time series features. In this paper, we propose relaxing the dimensionality of the decoder output so that it performs partial reconstruction. The fixed-length vector therefore represents features in the selected dimensions only. In addition, we propose using rolling fixed window approach to generate training samples from unbounded time series data. The change of time series features over time can be summarised as a smooth trajectory path. The fixed-length vectors are further analysed using additional visualisation and unsupervised clustering techniques. The proposed method can be applied in large-scale industrial processes for sensors signal analysis purpose, where clusters of the vector representations can reflect the operating states of the industrial system.

Recommended citation: Wong T., Luo Z. (2018) Recurrent Auto-Encoder Model for Large-Scale Industrial Sensor Signal Analysis. In: Pimenidis E., Jayne C. (eds) Engineering Applications of Neural Networks. EANN 2018. Communications in Computer and Information Science, vol 893. Springer, Cham https://link.springer.com/chapter/10.1007/978-3-319-98204-5_17

talks

Multi-seasonal Time Series Modelling using Recurrent Neural Nets

Published:

The ability to foresee what’s about to happen is crucial to the success of energy companies like Centrica. For instance, predicting the number of boiler breakdown on any given day allows us to ensure sufficient number of gas engineers to be staffed.

Parallelised Time Series Spike Detection using R on the Hadoop Platform

Published:

Smart meters records continuous stream of electricity consumption for each and every supply point across the United Kingdom. Energy suppliers are interested in understanding customer’s consumption pattern in order to provide better service for them. FlexiScore (F) is a new concept which British Gas has developed. It is a single numeric value ranging between 0 and 1 which quantifies the amount of flexible energy load for each electric supply points. High F value suggests the presence of erratic spikes, while low F value indicates prolonged consistency and non-spiky behaviour. The algorithm has been productionised on the Hadoop platform (on premise) using Microsoft R Server 8.0 as a fully-scalable analytics framework. The large-scale distributed process contains an array of Markov Chains Monte Carlo (MCMC) for missing data permutation. A layer of Fourier transformation has been applied to create seasonal time series model. Afterwards, simple heuristics is applied to isolate erratic consumption spikes. The F score is then computed as output alongside other descriptive statistics.

Analysing High-Frequency Industrial Component Failure using Text Mining Techniques

Published:

Centrica plc is an energy service company and its Exploration and Production (E&P) division currently operates several gas production assets across the world. A largescale production asset usually contains thousands of components which require regular inspection and maintenance. Understanding the pattern of component failure is the key to manage large-scale assets successfully.

Text Mining for Preventative Maintenance

Published:

Large-scale industrial processes are normally comprised of thousands and thousands of individual components which are vulnerable to breakdown. Maintenance of these components is the key to reduce unplanned outages. The repair log dataset contains unstructured, free-format text description detailing the issues. We applied text mining algorithms to this dataset and turned it into an analysable format. A combination of techniques were used including tf-idf scheme and n-grams approach. Groups of vulnerable components can be visualised as a graph network.

Deep Neural Network Training and Applications

Published:

Deep learning models can be used to extract representations for multidimensional time series data. We have used a sensors dataset collected from an industrial-scale compresssor unit to illustrate this problem. Real-values sensor signals were treated as multidimensional time series and fed through a recurrent auto-encoder model. Representations extracted can be projected to low dimensionity space and reflect temporal behaviour of the underlying time series. Specific signals can be isolated for detailed analysis using partial reconstruction of the original input.

Generalised Additive Model for Gas Boiler Breakdown Demand Prediction

Published:

At British Gas, we operate a service and repair business with more than 6,000 qualified engineers ready to serve customers who are urgently in need across the country. Predicting demand accurately ahead of time allows us to optimally schedule workforce. Additional workforce can be scheduled in case demand is forecasted to increase substantially. We have developed a prototype demand forecasting procedure which uses a mixture of machine learning techniques. The key component uses Generalised Additive Model (GAM) to estimate the number of incoming work requests. It takes into account the non-linear effects of multiple predictor variables. The models were trained at patches level in order to capture local behaviour. Planning operators can then use the model output to fine-tune workforce assignment at the local level to meet changing demand.

Modelling Field Operation Capacity using Generalised Additive Model and Random Forest

Published:

In any customer-facing business, accurately predicting demand ahead of time is of paramount importance*. Workforce capacity can be flexibly scheduled at local area accordingly. In this way, we can ensure having sufficient workforce to meet volatile demand.In this case study, we focus on the gas boiler repairing field operation in the UK. We have developed a prototype capacity forecasting procedure which uses a mixture of machine learning techniques to achieve its goal. Firstly, it uses Generalised Additive Model approach to estimate the number of incoming work requests. It takes into account the non-linear effects of multiple predictor variables. The next stage uses a large random forest to estimate the expected number of appointments for each work request by feeding in various ordinal and categorical inputs. At this stage, the size of the training set is considerable large and does not fully-fit in memory. In light of this, the random forest model was trained in chunks / parallel to enhance computational performance. Once all previous steps have been completed, probabilistic input such as the ECMWF Ensemble weather forecast to give a view of all predicted scenarios.

Signal Analysis using Deep Learning

Published:

Deep learning models can be used to extract representations for multidimensional time series data. We have used a sensors dataset collected from a large-scale industrial facility to illustrate this problem. Real-values sensor signals were treated as multidimensional time series and fed through a recurrent auto-encoder model. Representations extracted can be projected to low dimensionality space and reflect temporal behaviour of the underlying time series. In this way, the change of time series features over time can be summarised as a smooth trajectory path. The fixed-length vectors are further analysed using additional visualisation and unsupervised clustering techniques.

Large-Scale Time Series Forecasting in Apache Spark

Published:

Accurately forecasting power demand is important for securing energy supply. Time series forecasting methods and other machine learning algorithms can be used to create energy forecasts. We have developed a forecasting framework based on multi-model approach at customer account level. The framework uses a wide range of algorithms (e.g. GLM, ElasticNet, Seasonal ARIMA-X, Decision Tree, Random Forest and Gradient Boosting Machine). Models are pre-trained on AWS EMR cluster using Spark/SparklyR. The process is run at massively parallel scale (>3000 vCores). Once the model training algorithm has completed, the model objects are persisted on AWS S3 so that they can be reused at a later date. To trigger a forecast, the deploy pipeline will load the pre-trained model object from S3 and create a forecast based on the prevailing inputs. The output is stored as partitioned parquet files on S3, which can be converted into table view through AWS Athena.

teaching

Business Analytics in R

Workshop, British Gas, 2018

R is an open source language for statistical programming. It is widely used among statisticians, academic researchers, and business analysts across the world. This is a three days workshop for business analysts with little or no experience in the R language. The course focuses on several topics.

EE3020 Renewable Energy Systems

Guest Lecture, Royal Holloway, University of London, 2020

This guest lecture covers the basics of Transmission and Distribution, and the Balance Mechanism (BM) from a supplier perspective. It also covers demand estimation methodology, especially the Non-Half-Hourly (NHH) settlement process and how advances in metering technologies bring changes to this area. Lastly, the lecture briefly discusses emerging trends in the energy industry.

tutorials

Linear Programming for Optimal Resources Allocation

Published:

Using Linear Programming (LP) solver to allocate resources to geographical regions. This generalised example shows how to dynamically allocate engineers to local areas, while satisfying business constraints and minimising travel cost (e.g. travel time, fuel expenditurem, motor insurance, etc).

Using Linear Programming for Route Planning and Job Scheduling

Published:

Efficiently managing travel and job scheduling for multiple engineers across various locations presents a significant operational challenge. We use a Linear Programming (LP) model designed to optimise route planning and job allocation among engineering teams, aiming to minimise travel time and adhere to individual working hours constraints. Utilising variables such as travel costs, job durations, and resource capacities, we construct a mathematical framework that accommodates each engineer’s starting location and contractual obligations. Experimental results, visualised through Gantt charts and geographical plotting, demonstrate the model’s efficacy in reducing total travel time while ensuring equitable workload distribution. This approach not only enhances operational efficiency but also contributes to the broader field of operations research by providing a scalable solution for multi-location, multi-personnel scheduling problems.

Textual Inversion for Stable Diffusion

Published:

This tutorial provides a comprehensive guide on using Textual Inversion with the Stable Diffusion model to create personalized embeddings. It covers the significance of preparing diverse and high-quality training data, the process of creating and training an embedding, and the intricacies of generating images that reflect the trained concept accurately. The author shares practical insights into overcoming challenges such as data preparation, the training process, and adjusting the weight of embeddings to achieve desired results. This resource is valuable for anyone looking to tailor generative models to recognize and generate images of specific objects, faces, or styles.