Title Machine Learning Methods for Efficient Data Reduction and Reconstruction in the concept of Internet of Things
Title (croatian) Metode strojnog učenja za učinkovito smanjenje količine podataka i njihovu rekonstrukciju u konceptu Interneta stvari
Author Jelena Čulić Gambiroža
Mentor Mario Čagalj (mentor)
Committee member Maja Štula (predsjednik povjerenstva)
Committee member Darko Huljenić (član povjerenstva)
Committee member Damir Krstinić (član povjerenstva)
Committee member Ivo Stančić (član povjerenstva)
Committee member Maja Braović (član povjerenstva)
Granter University of Split Faculty of Electrical Engineering, Mechanical Engineering and Naval Architecture Split
Defense date and country 2023-11-17, Croatia
Scientific / art field, discipline and subdiscipline TECHNICAL SCIENCES Computing Data Processing
Universal decimal classification (UDC ) 004 - Computer science and technology. Computing. Data processing
Abstract The concept of Internet of Things (IoT) has grown in recent years, and the number of IoT devices is rapidly increasing. Consequently, the amount of collected and stored data is increasing, thus leading to Big data and its related challenges. While individual IoT sensors consume a relatively small amount of energy, they are mostly battery powered and numerous, which limits their lifetime and creates a great load on the backend systems. The main goal of herein presented research is to detect places in the IoT network where data velocity and volume reductions can be made while preserving the value and variety of the data. Through the research, three challenges were tackled. The first two challenges address the issue of sensor online time reduction and the effect on battery life prolongation. The third challenge mitigates risks introduced by reducing the amount of data circulating through the IoT network, as well as simplifying the configuration process when adding new sensors to an IoT network. The thesis is structured accordingly. In the first part of research, we propose a dynamic monitoring frequency (DMF) algorithm that aims at collecting data only when sensor readings change by more than a tolerable value between consecutive readings. Thus, a sensor is turned on only when a change in monitored phenomenon value exceeds a predefined threshold. Two algorithms are analysed, namely statistical and machine learning. DMF shows notable performance, resulting either with up to ∼ 70% less missed readings, or collects up to ∼ 40% less data compared to the baseline algorithm that collects data with static monitoring frequency. The focus of the second part of the research are low-cost sensors as they need to preheat for several minutes to reliably collect monitored value from the environment. However, instead of waiting for a sensor to heat up, a transient, i.e., a data trend that the sensor collects while heating up is analysed. It is shown that long short-term memory (LSTM) neural network can be used to learn and later predict actual value from a part of the transient. This way, instead of being constantly online or fully preheating, the sensor needs to be turned on for only 20 seconds and then sleep for 120 seconds. Our approach decreases energy consumption with high accuracy up to 85% compared to a system where sensors are constantly online, and
more than 50% compared to a system where a sensor collects actual values instead of a part of the transient.
In the third part of the research, different approaches for signal type classification that can be used to recognise a signal type being read from an IoT sensor are evaluated and compared. This is performed by using the machine learning methods for modeling a signal represented as raw time series data. Three machine learning classification approaches are taken into a consideration, namely one class, two class and multi class. According to the results of the evaluation, the most accurate multi class random forest algorithm can correctly classify unknown signals in ∼ 75% of the cases based on only 20 consecutive sensor readings. Moreover, multi class random forest can detect two most probable classes of monitored signal with the accuracy of 95%.
Abstract (croatian) Koncept Interneta stvari (engl. Internet of Things - IoT) u posljednje vrijeme sve se više širi, a broj IoT uredaja sve se više povećava. Posljedično, povećava se i količina prikupljenih i pohranjenih podataka, što dovodi do izazova prepoznatih unutar koncepta Velikih podataka (engl. Big data concept). Pojedinačni IoT senzori troše relativno malu količinu energije, no napajaju se uglavnom baterijama koje imaju ograničen vijek trajanja. Osnovni cilj ovog istraživanja je pronaci područja u IoT mreži na kojima se može smanjiti brzina prikupljanja i volumen podataka, pritom uzimajuci u obzir raznolikost podataka i njihovu vrijednost. U istraživanju su adresirana tri izazova. Prvi i drugi izazov bave se smanjenjem vremena rada senzora što rezultira produživanjem životnog vijeka baterije. Treci izazov smanjuje rizike koji proizlaze iz smanjenja količine podataka koji se šalju kroz IoT mrežu, te također pojednostavljuje konfiguracijski proces prilikom dodavanja novog IoT senzora u mrežu. Disertacija je strukturirana u skladu s 3 navedena izazova. Kroz prvi dio istraživanja, predložen je algoritam s dinamičkom frekvencijom/periodom prikupljanja podataka (DMF) koji prikuplja podatke samo kada se očekuje značajna promjena između uzastopnih očitanja. Stoga se senzor uključuje samo kada se očekuje da će se promatrana vrijednost promijeniti više od toleriranog raspona. Pritom se analiziraju i uspoređuju statistički pristup i pristup strojnog učenja. U usporedbi s algoritmom koji prikuplja podatke sa statičkom frekvencijom/periodom prikupljanja, DMF algoritam pokazuje znacajno poboljšanje te rezultira s do ∼ 70% manje nedetektiranih promjena ili s do ∼ 40% manje prikupljenih podataka. Drugi dio istraživanja bavi se niskobudžetnim senzorima kojima je potrebno prethodno zagrijavanje prije nego postignu stabilne radne uvjete. Na primjeru senzora plina, umjesto cekanja da senzor postigne stabilne radne uvjete, analizira se početni dio tranzijenta, odnosno trend podataka koje senzor prikuplja dok se zagrijava. Pokazano je da je primjenom Long Short-Term Memory (LSTM) neuralne mreže moguce predvidjeti stvarnu razinu plina iz početnog dijela tranzijenta. Na taj način, umjesto da se senzor u potpunosti zagrije, dovoljno ga je ukljuciti na 20 sekundi i zatim ga staviti u stanje mirovanja 120 sekundi. S visokom preciznošću ovakav pristup smanjuje potrošnju energije na strani senzora do 85%, u usporedbi sa sustavima u kojima je senzor konstantno upaljen, odnosno više od 50% u usporedbi sa sustavima kada se senzor u potpunosti zagrijava. Kroz treci dio istraživanja uspoređuju se različiti pristupi klasifikacije tipa signala koji se mogu koristiti za prepoznavanje tipa signala kojeg prikuplja pojedini senzor. Pritom se koriste metode strojnog učenja. Analizirana su tri pristupa klasifikacije signala, odnosno prepoznavanje tipa signala na temelju jedne klase, na temelju dvije klase i na temelju više klasa. Rezultati pokazuju da najtočniji pristup, klasifikacija na temelju više klasa korištenjem random forest algoritma, može ispravno klasificirati nepoznati signal u ∼ 75% slučajeva na temelju samo 20 uzastopnih očitanja. Nadalje, isti model može pronaći kojim dvjema klasama signal najvjerojatnije pripada s točnošću od 95%.
Keywords
Internet of Things
IoT
Sensors
Machine learning
Random forest
LSTM
Time series
Signal pattern recognition
Energy efficiency
Keywords (croatian)
Internet stvari
IoT
senzori
strojno učenje
random forest
LSTM
vremenski niz podataka
prepoznavanje uzorka signala
energetska učinkovitost
Language english
URN:NBN urn:nbn:hr:179:106445
Study programme Title: Electrical Engineering and Information Technology Study programme type: university Study level: postgraduate Academic / professional title: doktor/doktorica znanosti, područje tehničkih znanosti, polje elektrotehnika (doktor/doktorica znanosti, područje tehničkih znanosti, polje elektrotehnika)
Type of resource Text
File origin Born digital
Access conditions Open access
Terms of use
Created on 2023-12-14 10:50:34