Baza Publikacji Pracowników Politechniki Lubelskiej

Status:
Autorzy:	Karczmarek Paweł, Kiersztyn Adam, Pedrycz Witold, Czerwiński Dariusz
Dyscypliny:
	Aby zobaczyć szczegóły należy się zalogować.
Rok wydania:	2021
Wersja dokumentu:	Drukowana \| Elektroniczna
Język:	angielski
Wolumen/Tom:	106
Numer artykułu:	107354
Strony:	1 - 10
Impact Factor:	8,263
Web of Science® Times Cited:	25
Scopus® Cytowania:	33
Bazy:	Web of Science \| Scopus \| Google Schoolar
Efekt badań statutowych	NIE
Finansowanie:	Funded by the National Science Centre, Poland under CHIST-ERA programme (Grant no. 2018/28/Z/ST6/00563).
Materiał konferencyjny:	NIE
Publikacja OA:	NIE
Abstrakty:	angielski
	Theproblemoffindinganomalies(outliers)indatabasesisoneofthemostimportantissuesinmoderndata analysis. One of the reasons is the occurrence of this issue in almost every type of database,includingnumerical,categorical,time,mixed,orgraphicdata.Therearecurrentlymanymethodsoftendedicated to specific data analysis. Finally, this topic is extremely interesting per se, as a researchproblem that intrigues researchers. One of the classic methods of data analysis dedicated to findingthe anomalies in the data is Isolation Forest. However, this method, with a few exceptions, has notbeen modified from the time of its first publication, and, in particular, it has not yet appeared incombinationwiththetypicalfuzzymethodsusedforgroupingsuchasFuzzyC-Means(FCM)clustering.In this study, we thoroughly analyze this approach, as well as several related ones. We examine thepossibilities of this technique and analyze it in detail for characteristics of data (database size, numberof attributes, records, their type, etc.). It is worth noting that FCM allows to obtain membership gradesof elements forming Isolation Forest nodes to clusters on the basis of which these nodes are built.Hence, at the stage of calculating the anomaly scores, this information is effectively used, in particulartoexpresshowmuchagivenelementmaybelongtoagroupofsimilarelements,whichcanbeinferredfrom the characteristics of the cluster in which it lies. In this study, we propose a set of methodsenhancing the Isolation Forest on a basis of Fuzzy C-Means. The results of numerical experimentscarried using 27 various datasets and reported in this paper lead us to the conclusion that FCM canplayapivotalroleinanenhancementofIsolationForestapproachandraisesupthevaluesofparticularmeasures of effectiveness of the anomaly detection methods.

Informacja o cookies

Fuzzy C-Means-based Isolation Forest

Artykuł w czasopiśmie