Baza Publikacji Pracowników Politechniki Lubelskiej

Status:
Autorzy:	Gałka Łukasz, Karczmarek Paweł
Dyscypliny:
	Aby zobaczyć szczegóły należy się zalogować.
Rok wydania:	2024
Wersja dokumentu:	Drukowana \| Elektroniczna
Język:	angielski
Wolumen/Tom:	151
Numer artykułu:	110395
Strony:	1 - 18
Impact Factor:	7,6
Web of Science® Times Cited:	4
Scopus® Cytowania:	7
Bazy:	Web of Science \| Scopus \| Google Scholar
Efekt badań statutowych	NIE
Finansowanie:	This work has been supported by the internal grants FD-20/IT-3/047 and FD-20/IT-3/004.
Materiał konferencyjny:	NIE
Publikacja OA:	NIE
Abstrakty:	angielski
	Modern data mining techniques have been gained importance in recent years. In particular, anomaly detection algorithms, applied in key sectors of information technology, have been growing in popularity. One of the efficient and fast algorithms is Isolation Forest. The method consists of two separated stages: Forest formation and evaluation of elements. The first stage relies on forming a forest of isolation trees. Each tree is built in the same manner according to drawn samples and random divisions of data attributes. In this study, an innovative deterministic attribute selection method is proposed, maintaining its random value. New ideas based on imbalance, clustering, and a dispersion of values through non-linear transformation of elements are introduced and thoroughly analyzed. These novel anomaly detection approaches are applied to 25 real datasets, as well as our own artificially generated databases. The Area Under the ROC Curve and the Area Under the PR Curve are used as a measure of the outliers classification quality. The results of the numerical experiment have proven high efficiency and competitive evaluation speed of the proposals in comparison to other Isolation Forest-based approaches, as well as several other popular techniques.

Informacja o cookies

Deterministic attribute selection for isolation forest

Artykuł w czasopiśmie