Baza Publikacji Pracowników Politechniki Lubelskiej

Status:
Autorzy:	Rachwał Albert, Karczmarek Paweł, Rachwał Alicja, Stęgierski Rafał
Dyscypliny:
	Aby zobaczyć szczegóły należy się zalogować.
Rok wydania:	2024
Wersja dokumentu:	Elektroniczna
Język:	angielski
Wolumen/Tom:	12
Strony:	101797 - 101813
Impact Factor:	3,4
Web of Science® Times Cited:	2
Scopus® Cytowania:	2
Bazy:	Web of Science \| Scopus
Efekt badań statutowych	NIE
Materiał konferencyjny:	NIE
Publikacja OA:	TAK
Licencja:
Sposób udostępnienia:	Witryna wydawcy
Wersja tekstu:	Ostateczna wersja opublikowana
Czas opublikowania:	W momencie opublikowania
Data opublikowania w OA:	22 lipca 2024
Abstrakty:	angielski
	Recognizing anomalies is an extremely important process in data analysis, aimed at identifying patterns in data that deviate from known norms or typical standards. These anomalies are often indicative of significant, and sometimes critical, issues such as fraud, network intrusions, and system failures. Traditional anomaly detection algorithms primarily focus on the attributes of individual observations within a dataset, typically establishing a ‘normal’ profile and flagging deviations from this profile as anomalies. This paper introduces an innovative enhancement to the Isolation Forest algorithm, a renowned method for anomaly detection known for its effectiveness and efficiency, especially in large datasets. The Isolation Forest algorithm operates by randomly partitioning the data space and constructing a binary tree, where the oddity score of a data point is ascertained based on its separation from the extremity to the base of the structure, enabling the autonomous detection of outliers in a completely unsupervised manner. The methodology presented in the paper is based on repeatedly building Isolation Forest models on datasets from which individual attributes are excluded. In our research, we used the SHAP (SHapley Additive exPlanations) method which comes from game theory and is used to determine the impact of individual features on the result of the model. When training the Isolation Forest on the full dataset, the SHAP method is used to obtain the coefficients of influence of model attributes on the prediction result. Both negative and positive influences are considered significant when counting the anomaly score. On the foundations of the results from all sub-models, a weighted average is calculated, to which weights are calculated based on the SHAP model. The comparative analysis of evaluation metrics revealed a substantial enhancement attributed to the implemented methodology. The metrics used for evaluation have shown improvement in most cases from 3.5 to 6 percent point. One of the metrics have shown an improvement of 12 percent. Obtained results demonstrate that this integrated approach not only enhances the prediction accuracy of the Isolation Forest algorithm but also offers a more interpretable understanding of the data. This advancement in anomaly detection methodology promises significant implications for various fields where quick and accurate detection of outliers is paramount.

Informacja o cookies

Isolation Forest With Exclusion of Attributes Based on Shapley Index

Artykuł w czasopiśmie