Effectiveness of Modern Models Belonging to the YOLO and Vision Transformer Architectures in Dangerous Items Detection
Artykuł w czasopiśmie
MNiSW
100
Lista 2024
| Status: | |
| Autorzy: | Omiotek Zbigniew |
| Dyscypliny: | |
| Aby zobaczyć szczegóły należy się zalogować. | |
| Rok wydania: | 2025 |
| Wersja dokumentu: | Elektroniczna |
| Język: | angielski |
| Numer czasopisma: | 17 |
| Wolumen/Tom: | 14 |
| Numer artykułu: | 3540 |
| Strony: | 1 - 26 |
| Impact Factor: | 2,6 |
| Web of Science® Times Cited: | 0 |
| Scopus® Cytowania: | 0 |
| Bazy: | Web of Science | Scopus |
| Efekt badań statutowych | NIE |
| URL danych badawczych | LINK |
| Materiał konferencyjny: | NIE |
| Publikacja OA: | TAK |
| Licencja: | |
| Sposób udostępnienia: | Witryna wydawcy |
| Wersja tekstu: | Ostateczna wersja opublikowana |
| Czas opublikowania: | W momencie opublikowania |
| Data opublikowania w OA: | 5 września 2025 |
| Abstrakty: | angielski |
| The effectiveness of recently developed tools for detecting dangerous items is overestimated due to the low quality of the datasets used to build the models. The main drawbacks of these datasets include the unrepresentative range of conditions in which the items are presented, the limited number of classes representing items being detected, and the small number of instances of items belonging to individual classes. To fill the gap in this area, a comprehensive dataset dedicated to the detection of items most used in various acts of public security violations has been built. The dataset includes items such as a machete, knife, baseball bat, rifle, and gun, which are presented in varying quality and under different environmental conditions. The specificity of the constructed dataset allows for more reliable results, which give a better idea of the effectiveness of item detection in real-world conditions. The collected dataset was used to build and compare the effectiveness of modern models for detecting items belonging to the YOLO and Vision Transformer (ViT) architectures. Based on a comprehensive analysis of the results, taking into account accuracy and performance, it turned out that the best results were achieved by the YOLOv11m model, for which Recall = 88.2%, Precision = 89.6%, mAP@50 = 91.8%, mAP@50–95 = 73.7%, Inference time = 1.9 ms. The test results make it possible to recommend this model for use in public security monitoring systems aimed at detecting potentially dangerous items. |
