Severity: Warning
Message: file_get_contents(https://...@pubfacts.com&api_key=b8daa3ad693db53b1410957c26c9a51b4908&a=1): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests
Filename: helpers/my_audit_helper.php
Line Number: 176
Backtrace:
File: /var/www/html/application/helpers/my_audit_helper.php
Line: 176
Function: file_get_contents
File: /var/www/html/application/helpers/my_audit_helper.php
Line: 250
Function: simplexml_load_file_from_url
File: /var/www/html/application/helpers/my_audit_helper.php
Line: 3122
Function: getPubMedXML
File: /var/www/html/application/controllers/Detail.php
Line: 575
Function: pubMedSearch_Global
File: /var/www/html/application/controllers/Detail.php
Line: 489
Function: pubMedGetRelatedKeyword
File: /var/www/html/index.php
Line: 316
Function: require_once
Pedestrian detection in crowded scenes is widely used in computer vision. However, it still has two difficulties: 1) eliminating repeated predictions (multiple predictions corresponding to the same object); 2) false detection and missing detection due to the high scene occlusion rate and the small visible area of detected pedestrians. This paper presents a detection framework based on DETR (detection transformer) to address the above problems, and the model is called AD-DETR (asymmetrical relation detection transformer). We find that the symmetry in a DETR framework causes synchronous prediction updates and duplicate predictions. Therefore, we propose an asymmetric relationship fusion mechanism and let each query asymmetrically fuse the relative relationships of surrounding predictions to learn to eliminate duplicate predictions. Then, we propose a decoupled cross-attention head that allows the model to learn to restrict the range of attention to focus more on visible regions and regions that contribute more to confidence. The method can reduce the noise information introduced by the occluded objects to reduce the false detection rate. Meanwhile, in our proposed asymmetric relations module, we establish a way to encode the relative relation between sets of attention points and improve the baseline. Without additional annotations, combined with the deformable-DETR with Res50 as the backbone, our method can achieve an average precision of 92.6%, MR$ ^{-2} $ of 40.0% and Jaccard index of 84.4% on the challenging CrowdHuman dataset. Our method exceeds previous methods, such as Iter-E2EDet (progressive end-to-end object detection), MIP (one proposal, multiple predictions), etc. Experiments show that our method can significantly improve the performance of the query-based model for crowded scenes, and it is highly robust for the crowded scene.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.3934/mbe.2023633 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!