Severity: Warning
Message: file_get_contents(https://...@gmail.com&api_key=61f08fa0b96a73de8c900d749fcb997acc09&a=1): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests
Filename: helpers/my_audit_helper.php
Line Number: 176
Backtrace:
File: /var/www/html/application/helpers/my_audit_helper.php
Line: 176
Function: file_get_contents
File: /var/www/html/application/helpers/my_audit_helper.php
Line: 250
Function: simplexml_load_file_from_url
File: /var/www/html/application/helpers/my_audit_helper.php
Line: 3122
Function: getPubMedXML
File: /var/www/html/application/controllers/Detail.php
Line: 575
Function: pubMedSearch_Global
File: /var/www/html/application/controllers/Detail.php
Line: 489
Function: pubMedGetRelatedKeyword
File: /var/www/html/index.php
Line: 316
Function: require_once
Recently, the National Institute for Occupational Safety and Health (NIOSH) released an updated version of the NIOSH Industry and Occupation Computerized Coding System (NIOCCS), which uses supervised machine learning to assign industry and occupational codes based on provided free-text information. However, no efforts have been made to externally verify the quality of assigned industry and job titles when the algorithm is provided with inputs of varying quality. This study sought to evaluate whether the NIOCCS algorithm was sufficiently robust with low-quality inputs and how variable quality could impact subsequent job estimated exposures in a large job-exposure matrix for noise (NoiseJEM). Using free-text industry and job descriptions from >700,000 noise measurements in the NoiseJEM, three files were created and input into NIOCCS: (1) N1, "raw" industries and job titles; (2) N2, "refined" industries and "raw" job titles; and (3) N3, "refined" industries and job titles. Standardized industry and occupation codes were output by NIOCCS. Descriptive statistics of performance metrics (e.g., misclassification/discordance of occupation codes) were evaluated for each input relative to the original NoiseJEM dataset (N0). Across major Standardized Occupational Classifications (SOC), total discordance rates for N1, N2, and N3 compared to N0 were 53.6%, 42.3%, and 5.0%, respectively. The impact of discordance on the major SOC group varied and included both over- and under-estimates of average noise exposure compared to N0. N2 had the most accurate noise exposure estimates (i.e., smallest bias) across major SOC groups compared to N1 and N3. Further refinement of job titles in N3 showed little improvement. Some variation in classification efficacy was seen over time, particularly prior to 1985. Machine learning algorithms can systematically and consistently classify data but are highly dependent on the quality and amount of input data. The greatest benefit for an end-user may come from cleaning industry information before applying this method for job classification. Our results highlight the need for standardized classification methods that remain constant over time.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1080/15459624.2022.2076860 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!