建筑工程安全事故报告风险研究
黄亚春
摘 要
建筑工程安全管理在建筑行业中举足轻重,是工程项目管理的核心目标之一。工程安全事故报告作为对工程安全事故的记录总结,记载了事故现场伤亡情况的丰富信息。然而,之前对于工程安全风险的研究很少从工程安全事故案例中总结工程风险管理经验。鉴于此,迫切地需要发展综合的分析方法,从案例文本中探索可靠的分析工具以及挖掘不同的风险因素对于安全事故的影响。
为此,本研究基于自然语言处理的相关理论和技术方法,首先使用不同的数据可视化工具,对获取到的工程安全事故报告文本进行多方位的可视化分析,进而应用当下热门的机器学习分类模型对事故原因进行文本分类,并将不同模型得出的分类结果进行性能对比评估。在文本分类基础上,将非结构化的文本描述处理成逻辑分明的结构化文本,再利用TF-IDF抽取对事故文本比较重要的关键词,最后运用关联规则对与这些关键词相关的潜在风险因素进行深入的分析,挖掘不同的风险因素对工程安全事故的影响。
本研究探索出的分类效果最优的CNN模型,有利于加速建筑行业的自动文本分类技术的发展,为工程安全风险分析提供一个可靠的分析工具和框架。分析总结出的不同工程特征之间的相互联系,为后续的工程安全生产活动提供了安全指导,有利于促进建筑行业安全管理更加人性化、精细化。
关键词:安全事故报告;风险分析;自然语言处理;文本分类;关联规则
Abstract
As one of the core goals of construction project management, safety management is vital in the construction industry. Text reports of construction accidents contain much information about the accident situation. However, past researches on construction risks seldom cooperate with reports of accident cases to summarize safety management experience. Thus, it seems necessary to develop comprehensive analysis methods to explore a relied tool and dig diverse risky factors’ influences on safety accidents from text reports.
Based on the state-of-the-art theories and technologies of Natural Language Processing, this thesis utilized data visualization tools to present collected report textin many aspects. Then, machine learning models were applied to classify diverse accident causes and comparisons on classification results were made to evaluate different models. Further, unstructed text descriptions of accidents were transferred into well-structed text with logic. TF-IDF was also applied to extract key words with much importance to accident text, which were deeply digged by the association rule algorithm to find hidden relations and risky factors’ influences on safety accidents.
The best CNN model on text classification of accident reports proposed by this thesis will promote its development in the construction industry and provide a relied tool and framework for risk analysis on safety management. The relations and rules among construction factors digged by the association rule algorithm will guide the construction production in later days and enhance humanization and specialization of safety management in the construction industry.
Key Words: Safety Accident Report; Risk Analysis; Natural Language Processing; Text Classification; Association Rule