生成式人工智能大模型应用于宫颈癌防治科学普及工作的优势与挑战
The advantages and challenges of applying generative artificial intelligence models in scientific popularization work on the prevention of cervical cancer
投稿时间:2024-01-24  修订日期:2024-07-05
DOI:
中文关键词:  宫颈癌  科普  大模型  人工智能  实践
英文关键词:Cervical Cancer  Science Popularization  Large Model  Artificial Intelligence, AI  Practice
基金项目:
作者单位邮编
杨晰 中国医学科学院肿瘤医院妇科 100021
黄曼妮 中国医学科学院肿瘤医院妇科 100021
安菊生 中国医学科学院肿瘤医院妇科 100021
袁光文 中国医学科学院肿瘤医院妇科 100021
吕讷男 中国医学科学院肿瘤医院妇科 100021
李宁 中国医学科学院肿瘤医院妇科 100021
李斌 中国医学科学院肿瘤医院妇科 100021
吴令英* 中国医学科学院肿瘤医院妇科 100021
摘要点击次数: 42
全文下载次数: 0
中文摘要:
      目的 评价生成式人工智能(Artificial Intelligence, AI)大模型应用于宫颈癌科普工作的优势及潜在问题。 方法 选择已获批上市的生成式中文AI大模型,就宫颈癌领域常见科普问题进行人机交互对话并生成科普文本。由宫颈癌领域知名科普专家对模型生成内容进行单盲法五维度评分(科学准确性、逻辑清晰度、实用价值、参考依据、立场与价值观),使用SPSS version 22进行统计分析,模型评分两两对比差异使用配对t检验进行分析,P<0.05为差异有统计学意义,单独讨论备注中的特殊情况。通过中国知网学术不端检测平台对科普文本进行重复率测评以明确其内容来源。 结果 三个模型在五个维度评分分别为:模型1:16.14±0.72,18.71±0.31,17±0.60,10.86±2.58,19±0.33,总分81.71±3.85。模型2:16.57±0.46,17.43±0.70,17±0.60,10.86±2.58,18.57±0.70,总分80.43±3.00。模型3:16.29±0.41,17.86±0.61,17.14±0.74,11.43±2.75,18.86±0.61,总分81.57±3.92。模型间两两对比无明显统计学差异。五个维度整体平均分从高到底分别为立场与价值观(18.86±0.61),逻辑清晰度(17.86±0.61)、实用价值(17.14±0.74),科学准确性(16.29±0.41),参考依据(11.43±2.75)。专家提出相关质疑,如变换提问语句、反复提问或提问时间不同等变量可能导致生成文本存在差异,部分知识点未及时更新,模型生成文本未提供参考依据等。重复率检测,三模型总文字复制比分别为38.6%,44.9%,38.9%,三模型主要来源为互联网公开资料,来源于专业期刊及论著的内容极少。 结论 生成式AI大模型对常见宫颈癌科普问题的生成内容具备一定的参考价值,未发现严重的误导及商业化倾向,但参考来源模糊。未来需要开展更多研究确定其实际应用价值,医务工作者需要加大互联网科普力度以确保在线互联网生态系统准确性。
英文摘要:
      Objective: Evaluate the advantages and potential issues of applying generative artificial intelligence models in scientific popularization work on the prevention of cervical cancer. Method: Publicly available Chinese-text-generating models are chosen to create an interactive dialogue platform enabling written communication between the public and artificial intelligence models to generate popularized science texts about cervical cancer. The generated content is single-blind assessed by well-reputed popular science experts in the specialization of cervical cancer by five dimensional scoring criteria (scientific accuracy, logical clarity, practical value, reference basis, stance and values). Statistical analyses were performed using SPSS version 22. Paired samples T tests were used to analyze the differences, and P<0.05 was considered statistically significant. Special cases in the remarks were discussed separately. Using the China National Knowledge Infrastructure (CNKI) to evaluate the repetition rate and clarify content sources. Results: The scores of the three models are as followed: Model 1: 16.14 ± 0.72, 18.71 ± 0.31, 17 ± 0.60, 10.86 ± 2.58, 19 ± 0.33, with a total score of 81.71 ± 3.85. Model 2: 16.57 ± 0.46, 17.43 ± 0.70, 17 ± 0.60, 10.86 ± 2.58, 18.57 ± 0.70, total score 80.43 ± 3.00. Model 3: 16.29 ± 0.41, 17.86 ± 0.61, 17.14 ± 0.74, 11.43 ± 2.75, 18.86 ± 0.61, total score 81.57 ± 3.92. There was no significant statistical difference in pairwise comparison between models. The means and standard deviations of five dimensions in descending order are given as follows: stance and values (18.86 ± 0.61), logical clarity (17.86 ± 0.61), practical value (17.14 ± 0.74), scientific accuracy (16.29 ± 0.41), and reference basis (11.43 ± 2.75). Experts have raised concerns as follows: variables such as changing questioning sentences, repeated questioning, or different questioning times may lead to differences in the generated text; some knowledge were not updated; no references were provided. Repetition rate test showed that the total copy ratio of the three models was 38.6%, 44.9%, and 38.9%, respectively. The three texts were mainly generated from Internet public data, and the proportion of content from professional journals was low. Conclusion: The AI models generally provide sound responses to questions related to cervical cancer. No serious misleading or commercialized tendencies have been found, but the reference source is vague. More research is required to assess the practical value of the models. Medical experts need to pay more attention to and make endeavors to increase the content accuracy of the popularization of internet-based science texts on the internet.
在线阅读     查看/发表评论  下载PDF阅读器