2024 年 5 月 30 日,新加坡政府发布了《生成式人工智能治理模型框架》(Model AI Governance Framework for Generative AI ,“生成式AI框架”)。该框架以新加坡资讯通信媒体发展局(“ IMDA ”)、Aicadium (一家提供AI解决方案的公司)和 AI Verify 基金会联合发布的“关于生成式人工智能的讨论文件”中强调的政策理念为基础,并借鉴了主要司法管辖区、国际组织、研究界和领先的人工智能组织的见解和讨论,反映了生成式AI中新兴的原则、关注点和技术发展。

该框架建议从 9 个维度全面审视生成式AI的开发

  • 问责制——为人工智能系统开发生命周期中的不同参与者建立正确的激励机制,对最终用户负责。

  • 数据——确保数据质量并以务实的方式处理可能存在争议的训练数据,因为数据是模型开发的核心。

  • 可信开发和部署——根据行业在开发、评估和披露方面的最佳实践,提高基线安全和卫生措施的透明度。

  • 事件报告——实施事件管理系统,以便及时通知、补救和持续改进,因为没有任何人工智能系统是万无一失的。

  • 测试和保证——通过第三方测试提供外部验证和增加信任,并制定通用的人工智能测试标准以确保一致性。

  • 安全——解决通过生成式 AI 模型出现的新威胁载体。

  • 内容来源——内容来源的透明度可为最终用户提供有用的信号。

  • 安全与协调研发——通过人工智能安全研究所之间的全球合作加速研发,以提高模型与人类意图和价值观的一致性。

  • 人工智能造福公众——负责任的人工智能包括利用人工智能造福公众,方式包括民主化使用、提高公共部门的采用率、提高工人技能和可持续地发展人工智能系统。

该框架同时强调了数据在生成式AI中的核心作用

  • 数据质量的重要性:在生成式人工智能模型的开发过程中,数据是核心要素,直接影响模型输出的质量。为了确保模型能够产生高质量、可信赖的输出,输入数据的质量至关重要。框架建议通过使用可信的数据源来保证数据质量。这不仅有助于提升模型的准确性和可靠性,还能防止因为数据问题导致的偏见和错误信息。

  • 个人数据和版权内容的使用:在生成式人工智能的训练过程中,使用个人数据和版权材料可能引发争议。框架指出,在使用这些数据时,需为企业提供明确的指引,确保其合法合规。具体而言,应采取务实的方式处理这些数据,既要保障数据使用的合法性,又要防止侵犯个人隐私和知识产权。这种方式既能促进模型的开发,也能保护数据主体的权益。

  • 数据责任的分配:生成式人工智能的开发和部署涉及技术栈的多个层次,因此数据责任的分配可能并不立即明确。框架强调,需要激励人工智能开发链中的各方对终端用户负责,确保数据的合法、合规使用。这种责任分配不仅包括数据的收集和处理,还涉及数据的存储和共享等多个方面。

为了实现有效的生成式AI中的数据治理,框架提出了一系列具体措施

  • 数据源的选择:使用高质量、可信赖的数据源,避免使用可能引发争议的个人数据和版权材料。

  • 数据透明度:确保数据使用的透明度,让用户了解数据的来源和使用方式。

  • 数据保护机制:建立健全的数据保护机制,防止数据泄露和滥用,保护用户隐私和知识产权。

  • 数据质量评估:定期对数据质量进行评估,确保模型训练数据的准确性和可靠性。

以下为框架文件EXECUTIVE SUMMARY中英文对照内容

Generative AI has captured the world’s imagination. While it holds significant transformative potential, it also comes with risks. Building a trusted ecosystem is therefore critical — it helps people embrace AI with confidence, gives maximal space for innovation, and serves as a core foundation to harnessing AI for the Public Good. AI, as a whole, is a technology that has been developing over the years. Prior development and deployment is sometimes termed traditional AI.

生成式人工智能激发了全世界的想象。虽然它具有显著的变革潜力,但也伴随着风险。因此,构建一个值得信赖的生态系统至关重要——这有助于人们自信地接受人工智能,为创新提供最大的空间,并作为利用人工智能造福公众的核心基础。人工智能整体上是一项多年来不断发展的技术,先前的开发和部署有时被称为传统人工智能。

To lay the groundwork to promote the responsible use of traditional AI, Singapore released the first version of the Model AI Governance Framework in 2019, and updated it subsequently in 2020.2 The recent advent of generative AI 3 has reinforced some of the same AI risks (e.g., bias, misuse, lack of explainability), and introduced new ones (e.g., hallucination, copyright infringement, value alignment). These concerns were highlighted in our earlier Discussion Paper on Generative AI: Implications for Trust and Governance, 4 issued in June 2023. The discussions and feedback have been instructive. Existing governance frameworks need to be reviewed to foster a broader trusted ecosystem. A careful balance needs to be struck between protecting users and driving innovation. There have also been various international discussions pulling in the related and pertinent topics of accountability, copyright and misinformation, among others. These issues are interconnected and need to be viewed in a practical and holistic manner. No single intervention will be a silver bullet.

为了奠定促进传统人工智能负责任使用的基础,新加坡于2019年发布了首版《人工智能治理模型框架》,并于2020年进行了更新。生成式人工智能的最近出现进一步强化了某些人工智能风险(如偏见、滥用、缺乏可解释性),并引入了新的风险(如幻觉、版权侵权、对齐价值观)。这些问题在我们2023年6月发布的《生成式人工智能:信任与治理的影响》讨论文件中得到了强调。讨论和反馈是有启发性的。需要审查现有的治理框架,以培养更广泛的信任生态系统。在保护用户和推动创新之间需要谨慎平衡。国际上也进行了各种讨论,涉及责任、版权和错误信息等相关和重要的话题。这些问题是相互关联的,需要以实际和整体的方式看待。没有一种单一的干预措施能成为万能药。

This Model AI Governance Framework for Generative AI therefore seeks to set forth a systematic and balanced approach to address generative AI concerns while continuing to facilitate innovation. It requires all key stakeholders, including policymakers, industry, the research community and the broader public, to collectively do their part. There are nine dimensions which the Framework proposes to be looked at in totality, to foster a trusted ecosystem.

因此,本《生成式人工智能治理模型框架》旨在提出一种系统和平衡的方法来解决生成式人工智能的担忧,同时继续促进创新。它要求包括政策制定者、行业、研究界和更广泛的公众在内的所有关键利益相关者共同努力。框架建议从九个方面全面考虑,以培养一个值得信赖的生态系统。

a) Accountability — Accountability is a key consideration to incentivise players along the AI development chain to be responsible to end-users. In doing so, we recognise that generative AI, like most software development, involves multiple layers in the tech stack, and hence the allocation of responsibility may not be immediately clear. While generative AI development has unique characteristics, useful parallels can still be drawn with today’s cloud and software development stacks, and initial practical steps can be taken.

a) 问责制—问责制是激励人工智能开发链中的各方对终端用户负责的关键考虑因素。为此,我们认识到生成式人工智能,如同大多数软件开发一样,涉及技术栈中的多个层次,因此责任分配可能并不立即明确。虽然生成式人工智能开发具有独特的特点,但仍可以借鉴当前云和软件开发栈的有用类比,并采取初步的实际步骤。

b) Data — Data is a core element of model development. It significantly impacts the quality of the model output. Hence, what is fed to the model is important and there is a need to ensure data quality, such as through the use of trusted data sources. In cases where the use of data for model training is potentially contentious, such as personal data and copyright material, it is also important to give business clarity, ensure fair treatment, and to do so in a pragmatic way.

b) 数据 — 数据是模型开发的核心要素。它显著影响模型输出的质量。因此,输入模型的数据很重要,需要确保数据质量,如通过使用可信的数据源。在模型训练中使用的数据存在潜在争议的情况下,如个人数据和版权材料,也需要为企业提供明确的指引,确保公平对待,并以务实的方式处理。

c) Trusted Development and Deployment — Model development, and the application deployment on top of it, are at the core of AI-driven innovation. Notwithstanding the limited visibility that end-users may have, meaningful transparency around the baseline safety and hygiene measures undertaken is key. This involves industry adopting best practices in development, evaluation, and thereafter “food label”-type transparency and disclosure. This can enhance broader awareness and safety over time.

c) 可信开发和部署 — 模型开发及其应用部署是人工智能驱动创新的核心。尽管终端用户可能看不到开发过程,但围绕基本安全和卫生措施的有意义透明度是关键。这涉及行业在开发、评估中采用最佳实践,并随后进行类似“食品标签”类型的透明度和披露。这可以随着时间的推移增强更广泛的认识和安全性。

d) Incident Reporting — Even with the most robust development processes and safeguards, no software we use today is completely foolproof. The same applies to AI. Incident reporting is an established practice, and allows for timely notification and remediation. Establishing structures and processes to enable incident monitoring and reporting is therefore key. This also supports continuous improvement of AI systems.

d) 事件报告 — 即使有最健全的开发过程和保障措施,我们今天使用的软件也没有完全无懈可击的。这同样适用于人工智能。事件报告是一种既定做法,允许及时通知和补救。因此,建立结构和流程以进行事件监控和报告是关键。这也支持人工智能系统的持续改进。

e) Testing and Assurance — For a trusted ecosystem, third-party testing and assurance plays a complementary role. We do this today in many domains, such as finance and healthcare, to enable independent verification. Although AI testing is an emerging field, it is valuable for companies to adopt third-party testing and assurance to demonstrate trust with their end-users. It is also important to develop common standards around AI testing to ensure quality and consistency.

e) 测试和保证 — 对于一个值得信赖的生态系统,第三方测试和保证起到互补作用。我们今天在许多领域(如金融和医疗)进行独立验证。尽管人工智能测试是一个新兴领域,但公司采用第三方测试和保证对于向终端用户展示信任是有价值的。制定关于人工智能测试的共同标准以确保质量和一致性也很重要。

f) Security — Generative AI introduces the potential for new threat vectors against the models themselves. This goes beyond security risks inherent in any software stack. While this is a nascent area, existing frameworks for information security need to be adapted and new testing tools developed to address these risks.

f) 安全性 — 生成式人工智能引入了针对模型本身的新威胁向量。这超出了任何软件栈中固有的安全风险。虽然这是一个新兴领域,但需要改编现有的信息安全框架并开发新的测试工具以应对这些风险。

g) Content Provenance — AI-generated content, because of the ease with which it can be created, can exacerbate misinformation. Transparency about where and how content is generated enables end-users to determine how to consume online content in an informed manner. Governments are looking to technical solutions like digital watermarking and cryptographic provenance. These technologies need to be used in the right context.

g) 内容来源 — 由于生成式人工智能生成内容的容易性,它可能会加剧错误信息。透明地说明内容的生成地点和方式,可以让终端用户能够明智地消费在线内容。政府正在寻求数字水印和加密溯源等技术解决方案。这些技术需要在合适的情景中使用。

h) Safety and Alignment Research & Development (R&D) — The state-of-thescience today for model safety does not fully cover all risks. Accelerated investment in R&D is required to improve model alignment with human intention and values. Global cooperation among AI safety R&D institutes will be critical to optimise limited resources for maximum impact, and keep pace with commercially driven growth in model capabilities.

h) 安全性和对齐研究与开发(R&D) — 当今模型安全的科学状态并未完全覆盖所有风险。需要加速投资于研究与开发,以提高模型与人类意图和价值观的对齐。全球人工智能安全研究机构之间的合作将对优化有限资源以获得最大影响至关重要,并跟上商业驱动的模型能力增长的步伐。

i) AI for Public Good — Responsible AI goes beyond risk mitigation. It is also about uplifting and empowering our people and businesses to thrive in an AI-enabled future. Democratising AI access, improving public sector AI adoption, upskilling workers and developing AI systems sustainably will support efforts to steer AI towards the Public Good.

i) 人工智能造福公众 — 负责任的人工智能不仅仅是风险缓解。它还在于提升和赋能我们的人民和企业在人工智能驱动的未来中蓬勃发展。民主化人工智能访问、改善公共部门人工智能采用、提升工人技能和可持续地开发人工智能系统将支持推动人工智能走向公共利益的努力。

参考资料:

1. https://aiverifyfoundation.sg/wp-content/uploads/2024/05/Model-AI-2. Governance-Framework-for-Generative-AI-May-2024-1-1.pdf

声明:本文来自数据信任与治理,版权归作者所有。文章内容仅代表作者独立观点,不代表安全内参立场,转载目的在于传递更多信息。如有侵权,请联系 anquanneican@163.com。