Skip to content

Commit c457d64

Browse files
Merge pull request #3 from chrisn/chrisn-typos
Fix a few minor typos
2 parents 0593c8a + f970dfb commit c457d64

File tree

1 file changed

+6
-6
lines changed

1 file changed

+6
-6
lines changed

index.html

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -117,7 +117,7 @@ <h2>Executive Summary</h2>
117117
<li><a href="#b3">exposing model-backed Web APIs</a>,</li>
118118
<li><a href="#b4">personal data stores</a> to reduce risk of private data exposure,</li>
119119
<li><a href="#b5">strengthening credentials and identity mechanisms</a> in light of new impersonation risks,</li>
120-
<li>an <a href="#b6">evalualation framework for the environmental impact of Web standards</a>,</li>
120+
<li>an <a href="#b6">evaluation framework for the environmental impact of Web standards</a>,</li>
121121
<li>a <a href="#b8">framework to manage interoperability based on model inference</a>, including for non-deterministic models.</li>
122122
</ul>
123123
<p>We are <a href="https://github.com/w3c/ai-web-impact/issues">seeking input</a> from the community on proposals that could help make progress on these topics, and on other topics that this document has failed to identify.</p>
@@ -144,7 +144,7 @@ <h3>Terminology</h3>
144144
The term "Artificial Intelligence" covers a very broad spectrum of algorithms, techniques and technologies. [[ISO/IEC-22989]] defines <dfn>Artificial Intelligence</dfn> as "research and development of mechanisms and applications of [=AI systems=]", with <dfn data-lt="Artificial Intelligence system">AI system</dfn>s being "an engineered system that generates outputs such as content, forecasts, recommendations or decisions for a given set of human-defined objectives". At the time of the writing of this document in early 2024, the gist of the Web ecosystem conversation on Artificial Intelligence is mostly about systems based on <dfn>Machine Learning</dfn> ("process of optimizing model parameters through computational techniques, such that the model's behavior reflects the data or experience") and its software manifestation, <dfn data-lt="model">Machine Learning model</dfn>s ("mathematical construct that generates an inference or prediction based on input data or information").
145145
</p>
146146
<p>
147-
While we acknowledge the much broader meaning of Artificial Intelligence and its intersection with a number of other Web- and W3C-related activities (e.g., the Semantic Web and Linked Data), this document willfully focuses only on the current conversation around the impact that these [=Machine Learning models=] are bringing to the Web. We further acknowledge that this document has been developed during, and is partially a response to, a cycle of inflated expectations and investments in that space.That situation underlines the need for a framework to structure the conversation.
147+
While we acknowledge the much broader meaning of Artificial Intelligence and its intersection with a number of other Web- and W3C-related activities (e.g., the Semantic Web and Linked Data), this document willfully focuses only on the current conversation around the impact that these [=Machine Learning models=] are bringing to the Web. We further acknowledge that this document has been developed during, and is partially a response to, a cycle of inflated expectations and investments in that space. That situation underlines the need for a framework to structure the conversation.
148148
</p>
149149
<p>
150150
Because of this focus on [=Machine Learning=], this document analyzes AI impact through the two main phases needed to operate [=Machine Learning models=]: <dfn>training</dfn> ("process to determine or to improve the parameters of a Machine Learning model, based on a Machine Learning algorithm, by using training data") and <dfn data-lt="run|running">inference</dfn> (the actual usage of these models to produce their expected outcomes), which we also casually refer as running a model.
@@ -319,7 +319,7 @@ <h3>Balancing content creators incentives and consumers rights</h3>
319319
The controversy that has emerged from that situation is being debated (and in some cases, arbitrated) through the lens of copyright law.
320320
</p>
321321
<p>
322-
It is not our place to determine if and how various copyright legislation bears on that particular usage. Beyond legal considerations, the copyright system creates a (relatively) shared understanding between creators and consumers that, by default, content cannot be redistributed, remixed, adapted or built upon without creators consent. This shared understanding made it possible for a lot of content to be openly distributed on the Web. It also allowed creators to consider a variety of monetization options (subscription, pay per view, advertising) for their content grounded on the assumption that consumers will always reach their pages.
322+
It is not our place to determine if and how various copyright legislation bears on that particular usage. Beyond legal considerations, the copyright system creates a (relatively) shared understanding between creators and consumers that, by default, content cannot be redistributed, remixed, adapted or built upon without creators' consent. This shared understanding made it possible for a lot of content to be openly distributed on the Web. It also allowed creators to consider a variety of monetization options (subscription, pay per view, advertising) for their content grounded on the assumption that consumers will always reach their pages.
323323
</p>
324324
<p>
325325
A number of [=AI systems=] combine (1) automated large-scale consumption of Web content, and (2) production at scale of content, in ways that do not recognize or otherwise compensate content it was trained from.
@@ -354,7 +354,7 @@ <h4>Comparison with search engines</h4>
354354
Over time, in addition to the links to sites matching the user's query, search engines have integrated more ways to surface content directly from the target Web sites: either through rich snippets (typically made possible by the use of schema.org metadata) or through embedded preview (e.g., what the <a href="https://amp.dev/">AMP project</a> enabled). These changes were frequently accompanied by sometimes challenging discussions around the balance between bringing additional visibility to crawled content and reducing the incentive from end-users to visit the source website (e.g., because they may have already received sufficient information from the search results page).
355355
</p>
356356
<p>
357-
In a certain number of cases, [=AI systems=] are used as an alternative or complement to what users would traditionally have used a search engine for (and indeed, are increasingly integrated into search engines interfaces). So it seems useful to explore to what extent the lessons learned from the evolutionary process balancing the needs from search engines and from content creators can inform the discussion on crawlers used to train [=Machine Learning models=].
357+
In a certain number of cases, [=AI systems=] are used as an alternative or complement to what users would traditionally have used a search engine for (and indeed, are increasingly integrated into search engine interfaces). So it seems useful to explore to what extent the lessons learned from the evolutionary process balancing the needs from search engines and from content creators can inform the discussion on crawlers used to train [=Machine Learning models=].
358358
</p>
359359
<p>
360360
In making that comparison, it's also important to note significant differences:
@@ -363,7 +363,7 @@ <h4>Comparison with search engines</h4>
363363

364364
<li>The implicit contract that content creators expect from search engines crawlers –i.e., that they will bring exposure to their content– does not have a systematic equivalent for content integrated into [=AI systems=]; while some such systems are gaining the ability to point back to the source of their training data used in a given [=inference=], this is hardly a widespread feature of these systems, nor is it obvious it could be applied systematically (e.g., would linking back to sources for a generated image even make sense?)
365365

366-
<li><code>robots.txt</code> directives allow to give specific rules to specific crawlers based on their user agent; while this has been practically manageable when dealing with (for better or for worse) few well-known search engine crawlers, expecting content creators to maintain potential allow- and block-list of the rapidly expanding number of crawlers deployed to retrieve training data seems unlikely to achieve sustainable results
366+
<li><code>robots.txt</code> directives allow specific rules to be given to specific crawlers based on their user agent; while this has been practically manageable when dealing with (for better or for worse) few well-known search engine crawlers, expecting content creators to maintain potential allow- and block-lists of the rapidly expanding number of crawlers deployed to retrieve training data seems unlikely to achieve sustainable results
367367
</li>
368368
</ul>
369369
<p>
@@ -397,7 +397,7 @@ <h2 id=interop>Impact of [=AI systems=] on interoperability</h2>
397397
</p>
398398
<div id=b8 class=advisement>
399399
<p>
400-
As discussed above, [=Machine Learning models=] are already finding their ways in standardized Web APIs. These creates two challenges to our interoperability goals:
400+
As discussed above, [=Machine Learning models=] are already finding their way into standardized Web APIs. These creates two challenges to our interoperability goals:
401401
</p>
402402
<ul>
403403

0 commit comments

Comments
 (0)