Skip to content

Commit

Permalink
Merge pull request #20 from w3c/xfq-patch-1
Browse files Browse the repository at this point in the history
Typo fix
  • Loading branch information
dontcallmedom authored Mar 21, 2024
2 parents a487db5 + 7f74d0c commit 7dae4e2
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion index.html
Original file line number Diff line number Diff line change
Expand Up @@ -370,7 +370,7 @@ <h4>Comparison with search engines</h4>
</p>
<ul>

<li>The implicit contract that content creators expect from search engines crawlers –i.e., that they will bring exposure to their content– does not have a systematic equivalent for content integrated into [=AI systems=]; while some such systems are gaining the ability to point back to the source of their training data used in a given [=inference=], this is hardly a widespread feature of these systems, nor is it obvious it could be applied systematically (e.g., would linking back to sources for a generated image even make sense?); even if it could, fewer sources would likely be exposed than in a typical search engine results page, and the incentives for the user to follow the links would likely be sustantially lower.
<li>The implicit contract that content creators expect from search engines crawlers –i.e., that they will bring exposure to their content– does not have a systematic equivalent for content integrated into [=AI systems=]; while some such systems are gaining the ability to point back to the source of their training data used in a given [=inference=], this is hardly a widespread feature of these systems, nor is it obvious it could be applied systematically (e.g., would linking back to sources for a generated image even make sense?); even if it could, fewer sources would likely be exposed than in a typical search engine results page, and the incentives for the user to follow the links would likely be substantially lower.

<li><code>robots.txt</code> directives allow specific rules to be given to specific crawlers based on their user agent; while this has been practically manageable when dealing with (for better or for worse) few well-known search engine crawlers, expecting content creators to maintain potential allow- and block-lists of the rapidly expanding number of crawlers deployed to retrieve training data seems unlikely to achieve sustainable results.
</li>
Expand Down

0 comments on commit 7dae4e2

Please sign in to comment.