diff --git a/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses.md b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses.md
new file mode 100644
index 00000000..0eb3460d
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses.md
@@ -0,0 +1,69 @@
+# Module: contrastive_losses
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/__init__.py">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Contrastive losses.
+
+Users of TF-GNN can use these layers by importing them next to the core library:
+
+```python
+import tensorflow_gnn as tfgnn
+from tensorflow_gnn.models import contrastive_losses
+```
+
+## Classes
+
+[`class AllSvdMetrics`](./contrastive_losses/AllSvdMetrics.md): Computes
+multiple metrics for representations using one SVD call.
+
+[`class BarlowTwinsTask`](./contrastive_losses/BarlowTwinsTask.md): A Barlow
+Twins (BT) Task.
+
+[`class ContrastiveLossTask`](./contrastive_losses/ContrastiveLossTask.md): Base
+class for unsupervised contrastive representation learning tasks.
+
+[`class CorruptionSpec`](./contrastive_losses/CorruptionSpec.md): Class for
+defining corruption specification.
+
+[`class Corruptor`](./contrastive_losses/Corruptor.md): Base class for graph
+corruptor.
+
+[`class DeepGraphInfomaxLogits`](./contrastive_losses/DeepGraphInfomaxLogits.md):
+Computes clean and corrupted logits for Deep Graph Infomax (DGI).
+
+[`class DeepGraphInfomaxTask`](./contrastive_losses/DeepGraphInfomaxTask.md): A
+Deep Graph Infomax (DGI) Task.
+
+[`class DropoutFeatures`](./contrastive_losses/DropoutFeatures.md): Base class
+for graph corruptor.
+
+[`class ShuffleFeaturesGlobally`](./contrastive_losses/ShuffleFeaturesGlobally.md):
+A corruptor that shuffles features.
+
+[`class TripletEmbeddingSquaredDistances`](./contrastive_losses/TripletEmbeddingSquaredDistances.md):
+Computes embeddings distance between positive and negative pairs.
+
+[`class TripletLossTask`](./contrastive_losses/TripletLossTask.md): The triplet
+loss task.
+
+[`class VicRegTask`](./contrastive_losses/VicRegTask.md): A VICReg Task.
+
+## Functions
+
+[`coherence(...)`](./contrastive_losses/coherence.md): Coherence metric
+implementation.
+
+[`numerical_rank(...)`](./contrastive_losses/numerical_rank.md): Numerical rank
+implementation.
+
+[`pseudo_condition_number(...)`](./contrastive_losses/pseudo_condition_number.md):
+Pseudo-condition number metric implementation.
+
+[`rankme(...)`](./contrastive_losses/rankme.md): RankMe metric implementation.
+
+[`self_clustering(...)`](./contrastive_losses/self_clustering.md):
+Self-clustering metric implementation.
diff --git a/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/AllSvdMetrics.md b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/AllSvdMetrics.md
new file mode 100644
index 00000000..5d7964d3
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/AllSvdMetrics.md
@@ -0,0 +1,202 @@
+# contrastive_losses.AllSvdMetrics
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/metrics.py#L337-L348">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Computes multiple metrics for representations using one SVD call.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>contrastive_losses.AllSvdMetrics(
+    *args, **kwargs
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+Refer to https://arxiv.org/abs/2305.16562 for more details.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>fns</code><a id="fns"></a>
+</td>
+<td>
+a mapping from a metric name to a <code>Callable</code> that accepts
+representations as well as the result of their SVD decomposition.
+Currently only singular values are passed.
+</td>
+</tr><tr>
+<td>
+<code>y_pred_transform_fn</code><a id="y_pred_transform_fn"></a>
+</td>
+<td>
+a function to extract clean representations
+from model predictions. By default, no transformation is applied.
+</td>
+</tr><tr>
+<td>
+<code>name</code><a id="name"></a>
+</td>
+<td>
+Name for the metric class, used for Keras bookkeeping.
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="merge_state"><code>merge_state</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>merge_state(
+    metrics
+)
+</code></pre>
+
+Merges the state from one or more metrics.
+
+This method can be used by distributed systems to merge the state computed by
+different metric instances. Typically the state will be stored in the form of
+the metric's weights. For example, a tf.keras.metrics.Mean metric contains a
+list of two weight values: a total and a count. If there were two instances of a
+tf.keras.metrics.Accuracy that each independently aggregated partial state for
+an overall accuracy calculation, these two metric's states could be combined as
+follows:
+
+```
+>>> m1 = tf.keras.metrics.Accuracy()
+>>> _ = m1.update_state([[1], [2]], [[0], [2]])
+```
+
+```
+>>> m2 = tf.keras.metrics.Accuracy()
+>>> _ = m2.update_state([[3], [4]], [[3], [4]])
+```
+
+```
+>>> m2.merge_state([m1])
+>>> m2.result().numpy()
+0.75
+```
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>metrics</code>
+</td>
+<td>
+an iterable of metrics. The metrics must have compatible
+state.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Raises</th></tr>
+
+<tr>
+<td>
+<code>ValueError</code>
+</td>
+<td>
+If the provided iterable does not contain metrics matching
+the metric's required specifications.
+</td>
+</tr>
+</table>
+
+<h3 id="reset_state"><code>reset_state</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/metrics.py#L321-L323">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reset_state() -> None
+</code></pre>
+
+Resets all of the metric state variables.
+
+This function is called between epochs/steps, when a metric is evaluated during
+training.
+
+<h3 id="result"><code>result</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/metrics.py#L333-L334">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>result() -> Mapping[str, tf.Tensor]
+</code></pre>
+
+Computes and returns the scalar metric value tensor or a dict of scalars.
+
+Result computation is an idempotent operation that simply calculates the metric
+value using the state variables.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A scalar tensor, or a dictionary of scalar tensors.
+</td>
+</tr>
+
+</table>
+
+<h3 id="update_state"><code>update_state</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/metrics.py#L325-L331">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>update_state(
+    _, y_pred: tf.Tensor, sample_weight=None
+) -> None
+</code></pre>
+
+Accumulates statistics for the metric.
+
+Note: This function is executed as a graph function in graph mode. This means:
+a) Operations on the same resource are executed in textual order. This should
+make it easier to do things like add the updated value of a variable to another,
+for example. b) You don't need to worry about collecting the update ops to
+execute. All update ops added to the graph by this function will be executed. As
+a result, code should generally work the same way with graph or eager execution.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr> <td> <code>*args</code> </td> <td>
+
+</td>
+</tr><tr>
+<td>
+<code>**kwargs</code>
+</td>
+<td>
+A mini-batch of inputs to the Metric.
+</td>
+</tr>
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/BarlowTwinsTask.md b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/BarlowTwinsTask.md
new file mode 100644
index 00000000..8ef94ef3
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/BarlowTwinsTask.md
@@ -0,0 +1,168 @@
+# contrastive_losses.BarlowTwinsTask
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L253-L281">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+A Barlow Twins (BT) Task.
+
+Inherits From:
+[`ContrastiveLossTask`](../contrastive_losses/ContrastiveLossTask.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>contrastive_losses.BarlowTwinsTask(
+    *args,
+    lambda_: Optional[Union[tf.Tensor, float]] = None,
+    normalize_batch: bool = True,
+    **kwargs
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>node_set_name</code><a id="node_set_name"></a>
+</td>
+<td>
+Name of the node set for readout.
+</td>
+</tr><tr>
+<td>
+<code>feature_name</code><a id="feature_name"></a>
+</td>
+<td>
+Feature name for readout.
+</td>
+</tr><tr>
+<td>
+<code>representations_layer_name</code><a id="representations_layer_name"></a>
+</td>
+<td>
+Layer name for uncorrupted representations.
+</td>
+</tr><tr>
+<td>
+<code>corruptor</code><a id="corruptor"></a>
+</td>
+<td>
+<code>Corruptor</code> instance for creating negative samples. If not
+specified, we use <code>ShuffleFeaturesGlobally</code> by default.
+</td>
+</tr><tr>
+<td>
+<code>projector_units</code><a id="projector_units"></a>
+</td>
+<td>
+<code>Sequence</code> of layer sizes for projector network.
+Projectors prevent dimensional collapse, but can hinder training for
+easy corruptions. For more details, see
+https://arxiv.org/abs/2304.12210.
+</td>
+</tr><tr>
+<td>
+<code>seed</code><a id="seed"></a>
+</td>
+<td>
+Random seed for the default corruptor (<code>ShuffleFeaturesGlobally</code>).
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="losses"><code>losses</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L270-L278">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>losses() -> runner.Losses
+</code></pre>
+
+Returns arbitrary task specific losses.
+
+<h3 id="make_contrastive_layer"><code>make_contrastive_layer</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L267-L268">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>make_contrastive_layer() -> tf.keras.layers.Layer
+</code></pre>
+
+Returns the layer contrasting clean outputs with the correupted ones.
+
+<h3 id="metrics"><code>metrics</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L280-L281">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>metrics() -> runner.Metrics
+</code></pre>
+
+Returns arbitrary task specific metrics.
+
+<h3 id="predict"><code>predict</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L115-L146">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>predict(
+    *args
+) -> runner.Predictions
+</code></pre>
+
+Apply a readout head for use with various contrastive losses.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>*args</code>
+</td>
+<td>
+A tuple of (clean, corrupted) <code>tfgnn.GraphTensor</code>s.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+The logits for some contrastive loss as produced by the implementing
+subclass.
+</td>
+</tr>
+
+</table>
+
+<h3 id="preprocess"><code>preprocess</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L107-L113">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>preprocess(
+    inputs: GraphTensor
+) -> tuple[Sequence[GraphTensor], runner.Predictions]
+</code></pre>
+
+Applies a `Corruptor` and returns empty pseudo-labels.
diff --git a/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/ContrastiveLossTask.md b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/ContrastiveLossTask.md
new file mode 100644
index 00000000..f25e9ec3
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/ContrastiveLossTask.md
@@ -0,0 +1,186 @@
+# contrastive_losses.ContrastiveLossTask
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L40-L154">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Base class for unsupervised contrastive representation learning tasks.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>contrastive_losses.ContrastiveLossTask(
+    node_set_name: str,
+    *,
+    feature_name: str = tfgnn.HIDDEN_STATE,
+    representations_layer_name: Optional[str] = None,
+    corruptor: Optional[layers.Corruptor] = None,
+    projector_units: Optional[Sequence[int]] = None,
+    seed: Optional[int] = None
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+The process is separated into preprocessing and contrastive parts, with the
+focus on reusability of individual components. The `preprocess` produces input
+GraphTensors to be used with the `predict` as well as labels for the task. The
+default `predict` method implementation expects a pair of positive and negative
+GraphTensors. There are multiple ways proposed in the literature to learn
+representations based on the activations - we achieve that by using custom
+losses.
+
+Any subclass must implement `make_contrastive_layer` method, which produces the
+final prediction outputs.
+
+If the loss involves labels for each example, subclasses should leverage
+`losses` and `metrics` methods to specify task's losses. When the loss only
+involves model outputs, `make_contrastive_layer` should output both positive and
+perturb examples, and the `losses` should use pseudolabels.
+
+Any model-specific preprocessing should be implemented in the `preprocess`.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>node_set_name</code><a id="node_set_name"></a>
+</td>
+<td>
+Name of the node set for readout.
+</td>
+</tr><tr>
+<td>
+<code>feature_name</code><a id="feature_name"></a>
+</td>
+<td>
+Feature name for readout.
+</td>
+</tr><tr>
+<td>
+<code>representations_layer_name</code><a id="representations_layer_name"></a>
+</td>
+<td>
+Layer name for uncorrupted representations.
+</td>
+</tr><tr>
+<td>
+<code>corruptor</code><a id="corruptor"></a>
+</td>
+<td>
+<code>Corruptor</code> instance for creating negative samples. If not
+specified, we use <code>ShuffleFeaturesGlobally</code> by default.
+</td>
+</tr><tr>
+<td>
+<code>projector_units</code><a id="projector_units"></a>
+</td>
+<td>
+<code>Sequence</code> of layer sizes for projector network.
+Projectors prevent dimensional collapse, but can hinder training for
+easy corruptions. For more details, see
+https://arxiv.org/abs/2304.12210.
+</td>
+</tr><tr>
+<td>
+<code>seed</code><a id="seed"></a>
+</td>
+<td>
+Random seed for the default corruptor (<code>ShuffleFeaturesGlobally</code>).
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="losses"><code>losses</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>losses() -> Losses
+</code></pre>
+
+Returns arbitrary task specific losses.
+
+<h3 id="make_contrastive_layer"><code>make_contrastive_layer</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L148-L151">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>make_contrastive_layer() -> tf.keras.layers.Layer
+</code></pre>
+
+Returns the layer contrasting clean outputs with the correupted ones.
+
+<h3 id="metrics"><code>metrics</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L153-L154">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>metrics() -> runner.Metrics
+</code></pre>
+
+Returns arbitrary task specific metrics.
+
+<h3 id="predict"><code>predict</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L115-L146">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>predict(
+    *args
+) -> runner.Predictions
+</code></pre>
+
+Apply a readout head for use with various contrastive losses.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>*args</code>
+</td>
+<td>
+A tuple of (clean, corrupted) <code>tfgnn.GraphTensor</code>s.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+The logits for some contrastive loss as produced by the implementing
+subclass.
+</td>
+</tr>
+
+</table>
+
+<h3 id="preprocess"><code>preprocess</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L107-L113">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>preprocess(
+    inputs: GraphTensor
+) -> tuple[Sequence[GraphTensor], runner.Predictions]
+</code></pre>
+
+Applies a `Corruptor` and returns empty pseudo-labels.
diff --git a/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/CorruptionSpec.md b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/CorruptionSpec.md
new file mode 100644
index 00000000..ead30912
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/CorruptionSpec.md
@@ -0,0 +1,87 @@
+# contrastive_losses.CorruptionSpec
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/layers.py#L35-L81">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Class for defining corruption specification.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>contrastive_losses.CorruptionSpec(
+    node_set_corruption: NodeCorruptionSpec = dataclasses.field(default_factory=dict),
+    edge_set_corruption: EdgeCorruptionSpec = dataclasses.field(default_factory=dict),
+    context_corruption: ContextCorruptionSpec = dataclasses.field(default_factory=dict)
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+This has three fields for specifying the corruption behavior of node-, edge-,
+and context sets.
+
+A value of the key "*" is a wildcard value that is used for either all features
+or all node/edge sets.
+
+#### Some example usages:
+
+Want: corrupt everything with parameter 1.0. Solution: either set default to 1.0
+or set all corruption specs to `{"*": 1.}`.
+
+Want: corrupt all context features with parameter 1.0 except for "feat", which
+should not be corrupted. Solution: set `context_corruption` to `{"feat": 0.,
+"*": 1.}`
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr>
+<td>
+<code>node_set_corruption</code><a id="node_set_corruption"></a>
+</td>
+<td>
+Dataclass field
+</td>
+</tr><tr>
+<td>
+<code>edge_set_corruption</code><a id="edge_set_corruption"></a>
+</td>
+<td>
+Dataclass field
+</td>
+</tr><tr>
+<td>
+<code>context_corruption</code><a id="context_corruption"></a>
+</td>
+<td>
+Dataclass field
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="with_default"><code>with_default</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/layers.py#L66-L81">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_default(
+    default: T
+)
+</code></pre>
+
+<h3 id="__eq__"><code>__eq__</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>__eq__(
+    other
+)
+</code></pre>
+
+Return self==value.
diff --git a/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/Corruptor.md b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/Corruptor.md
new file mode 100644
index 00000000..89a71bb3
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/Corruptor.md
@@ -0,0 +1,58 @@
+# contrastive_losses.Corruptor
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/layers.py#L84-L142">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Base class for graph corruptor.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>contrastive_losses.Corruptor(
+    corruption_spec: Optional[CorruptionSpec[T]] = None,
+    *,
+    corruption_fn: Callable[[tfgnn.Field, T], tfgnn.Field],
+    default: Optional[T] = None,
+    **kwargs
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>corruption_spec</code><a id="corruption_spec"></a>
+</td>
+<td>
+A spec for corruption application.
+</td>
+</tr><tr>
+<td>
+<code>corruption_fn</code><a id="corruption_fn"></a>
+</td>
+<td>
+Corruption function.
+</td>
+</tr><tr>
+<td>
+<code>default</code><a id="default"></a>
+</td>
+<td>
+Global application default of the corruptor. This is only used
+when <code>corruption_spec</code> is None.
+</td>
+</tr><tr>
+<td>
+<code>**kwargs</code><a id="**kwargs"></a>
+</td>
+<td>
+Additional keyword arguments.
+</td>
+</tr>
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/DeepGraphInfomaxLogits.md b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/DeepGraphInfomaxLogits.md
new file mode 100644
index 00000000..60a512a3
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/DeepGraphInfomaxLogits.md
@@ -0,0 +1,17 @@
+# contrastive_losses.DeepGraphInfomaxLogits
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/layers.py#L277-L309">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Computes clean and corrupted logits for Deep Graph Infomax (DGI).
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>contrastive_losses.DeepGraphInfomaxLogits(
+    trainable=True, name=None, dtype=None, dynamic=False, **kwargs
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
diff --git a/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/DeepGraphInfomaxTask.md b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/DeepGraphInfomaxTask.md
new file mode 100644
index 00000000..1b35ea21
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/DeepGraphInfomaxTask.md
@@ -0,0 +1,165 @@
+# contrastive_losses.DeepGraphInfomaxTask
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L191-L237">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+A Deep Graph Infomax (DGI) Task.
+
+Inherits From:
+[`ContrastiveLossTask`](../contrastive_losses/ContrastiveLossTask.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>contrastive_losses.DeepGraphInfomaxTask(
+    *args, **kwargs
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>node_set_name</code><a id="node_set_name"></a>
+</td>
+<td>
+Name of the node set for readout.
+</td>
+</tr><tr>
+<td>
+<code>feature_name</code><a id="feature_name"></a>
+</td>
+<td>
+Feature name for readout.
+</td>
+</tr><tr>
+<td>
+<code>representations_layer_name</code><a id="representations_layer_name"></a>
+</td>
+<td>
+Layer name for uncorrupted representations.
+</td>
+</tr><tr>
+<td>
+<code>corruptor</code><a id="corruptor"></a>
+</td>
+<td>
+<code>Corruptor</code> instance for creating negative samples. If not
+specified, we use <code>ShuffleFeaturesGlobally</code> by default.
+</td>
+</tr><tr>
+<td>
+<code>projector_units</code><a id="projector_units"></a>
+</td>
+<td>
+<code>Sequence</code> of layer sizes for projector network.
+Projectors prevent dimensional collapse, but can hinder training for
+easy corruptions. For more details, see
+https://arxiv.org/abs/2304.12210.
+</td>
+</tr><tr>
+<td>
+<code>seed</code><a id="seed"></a>
+</td>
+<td>
+Random seed for the default corruptor (<code>ShuffleFeaturesGlobally</code>).
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="losses"><code>losses</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L222-L226">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>losses() -> runner.Losses
+</code></pre>
+
+Returns arbitrary task specific losses.
+
+<h3 id="make_contrastive_layer"><code>make_contrastive_layer</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L202-L203">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>make_contrastive_layer() -> tf.keras.layers.Layer
+</code></pre>
+
+Returns the layer contrasting clean outputs with the correupted ones.
+
+<h3 id="metrics"><code>metrics</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L228-L237">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>metrics() -> runner.Metrics
+</code></pre>
+
+Returns arbitrary task specific metrics.
+
+<h3 id="predict"><code>predict</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L205-L211">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>predict(
+    *args
+) -> runner.Predictions
+</code></pre>
+
+Apply a readout head for use with various contrastive losses.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>*args</code>
+</td>
+<td>
+A tuple of (clean, corrupted) <code>tfgnn.GraphTensor</code>s.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+The logits for some contrastive loss as produced by the implementing
+subclass.
+</td>
+</tr>
+
+</table>
+
+<h3 id="preprocess"><code>preprocess</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L213-L220">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>preprocess(
+    inputs: GraphTensor
+) -> tuple[Sequence[GraphTensor], Mapping[str, Field]]
+</code></pre>
+
+Creates labels--i.e., (positive, negative)--for Deep Graph Infomax.
diff --git a/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/DropoutFeatures.md b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/DropoutFeatures.md
new file mode 100644
index 00000000..161ecffb
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/DropoutFeatures.md
@@ -0,0 +1,56 @@
+# contrastive_losses.DropoutFeatures
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/layers.py#L168-L173">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Base class for graph corruptor.
+
+Inherits From: [`Corruptor`](../contrastive_losses/Corruptor.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>contrastive_losses.DropoutFeatures(
+    *args, seed: Optional[float] = None, **kwargs
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>corruption_spec</code><a id="corruption_spec"></a>
+</td>
+<td>
+A spec for corruption application.
+</td>
+</tr><tr>
+<td>
+<code>corruption_fn</code><a id="corruption_fn"></a>
+</td>
+<td>
+Corruption function.
+</td>
+</tr><tr>
+<td>
+<code>default</code><a id="default"></a>
+</td>
+<td>
+Global application default of the corruptor. This is only used
+when <code>corruption_spec</code> is None.
+</td>
+</tr><tr>
+<td>
+<code>**kwargs</code><a id="**kwargs"></a>
+</td>
+<td>
+Additional keyword arguments.
+</td>
+</tr>
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/ShuffleFeaturesGlobally.md b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/ShuffleFeaturesGlobally.md
new file mode 100644
index 00000000..88c73fa3
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/ShuffleFeaturesGlobally.md
@@ -0,0 +1,60 @@
+# contrastive_losses.ShuffleFeaturesGlobally
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/layers.py#L155-L165">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+A corruptor that shuffles features.
+
+Inherits From: [`Corruptor`](../contrastive_losses/Corruptor.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>contrastive_losses.ShuffleFeaturesGlobally(
+    *args, seed: Optional[float] = None, **kwargs
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+NOTE: this function does not currently support TPUs. Consider using other
+corruptor functions if executing on TPUs. See b/269249455 for reference.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>corruption_spec</code><a id="corruption_spec"></a>
+</td>
+<td>
+A spec for corruption application.
+</td>
+</tr><tr>
+<td>
+<code>corruption_fn</code><a id="corruption_fn"></a>
+</td>
+<td>
+Corruption function.
+</td>
+</tr><tr>
+<td>
+<code>default</code><a id="default"></a>
+</td>
+<td>
+Global application default of the corruptor. This is only used
+when <code>corruption_spec</code> is None.
+</td>
+</tr><tr>
+<td>
+<code>**kwargs</code><a id="**kwargs"></a>
+</td>
+<td>
+Additional keyword arguments.
+</td>
+</tr>
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/TripletEmbeddingSquaredDistances.md b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/TripletEmbeddingSquaredDistances.md
new file mode 100644
index 00000000..7af68145
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/TripletEmbeddingSquaredDistances.md
@@ -0,0 +1,17 @@
+# contrastive_losses.TripletEmbeddingSquaredDistances
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/layers.py#L312-L337">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Computes embeddings distance between positive and negative pairs.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>contrastive_losses.TripletEmbeddingSquaredDistances(
+    trainable=True, name=None, dtype=None, dynamic=False, **kwargs
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
diff --git a/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/TripletLossTask.md b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/TripletLossTask.md
new file mode 100644
index 00000000..82e5af85
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/TripletLossTask.md
@@ -0,0 +1,200 @@
+# contrastive_losses.TripletLossTask
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L318-L411">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+The triplet loss task.
+
+Inherits From:
+[`ContrastiveLossTask`](../contrastive_losses/ContrastiveLossTask.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>contrastive_losses.TripletLossTask(
+    *args, margin: float = 1.0, **kwargs
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>node_set_name</code><a id="node_set_name"></a>
+</td>
+<td>
+Name of the node set for readout.
+</td>
+</tr><tr>
+<td>
+<code>feature_name</code><a id="feature_name"></a>
+</td>
+<td>
+Feature name for readout.
+</td>
+</tr><tr>
+<td>
+<code>representations_layer_name</code><a id="representations_layer_name"></a>
+</td>
+<td>
+Layer name for uncorrupted representations.
+</td>
+</tr><tr>
+<td>
+<code>corruptor</code><a id="corruptor"></a>
+</td>
+<td>
+<code>Corruptor</code> instance for creating negative samples. If not
+specified, we use <code>ShuffleFeaturesGlobally</code> by default.
+</td>
+</tr><tr>
+<td>
+<code>projector_units</code><a id="projector_units"></a>
+</td>
+<td>
+<code>Sequence</code> of layer sizes for projector network.
+Projectors prevent dimensional collapse, but can hinder training for
+easy corruptions. For more details, see
+https://arxiv.org/abs/2304.12210.
+</td>
+</tr><tr>
+<td>
+<code>seed</code><a id="seed"></a>
+</td>
+<td>
+Random seed for the default corruptor (<code>ShuffleFeaturesGlobally</code>).
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="losses"><code>losses</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L399-L408">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>losses() -> runner.Losses
+</code></pre>
+
+Returns arbitrary task specific losses.
+
+<h3 id="make_contrastive_layer"><code>make_contrastive_layer</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L326-L327">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>make_contrastive_layer() -> tf.keras.layers.Layer
+</code></pre>
+
+Returns the layer contrasting clean outputs with the correupted ones.
+
+<h3 id="metrics"><code>metrics</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L410-L411">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>metrics() -> runner.Metrics
+</code></pre>
+
+Returns arbitrary task specific metrics.
+
+<h3 id="predict"><code>predict</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L357-L397">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>predict(
+    *args
+) -> runner.Predictions
+</code></pre>
+
+Apply a readout head for use with triplet contrastive loss.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>*args</code>
+</td>
+<td>
+A tuple of (anchor, positive_sample, negative_sample)
+<code>tfgnn.GraphTensor</code>s.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+The positive and negative distance embeddings for triplet loss as produced
+by the implementing subclass.
+</td>
+</tr>
+
+</table>
+
+<h3 id="preprocess"><code>preprocess</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L336-L355">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>preprocess(
+    inputs: GraphTensor
+) -> tuple[Sequence[GraphTensor], tfgnn.Field]
+</code></pre>
+
+Creates unused pseudo-labels.
+
+The input tensor should have the anchor and positive sample stacked along the
+first dimension for each feature within each node set. The corruptor is applied
+on the positive sample.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+The anchor and positive sample stack along the first axis.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+Sequence of three graph tensors (anchor, positive_sample,
+corrupted_sample) and unused pseudo-labels.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/VicRegTask.md b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/VicRegTask.md
new file mode 100644
index 00000000..cf466b46
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/VicRegTask.md
@@ -0,0 +1,169 @@
+# contrastive_losses.VicRegTask
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L284-L315">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+A VICReg Task.
+
+Inherits From:
+[`ContrastiveLossTask`](../contrastive_losses/ContrastiveLossTask.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>contrastive_losses.VicRegTask(
+    *args,
+    sim_weight: Union[tf.Tensor, float] = 25.0,
+    var_weight: Union[tf.Tensor, float] = 25.0,
+    cov_weight: Union[tf.Tensor, float] = 1.0,
+    **kwargs
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>node_set_name</code><a id="node_set_name"></a>
+</td>
+<td>
+Name of the node set for readout.
+</td>
+</tr><tr>
+<td>
+<code>feature_name</code><a id="feature_name"></a>
+</td>
+<td>
+Feature name for readout.
+</td>
+</tr><tr>
+<td>
+<code>representations_layer_name</code><a id="representations_layer_name"></a>
+</td>
+<td>
+Layer name for uncorrupted representations.
+</td>
+</tr><tr>
+<td>
+<code>corruptor</code><a id="corruptor"></a>
+</td>
+<td>
+<code>Corruptor</code> instance for creating negative samples. If not
+specified, we use <code>ShuffleFeaturesGlobally</code> by default.
+</td>
+</tr><tr>
+<td>
+<code>projector_units</code><a id="projector_units"></a>
+</td>
+<td>
+<code>Sequence</code> of layer sizes for projector network.
+Projectors prevent dimensional collapse, but can hinder training for
+easy corruptions. For more details, see
+https://arxiv.org/abs/2304.12210.
+</td>
+</tr><tr>
+<td>
+<code>seed</code><a id="seed"></a>
+</td>
+<td>
+Random seed for the default corruptor (<code>ShuffleFeaturesGlobally</code>).
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="losses"><code>losses</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L303-L312">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>losses() -> runner.Losses
+</code></pre>
+
+Returns arbitrary task specific losses.
+
+<h3 id="make_contrastive_layer"><code>make_contrastive_layer</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L300-L301">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>make_contrastive_layer() -> tf.keras.layers.Layer
+</code></pre>
+
+Returns the layer contrasting clean outputs with the correupted ones.
+
+<h3 id="metrics"><code>metrics</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L314-L315">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>metrics() -> runner.Metrics
+</code></pre>
+
+Returns arbitrary task specific metrics.
+
+<h3 id="predict"><code>predict</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L115-L146">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>predict(
+    *args
+) -> runner.Predictions
+</code></pre>
+
+Apply a readout head for use with various contrastive losses.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>*args</code>
+</td>
+<td>
+A tuple of (clean, corrupted) <code>tfgnn.GraphTensor</code>s.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+The logits for some contrastive loss as produced by the implementing
+subclass.
+</td>
+</tr>
+
+</table>
+
+<h3 id="preprocess"><code>preprocess</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/tasks.py#L107-L113">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>preprocess(
+    inputs: GraphTensor
+) -> tuple[Sequence[GraphTensor], runner.Predictions]
+</code></pre>
+
+Applies a `Corruptor` and returns empty pseudo-labels.
diff --git a/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/all_symbols.md b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/all_symbols.md
new file mode 100644
index 00000000..72b37858
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/all_symbols.md
@@ -0,0 +1,24 @@
+# All symbols in TensorFlow GNN Models: fContrastiveLosses
+
+<!-- Insert buttons and diff -->
+
+## Primary symbols
+
+*   <a href="../contrastive_losses.md"><code>contrastive_losses</code></a>
+*   <a href="../contrastive_losses/AllSvdMetrics.md"><code>contrastive_losses.AllSvdMetrics</code></a>
+*   <a href="../contrastive_losses/BarlowTwinsTask.md"><code>contrastive_losses.BarlowTwinsTask</code></a>
+*   <a href="../contrastive_losses/ContrastiveLossTask.md"><code>contrastive_losses.ContrastiveLossTask</code></a>
+*   <a href="../contrastive_losses/CorruptionSpec.md"><code>contrastive_losses.CorruptionSpec</code></a>
+*   <a href="../contrastive_losses/Corruptor.md"><code>contrastive_losses.Corruptor</code></a>
+*   <a href="../contrastive_losses/DeepGraphInfomaxLogits.md"><code>contrastive_losses.DeepGraphInfomaxLogits</code></a>
+*   <a href="../contrastive_losses/DeepGraphInfomaxTask.md"><code>contrastive_losses.DeepGraphInfomaxTask</code></a>
+*   <a href="../contrastive_losses/DropoutFeatures.md"><code>contrastive_losses.DropoutFeatures</code></a>
+*   <a href="../contrastive_losses/ShuffleFeaturesGlobally.md"><code>contrastive_losses.ShuffleFeaturesGlobally</code></a>
+*   <a href="../contrastive_losses/TripletEmbeddingSquaredDistances.md"><code>contrastive_losses.TripletEmbeddingSquaredDistances</code></a>
+*   <a href="../contrastive_losses/TripletLossTask.md"><code>contrastive_losses.TripletLossTask</code></a>
+*   <a href="../contrastive_losses/VicRegTask.md"><code>contrastive_losses.VicRegTask</code></a>
+*   <a href="../contrastive_losses/coherence.md"><code>contrastive_losses.coherence</code></a>
+*   <a href="../contrastive_losses/numerical_rank.md"><code>contrastive_losses.numerical_rank</code></a>
+*   <a href="../contrastive_losses/pseudo_condition_number.md"><code>contrastive_losses.pseudo_condition_number</code></a>
+*   <a href="../contrastive_losses/rankme.md"><code>contrastive_losses.rankme</code></a>
+*   <a href="../contrastive_losses/self_clustering.md"><code>contrastive_losses.self_clustering</code></a>
diff --git a/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/coherence.md b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/coherence.md
new file mode 100644
index 00000000..66ffc367
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/coherence.md
@@ -0,0 +1,69 @@
+# contrastive_losses.coherence
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/metrics.py#L184-L213">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Coherence metric implementation.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@tf.function</code>
+<code>contrastive_losses.coherence(
+    representations: tf.Tensor,
+    *,
+    sigma: Optional[tf.Tensor] = None,
+    u: Optional[tf.Tensor] = None
+) -> tf.Tensor
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+Coherence measures how easy it is to construct a linear classifier on top of
+data without knowing downstream labels. Refer to
+https://arxiv.org/abs/2305.16562 for more details.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>representations</code><a id="representations"></a>
+</td>
+<td>
+Input representations, a rank-2 tensor.
+</td>
+</tr><tr>
+<td>
+<code>sigma</code><a id="sigma"></a>
+</td>
+<td>
+Unused.
+</td>
+</tr><tr>
+<td>
+<code>u</code><a id="u"></a>
+</td>
+<td>
+An optional tensor with left singular vectors of representations. If not
+present, computes a SVD of representations.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
+<tr class="alt">
+<td colspan="2">
+Metric value as scalar <code>tf.Tensor</code>.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/numerical_rank.md b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/numerical_rank.md
new file mode 100644
index 00000000..567971ef
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/numerical_rank.md
@@ -0,0 +1,70 @@
+# contrastive_losses.numerical_rank
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/metrics.py#L94-L124">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Numerical rank implementation.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@tf.function</code>
+<code>contrastive_losses.numerical_rank(
+    representations: tf.Tensor,
+    *,
+    sigma: Optional[tf.Tensor] = None,
+    u: Optional[tf.Tensor] = None
+) -> tf.Tensor
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+Computes a metric that estimates the numerical column rank of a matrix. Rank is
+estimated as a matrix trace divided by the largest eigenvalue. When our matrix
+is a covariance matrix, we can compute both the trace and the largest eigenvalue
+efficiently via SVD as the largest singular value squared.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>representations</code><a id="representations"></a>
+</td>
+<td>
+Input representations. We expect rank 2 input.
+</td>
+</tr><tr>
+<td>
+<code>sigma</code><a id="sigma"></a>
+</td>
+<td>
+An optional tensor with singular values of representations. If not
+present, computes SVD (singular values only) of representations.
+</td>
+</tr><tr>
+<td>
+<code>u</code><a id="u"></a>
+</td>
+<td>
+Unused.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
+<tr class="alt">
+<td colspan="2">
+Metric value as scalar <code>tf.Tensor</code>.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/pseudo_condition_number.md b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/pseudo_condition_number.md
new file mode 100644
index 00000000..1ece1671
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/pseudo_condition_number.md
@@ -0,0 +1,69 @@
+# contrastive_losses.pseudo_condition_number
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/metrics.py#L64-L91">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Pseudo-condition number metric implementation.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@tf.function</code>
+<code>contrastive_losses.pseudo_condition_number(
+    representations: tf.Tensor,
+    *,
+    sigma: Optional[tf.Tensor] = None,
+    u: Optional[tf.Tensor] = None
+) -> tf.Tensor
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+Computes a metric that measures the decay rate of the singular values. NOTE: Can
+be unstable in practice, when using small batch sizes, leading to numerical
+instabilities.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>representations</code><a id="representations"></a>
+</td>
+<td>
+Input representations. We expect rank 2 input.
+</td>
+</tr><tr>
+<td>
+<code>sigma</code><a id="sigma"></a>
+</td>
+<td>
+An optional tensor with singular values of representations. If not
+present, computes SVD (singular values only) of representations.
+</td>
+</tr><tr>
+<td>
+<code>u</code><a id="u"></a>
+</td>
+<td>
+Unused.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
+<tr class="alt">
+<td colspan="2">
+Metric value as scalar <code>tf.Tensor</code>.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/rankme.md b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/rankme.md
new file mode 100644
index 00000000..90223259
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/rankme.md
@@ -0,0 +1,77 @@
+# contrastive_losses.rankme
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/metrics.py#L127-L156">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+RankMe metric implementation.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@tf.function</code>
+<code>contrastive_losses.rankme(
+    representations: tf.Tensor,
+    *,
+    sigma: Optional[tf.Tensor] = None,
+    u: Optional[tf.Tensor] = None,
+    epsilon: float = 1e-12,
+    **_
+) -> tf.Tensor
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+Computes a metric that measures the decay rate of the singular values. For the
+paper, see https://arxiv.org/abs/2210.02885.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>representations</code><a id="representations"></a>
+</td>
+<td>
+Input representations as rank-2 tensor.
+</td>
+</tr><tr>
+<td>
+<code>sigma</code><a id="sigma"></a>
+</td>
+<td>
+An optional tensor with singular values of representations. If not
+present, computes SVD (singular values only) of representations.
+</td>
+</tr><tr>
+<td>
+<code>u</code><a id="u"></a>
+</td>
+<td>
+Unused.
+</td>
+</tr><tr>
+<td>
+<code>epsilon</code><a id="epsilon"></a>
+</td>
+<td>
+Epsilon for numerican stability.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
+<tr class="alt">
+<td colspan="2">
+Metric value as scalar <code>tf.Tensor</code>.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/self_clustering.md b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/self_clustering.md
new file mode 100644
index 00000000..474189d5
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/models/contrastive_losses/self_clustering.md
@@ -0,0 +1,63 @@
+# contrastive_losses.self_clustering
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/contrastive_losses/metrics.py#L24-L61">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Self-clustering metric implementation.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@tf.function</code>
+<code>contrastive_losses.self_clustering(
+    representations: tf.Tensor, *, subtract_mean: bool = False, **_
+) -> tf.Tensor
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+Computes a metric that measures how well distributed representations are, if
+projected on the unit sphere. If `subtract_mean` is True, we additionally remove
+the mean from representations. The metric has a range of (-0.5, 1]. It achieves
+its maximum of 1 if representations collapse to a single point, and it is
+approximately 0 if representations are distributed randomly on the sphere. In
+theory, it can achieve negative values if the points are maximally equiangular,
+although this is very rare in practice. Refer to
+https://arxiv.org/abs/2305.16562 for more details.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>representations</code><a id="representations"></a>
+</td>
+<td>
+Input representations.
+</td>
+</tr><tr>
+<td>
+<code>subtract_mean</code><a id="subtract_mean"></a>
+</td>
+<td>
+Whether to subtract the mean from representations.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
+<tr class="alt">
+<td colspan="2">
+Metric value as scalar <code>tf.Tensor</code>.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/models/gat_v2.md b/tensorflow_gnn/docs/api_docs/python/models/gat_v2.md
index 6a8cf186..2292e9c9 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/gat_v2.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/gat_v2.md
@@ -1,17 +1,10 @@
 # Module: gat_v2
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/gat_v2/__init__.py">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/gat_v2/__init__.py">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Graph Attention Networks v2.
 
diff --git a/tensorflow_gnn/docs/api_docs/python/models/gat_v2/GATv2Conv.md b/tensorflow_gnn/docs/api_docs/python/models/gat_v2/GATv2Conv.md
index 677ce5e8..361de99b 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/gat_v2/GATv2Conv.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/gat_v2/GATv2Conv.md
@@ -1,17 +1,10 @@
 # gat_v2.GATv2Conv
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/gat_v2/layers.py#L22-L331">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/gat_v2/layers.py#L22-L334">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 The multi-head attention from Graph Attention Networks v2 (GATv2).
 
@@ -86,77 +79,81 @@ example, if the input features have shape `[num_nodes, 2, 4, 1]`, then it will
 perform an identical computation on each of the `num_nodes * 2 * 4` input
 values.
 
+This layer can be restored from config by `tf.keras.models.load_model()` when
+saved as part of a Keras model using `save_format="tf"`.
+
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Init args</h2></th></tr>
 
 <tr>
 <td>
-`num_heads`<a id="num_heads"></a>
+<code>num_heads</code><a id="num_heads"></a>
 </td>
 <td>
 The number of attention heads.
 </td>
 </tr><tr>
 <td>
-`per_head_channels`<a id="per_head_channels"></a>
+<code>per_head_channels</code><a id="per_head_channels"></a>
 </td>
 <td>
 The number of channels for each attention head. This
 means:
-  if `heads_merge_type == "concat"`, then final output size will be:
-    `per_head_channels * num_heads`.
-  if `heads_merge_type == "mean"`, then final output size will be:
-    `per_head_channels`.
+  if <code>heads_merge_type == "concat"</code>, then final output size will be:
+    <code>per_head_channels * num_heads</code>.
+  if <code>heads_merge_type == "mean"</code>, then final output size will be:
+    <code>per_head_channels</code>.
 </td>
 </tr><tr>
 <td>
-`receiver_tag`<a id="receiver_tag"></a>
+<code>receiver_tag</code><a id="receiver_tag"></a>
 </td>
 <td>
-one of `tfgnn.SOURCE`, `tfgnn.TARGET` or `tfgnn.CONTEXT`.
+one of <code>tfgnn.SOURCE</code>, <code>tfgnn.TARGET</code> or <code>tfgnn.CONTEXT</code>.
 The results of attention are aggregated for this graph piece.
-If set to `tfgnn.SOURCE` or `tfgnn.TARGET`, the layer can be called for
+If set to <code>tfgnn.SOURCE</code> or <code>tfgnn.TARGET</code>, the layer can be called for
 an edge set and will aggregate results at the specified endpoint of the
 edges.
-If set to `tfgnn.CONTEXT`, the layer can be called for an edge set or
+If set to <code>tfgnn.CONTEXT</code>, the layer can be called for an edge set or
 node set.
 If left unset for init, the tag must be passed at call time.
 </td>
 </tr><tr>
 <td>
-`receiver_feature`<a id="receiver_feature"></a>
+<code>receiver_feature</code><a id="receiver_feature"></a>
 </td>
 <td>
-Can be set to override `tfgnn.HIDDEN_STATE` for use as
+Can be set to override <code>tfgnn.HIDDEN_STATE</code> for use as
 the receiver's input feature to attention. (The attention key is derived
 from this input.)
 </td>
 </tr><tr>
 <td>
-`sender_node_feature`<a id="sender_node_feature"></a>
+<code>sender_node_feature</code><a id="sender_node_feature"></a>
 </td>
 <td>
-Can be set to override `tfgnn.HIDDEN_STATE` for use as
+Can be set to override <code>tfgnn.HIDDEN_STATE</code> for use as
 the input feature from sender nodes to attention.
-IMPORTANT: Must be set to `None` for use with `receiver_tag=tfgnn.CONTEXT`
+IMPORTANT: Must be set to <code>None</code> for use with <code>receiver_tag=tfgnn.CONTEXT</code>
 on an edge set, or for pooling from edges without sender node states.
 </td>
 </tr><tr>
 <td>
-`sender_edge_feature`<a id="sender_edge_feature"></a>
+<code>sender_edge_feature</code><a id="sender_edge_feature"></a>
 </td>
 <td>
 Can be set to a feature name of the edge set to select
-it as an input feature. By default, this set to `None`, which disables
+it as an input feature. By default, this set to <code>None</code>, which disables
 this input.
-IMPORTANT: Must be set for use with `receiver_tag=tfgnn.CONTEXT`
+IMPORTANT: Must be set for use with <code>receiver_tag=tfgnn.CONTEXT</code>
 on an edge set.
 </td>
 </tr><tr>
 <td>
-`use_bias`<a id="use_bias"></a>
+<code>use_bias</code><a id="use_bias"></a>
 </td>
 <td>
 If true, a bias term is added to the transformations of query and
@@ -164,7 +161,7 @@ value inputs.
 </td>
 </tr><tr>
 <td>
-`edge_dropout`<a id="edge_dropout"></a>
+<code>edge_dropout</code><a id="edge_dropout"></a>
 </td>
 <td>
 Can be set to a dropout rate for edge dropout. (When pooling
@@ -173,29 +170,29 @@ is dropped out.)
 </td>
 </tr><tr>
 <td>
-`attention_activation`<a id="attention_activation"></a>
+<code>attention_activation</code><a id="attention_activation"></a>
 </td>
 <td>
 The nonlinearity used on the transformed inputs
 before multiplying with the trained weights of the attention layer.
 This can be specified as a Keras layer, a tf.keras.activations.*
-function, or a string understood by `tf.keras.layers.Activation()`.
+function, or a string understood by <code>tf.keras.layers.Activation()</code>.
 Defaults to "leaky_relu", which in turn defaults to a negative slope
-of `alpha=0.2`.
+of <code>alpha=0.2</code>.
 </td>
 </tr><tr>
 <td>
-`heads_merge_type`<a id="heads_merge_type"></a>
+<code>heads_merge_type</code><a id="heads_merge_type"></a>
 </td>
 <td>
 The merge operation for combining output from
-all `num_heads` attention heads. By default, output of heads will be
+all <code>num_heads</code> attention heads. By default, output of heads will be
 concatenated. However, GAT paper (Velickovic et al, Eq 6) recommends *only for output layer* to do mean across attention heads, which is acheivable
-by setting to `"mean"`.
+by setting to <code>"mean"</code>.
 </td>
 </tr><tr>
 <td>
-`activation`<a id="activation"></a>
+<code>activation</code><a id="activation"></a>
 </td>
 <td>
 The nonlinearity applied to the final result of attention,
@@ -203,17 +200,17 @@ specified in the same ways as attention_activation.
 </td>
 </tr><tr>
 <td>
-`kernel_initializer`<a id="kernel_initializer"></a>
+<code>kernel_initializer</code><a id="kernel_initializer"></a>
 </td>
 <td>
-Can be set to a `kernel_initializer` as understood
-by `tf.keras.layers.Dense` etc.
-An `Initializer` object gets cloned before use to ensure a fresh seed,
-if not set explicitly. For more, see `tfgnn.keras.clone_initializer()`.
+Can be set to a <code>kernel_initializer</code> as understood
+by <code>tf.keras.layers.Dense</code> etc.
+An <code>Initializer</code> object gets cloned before use to ensure a fresh seed,
+if not set explicitly. For more, see <code>tfgnn.keras.clone_initializer()</code>.
 </td>
 </tr><tr>
 <td>
-`kernel_regularizer`<a id="kernel_regularizer"></a>
+<code>kernel_regularizer</code><a id="kernel_regularizer"></a>
 </td>
 <td>
 If given, will be used to regularize all layer kernels.
@@ -222,56 +219,58 @@ If given, will be used to regularize all layer kernels.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
-<tr> <td> `receiver_tag`<a id="receiver_tag"></a> </td> <td> one of
-`tfgnn.SOURCE`, `tfgnn.TARGET` or `tfgnn.CONTEXT`. The results are aggregated
-for this graph piece. If set to `tfgnn.SOURCE` or `tfgnn.TARGET`, the layer can
-be called for an edge set and will aggregate results at the specified endpoint
-of the edges. If set to `tfgnn.CONTEXT`, the layer can be called for an edge set
-or a node set and will aggregate results for context (per graph component). If
-left unset for init, the tag must be passed at call time. </td> </tr><tr> <td>
-`receiver_feature`<a id="receiver_feature"></a> </td> <td> The name of the
-feature that is read from the receiver graph piece and passed as
-convolve(receiver_input=...). </td> </tr><tr> <td>
-`sender_node_feature`<a id="sender_node_feature"></a> </td> <td> The name of the
-feature that is read from the sender nodes, if any, and passed as
-convolve(sender_node_input=...). NOTICE this must be `None` for use with
-`receiver_tag=tfgnn.CONTEXT` on an edge set, or for pooling from edges without
-sender node states. </td> </tr><tr> <td>
-`sender_edge_feature`<a id="sender_edge_feature"></a> </td> <td> The name of the
-feature that is read from the sender edges, if any, and passed as
-convolve(sender_edge_input=...). NOTICE this must not be `None` for use with
-`receiver_tag=tfgnn.CONTEXT` on an edge set. </td> </tr><tr> <td>
-`extra_receiver_ops`<a id="extra_receiver_ops"></a> </td> <td> A str-keyed
-dictionary of Python callables that are wrapped to bind some arguments and then
-passed on to `convolve()`. Sample usage: `extra_receiver_ops={"softmax":
-tfgnn.softmax}`. The values passed in this dict must be callable as follows,
-with two positional arguments:
+<tr> <td> <code>receiver_tag</code><a id="receiver_tag"></a> </td> <td> one of
+<code>tfgnn.SOURCE</code>, <code>tfgnn.TARGET</code> or
+<code>tfgnn.CONTEXT</code>. The results are aggregated for this graph piece. If
+set to <code>tfgnn.SOURCE</code> or <code>tfgnn.TARGET</code>, the layer can be
+called for an edge set and will aggregate results at the specified endpoint of
+the edges. If set to <code>tfgnn.CONTEXT</code>, the layer can be called for an
+edge set or a node set and will aggregate results for context (per graph
+component). If left unset for init, the tag must be passed at call time. </td>
+</tr><tr> <td> <code>receiver_feature</code><a id="receiver_feature"></a> </td>
+<td> The name of the feature that is read from the receiver graph piece and
+passed as convolve(receiver_input=...). </td> </tr><tr> <td>
+<code>sender_node_feature</code><a id="sender_node_feature"></a> </td> <td> The
+name of the feature that is read from the sender nodes, if any, and passed as
+convolve(sender_node_input=...). NOTICE this must be <code>None</code> for use
+with <code>receiver_tag=tfgnn.CONTEXT</code> on an edge set, or for pooling from
+edges without sender node states. </td> </tr><tr> <td>
+<code>sender_edge_feature</code><a id="sender_edge_feature"></a> </td> <td> The
+name of the feature that is read from the sender edges, if any, and passed as
+convolve(sender_edge_input=...). NOTICE this must not be <code>None</code> for
+use with <code>receiver_tag=tfgnn.CONTEXT</code> on an edge set. </td> </tr><tr>
+<td> <code>extra_receiver_ops</code><a id="extra_receiver_ops"></a> </td> <td> A
+str-keyed dictionary of Python callables that are wrapped to bind some arguments
+and then passed on to <code>convolve()</code>. Sample usage:
+<code>extra_receiver_ops={"softmax": tfgnn.softmax}</code>. The values passed in
+this dict must be callable as follows, with two positional arguments:
 
 ```python
 f(graph, receiver_tag, node_set_name=..., feature_value=..., ...)
 f(graph, receiver_tag, edge_set_name=..., feature_value=..., ...)
 ```
 
-The wrapped callables seen by `convolve()` can be called like
+The wrapped callables seen by <code>convolve()</code> can be called like
 
 ```python
 wrapped_f(feature_value, ...)
 ```
 
-The first three arguments of `f` are set to the input GraphTensor of
-the layer and the tag/name pair required by `tfgnn.broadcast()` and
-`tfgnn.pool()` to move values between the receiver and the messages that
+The first three arguments of <code>f</code> are set to the input GraphTensor of
+the layer and the tag/name pair required by <code>tfgnn.broadcast()</code> and
+<code>tfgnn.pool()</code> to move values between the receiver and the messages that
 are computed inside the convolution. The sole positional argument of
-`wrapped_f()` is passed to `f()`  as `feature_value=`, and any keyword
+<code>wrapped_f()</code> is passed to <code>f()</code>  as <code>feature_value=</code>, and any keyword
 arguments are forwarded.
 </td>
 </tr><tr>
 <td>
-`**kwargs`<a id="**kwargs"></a>
+<code>**kwargs</code><a id="**kwargs"></a>
 </td>
 <td>
 Forwarded to the base class tf.keras.layers.Layer.
@@ -280,30 +279,31 @@ Forwarded to the base class tf.keras.layers.Layer.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
 <tr>
 <td>
-`takes_receiver_input`<a id="takes_receiver_input"></a>
+<code>takes_receiver_input</code><a id="takes_receiver_input"></a>
 </td>
 <td>
-If `False`, all calls to convolve() will get `receiver_input=None`.
+If <code>False</code>, all calls to convolve() will get <code>receiver_input=None</code>.
 </td>
 </tr><tr>
 <td>
-`takes_sender_edge_input`<a id="takes_sender_edge_input"></a>
+<code>takes_sender_edge_input</code><a id="takes_sender_edge_input"></a>
 </td>
 <td>
-If `False`, all calls to convolve() will get `sender_edge_input=None`.
+If <code>False</code>, all calls to convolve() will get <code>sender_edge_input=None</code>.
 </td>
 </tr><tr>
 <td>
-`takes_sender_node_input`<a id="takes_sender_node_input"></a>
+<code>takes_sender_node_input</code><a id="takes_sender_node_input"></a>
 </td>
 <td>
-If `False`, all calls to convolve() will get `sender_node_input=None`.
+If <code>False</code>, all calls to convolve() will get <code>sender_node_input=None</code>.
 </td>
 </tr>
 </table>
@@ -312,7 +312,7 @@ If `False`, all calls to convolve() will get `sender_node_input=None`.
 
 <h3 id="convolve"><code>convolve</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/gat_v2/layers.py#L250-L309">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/gat_v2/layers.py#L253-L312">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
diff --git a/tensorflow_gnn/docs/api_docs/python/models/gat_v2/GATv2EdgePool.md b/tensorflow_gnn/docs/api_docs/python/models/gat_v2/GATv2EdgePool.md
index 9cd423a8..8b6c5c74 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/gat_v2/GATv2EdgePool.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/gat_v2/GATv2EdgePool.md
@@ -1,17 +1,10 @@
 # gat_v2.GATv2EdgePool
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/gat_v2/layers.py#L369-L420">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/gat_v2/layers.py#L372-L427">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns a layer for pooling edges with GATv2-style attention.
 
@@ -41,21 +34,26 @@ an edge set to do the analogous pooling of edge states to context.
 NOTE: This layer cannot pool node states. For that, use
 <a href="../gat_v2/GATv2Conv.md"><code>gat_v2.GATv2Conv</code></a>.
 
+The layer returned by this function can be restored from config by
+`tf.keras.models.load_model()` when saved as part of a Keras model using
+`save_format="tf"`.
+
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`num_heads`<a id="num_heads"></a>
+<code>num_heads</code><a id="num_heads"></a>
 </td>
 <td>
 The number of attention heads.
 </td>
 </tr><tr>
 <td>
-`per_head_channels`<a id="per_head_channels"></a>
+<code>per_head_channels</code><a id="per_head_channels"></a>
 </td>
 <td>
 The number of channels for each attention head. This
@@ -63,20 +61,20 @@ means that the final output size will be per_head_channels * num_heads.
 </td>
 </tr><tr>
 <td>
-`receiver_tag`<a id="receiver_tag"></a>
+<code>receiver_tag</code><a id="receiver_tag"></a>
 </td>
 <td>
 The results of attention are aggregated for this graph piece.
-If set to `tfgnn.CONTEXT`, the layer can be called for an edge set or
+If set to <code>tfgnn.CONTEXT</code>, the layer can be called for an edge set or
 node set.
-If set to an IncidentNodeTag (e.g., `tfgnn.SOURCE` or `tfgnn.TARGET`),
+If set to an IncidentNodeTag (e.g., <code>tfgnn.SOURCE</code> or <code>tfgnn.TARGET</code>),
 the layer can be called for an edge set and will aggregate results at
 the specified endpoint of the edges.
 If left unset, the tag must be passed when calling the layer.
 </td>
 </tr><tr>
 <td>
-`receiver_feature`<a id="receiver_feature"></a>
+<code>receiver_feature</code><a id="receiver_feature"></a>
 </td>
 <td>
 By default, the default state feature of the receiver
@@ -85,7 +83,7 @@ selected by setting this argument.
 </td>
 </tr><tr>
 <td>
-`sender_feature`<a id="sender_feature"></a>
+<code>sender_feature</code><a id="sender_feature"></a>
 </td>
 <td>
 By default, the default state feature of the edge set is
@@ -94,7 +92,7 @@ selected by setting this argument.
 </td>
 </tr><tr>
 <td>
-`**kwargs`<a id="**kwargs"></a>
+<code>**kwargs</code><a id="**kwargs"></a>
 </td>
 <td>
 Any other option for GATv2Conv, except sender_node_feature,
diff --git a/tensorflow_gnn/docs/api_docs/python/models/gat_v2/GATv2HomGraphUpdate.md b/tensorflow_gnn/docs/api_docs/python/models/gat_v2/GATv2HomGraphUpdate.md
index cbc66046..acc03341 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/gat_v2/GATv2HomGraphUpdate.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/gat_v2/GATv2HomGraphUpdate.md
@@ -1,17 +1,10 @@
 # gat_v2.GATv2HomGraphUpdate
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/gat_v2/layers.py#L424-L476">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/gat_v2/layers.py#L431-L487">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns a GraphUpdate layer with a Graph Attention Network V2 (GATv2).
 
@@ -40,21 +33,26 @@ objects instead, such as the GATv2MPNNGraphUpdate.
 > explicitly stored in the input GraphTensor. Attention of a node to itself
 > requires having an explicit loop in the edge set.
 
+The layer returned by this function can be restored from config by
+`tf.keras.models.load_model()` when saved as part of a Keras model using
+`save_format="tf"`.
+
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`num_heads`<a id="num_heads"></a>
+<code>num_heads</code><a id="num_heads"></a>
 </td>
 <td>
 The number of attention heads.
 </td>
 </tr><tr>
 <td>
-`per_head_channels`<a id="per_head_channels"></a>
+<code>per_head_channels</code><a id="per_head_channels"></a>
 </td>
 <td>
 The number of channels for each attention head. This
@@ -62,22 +60,22 @@ means that the final output size will be per_head_channels * num_heads.
 </td>
 </tr><tr>
 <td>
-`receiver_tag`<a id="receiver_tag"></a>
+<code>receiver_tag</code><a id="receiver_tag"></a>
 </td>
 <td>
-one of `tfgnn.SOURCE` or `tfgnn.TARGET`.
+one of <code>tfgnn.SOURCE</code> or <code>tfgnn.TARGET</code>.
 </td>
 </tr><tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 The feature name of node states; defaults to
-`tfgnn.HIDDEN_STATE`.
+<code>tfgnn.HIDDEN_STATE</code>.
 </td>
 </tr><tr>
 <td>
-`heads_merge_type`<a id="heads_merge_type"></a>
+<code>heads_merge_type</code><a id="heads_merge_type"></a>
 </td>
 <td>
 "concat" or "mean". Gets passed to GATv2Conv, which uses
@@ -85,14 +83,14 @@ it to combine all heads into layer's output.
 </td>
 </tr><tr>
 <td>
-`name`<a id="name"></a>
+<code>name</code><a id="name"></a>
 </td>
 <td>
 Optionally, a name for the layer returned.
 </td>
 </tr><tr>
 <td>
-`**kwargs`<a id="**kwargs"></a>
+<code>**kwargs</code><a id="**kwargs"></a>
 </td>
 <td>
 Any optional arguments to GATv2Conv, see there.
diff --git a/tensorflow_gnn/docs/api_docs/python/models/gat_v2/GATv2MPNNGraphUpdate.md b/tensorflow_gnn/docs/api_docs/python/models/gat_v2/GATv2MPNNGraphUpdate.md
index 0b67e958..ad5a214e 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/gat_v2/GATv2MPNNGraphUpdate.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/gat_v2/GATv2MPNNGraphUpdate.md
@@ -1,17 +1,10 @@
 # gat_v2.GATv2MPNNGraphUpdate
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/gat_v2/layers.py#L493-L589">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/gat_v2/layers.py#L504-L604">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns a GraphUpdate layer for message passing with GATv2 pooling.
 
@@ -44,37 +37,42 @@ the messages and their pooling with attention, followed by a dense layer to
 compute the new node states from a concatenation of the old node state and all
 pooled messages.
 
+The layer returned by this function can be restored from config by
+`tf.keras.models.load_model()` when saved as part of a Keras model using
+`save_format="tf"`.
+
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`units`<a id="units"></a>
+<code>units</code><a id="units"></a>
 </td>
 <td>
 The dimension of output hidden states for each node.
 </td>
 </tr><tr>
 <td>
-`message_dim`<a id="message_dim"></a>
+<code>message_dim</code><a id="message_dim"></a>
 </td>
 <td>
 The dimension of messages (attention values) computed on
-each edge.  Must be divisible by `num_heads`.
+each edge.  Must be divisible by <code>num_heads</code>.
 </td>
 </tr><tr>
 <td>
-`num_heads`<a id="num_heads"></a>
+<code>num_heads</code><a id="num_heads"></a>
 </td>
 <td>
-The number of attention heads used by GATv2. `message_dim`
+The number of attention heads used by GATv2. <code>message_dim</code>
 must be divisible by this number.
 </td>
 </tr><tr>
 <td>
-`heads_merge_type`<a id="heads_merge_type"></a>
+<code>heads_merge_type</code><a id="heads_merge_type"></a>
 </td>
 <td>
 "concat" or "mean". Gets passed to GATv2Conv, which uses
@@ -82,15 +80,15 @@ it to combine all heads into layer's output.
 </td>
 </tr><tr>
 <td>
-`receiver_tag`<a id="receiver_tag"></a>
+<code>receiver_tag</code><a id="receiver_tag"></a>
 </td>
 <td>
-one of `tfgnn.TARGET` or `tfgnn.SOURCE`, to select the
+one of <code>tfgnn.TARGET</code> or <code>tfgnn.SOURCE</code>, to select the
 incident node of each edge that receives the message.
 </td>
 </tr><tr>
 <td>
-`node_set_names`<a id="node_set_names"></a>
+<code>node_set_names</code><a id="node_set_names"></a>
 </td>
 <td>
 The names of node sets to update. If unset, updates all
@@ -98,16 +96,16 @@ that are on the receiving end of any edge set.
 </td>
 </tr><tr>
 <td>
-`edge_feature`<a id="edge_feature"></a>
+<code>edge_feature</code><a id="edge_feature"></a>
 </td>
 <td>
 Can be set to a feature name of the edge set to select
-it as an input feature. By default, this set to `None`, which disables
+it as an input feature. By default, this set to <code>None</code>, which disables
 this input.
 </td>
 </tr><tr>
 <td>
-`l2_regularization`<a id="l2_regularization"></a>
+<code>l2_regularization</code><a id="l2_regularization"></a>
 </td>
 <td>
 The coefficient of L2 regularization for weights and
@@ -115,7 +113,7 @@ biases.
 </td>
 </tr><tr>
 <td>
-`edge_dropout_rate`<a id="edge_dropout_rate"></a>
+<code>edge_dropout_rate</code><a id="edge_dropout_rate"></a>
 </td>
 <td>
 The edge dropout rate applied during attention pooling
@@ -123,26 +121,26 @@ of edges.
 </td>
 </tr><tr>
 <td>
-`state_dropout_rate`<a id="state_dropout_rate"></a>
+<code>state_dropout_rate</code><a id="state_dropout_rate"></a>
 </td>
 <td>
 The dropout rate applied to the resulting node states.
 </td>
 </tr><tr>
 <td>
-`attention_activation`<a id="attention_activation"></a>
+<code>attention_activation</code><a id="attention_activation"></a>
 </td>
 <td>
 The nonlinearity used on the transformed inputs
 before multiplying with the trained weights of the attention layer.
 This can be specified as a Keras layer, a tf.keras.activations.*
-function, or a string understood by `tf.keras.layers.Activation()`.
+function, or a string understood by <code>tf.keras.layers.Activation()</code>.
 Defaults to "leaky_relu", which in turn defaults to a negative slope
-of `alpha=0.2`.
+of <code>alpha=0.2</code>.
 </td>
 </tr><tr>
 <td>
-`conv_activation`<a id="conv_activation"></a>
+<code>conv_activation</code><a id="conv_activation"></a>
 </td>
 <td>
 The nonlinearity applied to the result of attention on one
@@ -150,7 +148,7 @@ edge set, specified in the same ways as attention_activation.
 </td>
 </tr><tr>
 <td>
-`activation`<a id="activation"></a>
+<code>activation</code><a id="activation"></a>
 </td>
 <td>
 The nonlinearity applied to the new node states computed by
@@ -158,23 +156,24 @@ this graph update.
 </td>
 </tr><tr>
 <td>
-`kernel_initializer`<a id="kernel_initializer"></a>
+<code>kernel_initializer</code><a id="kernel_initializer"></a>
 </td>
 <td>
-Can be set to a `kernel_initializer` as understood
-by `tf.keras.layers.Dense` etc.
+Can be set to a <code>kernel_initializer</code> as understood
+by <code>tf.keras.layers.Dense</code> etc.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
 A GraphUpdate layer for use on a scalar GraphTensor with
-`tfgnn.HIDDEN_STATE` features on the node sets.
+<code>tfgnn.HIDDEN_STATE</code> features on the node sets.
 </td>
 </tr>
 
diff --git a/tensorflow_gnn/docs/api_docs/python/models/gat_v2/graph_update_from_config_dict.md b/tensorflow_gnn/docs/api_docs/python/models/gat_v2/graph_update_from_config_dict.md
index 858724c5..ef2f3c62 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/gat_v2/graph_update_from_config_dict.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/gat_v2/graph_update_from_config_dict.md
@@ -1,23 +1,18 @@
 # gat_v2.graph_update_from_config_dict
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/gat_v2/config_dict.py#L42-L62">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/gat_v2/config_dict.py#L45-L70">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns a GATv2MPNNGraphUpdate initialized from `cfg`.
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>gat_v2.graph_update_from_config_dict(
-    cfg: config_dict.ConfigDict
+    cfg: config_dict.ConfigDict,
+    *,
+    node_set_names: Optional[Collection[tfgnn.NodeSetName]] = None
 ) -> tf.keras.layers.Layer
 </code></pre>
 
@@ -30,15 +25,23 @@ Returns a GATv2MPNNGraphUpdate initialized from `cfg`.
 
 <tr>
 <td>
-`cfg`<a id="cfg"></a>
+<code>cfg</code><a id="cfg"></a>
 </td>
 <td>
-A `ConfigDict` with the fields defined by
-`graph_update_get_config_dict()`. All fields with non-`None` values are
+A <code>ConfigDict</code> with the fields defined by
+<code>graph_update_get_config_dict()</code>. All fields with non-<code>None</code> values are
 used as keyword arguments for initializing and returning a
-`GATv2MPNNGraphUpdate` object. For the required arguments of
-`GATv2MPNNGraphUpdate.__init__`, users must set a value in
-`cfg` before passing it here.
+<code>GATv2MPNNGraphUpdate</code> object. For the required arguments of
+<code>GATv2MPNNGraphUpdate.__init__</code>, users must set a value in
+<code>cfg</code> before passing it here.
+</td>
+</tr><tr>
+<td>
+<code>node_set_names</code><a id="node_set_names"></a>
+</td>
+<td>
+Optionally, the names of NodeSets to update; forwarded to
+<code>GATv2MPNNGraphUpdate.__init__</code>.
 </td>
 </tr>
 </table>
@@ -50,7 +53,7 @@ used as keyword arguments for initializing and returning a
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-A new `GATv2MPNNGraphUpdate` object.
+A new <code>GATv2MPNNGraphUpdate</code> object.
 </td>
 </tr>
 
@@ -64,11 +67,11 @@ A new `GATv2MPNNGraphUpdate` object.
 
 <tr>
 <td>
-`TypeError`<a id="TypeError"></a>
+<code>TypeError</code><a id="TypeError"></a>
 </td>
 <td>
-if `cfg` fails to supply a required argument for
-`GATv2MPNNGraphUpdate.__init__`.
+if <code>cfg</code> fails to supply a required argument for
+<code>GATv2MPNNGraphUpdate.__init__</code>.
 </td>
 </tr>
 </table>
diff --git a/tensorflow_gnn/docs/api_docs/python/models/gat_v2/graph_update_get_config_dict.md b/tensorflow_gnn/docs/api_docs/python/models/gat_v2/graph_update_get_config_dict.md
index 0f48b98f..073dfac2 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/gat_v2/graph_update_get_config_dict.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/gat_v2/graph_update_get_config_dict.md
@@ -1,17 +1,10 @@
 # gat_v2.graph_update_get_config_dict
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/gat_v2/config_dict.py#L22-L39">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/gat_v2/config_dict.py#L25-L42">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns ConfigDict for graph_update_from_config_dict() with defaults.
 
diff --git a/tensorflow_gnn/docs/api_docs/python/models/gcn.md b/tensorflow_gnn/docs/api_docs/python/models/gcn.md
index 0ecf4de6..103f9866 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/gcn.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/gcn.md
@@ -1,17 +1,10 @@
 # Module: gcn
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/gcn/__init__.py">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/gcn/__init__.py">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Graph Convolutional Networks.
 
diff --git a/tensorflow_gnn/docs/api_docs/python/models/gcn/GCNConv.md b/tensorflow_gnn/docs/api_docs/python/models/gcn/GCNConv.md
index d2890c55..ddbeeb2c 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/gcn/GCNConv.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/gcn/GCNConv.md
@@ -1,17 +1,10 @@
 # gcn.GCNConv
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/gcn/gcn_conv.py#L26-L279">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/gcn/gcn_conv.py#L26-L282">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Implements the Graph Convolutional Network by Kipf&Welling (2016).
 
@@ -71,14 +64,18 @@ $$v_{ij} = w_{ij} / (\sqrt{\deg^{in}_i} \sqrt{\deg^{in}_j}).$$
 For symmetric graphs (as in the original GCN paper), `"in_out"` and `"in_in"`
 are equal, but the latter needs to compute degrees just once.
 
+This layer can be restored from config by `tf.keras.models.load_model()` when
+saved as part of a Keras model using `save_format="tf"`.
+
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Init arguments</h2></th></tr>
 
 <tr>
 <td>
-`units`<a id="units"></a>
+<code>units</code><a id="units"></a>
 </td>
 <td>
 Number of output units for this transformation applied to sender
@@ -86,26 +83,26 @@ node features.
 </td>
 </tr><tr>
 <td>
-`receiver_tag`<a id="receiver_tag"></a>
+<code>receiver_tag</code><a id="receiver_tag"></a>
 </td>
 <td>
 This layer's result is obtained by pooling the per-edge
-results at this endpoint of each edge. The default is `tfgnn.TARGET`,
+results at this endpoint of each edge. The default is <code>tfgnn.TARGET</code>,
 but it is perfectly reasonable to do a convolution towards the
-`tfgnn.SOURCE` instead. (Source and target are conventional names for
+<code>tfgnn.SOURCE</code> instead. (Source and target are conventional names for
 the incident nodes of a directed edge, data flow in a GNN may happen
 in either direction.)
 </td>
 </tr><tr>
 <td>
-`activation`<a id="activation"></a>
+<code>activation</code><a id="activation"></a>
 </td>
 <td>
 Keras activation to apply to the result, defaults to 'relu'.
 </td>
 </tr><tr>
 <td>
-`use_bias`<a id="use_bias"></a>
+<code>use_bias</code><a id="use_bias"></a>
 </td>
 <td>
 Whether to add bias in the final transformation. The original
@@ -114,7 +111,7 @@ with Keras and other implementations.
 </td>
 </tr><tr>
 <td>
-`add_self_loops`<a id="add_self_loops"></a>
+<code>add_self_loops</code><a id="add_self_loops"></a>
 </td>
 <td>
 Whether to compute the result as if a loop from each node
@@ -123,24 +120,24 @@ with an edge weight of one.
 </td>
 </tr><tr>
 <td>
-`kernel_initializer`<a id="kernel_initializer"></a>
+<code>kernel_initializer</code><a id="kernel_initializer"></a>
 </td>
 <td>
-Can be set to a `kernel_initializer` as understood
-by `tf.keras.layers.Dense` etc.
-An `Initializer` object gets cloned before use to ensure a fresh seed,
-if not set explicitly. For more, see `tfgnn.keras.clone_initializer()`.
+Can be set to a <code>kernel_initializer</code> as understood
+by <code>tf.keras.layers.Dense</code> etc.
+An <code>Initializer</code> object gets cloned before use to ensure a fresh seed,
+if not set explicitly. For more, see <code>tfgnn.keras.clone_initializer()</code>.
 </td>
 </tr><tr>
 <td>
-`node_feature`<a id="node_feature"></a>
+<code>node_feature</code><a id="node_feature"></a>
 </td>
 <td>
 Name of the node feature to transform.
 </td>
 </tr><tr>
 <td>
-`edge_weight_feature_name`<a id="edge_weight_feature_name"></a>
+<code>edge_weight_feature_name</code><a id="edge_weight_feature_name"></a>
 </td>
 <td>
 Can be set to the name of a feature on the edge
@@ -149,15 +146,15 @@ it as the edge's entry in the adjacency matrix, instead of the default 1.
 </td>
 </tr><tr>
 <td>
-`degree_normalization`<a id="degree_normalization"></a>
+<code>degree_normalization</code><a id="degree_normalization"></a>
 </td>
 <td>
-Can be set to `"none"`, `"in"`, `"out"`, `"in_out"`,
-or `"in_in"`, as explained above.
+Can be set to <code>"none"</code>, <code>"in"</code>, <code>"out"</code>, <code>"in_out"</code>,
+or <code>"in_in"</code>, as explained above.
 </td>
 </tr><tr>
 <td>
-`**kwargs`<a id="**kwargs"></a>
+<code>**kwargs</code><a id="**kwargs"></a>
 </td>
 <td>
 additional arguments for the Layer.
@@ -166,23 +163,24 @@ additional arguments for the Layer.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Call arguments</h2></th></tr>
 
 <tr>
 <td>
-`graph`<a id="graph"></a>
+<code>graph</code><a id="graph"></a>
 </td>
 <td>
 The GraphTensor on which to apply the layer.
 </td>
 </tr><tr>
 <td>
-`edge_set_name`<a id="edge_set_name"></a>
+<code>edge_set_name</code><a id="edge_set_name"></a>
 </td>
 <td>
-Edge set of `graph` over which to apply the layer.
+Edge set of <code>graph</code> over which to apply the layer.
 </td>
 </tr>
 </table>
diff --git a/tensorflow_gnn/docs/api_docs/python/models/gcn/GCNHomGraphUpdate.md b/tensorflow_gnn/docs/api_docs/python/models/gcn/GCNHomGraphUpdate.md
index 1d3767e2..3b4663a6 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/gcn/GCNHomGraphUpdate.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/gcn/GCNHomGraphUpdate.md
@@ -1,17 +1,10 @@
 # gcn.GCNHomGraphUpdate
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/gcn/gcn_conv.py#L282-L338">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/gcn/gcn_conv.py#L285-L345">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns a graph update layer for GCN convolution.
 
@@ -41,32 +34,37 @@ GCNConv objects instead.
 > requires having an explicit loop in the edge set, or setting
 > `add_self_loops=True`.
 
+Thie layer returned by this function can be restored from config by
+`tf.keras.models.load_model()` when saved as part of a Keras model using
+`save_format="tf"`.
+
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`units`<a id="units"></a>
+<code>units</code><a id="units"></a>
 </td>
 <td>
 The dimension of output hidden states for each node.
 </td>
 </tr><tr>
 <td>
-`receiver_tag`<a id="receiver_tag"></a>
+<code>receiver_tag</code><a id="receiver_tag"></a>
 </td>
 <td>
-The default is `tfgnn.TARGET`,
+The default is <code>tfgnn.TARGET</code>,
 but it is perfectly reasonable to do a convolution towards the
-`tfgnn.SOURCE` instead. (Source and target are conventional names for
+<code>tfgnn.SOURCE</code> instead. (Source and target are conventional names for
 the incident nodes of a directed edge, data flow in a GNN may happen
 in either direction.)
 </td>
 </tr><tr>
 <td>
-`add_self_loops`<a id="add_self_loops"></a>
+<code>add_self_loops</code><a id="add_self_loops"></a>
 </td>
 <td>
 Whether to compute the result as if a loop from each node
@@ -74,22 +72,22 @@ to itself had been added to the edge set.
 </td>
 </tr><tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 The feature name of node states; defaults to
-`tfgnn.HIDDEN_STATE`.
+<code>tfgnn.HIDDEN_STATE</code>.
 </td>
 </tr><tr>
 <td>
-`name`<a id="name"></a>
+<code>name</code><a id="name"></a>
 </td>
 <td>
 Optionally, a name for the layer returned.
 </td>
 </tr><tr>
 <td>
-`**kwargs`<a id="**kwargs"></a>
+<code>**kwargs</code><a id="**kwargs"></a>
 </td>
 <td>
 Any optional arguments to GCNConv, see there.
diff --git a/tensorflow_gnn/docs/api_docs/python/models/graph_sage.md b/tensorflow_gnn/docs/api_docs/python/models/graph_sage.md
index e418b9a1..09192446 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/graph_sage.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/graph_sage.md
@@ -1,17 +1,10 @@
 # Module: graph_sage
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/graph_sage/__init__.py">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/graph_sage/__init__.py">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 GraphSAGE.
 
diff --git a/tensorflow_gnn/docs/api_docs/python/models/graph_sage/GCNGraphSAGENodeSetUpdate.md b/tensorflow_gnn/docs/api_docs/python/models/graph_sage/GCNGraphSAGENodeSetUpdate.md
index dc2240b4..415ce1a5 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/graph_sage/GCNGraphSAGENodeSetUpdate.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/graph_sage/GCNGraphSAGENodeSetUpdate.md
@@ -1,17 +1,10 @@
 # graph_sage.GCNGraphSAGENodeSetUpdate
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/graph_sage/layers.py#L256-L472">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/graph_sage/layers.py#L262-L481">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 GCNGraphSAGENodeSetUpdate is an extension of the mean aggregator operator.
 
@@ -77,33 +70,37 @@ reduce operation, instead only the sender node states will be accumulated based
 on the reduce_type specified. If share_weights is set to True, then single
 weight matrix will be used in place of W_E and W_self.
 
+This layer can be restored from config by `tf.keras.models.load_model()` when
+saved as part of a Keras model using `save_format="tf"`.
+
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`edge_set_names`<a id="edge_set_names"></a>
+<code>edge_set_names</code><a id="edge_set_names"></a>
 </td>
 <td>
 A list of edge set names to broadcast sender node states.
 </td>
 </tr><tr>
 <td>
-`receiver_tag`<a id="receiver_tag"></a>
+<code>receiver_tag</code><a id="receiver_tag"></a>
 </td>
 <td>
-Either one of `tfgnn.SOURCE` or `tfgnn.TARGET`. The results
+Either one of <code>tfgnn.SOURCE</code> or <code>tfgnn.TARGET</code>. The results
 of GraphSAGE convolution are aggregated for this graph piece. If set to
-`tfgnn.SOURCE` or `tfgnn.TARGET`, the layer will be called for each edge
+<code>tfgnn.SOURCE</code> or <code>tfgnn.TARGET</code>, the layer will be called for each edge
 set and will aggregate results at the specified endpoint of the edges.
 This should point at the node_set_name for each of the specified edge
 set name in the edge_set_name_dict.
 </td>
 </tr><tr>
 <td>
-`reduce_type`<a id="reduce_type"></a>
+<code>reduce_type</code><a id="reduce_type"></a>
 </td>
 <td>
 An aggregation operation name. Supported list of aggregation
@@ -111,24 +108,24 @@ operators are sum or mean.
 </td>
 </tr><tr>
 <td>
-`self_node_feature`<a id="self_node_feature"></a>
+<code>self_node_feature</code><a id="self_node_feature"></a>
 </td>
 <td>
 Feature name for the self node sets to be aggregated
 with the broadcasted sender node states. Default is
-`tfgnn.HIDDEN_STATE`.
+<code>tfgnn.HIDDEN_STATE</code>.
 </td>
 </tr><tr>
 <td>
-`sender_node_feature`<a id="sender_node_feature"></a>
+<code>sender_node_feature</code><a id="sender_node_feature"></a>
 </td>
 <td>
 Feature name for the sender node sets. Default is
-`tfgnn.HIDDEN_STATE`.
+<code>tfgnn.HIDDEN_STATE</code>.
 </td>
 </tr><tr>
 <td>
-`units`<a id="units"></a>
+<code>units</code><a id="units"></a>
 </td>
 <td>
 Number of output units for the linear transformation applied to
@@ -136,7 +133,7 @@ sender node and self node features.
 </td>
 </tr><tr>
 <td>
-`dropout_rate`<a id="dropout_rate"></a>
+<code>dropout_rate</code><a id="dropout_rate"></a>
 </td>
 <td>
 Can be set to a dropout rate that will be applied to both
@@ -144,7 +141,7 @@ self node and the sender node states.
 </td>
 </tr><tr>
 <td>
-`activation`<a id="activation"></a>
+<code>activation</code><a id="activation"></a>
 </td>
 <td>
 The nonlinearity applied to the update node states. This can
@@ -153,7 +150,7 @@ string understood by tf.keras.layers.Activation(). Defaults to relu.
 </td>
 </tr><tr>
 <td>
-`use_bias`<a id="use_bias"></a>
+<code>use_bias</code><a id="use_bias"></a>
 </td>
 <td>
 If true a bias term will be added to mean aggregated feature
@@ -161,7 +158,7 @@ vectors before applying non-linear activation.
 </td>
 </tr><tr>
 <td>
-`share_weights`<a id="share_weights"></a>
+<code>share_weights</code><a id="share_weights"></a>
 </td>
 <td>
 If left unset, separate weights are used to transform the
@@ -171,7 +168,7 @@ applied to all inputs.
 </td>
 </tr><tr>
 <td>
-`add_self_loop`<a id="add_self_loop"></a>
+<code>add_self_loop</code><a id="add_self_loop"></a>
 </td>
 <td>
 If left at True (the default), each node state update takes
diff --git a/tensorflow_gnn/docs/api_docs/python/models/graph_sage/GraphSAGEAggregatorConv.md b/tensorflow_gnn/docs/api_docs/python/models/graph_sage/GraphSAGEAggregatorConv.md
index 3bf2a5cd..83cc06ac 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/graph_sage/GraphSAGEAggregatorConv.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/graph_sage/GraphSAGEAggregatorConv.md
@@ -1,17 +1,10 @@
 # graph_sage.GraphSAGEAggregatorConv
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/graph_sage/layers.py#L24-L123">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/graph_sage/layers.py#L24-L126">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 GraphSAGE: element-wise aggregation of neighbors and their linear
 transformation.
@@ -45,42 +38,46 @@ besides "mean", see the reduce_type=... argument. For stateful transformation
 with a hidden layer, see
 <a href="../graph_sage/GraphSAGEPoolingConv.md"><code>graph_sage.GraphSAGEPoolingConv</code></a>.
 
+This layer can be restored from config by `tf.keras.models.load_model()` when
+saved as part of a Keras model using `save_format="tf"`.
+
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`receiver_tag`<a id="receiver_tag"></a>
+<code>receiver_tag</code><a id="receiver_tag"></a>
 </td>
 <td>
-Either one of `tfgnn.SOURCE` or `tfgnn.TARGET`. The results
+Either one of <code>tfgnn.SOURCE</code> or <code>tfgnn.TARGET</code>. The results
 of GraphSAGE convolution are aggregated for this graph piece. If set to
-`tfgnn.SOURCE` or `tfgnn.TARGET`, the layer will be called for an edge
+<code>tfgnn.SOURCE</code> or <code>tfgnn.TARGET</code>, the layer will be called for an edge
 set and will aggregate results at the specified endpoint of the edges.
 </td>
 </tr><tr>
 <td>
-`reduce_type`<a id="reduce_type"></a>
+<code>reduce_type</code><a id="reduce_type"></a>
 </td>
 <td>
 An aggregation operation name. Supported list of aggregation
 operators can be found at
-`tfgnn.get_registered_reduce_operation_names()`.
+<code>tfgnn.get_registered_reduce_operation_names()</code>.
 </td>
 </tr><tr>
 <td>
-`sender_node_feature`<a id="sender_node_feature"></a>
+<code>sender_node_feature</code><a id="sender_node_feature"></a>
 </td>
 <td>
 Can be set to specify the feature name for use as the
 input feature from sender nodes to GraphSAGE aggregation, defaults to
-`tfgnn.HIDDEN_STATE`.
+<code>tfgnn.HIDDEN_STATE</code>.
 </td>
 </tr><tr>
 <td>
-`units`<a id="units"></a>
+<code>units</code><a id="units"></a>
 </td>
 <td>
 Number of output units for the linear transformation applied to
@@ -88,7 +85,7 @@ sender node features.
 </td>
 </tr><tr>
 <td>
-`dropout_rate`<a id="dropout_rate"></a>
+<code>dropout_rate</code><a id="dropout_rate"></a>
 </td>
 <td>
 Can be set to a dropout rate that will be applied to sender
@@ -96,7 +93,7 @@ node features (independently on each edge).
 </td>
 </tr><tr>
 <td>
-`**kwargs`<a id="**kwargs"></a>
+<code>**kwargs</code><a id="**kwargs"></a>
 </td>
 <td>
 Additional arguments for the Layer.
@@ -105,30 +102,31 @@ Additional arguments for the Layer.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
 <tr>
 <td>
-`takes_receiver_input`<a id="takes_receiver_input"></a>
+<code>takes_receiver_input</code><a id="takes_receiver_input"></a>
 </td>
 <td>
-If `False`, all calls to convolve() will get `receiver_input=None`.
+If <code>False</code>, all calls to convolve() will get <code>receiver_input=None</code>.
 </td>
 </tr><tr>
 <td>
-`takes_sender_edge_input`<a id="takes_sender_edge_input"></a>
+<code>takes_sender_edge_input</code><a id="takes_sender_edge_input"></a>
 </td>
 <td>
-If `False`, all calls to convolve() will get `sender_edge_input=None`.
+If <code>False</code>, all calls to convolve() will get <code>sender_edge_input=None</code>.
 </td>
 </tr><tr>
 <td>
-`takes_sender_node_input`<a id="takes_sender_node_input"></a>
+<code>takes_sender_node_input</code><a id="takes_sender_node_input"></a>
 </td>
 <td>
-If `False`, all calls to convolve() will get `sender_node_input=None`.
+If <code>False</code>, all calls to convolve() will get <code>sender_node_input=None</code>.
 </td>
 </tr>
 </table>
@@ -137,7 +135,7 @@ If `False`, all calls to convolve() will get `sender_node_input=None`.
 
 <h3 id="convolve"><code>convolve</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/graph_sage/layers.py#L108-L123">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/graph_sage/layers.py#L111-L126">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
diff --git a/tensorflow_gnn/docs/api_docs/python/models/graph_sage/GraphSAGEGraphUpdate.md b/tensorflow_gnn/docs/api_docs/python/models/graph_sage/GraphSAGEGraphUpdate.md
index 3a3fb4c3..c7156448 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/graph_sage/GraphSAGEGraphUpdate.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/graph_sage/GraphSAGEGraphUpdate.md
@@ -1,17 +1,10 @@
 # graph_sage.GraphSAGEGraphUpdate
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/graph_sage/layers.py#L475-L588">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/graph_sage/layers.py#L484-L597">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns a GraphSAGE GraphUpdater layer for nodes in node_set_names.
 
@@ -43,13 +36,14 @@ applies only one step of GraphSAGE convolution over the incident nodes of the
 edge_set_name_list for the specified node_set_name node.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`units`<a id="units"></a>
+<code>units</code><a id="units"></a>
 </td>
 <td>
 Number of output units of the linear transformation applied to both
@@ -57,7 +51,7 @@ final aggregated sender node features as well as the self node feature.
 </td>
 </tr><tr>
 <td>
-`hidden_units`<a id="hidden_units"></a>
+<code>hidden_units</code><a id="hidden_units"></a>
 </td>
 <td>
 Number of output units to be configure for GraphSAGE pooling
@@ -65,17 +59,17 @@ type convolution only.
 </td>
 </tr><tr>
 <td>
-`receiver_tag`<a id="receiver_tag"></a>
+<code>receiver_tag</code><a id="receiver_tag"></a>
 </td>
 <td>
-Either one of `tfgnn.SOURCE` or `tfgnn.TARGET`. The results of
-GraphSAGE are aggregated for this graph piece. When set to `tfgnn.SOURCE`
-or `tfgnn.TARGET`, the layer is called for an edge set and will aggregate
+Either one of <code>tfgnn.SOURCE</code> or <code>tfgnn.TARGET</code>. The results of
+GraphSAGE are aggregated for this graph piece. When set to <code>tfgnn.SOURCE</code>
+or <code>tfgnn.TARGET</code>, the layer is called for an edge set and will aggregate
 results at the specified endpoint of the edges.
 </td>
 </tr><tr>
 <td>
-`node_set_names`<a id="node_set_names"></a>
+<code>node_set_names</code><a id="node_set_names"></a>
 </td>
 <td>
 By default, this layer updates all node sets that receive
@@ -86,15 +80,15 @@ auxiliary node sets.
 </td>
 </tr><tr>
 <td>
-`reduce_type`<a id="reduce_type"></a>
+<code>reduce_type</code><a id="reduce_type"></a>
 </td>
 <td>
 An aggregation operation name. Supported list of aggregation
-operators can be found at `tfgnn.get_registered_reduce_operation_names()`.
+operators can be found at <code>tfgnn.get_registered_reduce_operation_names()</code>.
 </td>
 </tr><tr>
 <td>
-`use_pooling`<a id="use_pooling"></a>
+<code>use_pooling</code><a id="use_pooling"></a>
 </td>
 <td>
 If enabled, <a href="../graph_sage/GraphSAGEPoolingConv.md"><code>graph_sage.GraphSAGEPoolingConv</code></a> will be used,
@@ -103,7 +97,7 @@ provided edges.
 </td>
 </tr><tr>
 <td>
-`use_bias`<a id="use_bias"></a>
+<code>use_bias</code><a id="use_bias"></a>
 </td>
 <td>
 If true a bias term will be added to the linear transformations
@@ -111,7 +105,7 @@ for the incident node features as well as for the self node feature.
 </td>
 </tr><tr>
 <td>
-`dropout_rate`<a id="dropout_rate"></a>
+<code>dropout_rate</code><a id="dropout_rate"></a>
 </td>
 <td>
 Can be set to a dropout rate that will be applied to both
@@ -119,7 +113,7 @@ incident node features as well as the self node feature.
 </td>
 </tr><tr>
 <td>
-`l2_normalize`<a id="l2_normalize"></a>
+<code>l2_normalize</code><a id="l2_normalize"></a>
 </td>
 <td>
 If enabled l2 normalization will be applied to final node
@@ -127,7 +121,7 @@ states.
 </td>
 </tr><tr>
 <td>
-`combine_type`<a id="combine_type"></a>
+<code>combine_type</code><a id="combine_type"></a>
 </td>
 <td>
 Can be set to "sum" or "concat". If it's specified as concat
@@ -136,32 +130,32 @@ node state will be added with the sender node features.
 </td>
 </tr><tr>
 <td>
-`activation`<a id="activation"></a>
+<code>activation</code><a id="activation"></a>
 </td>
 <td>
 The nonlinearity applied to the concatenated or added node state
 and aggregated sender node features. This can be specified as a Keras
 layer, a tf.keras.activations.* function, or a string understood by
-`tf.keras.layers.Activation()`. Defaults to relu.
+<code>tf.keras.layers.Activation()</code>. Defaults to relu.
 </td>
 </tr><tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 The feature name of node states; defaults to
-`tfgnn.HIDDEN_STATE`.
+<code>tfgnn.HIDDEN_STATE</code>.
 </td>
 </tr><tr>
 <td>
-`name`<a id="name"></a>
+<code>name</code><a id="name"></a>
 </td>
 <td>
 Optionally, a name for the layer returned.
 </td>
 </tr><tr>
 <td>
-`**kwargs`<a id="**kwargs"></a>
+<code>**kwargs</code><a id="**kwargs"></a>
 </td>
 <td>
 Any optional arguments to <a href="../graph_sage/GraphSAGEPoolingConv.md"><code>graph_sage.GraphSAGEPoolingConv</code></a>,
diff --git a/tensorflow_gnn/docs/api_docs/python/models/graph_sage/GraphSAGENextState.md b/tensorflow_gnn/docs/api_docs/python/models/graph_sage/GraphSAGENextState.md
index 62e06194..68d0c6c8 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/graph_sage/GraphSAGENextState.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/graph_sage/GraphSAGENextState.md
@@ -1,17 +1,10 @@
 # graph_sage.GraphSAGENextState
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/graph_sage/layers.py#L591-L742">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/graph_sage/layers.py#L600-L751">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 GraphSAGENextState: compute new node states with GraphSAGE algorithm.
 
@@ -60,7 +53,7 @@ equal, unless `combine_type="concat"`is set.
 GraphSAGE is Algorithm 1 in Hamilton et al.:
 ["Inductive Representation Learning on Large Graphs"](https://arxiv.org/abs/1706.02216),
 2017. It computes the new hidden state h_v for each node v from a concatenation
-of the previous hiddden state with an aggregation of the neighbor states as
+of the previous hidden state with an aggregation of the neighbor states as
 
 $$h_v = \sigma(W \text{ concat}(h_v, h_{N(v)}))$$
 
@@ -87,13 +80,14 @@ Beyond the original GraphSAGE, this class supports:
 *   additional options to influence normalization, activation, etc.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`units`<a id="units"></a>
+<code>units</code><a id="units"></a>
 </td>
 <td>
 Number of output units for the linear transformation applied to the
@@ -101,7 +95,7 @@ node feature.
 </td>
 </tr><tr>
 <td>
-`use_bias`<a id="use_bias"></a>
+<code>use_bias</code><a id="use_bias"></a>
 </td>
 <td>
 If true a bias term will be added to the linear transformations
@@ -109,7 +103,7 @@ for the self node feature.
 </td>
 </tr><tr>
 <td>
-`dropout_rate`<a id="dropout_rate"></a>
+<code>dropout_rate</code><a id="dropout_rate"></a>
 </td>
 <td>
 Can be set to a dropout rate that will be applied to the
@@ -117,15 +111,15 @@ node feature.
 </td>
 </tr><tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 The feature name of node states; defaults to
-`tfgnn.HIDDEN_STATE`.
+<code>tfgnn.HIDDEN_STATE</code>.
 </td>
 </tr><tr>
 <td>
-`l2_normalize`<a id="l2_normalize"></a>
+<code>l2_normalize</code><a id="l2_normalize"></a>
 </td>
 <td>
 If enabled l2 normalization will be applied to node state
@@ -133,7 +127,7 @@ vectors.
 </td>
 </tr><tr>
 <td>
-`combine_type`<a id="combine_type"></a>
+<code>combine_type</code><a id="combine_type"></a>
 </td>
 <td>
 Can be set to "sum" or "concat". The default "sum" recovers
@@ -144,17 +138,17 @@ Setting this to "concat" concatenates the results of the transformations
 </td>
 </tr><tr>
 <td>
-`activation`<a id="activation"></a>
+<code>activation</code><a id="activation"></a>
 </td>
 <td>
 The nonlinearity applied to the concatenated or added node
 state and aggregated sender node features. This can be specified as a
 Keras layer, a tf.keras.activations.* function, or a string understood
-by `tf.keras.layers.Activation()`. Defaults to relu.
+by <code>tf.keras.layers.Activation()</code>. Defaults to relu.
 </td>
 </tr><tr>
 <td>
-`**kwargs`<a id="**kwargs"></a>
+<code>**kwargs</code><a id="**kwargs"></a>
 </td>
 <td>
 Forwarded to the base class tf.keras.layers.Layer.
diff --git a/tensorflow_gnn/docs/api_docs/python/models/graph_sage/GraphSAGEPoolingConv.md b/tensorflow_gnn/docs/api_docs/python/models/graph_sage/GraphSAGEPoolingConv.md
index 3d53c792..a23368ed 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/graph_sage/GraphSAGEPoolingConv.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/graph_sage/GraphSAGEPoolingConv.md
@@ -1,17 +1,10 @@
 # graph_sage.GraphSAGEPoolingConv
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/graph_sage/layers.py#L126-L253">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/graph_sage/layers.py#L129-L259">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 GraphSAGE: pooling aggregator transform of neighbors followed by linear
 transformation.
@@ -53,33 +46,37 @@ involves the aforementioned hidden layer. For element-wise aggregation (as in
 `tfgnn.pool_edges_to_node()`), see
 <a href="../graph_sage/GraphSAGEAggregatorConv.md"><code>graph_sage.GraphSAGEAggregatorConv</code></a>.
 
+This layer can be restored from config by `tf.keras.models.load_model()` when
+saved as part of a Keras model using `save_format="tf"`.
+
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`receiver_tag`<a id="receiver_tag"></a>
+<code>receiver_tag</code><a id="receiver_tag"></a>
 </td>
 <td>
-Either one of `tfgnn.SOURCE` or `tfgnn.TARGET`. The results
+Either one of <code>tfgnn.SOURCE</code> or <code>tfgnn.TARGET</code>. The results
 of GraphSAGE are aggregated for this graph piece. If set to
-`tfgnn.SOURCE` or `tfgnn.TARGET`, the layer will be called for an edge
+<code>tfgnn.SOURCE</code> or <code>tfgnn.TARGET</code>, the layer will be called for an edge
 set and will aggregate results at the specified endpoint of the edges.
 </td>
 </tr><tr>
 <td>
-`sender_node_feature`<a id="sender_node_feature"></a>
+<code>sender_node_feature</code><a id="sender_node_feature"></a>
 </td>
 <td>
 Can be set to specify the feature name for use as the
 input feature from sender nodes to GraphSAGE aggregation, defaults to
-`tfgnn.HIDDEN_STATE`.
+<code>tfgnn.HIDDEN_STATE</code>.
 </td>
 </tr><tr>
 <td>
-`units`<a id="units"></a>
+<code>units</code><a id="units"></a>
 </td>
 <td>
 Number of output units for the final dimensionality of the output
@@ -87,7 +84,7 @@ from the layer.
 </td>
 </tr><tr>
 <td>
-`hidden_units`<a id="hidden_units"></a>
+<code>hidden_units</code><a id="hidden_units"></a>
 </td>
 <td>
 Number of output units for the linear transformation applied
@@ -97,16 +94,16 @@ W_pool from Eq. (3) in
 </td>
 </tr><tr>
 <td>
-`reduce_type`<a id="reduce_type"></a>
+<code>reduce_type</code><a id="reduce_type"></a>
 </td>
 <td>
 An aggregation operation name. Supported list of aggregation
 operators can be found at
-`tfgnn.get_registered_reduce_operation_names()`.
+<code>tfgnn.get_registered_reduce_operation_names()</code>.
 </td>
 </tr><tr>
 <td>
-`use_bias`<a id="use_bias"></a>
+<code>use_bias</code><a id="use_bias"></a>
 </td>
 <td>
 If true a bias term will be added to the linear transformations
@@ -114,7 +111,7 @@ for the sender node features.
 </td>
 </tr><tr>
 <td>
-`dropout_rate`<a id="dropout_rate"></a>
+<code>dropout_rate</code><a id="dropout_rate"></a>
 </td>
 <td>
 Can be set to a dropout rate that will be applied to sender
@@ -122,17 +119,17 @@ node features (independently on each edge).
 </td>
 </tr><tr>
 <td>
-`activation`<a id="activation"></a>
+<code>activation</code><a id="activation"></a>
 </td>
 <td>
 The nonlinearity applied to the concatenated or added node
 state and aggregated sender node features. This can be specified as a
 Keras layer, a tf.keras.activations.* function, or a string understood
-by `tf.keras.layers.Activation()`. Defaults to relu.
+by <code>tf.keras.layers.Activation()</code>. Defaults to relu.
 </td>
 </tr><tr>
 <td>
-`**kwargs`<a id="**kwargs"></a>
+<code>**kwargs</code><a id="**kwargs"></a>
 </td>
 <td>
 Additional arguments for the Layer.
@@ -141,30 +138,31 @@ Additional arguments for the Layer.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
 <tr>
 <td>
-`takes_receiver_input`<a id="takes_receiver_input"></a>
+<code>takes_receiver_input</code><a id="takes_receiver_input"></a>
 </td>
 <td>
-If `False`, all calls to convolve() will get `receiver_input=None`.
+If <code>False</code>, all calls to convolve() will get <code>receiver_input=None</code>.
 </td>
 </tr><tr>
 <td>
-`takes_sender_edge_input`<a id="takes_sender_edge_input"></a>
+<code>takes_sender_edge_input</code><a id="takes_sender_edge_input"></a>
 </td>
 <td>
-If `False`, all calls to convolve() will get `sender_edge_input=None`.
+If <code>False</code>, all calls to convolve() will get <code>sender_edge_input=None</code>.
 </td>
 </tr><tr>
 <td>
-`takes_sender_node_input`<a id="takes_sender_node_input"></a>
+<code>takes_sender_node_input</code><a id="takes_sender_node_input"></a>
 </td>
 <td>
-If `False`, all calls to convolve() will get `sender_node_input=None`.
+If <code>False</code>, all calls to convolve() will get <code>sender_node_input=None</code>.
 </td>
 </tr>
 </table>
@@ -173,7 +171,7 @@ If `False`, all calls to convolve() will get `sender_node_input=None`.
 
 <h3 id="convolve"><code>convolve</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/graph_sage/layers.py#L236-L253">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/graph_sage/layers.py#L242-L259">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
diff --git a/tensorflow_gnn/docs/api_docs/python/models/mt_albis.md b/tensorflow_gnn/docs/api_docs/python/models/mt_albis.md
index 61451923..299ee813 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/mt_albis.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/mt_albis.md
@@ -1,17 +1,10 @@
 # Module: mt_albis
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/mt_albis/__init__.py">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/mt_albis/__init__.py">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 TF-GNN's Model Template "Albis".
 
diff --git a/tensorflow_gnn/docs/api_docs/python/models/mt_albis/MtAlbisGraphUpdate.md b/tensorflow_gnn/docs/api_docs/python/models/mt_albis/MtAlbisGraphUpdate.md
index 1b9c26f4..bfb58361 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/mt_albis/MtAlbisGraphUpdate.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/mt_albis/MtAlbisGraphUpdate.md
@@ -1,17 +1,10 @@
 # mt_albis.MtAlbisGraphUpdate
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/mt_albis/layers.py#L239-L389">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/mt_albis/layers.py#L243-L394">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns GraphUpdate layer for message passing with Model Template "Albis".
 
@@ -44,6 +37,10 @@ Returns GraphUpdate layer for message passing with Model Template "Albis".
 The TF-GNN Model Template "Albis" provides a small selection of field-tested GNN
 architectures through the unified interface of this class.
 
+The layer returned by this function can be restored from config by
+`tf.keras.models.load_model()` when saved as part of a Keras model using
+`save_format="tf"`.
+
 <!-- Tabular view -->
 
  <table class="responsive fixed orange">
@@ -52,29 +49,29 @@ architectures through the unified interface of this class.
 
 <tr>
 <td>
-`units`<a id="units"></a>
+<code>units</code><a id="units"></a>
 </td>
 <td>
 The dimension of node states in the output GraphTensor.
 </td>
 </tr><tr>
 <td>
-`message_dim`<a id="message_dim"></a>
+<code>message_dim</code><a id="message_dim"></a>
 </td>
 <td>
 The dimension of messages computed transiently on each edge.
 </td>
 </tr><tr>
 <td>
-`receiver_tag`<a id="receiver_tag"></a>
+<code>receiver_tag</code><a id="receiver_tag"></a>
 </td>
 <td>
-One of `tfgnn.SOURCE` or `tfgnn.TARGET`. The messages are
+One of <code>tfgnn.SOURCE</code> or <code>tfgnn.TARGET</code>. The messages are
 sent to the nodes at this endpoint of edges.
 </td>
 </tr><tr>
 <td>
-`node_set_names`<a id="node_set_names"></a>
+<code>node_set_names</code><a id="node_set_names"></a>
 </td>
 <td>
 Optionally, the names of NodeSets to update. By default,
@@ -82,7 +79,7 @@ all NodeSets are updated that receive from at least one EdgeSet.
 </td>
 </tr><tr>
 <td>
-`edge_feature_name`<a id="edge_feature_name"></a>
+<code>edge_feature_name</code><a id="edge_feature_name"></a>
 </td>
 <td>
 Optionally, the name of an edge feature to include in
@@ -90,53 +87,54 @@ message computation on edges.
 </td>
 </tr><tr>
 <td>
-`attention_type`<a id="attention_type"></a>
+<code>attention_type</code><a id="attention_type"></a>
 </td>
 <td>
-`"none"`, `"multi_head"`, or `"gat_v2"`. Selects whether
+<code>"none"</code>, <code>"multi_head"</code>, or <code>"gat_v2"</code>. Selects whether
 messages are pooled with data-dependent weights computed by a trained
 attention mechansim.
 </td>
 </tr><tr>
 <td>
-`attention_edge_set_names`<a id="attention_edge_set_names"></a>
+<code>attention_edge_set_names</code><a id="attention_edge_set_names"></a>
 </td>
 <td>
 If set, edge sets other than those named here
-will be treated as if `attention_type="none"` regardless.
+will be treated as if <code>attention_type="none"</code> regardless.
 </td>
 </tr><tr>
 <td>
-`attention_num_heads`<a id="attention_num_heads"></a>
+<code>attention_num_heads</code><a id="attention_num_heads"></a>
 </td>
 <td>
-For attention_types `"multi_head"` or `"gat_v2"`,
+For attention_types <code>"multi_head"</code> or <code>"gat_v2"</code>,
 the number of attention heads.
 </td>
 </tr><tr>
 <td>
-`simple_conv_reduce_type`<a id="simple_conv_reduce_type"></a>
+<code>simple_conv_reduce_type</code><a id="simple_conv_reduce_type"></a>
 </td>
 <td>
-For attention_type `"none"`, controls how messages
-are aggregated on an EdgeSet for each receiver node. Defaults to `"mean"`;
-other recommened values are the concatenations `"mean|sum"`, `"mean|max"`,
-and `"mean|sum|max"` (but mind the increased output dimension and the
-corresponding increase in the number of weights in the next-state layer).
-Technically, can be set to any reduce_type understood by `tfgnn.pool()`.
+For attention_type <code>"none"</code>, controls how messages
+are aggregated on an EdgeSet for each receiver node. Defaults to <code>"mean"</code>;
+other recommended values are the concatenations <code>"mean|sum"</code>,
+<code>"mean|max"</code>, and <code>"mean|sum|max"</code> (but mind the increased output
+dimension and the corresponding increase in the number of weights in the
+next-state layer). Technically, can be set to any reduce_type understood
+by <code>tfgnn.pool()</code>.
 </td>
 </tr><tr>
 <td>
-`simple_conv_use_receiver_state`<a id="simple_conv_use_receiver_state"></a>
+<code>simple_conv_use_receiver_state</code><a id="simple_conv_use_receiver_state"></a>
 </td>
 <td>
-For attention_type `"none"`, controls
+For attention_type <code>"none"</code>, controls
 whether the receiver node state is used in computing each edge's message
-(in addition to the sender node state and possibly an `edge feature`).
+(in addition to the sender node state and possibly an <code>edge feature</code>).
 </td>
 </tr><tr>
 <td>
-`state_dropout_rate`<a id="state_dropout_rate"></a>
+<code>state_dropout_rate</code><a id="state_dropout_rate"></a>
 </td>
 <td>
 The dropout rate applied to the pooled and combined
@@ -147,7 +145,7 @@ is applied to messages after pooling.)
 </td>
 </tr><tr>
 <td>
-`edge_dropout_rate`<a id="edge_dropout_rate"></a>
+<code>edge_dropout_rate</code><a id="edge_dropout_rate"></a>
 </td>
 <td>
 Can be set to a dropout rate for entire edges during
@@ -156,7 +154,7 @@ an edge is dropped, as if the edge were not present in the graph.
 </td>
 </tr><tr>
 <td>
-`l2_regularization`<a id="l2_regularization"></a>
+<code>l2_regularization</code><a id="l2_regularization"></a>
 </td>
 <td>
 The coefficient of L2 regularization for trained weights.
@@ -164,54 +162,54 @@ The coefficient of L2 regularization for trained weights.
 </td>
 </tr><tr>
 <td>
-`kernel_initializer`<a id="kernel_initializer"></a>
+<code>kernel_initializer</code><a id="kernel_initializer"></a>
 </td>
 <td>
-Can be set to a `kernel_initializer` as understood
-by `tf.keras.layers.Dense` etc.
-An `Initializer` object gets cloned before use to ensure a fresh seed,
-if not set explicitly. For more, see `tfgnn.keras.clone_initializer()`.
+Can be set to a <code>kernel_initializer</code> as understood
+by <code>tf.keras.layers.Dense</code> etc.
+An <code>Initializer</code> object gets cloned before use to ensure a fresh seed,
+if not set explicitly. For more, see <code>tfgnn.keras.clone_initializer()</code>.
 </td>
 </tr><tr>
 <td>
-`normalization_type`<a id="normalization_type"></a>
+<code>normalization_type</code><a id="normalization_type"></a>
 </td>
 <td>
 controls the normalization of output node states.
-By default (`"layer"`), LayerNormalization is used. Can be set to
-`"none"`, or to `"batch"` for BatchNormalization.
+By default (<code>"layer"</code>), LayerNormalization is used. Can be set to
+<code>"none"</code>, or to <code>"batch"</code> for BatchNormalization.
 </td>
 </tr><tr>
 <td>
-`batch_normalization_momentum`<a id="batch_normalization_momentum"></a>
+<code>batch_normalization_momentum</code><a id="batch_normalization_momentum"></a>
 </td>
 <td>
-If `normalization_type="batch"`, sets the
-`BatchNormalization(momentum=...)` parameter. Ignored otherwise.
+If <code>normalization_type="batch"</code>, sets the
+<code>BatchNormalization(momentum=...)</code> parameter. Ignored otherwise.
 </td>
 </tr><tr>
 <td>
-`next_state_type`<a id="next_state_type"></a>
+<code>next_state_type</code><a id="next_state_type"></a>
 </td>
 <td>
-`"dense"` or `"residual"`. With the latter, a residual
+<code>"dense"</code> or <code>"residual"</code>. With the latter, a residual
 link is added from the old to the new node state, which requires that all
-input node states already have size `units` (unless their size is 0, as
+input node states already have size <code>units</code> (unless their size is 0, as
 for latent node sets, in which case the residual link is omitted).
 </td>
 </tr><tr>
 <td>
-`edge_set_combine_type`<a id="edge_set_combine_type"></a>
+<code>edge_set_combine_type</code><a id="edge_set_combine_type"></a>
 </td>
 <td>
-`"concat"` or `"sum"`. Controls how pooled messages
+<code>"concat"</code> or <code>"sum"</code>. Controls how pooled messages
 from various edge sets are combined as inputs to the NextState layer
-that updates the node states. Defaults to `"concat"`, which gives the
+that updates the node states. Defaults to <code>"concat"</code>, which gives the
 pooled messages from each edge set separate weights in the NextState
-layer, namely `units * message_dim * num_incident_edge_sets` per node set.
-Setting this to `"sum"` adds up the pooled messages into a single
+layer, namely <code>units * message_dim * num_incident_edge_sets</code> per node set.
+Setting this to <code>"sum"</code> adds up the pooled messages into a single
 vector before passing them into the NextState layer, which requires just
-`units * message_dim` weights per node set.
+<code>units * message_dim</code> weights per node set.
 </td>
 </tr>
 </table>
diff --git a/tensorflow_gnn/docs/api_docs/python/models/mt_albis/graph_update_from_config_dict.md b/tensorflow_gnn/docs/api_docs/python/models/mt_albis/graph_update_from_config_dict.md
index 6843881a..5eb5a0e6 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/mt_albis/graph_update_from_config_dict.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/mt_albis/graph_update_from_config_dict.md
@@ -1,23 +1,17 @@
 # mt_albis.graph_update_from_config_dict
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/mt_albis/config_dict.py#L47-L70">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/mt_albis/config_dict.py#L50-L77">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Constructs a MtAlbisGraphUpdate from a ConfigDict.
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>mt_albis.graph_update_from_config_dict(
-    cfg: config_dict.ConfigDict
+    cfg: config_dict.ConfigDict,
+    node_set_names: Optional[Collection[tfgnn.NodeSetName]] = None
 ) -> tf.keras.layers.Layer
 </code></pre>
 
@@ -30,16 +24,24 @@ Constructs a MtAlbisGraphUpdate from a ConfigDict.
 
 <tr>
 <td>
-`cfg`<a id="cfg"></a>
+<code>cfg</code><a id="cfg"></a>
 </td>
 <td>
-A `ConfigDict` with the fields defined by
-`graph_update_get_config_dict()`. All fields with non-`None` values are
+A <code>ConfigDict</code> with the fields defined by
+<code>graph_update_get_config_dict()</code>. All fields with non-<code>None</code> values are
 used as keyword arguments for initializing and returning a
-`MtAlbisGraphUpdate` object. For the required arguments of
-`MtAlbisGraphUpdate.__init__`, users must set a value in `cfg` before
+<code>MtAlbisGraphUpdate</code> object. For the required arguments of
+<code>MtAlbisGraphUpdate.__init__</code>, users must set a value in <code>cfg</code> before
 passing it here.
 </td>
+</tr><tr>
+<td>
+<code>node_set_names</code><a id="node_set_names"></a>
+</td>
+<td>
+Optionally, the names of NodeSets to update; forwarded to
+<code>MtAlbisGraphUpdate.__init__</code>.
+</td>
 </tr>
 </table>
 
@@ -50,7 +52,7 @@ passing it here.
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-A new Layer object as returned by `MtAlbisGraphUpdate()`.
+A new Layer object as returned by <code>MtAlbisGraphUpdate()</code>.
 </td>
 </tr>
 
@@ -64,11 +66,11 @@ A new Layer object as returned by `MtAlbisGraphUpdate()`.
 
 <tr>
 <td>
-`TypeError`<a id="TypeError"></a>
+<code>TypeError</code><a id="TypeError"></a>
 </td>
 <td>
-if `cfg` fails to supply a required argument for
-`MtAlbisGraphUpdate()`.
+if <code>cfg</code> fails to supply a required argument for
+<code>MtAlbisGraphUpdate()</code>.
 </td>
 </tr>
 </table>
diff --git a/tensorflow_gnn/docs/api_docs/python/models/mt_albis/graph_update_get_config_dict.md b/tensorflow_gnn/docs/api_docs/python/models/mt_albis/graph_update_get_config_dict.md
index a104fb31..4211a131 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/mt_albis/graph_update_get_config_dict.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/mt_albis/graph_update_get_config_dict.md
@@ -1,17 +1,10 @@
 # mt_albis.graph_update_get_config_dict
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/mt_albis/config_dict.py#L22-L44">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/mt_albis/config_dict.py#L25-L47">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns ConfigDict for graph_update_from_config_dict() with defaults.
 
diff --git a/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention.md b/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention.md
index 47ae18eb..f4dc2f07 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention.md
@@ -1,17 +1,10 @@
 # Module: multi_head_attention
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/multi_head_attention/__init__.py">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/multi_head_attention/__init__.py">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Transformer-style multi-head attention.
 
diff --git a/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention/MultiHeadAttentionConv.md b/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention/MultiHeadAttentionConv.md
index 36aab5a5..c5d90833 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention/MultiHeadAttentionConv.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention/MultiHeadAttentionConv.md
@@ -1,17 +1,10 @@
 # multi_head_attention.MultiHeadAttentionConv
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/multi_head_attention/layers.py#L24-L562">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/multi_head_attention/layers.py#L24-L568">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Transformer-style (dot-product) multi-head attention on GNNs.
 
@@ -32,7 +25,7 @@ Transformer-style (dot-product) multi-head attention on GNNs.
     kernel_initializer: Any = None,
     kernel_regularizer: Any = None,
     transform_keys: bool = True,
-    score_scaling: Literal['none', 'rsqrt_dim', 'trainable_sigmoid'] = &#x27;rsqrt_dim&#x27;,
+    score_scaling: Literal['none', 'rsqrt_dim', 'trainable_elup1'] = &#x27;rsqrt_dim&#x27;,
     transform_values_after_pooling: bool = False,
     **kwargs
 )
@@ -96,50 +89,54 @@ using the other arguments.
 
 Example: Transformer-style attention on neighbors along incoming edges whose
 result is concatenated with the old node state and passed through a Dense layer
-to compute the new node state. `dense = tf.keras.layers.Dense graph =
-tfgnn.keras.layers.GraphUpdate( node_sets={"paper":
-tfgnn.keras.layers.NodeSetUpdate( {"cites":
-tfgnn.keras.layers.MultiHeadAttentionConv( message_dim,
-receiver_tag=tfgnn.TARGET)},
-tfgnn.keras.layers.NextStateFromConcat(dense(node_state_dim)))} )(graph)`
+to compute the new node state.
+
+```
+dense = tf.keras.layers.Dense
+graph = tfgnn.keras.layers.GraphUpdate(
+    node_sets={"paper": tfgnn.keras.layers.NodeSetUpdate(
+        {"cites": tfgnn.keras.layers.MultiHeadAttentionConv(
+             message_dim, receiver_tag=tfgnn.TARGET)},
+        tfgnn.keras.layers.NextStateFromConcat(dense(node_state_dim)))}
+)(graph)
+```
 
 For now, there is a variant that modifies the inputs transformation part and
 could potentially be beneficial:
 
-```
-1. (transform_keys is False) Instead of projecting both queries and
-  keys when computing attention weights, we only project the queries
-  because the two linear projections can be collapsed to a single
-  projection:
-
-    $$ (Q_v W_Q^k)(K_u W_K^k)^T
-      = Q_v (W_Q^k {W_K^k}^T) K_u^T
-      = Q_v W_{QK}^k K_u^T $$
-
-  where $d$ is the key width. (Following "Attention is all you need",
-  this scaling is meant to achieve unit variance of the results, assuming
-  that $Q_v W_{QK}^k$ has unit variance due to the initialization of
-  $Q_v W_{QK}^k$.)
-
-  NOTE: The single projection matrix behaves differently in
-  gradient-descent training than the product of two matrices.
-```
+1.  (transform_keys is False) Instead of projecting both queries and keys when
+    computing attention weights, we only project the queries because the two
+    linear projections can be collapsed to a single projection:
+
+    $$ (Q_v W_Q^k)(K_u W_K^k)^T = Q_v (W_Q^k {W_K^k}^T) K_u^T = Q_v W_{QK}^k
+    K_u^T $$
+
+    where $d$ is the key width. (Following "Attention is all you need", this
+    scaling is meant to achieve unit variance of the results, assuming that $Q_v
+    W_{QK}^k$ has unit variance due to the initialization of $Q_v W_{QK}^k$.)
+
+    NOTE: The single projection matrix behaves differently in gradient-descent
+    training than the product of two matrices.
+
+This layer can be restored from config by `tf.keras.models.load_model()` when
+saved as part of a Keras model using `save_format="tf"`.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Init args</h2></th></tr>
 
 <tr>
 <td>
-`num_heads`<a id="num_heads"></a>
+<code>num_heads</code><a id="num_heads"></a>
 </td>
 <td>
 The number of attention heads.
 </td>
 </tr><tr>
 <td>
-`per_head_channels`<a id="per_head_channels"></a>
+<code>per_head_channels</code><a id="per_head_channels"></a>
 </td>
 <td>
 The number of channels for each attention head. This
@@ -147,51 +144,51 @@ means that the final output size will be per_head_channels * num_heads.
 </td>
 </tr><tr>
 <td>
-`receiver_tag`<a id="receiver_tag"></a>
+<code>receiver_tag</code><a id="receiver_tag"></a>
 </td>
 <td>
-one of `tfgnn.SOURCE`, `tfgnn.TARGET` or `tfgnn.CONTEXT`.
+one of <code>tfgnn.SOURCE</code>, <code>tfgnn.TARGET</code> or <code>tfgnn.CONTEXT</code>.
 The results of attention are aggregated for this graph piece.
-If set to `tfgnn.SOURCE` or `tfgnn.TARGET`, the layer can be called for
+If set to <code>tfgnn.SOURCE</code> or <code>tfgnn.TARGET</code>, the layer can be called for
 an edge set and will aggregate results at the specified endpoint of the
 edges.
-If set to `tfgnn.CONTEXT`, the layer can be called for an edge set or
+If set to <code>tfgnn.CONTEXT</code>, the layer can be called for an edge set or
 node set.
 If left unset for init, the tag must be passed at call time.
 </td>
 </tr><tr>
 <td>
-`receiver_feature`<a id="receiver_feature"></a>
+<code>receiver_feature</code><a id="receiver_feature"></a>
 </td>
 <td>
-Can be set to override `tfgnn.HIDDEN_STATE`
+Can be set to override <code>tfgnn.HIDDEN_STATE</code>
 for use as the receiver's input feature to attention. (The attention key
 is derived from this input.)
 </td>
 </tr><tr>
 <td>
-`sender_node_feature`<a id="sender_node_feature"></a>
+<code>sender_node_feature</code><a id="sender_node_feature"></a>
 </td>
 <td>
-Can be set to override `tfgnn.HIDDEN_STATE`
+Can be set to override <code>tfgnn.HIDDEN_STATE</code>
 for use as the input feature from sender nodes to attention.
-IMPORTANT: Must be set to `None` for use with `receiver_tag=tfgnn.CONTEXT`
+IMPORTANT: Must be set to <code>None</code> for use with <code>receiver_tag=tfgnn.CONTEXT</code>
 on an edge set, or for pooling from edges without sender node states.
 </td>
 </tr><tr>
 <td>
-`sender_edge_feature`<a id="sender_edge_feature"></a>
+<code>sender_edge_feature</code><a id="sender_edge_feature"></a>
 </td>
 <td>
 Can be set to a feature name of the edge set to select
-it as an input feature. By default, this set to `None`, which disables
+it as an input feature. By default, this set to <code>None</code>, which disables
 this input.
-IMPORTANT: Must be set for use with `receiver_tag=tfgnn.CONTEXT`
+IMPORTANT: Must be set for use with <code>receiver_tag=tfgnn.CONTEXT</code>
 on an edge set.
 </td>
 </tr><tr>
 <td>
-`use_bias`<a id="use_bias"></a>
+<code>use_bias</code><a id="use_bias"></a>
 </td>
 <td>
 If true, bias terms are added to the transformations of query,
@@ -199,7 +196,7 @@ key and value inputs.
 </td>
 </tr><tr>
 <td>
-`edge_dropout`<a id="edge_dropout"></a>
+<code>edge_dropout</code><a id="edge_dropout"></a>
 </td>
 <td>
 Can be set to a dropout rate for edge dropout. (When pooling
@@ -208,7 +205,7 @@ is dropped out.)
 </td>
 </tr><tr>
 <td>
-`inputs_dropout`<a id="inputs_dropout"></a>
+<code>inputs_dropout</code><a id="inputs_dropout"></a>
 </td>
 <td>
 Dropout rate for random dropout on the inputs to this
@@ -216,18 +213,18 @@ convolution layer, i.e. the receiver, sender node, and sender edge inputs.
 </td>
 </tr><tr>
 <td>
-`attention_activation`<a id="attention_activation"></a>
+<code>attention_activation</code><a id="attention_activation"></a>
 </td>
 <td>
 The nonlinearity used on the transformed inputs
-(query, and keys if `transform_keys` is `True`) before computing the
+(query, and keys if <code>transform_keys</code> is <code>True</code>) before computing the
 attention scores. This can be specified as a Keras layer, a
 tf.keras.activations.* function, or a string understood by
-`tf.keras.layers.Activation`. Defaults to None.
+<code>tf.keras.layers.Activation</code>. Defaults to None.
 </td>
 </tr><tr>
 <td>
-`activation`<a id="activation"></a>
+<code>activation</code><a id="activation"></a>
 </td>
 <td>
 The nonlinearity applied to the final result of attention,
@@ -235,25 +232,25 @@ specified in the same ways as attention_activation.
 </td>
 </tr><tr>
 <td>
-`kernel_initializer`<a id="kernel_initializer"></a>
+<code>kernel_initializer</code><a id="kernel_initializer"></a>
 </td>
 <td>
-Can be set to a `kernel_initializer` as understood
-by `tf.keras.layers.Dense` etc.
-An `Initializer` object gets cloned before use to ensure a fresh seed,
-if not set explicitly. For more, see `tfgnn.keras.clone_initializer()`.
+Can be set to a <code>kernel_initializer</code> as understood
+by <code>tf.keras.layers.Dense</code> etc.
+An <code>Initializer</code> object gets cloned before use to ensure a fresh seed,
+if not set explicitly. For more, see <code>tfgnn.keras.clone_initializer()</code>.
 </td>
 </tr><tr>
 <td>
-`kernel_regularizer`<a id="kernel_regularizer"></a>
+<code>kernel_regularizer</code><a id="kernel_regularizer"></a>
 </td>
 <td>
-Can be set to a `kernel_regularized` as understood
-by `tf.keras.layers.Dense` etc.
+Can be set to a <code>kernel_regularized</code> as understood
+by <code>tf.keras.layers.Dense</code> etc.
 </td>
 </tr><tr>
 <td>
-`transform_keys`<a id="transform_keys"></a>
+<code>transform_keys</code><a id="transform_keys"></a>
 </td>
 <td>
 If true, transform both queries and keys inputs. Otherwise,
@@ -263,22 +260,24 @@ independent of this arg.)
 </td>
 </tr><tr>
 <td>
-`score_scaling`<a id="score_scaling"></a>
+<code>score_scaling</code><a id="score_scaling"></a>
 </td>
 <td>
-One of either `"none"`, `"rsqrt_dim"`, or
-`"trainable_sigmoid"`. If set to `"rsqrt_dim"`, the attention scores are
+One of either <code>"rsqrt_dim"</code> (default), <code>"trainable_elup1"</code>,
+or <code>"none"</code>. If set to <code>"rsqrt_dim"</code>, the attention scores are
 divided by the square root of the dimension of keys (i.e.,
-`per_head_channels` if `transform_keys=True`, otherwise whatever the
-dimension of combined sender inputs is). If set to `"trainable_sigmoid"`,
-the scores are scaled with `sigmoid(x)`, where `x` is a trainable weight
-of the model that is initialized to `-5.0`, which initially makes all the
-attention weights equal and slowly ramps up as the other weights in the
-layer converge. Defaults to `"rsqrt_dim"`.
+<code>per_head_channels</code> if <code>transform_keys=True</code>, otherwise whatever the
+dimension of combined sender inputs is). If set to <code>"trainable_elup1"</code>,
+the scores are scaled with <code>elu(x) + 1</code>, where <code>elu</code> is the Exponential
+Linear Unit (see <code>tf.keras.activations.elu</code>), and <code>x</code> is a per-head
+trainable weight of the model that is initialized to <code>0.0</code>. Recall that
+<code>elu(x) + 1 == exp(x) if x<0 else x+1</code>, so the
+initial scaling factor is <code>1.0</code>, decreases exponentially below 1.0, and
+grows linearly above 1.0.
 </td>
 </tr><tr>
 <td>
-`transform_values_after_pooling`<a id="transform_values_after_pooling"></a>
+<code>transform_values_after_pooling</code><a id="transform_values_after_pooling"></a>
 </td>
 <td>
 By default, each attention head applies
@@ -292,56 +291,58 @@ IMPORTANT: Toggling this option breaks checkpoint compatibility.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
-<tr> <td> `receiver_tag`<a id="receiver_tag"></a> </td> <td> one of
-`tfgnn.SOURCE`, `tfgnn.TARGET` or `tfgnn.CONTEXT`. The results are aggregated
-for this graph piece. If set to `tfgnn.SOURCE` or `tfgnn.TARGET`, the layer can
-be called for an edge set and will aggregate results at the specified endpoint
-of the edges. If set to `tfgnn.CONTEXT`, the layer can be called for an edge set
-or a node set and will aggregate results for context (per graph component). If
-left unset for init, the tag must be passed at call time. </td> </tr><tr> <td>
-`receiver_feature`<a id="receiver_feature"></a> </td> <td> The name of the
-feature that is read from the receiver graph piece and passed as
-convolve(receiver_input=...). </td> </tr><tr> <td>
-`sender_node_feature`<a id="sender_node_feature"></a> </td> <td> The name of the
-feature that is read from the sender nodes, if any, and passed as
-convolve(sender_node_input=...). NOTICE this must be `None` for use with
-`receiver_tag=tfgnn.CONTEXT` on an edge set, or for pooling from edges without
-sender node states. </td> </tr><tr> <td>
-`sender_edge_feature`<a id="sender_edge_feature"></a> </td> <td> The name of the
-feature that is read from the sender edges, if any, and passed as
-convolve(sender_edge_input=...). NOTICE this must not be `None` for use with
-`receiver_tag=tfgnn.CONTEXT` on an edge set. </td> </tr><tr> <td>
-`extra_receiver_ops`<a id="extra_receiver_ops"></a> </td> <td> A str-keyed
-dictionary of Python callables that are wrapped to bind some arguments and then
-passed on to `convolve()`. Sample usage: `extra_receiver_ops={"softmax":
-tfgnn.softmax}`. The values passed in this dict must be callable as follows,
-with two positional arguments:
+<tr> <td> <code>receiver_tag</code><a id="receiver_tag"></a> </td> <td> one of
+<code>tfgnn.SOURCE</code>, <code>tfgnn.TARGET</code> or
+<code>tfgnn.CONTEXT</code>. The results are aggregated for this graph piece. If
+set to <code>tfgnn.SOURCE</code> or <code>tfgnn.TARGET</code>, the layer can be
+called for an edge set and will aggregate results at the specified endpoint of
+the edges. If set to <code>tfgnn.CONTEXT</code>, the layer can be called for an
+edge set or a node set and will aggregate results for context (per graph
+component). If left unset for init, the tag must be passed at call time. </td>
+</tr><tr> <td> <code>receiver_feature</code><a id="receiver_feature"></a> </td>
+<td> The name of the feature that is read from the receiver graph piece and
+passed as convolve(receiver_input=...). </td> </tr><tr> <td>
+<code>sender_node_feature</code><a id="sender_node_feature"></a> </td> <td> The
+name of the feature that is read from the sender nodes, if any, and passed as
+convolve(sender_node_input=...). NOTICE this must be <code>None</code> for use
+with <code>receiver_tag=tfgnn.CONTEXT</code> on an edge set, or for pooling from
+edges without sender node states. </td> </tr><tr> <td>
+<code>sender_edge_feature</code><a id="sender_edge_feature"></a> </td> <td> The
+name of the feature that is read from the sender edges, if any, and passed as
+convolve(sender_edge_input=...). NOTICE this must not be <code>None</code> for
+use with <code>receiver_tag=tfgnn.CONTEXT</code> on an edge set. </td> </tr><tr>
+<td> <code>extra_receiver_ops</code><a id="extra_receiver_ops"></a> </td> <td> A
+str-keyed dictionary of Python callables that are wrapped to bind some arguments
+and then passed on to <code>convolve()</code>. Sample usage:
+<code>extra_receiver_ops={"softmax": tfgnn.softmax}</code>. The values passed in
+this dict must be callable as follows, with two positional arguments:
 
 ```python
 f(graph, receiver_tag, node_set_name=..., feature_value=..., ...)
 f(graph, receiver_tag, edge_set_name=..., feature_value=..., ...)
 ```
 
-The wrapped callables seen by `convolve()` can be called like
+The wrapped callables seen by <code>convolve()</code> can be called like
 
 ```python
 wrapped_f(feature_value, ...)
 ```
 
-The first three arguments of `f` are set to the input GraphTensor of
-the layer and the tag/name pair required by `tfgnn.broadcast()` and
-`tfgnn.pool()` to move values between the receiver and the messages that
+The first three arguments of <code>f</code> are set to the input GraphTensor of
+the layer and the tag/name pair required by <code>tfgnn.broadcast()</code> and
+<code>tfgnn.pool()</code> to move values between the receiver and the messages that
 are computed inside the convolution. The sole positional argument of
-`wrapped_f()` is passed to `f()`  as `feature_value=`, and any keyword
+<code>wrapped_f()</code> is passed to <code>f()</code>  as <code>feature_value=</code>, and any keyword
 arguments are forwarded.
 </td>
 </tr><tr>
 <td>
-`**kwargs`<a id="**kwargs"></a>
+<code>**kwargs</code><a id="**kwargs"></a>
 </td>
 <td>
 Forwarded to the base class tf.keras.layers.Layer.
@@ -350,30 +351,31 @@ Forwarded to the base class tf.keras.layers.Layer.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
 <tr>
 <td>
-`takes_receiver_input`<a id="takes_receiver_input"></a>
+<code>takes_receiver_input</code><a id="takes_receiver_input"></a>
 </td>
 <td>
-If `False`, all calls to convolve() will get `receiver_input=None`.
+If <code>False</code>, all calls to convolve() will get <code>receiver_input=None</code>.
 </td>
 </tr><tr>
 <td>
-`takes_sender_edge_input`<a id="takes_sender_edge_input"></a>
+<code>takes_sender_edge_input</code><a id="takes_sender_edge_input"></a>
 </td>
 <td>
-If `False`, all calls to convolve() will get `sender_edge_input=None`.
+If <code>False</code>, all calls to convolve() will get <code>sender_edge_input=None</code>.
 </td>
 </tr><tr>
 <td>
-`takes_sender_node_input`<a id="takes_sender_node_input"></a>
+<code>takes_sender_node_input</code><a id="takes_sender_node_input"></a>
 </td>
 <td>
-If `False`, all calls to convolve() will get `sender_node_input=None`.
+If <code>False</code>, all calls to convolve() will get <code>sender_node_input=None</code>.
 </td>
 </tr>
 </table>
@@ -382,7 +384,7 @@ If `False`, all calls to convolve() will get `sender_node_input=None`.
 
 <h3 id="convolve"><code>convolve</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/multi_head_attention/layers.py#L350-L534">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/multi_head_attention/layers.py#L357-L540">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -408,44 +410,45 @@ from nodes to context). In the end, values have to be pooled from there into a
 Tensor with a leading dimension indexed by receivers, see `pool_to_receiver`.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`sender_node_input`
+<code>sender_node_input</code>
 </td>
 <td>
-The input Tensor from the sender NodeSet, or `None`.
-If self.takes_sender_node_input is `False`, this arg will be `None`.
-(If it is `True`, that depends on how this layer gets called.)
+The input Tensor from the sender NodeSet, or <code>None</code>.
+If self.takes_sender_node_input is <code>False</code>, this arg will be <code>None</code>.
+(If it is <code>True</code>, that depends on how this layer gets called.)
 See also broadcast_from_sender_node.
 </td>
 </tr><tr>
 <td>
-`sender_edge_input`
+<code>sender_edge_input</code>
 </td>
 <td>
-The input Tensor from the sender EdgeSet, or `None`.
-If self.takes_sender_edge_input is `False`, this arg will be `None`.
-(If it is `True`, it depends on how this layer gets called.)
+The input Tensor from the sender EdgeSet, or <code>None</code>.
+If self.takes_sender_edge_input is <code>False</code>, this arg will be <code>None</code>.
+(If it is <code>True</code>, it depends on how this layer gets called.)
 If present, this Tensor is already indexed by the items for which
 messages are computed.
 </td>
 </tr><tr>
 <td>
-`receiver_input`
+<code>receiver_input</code>
 </td>
 <td>
 The input Tensor from the receiver NodeSet or Context,
-or None. If self.takes_receiver_input is `False`, this arg will be
-`None`. (If it is `True`, it depends on how this layer gets called.)
+or None. If self.takes_receiver_input is <code>False</code>, this arg will be
+<code>None</code>. (If it is <code>True</code>, it depends on how this layer gets called.)
 See broadcast_from_receiver.
 </td>
 </tr><tr>
 <td>
-`broadcast_from_sender_node`
+<code>broadcast_from_sender_node</code>
 </td>
 <td>
 A function that broadcasts a Tensor indexed
@@ -454,25 +457,25 @@ messages are computed.
 </td>
 </tr><tr>
 <td>
-`broadcast_from_receiver`
+<code>broadcast_from_receiver</code>
 </td>
 <td>
-Call this as `broadcast_from_receiver(value)`
+Call this as <code>broadcast_from_receiver(value)</code>
 to broadcast a Tensor indexed like receiver_input to a Tensor indexed
 by the items for which messages are computed.
 </td>
 </tr><tr>
 <td>
-`pool_to_receiver`
+<code>pool_to_receiver</code>
 </td>
 <td>
-Call this as `pool_to_receiver(value, reduce_type=...)`
+Call this as <code>pool_to_receiver(value, reduce_type=...)</code>
 to pool an item-indexed Tensor to a receiver-indexed tensor, using
 a reduce_type understood by tfgnn.pool(), such as "sum".
 </td>
 </tr><tr>
 <td>
-`extra_receiver_ops`
+<code>extra_receiver_ops</code>
 </td>
 <td>
 The extra_receiver_ops passed to init, see there,
@@ -482,10 +485,10 @@ this argument, so subclass implementors not using it can omit it.
 </td>
 </tr><tr>
 <td>
-`training`
+<code>training</code>
 </td>
 <td>
-The `training` boolean that was passed to Layer.call(). If true,
+The <code>training</code> boolean that was passed to Layer.call(). If true,
 the result is computed for training rather than inference. For example,
 calls to tf.nn.dropout() are usually conditioned on this flag.
 By contrast, calling another Keras layer (like tf.keras.layers.Dropout)
@@ -495,6 +498,7 @@ does not require forwarding this arg, Keras does that automatically.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Returns</th></tr>
diff --git a/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention/MultiHeadAttentionEdgePool.md b/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention/MultiHeadAttentionEdgePool.md
index d535a182..6fa90816 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention/MultiHeadAttentionEdgePool.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention/MultiHeadAttentionEdgePool.md
@@ -1,17 +1,10 @@
 # multi_head_attention.MultiHeadAttentionEdgePool
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/multi_head_attention/layers.py#L565-L618">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/multi_head_attention/layers.py#L571-L628">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns a layer for pooling edges with Transformer-style Multi-Head Attention.
 
@@ -40,21 +33,26 @@ an edge set to do the analogous pooling of edge states to context.
 
 NOTE: This layer cannot pool node states. For that, use MultiHeadAttentionConv.
 
+The layer returned by this function can be restored from config by
+`tf.keras.models.load_model()` when saved as part of a Keras model using
+`save_format="tf"`.
+
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`num_heads`<a id="num_heads"></a>
+<code>num_heads</code><a id="num_heads"></a>
 </td>
 <td>
 The number of attention heads.
 </td>
 </tr><tr>
 <td>
-`per_head_channels`<a id="per_head_channels"></a>
+<code>per_head_channels</code><a id="per_head_channels"></a>
 </td>
 <td>
 The number of channels for each attention head. This
@@ -62,19 +60,19 @@ means that the final output size will be per_head_channels * num_heads.
 </td>
 </tr><tr>
 <td>
-`receiver_tag`<a id="receiver_tag"></a>
+<code>receiver_tag</code><a id="receiver_tag"></a>
 </td>
 <td>
 The results of attention are aggregated for this graph piece.
-If set to `tfgnn.CONTEXT`, the layer can be called for an edge set or node
-set. If set to an IncidentNodeTag (e.g., `tfgnn.SOURCE` or
-`tfgnn.TARGET`), the layer can be called for an edge set and will
+If set to <code>tfgnn.CONTEXT</code>, the layer can be called for an edge set or node
+set. If set to an IncidentNodeTag (e.g., <code>tfgnn.SOURCE</code> or
+<code>tfgnn.TARGET</code>), the layer can be called for an edge set and will
 aggregate results at the specified endpoint of the edges. If left unset,
 the tag must be passed when calling the layer.
 </td>
 </tr><tr>
 <td>
-`receiver_feature`<a id="receiver_feature"></a>
+<code>receiver_feature</code><a id="receiver_feature"></a>
 </td>
 <td>
 By default, the default state feature of the receiver is
@@ -83,7 +81,7 @@ selected by setting this argument.
 </td>
 </tr><tr>
 <td>
-`sender_feature`<a id="sender_feature"></a>
+<code>sender_feature</code><a id="sender_feature"></a>
 </td>
 <td>
 By default, the default state feature of the edge set is
@@ -92,7 +90,7 @@ selected by setting this argument.
 </td>
 </tr><tr>
 <td>
-`**kwargs`<a id="**kwargs"></a>
+<code>**kwargs</code><a id="**kwargs"></a>
 </td>
 <td>
 Any other option for MultiHeadAttentionConv, except
diff --git a/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention/MultiHeadAttentionHomGraphUpdate.md b/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention/MultiHeadAttentionHomGraphUpdate.md
index 27cc34b8..d532de39 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention/MultiHeadAttentionHomGraphUpdate.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention/MultiHeadAttentionHomGraphUpdate.md
@@ -1,17 +1,10 @@
 # multi_head_attention.MultiHeadAttentionHomGraphUpdate
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/multi_head_attention/layers.py#L622-L681">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/multi_head_attention/layers.py#L632-L695">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns a GraphUpdate layer with a transformer-style multihead attention.
 
@@ -42,21 +35,26 @@ details).
 > that are explicitly stored in the input GraphTensor. Attention of a node to
 > itself requires having an explicit loop in the edge set.
 
+The layer returned by this function can be restored from config by
+`tf.keras.models.load_model()` when saved as part of a Keras model using
+`save_format="tf"`.
+
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`num_heads`<a id="num_heads"></a>
+<code>num_heads</code><a id="num_heads"></a>
 </td>
 <td>
 The number of attention heads.
 </td>
 </tr><tr>
 <td>
-`per_head_channels`<a id="per_head_channels"></a>
+<code>per_head_channels</code><a id="per_head_channels"></a>
 </td>
 <td>
 The number of channels for each attention head. This
@@ -64,29 +62,29 @@ means that the final output size will be per_head_channels * num_heads.
 </td>
 </tr><tr>
 <td>
-`receiver_tag`<a id="receiver_tag"></a>
+<code>receiver_tag</code><a id="receiver_tag"></a>
 </td>
 <td>
-one of `tfgnn.SOURCE` or `tfgnn.TARGET`.
+one of <code>tfgnn.SOURCE</code> or <code>tfgnn.TARGET</code>.
 </td>
 </tr><tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 The feature name of node states; defaults to
-`tfgnn.HIDDEN_STATE`.
+<code>tfgnn.HIDDEN_STATE</code>.
 </td>
 </tr><tr>
 <td>
-`name`<a id="name"></a>
+<code>name</code><a id="name"></a>
 </td>
 <td>
 Optionally, a name for the layer returned.
 </td>
 </tr><tr>
 <td>
-`**kwargs`<a id="**kwargs"></a>
+<code>**kwargs</code><a id="**kwargs"></a>
 </td>
 <td>
 Any optional arguments to MultiHeadAttentionConv, see there.
diff --git a/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention/MultiHeadAttentionMPNNGraphUpdate.md b/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention/MultiHeadAttentionMPNNGraphUpdate.md
index d6eb153e..9c496104 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention/MultiHeadAttentionMPNNGraphUpdate.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention/MultiHeadAttentionMPNNGraphUpdate.md
@@ -1,17 +1,10 @@
 # multi_head_attention.MultiHeadAttentionMPNNGraphUpdate
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/multi_head_attention/layers.py#L684-L781">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/multi_head_attention/layers.py#L698-L799">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns a GraphUpdate layer for message passing with MultiHeadAttention pooling.
 
@@ -44,45 +37,50 @@ layer to compute the new node states from a concatenation of the old node state
 and all pooled messages, analogous to TF-GNN's
 `vanilla_mpnn.VanillaMPNNGraphUpdate` and `gat_v2.GATv2MPNNGraphUpdate`.
 
+The layer returned by this function can be restored from config by
+`tf.keras.models.load_model()` when saved as part of a Keras model using
+`save_format="tf"`.
+
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`units`<a id="units"></a>
+<code>units</code><a id="units"></a>
 </td>
 <td>
 The dimension of output hidden states for each node.
 </td>
 </tr><tr>
 <td>
-`message_dim`<a id="message_dim"></a>
+<code>message_dim</code><a id="message_dim"></a>
 </td>
 <td>
 The dimension of messages (attention values) computed on each
-edge.  Must be divisible by `num_heads`.
+edge.  Must be divisible by <code>num_heads</code>.
 </td>
 </tr><tr>
 <td>
-`num_heads`<a id="num_heads"></a>
+<code>num_heads</code><a id="num_heads"></a>
 </td>
 <td>
 The number of attention heads used by MultiHeadAttention.
-`message_dim` must be divisible by this number.
+<code>message_dim</code> must be divisible by this number.
 </td>
 </tr><tr>
 <td>
-`receiver_tag`<a id="receiver_tag"></a>
+<code>receiver_tag</code><a id="receiver_tag"></a>
 </td>
 <td>
-one of `tfgnn.TARGET` or `tfgnn.SOURCE`, to select the
+one of <code>tfgnn.TARGET</code> or <code>tfgnn.SOURCE</code>, to select the
 incident node of each edge that receives the message.
 </td>
 </tr><tr>
 <td>
-`node_set_names`<a id="node_set_names"></a>
+<code>node_set_names</code><a id="node_set_names"></a>
 </td>
 <td>
 The names of node sets to update. If unset, updates all that
@@ -90,16 +88,16 @@ are on the receiving end of any edge set.
 </td>
 </tr><tr>
 <td>
-`edge_feature`<a id="edge_feature"></a>
+<code>edge_feature</code><a id="edge_feature"></a>
 </td>
 <td>
 Can be set to a feature name of the edge set to select it as
-an input feature. By default, this set to `None`, which disables this
+an input feature. By default, this set to <code>None</code>, which disables this
 input.
 </td>
 </tr><tr>
 <td>
-`l2_regularization`<a id="l2_regularization"></a>
+<code>l2_regularization</code><a id="l2_regularization"></a>
 </td>
 <td>
 The coefficient of L2 regularization for weights and
@@ -107,7 +105,7 @@ biases.
 </td>
 </tr><tr>
 <td>
-`edge_dropout_rate`<a id="edge_dropout_rate"></a>
+<code>edge_dropout_rate</code><a id="edge_dropout_rate"></a>
 </td>
 <td>
 The edge dropout rate applied during attention pooling of
@@ -115,24 +113,24 @@ edges.
 </td>
 </tr><tr>
 <td>
-`state_dropout_rate`<a id="state_dropout_rate"></a>
+<code>state_dropout_rate</code><a id="state_dropout_rate"></a>
 </td>
 <td>
 The dropout rate applied to the resulting node states.
 </td>
 </tr><tr>
 <td>
-`attention_activation`<a id="attention_activation"></a>
+<code>attention_activation</code><a id="attention_activation"></a>
 </td>
 <td>
 The nonlinearity used on the transformed inputs before
 multiplying with the trained weights of the attention layer. This can be
 specified as a Keras layer, a tf.keras.activations.* function, or a string
-understood by `tf.keras.layers.Activation`. Defaults to None.
+understood by <code>tf.keras.layers.Activation</code>. Defaults to None.
 </td>
 </tr><tr>
 <td>
-`conv_activation`<a id="conv_activation"></a>
+<code>conv_activation</code><a id="conv_activation"></a>
 </td>
 <td>
 The nonlinearity applied to the result of attention on one
@@ -140,7 +138,7 @@ edge set, specified in the same ways as attention_activation.
 </td>
 </tr><tr>
 <td>
-`activation`<a id="activation"></a>
+<code>activation</code><a id="activation"></a>
 </td>
 <td>
 The nonlinearity applied to the new node states computed by this
@@ -148,25 +146,26 @@ graph update.
 </td>
 </tr><tr>
 <td>
-`kernel_initializer`<a id="kernel_initializer"></a>
+<code>kernel_initializer</code><a id="kernel_initializer"></a>
 </td>
 <td>
-Can be set to a `kernel_initializer` as understood
-by `tf.keras.layers.Dense` etc.
-An `Initializer` object gets cloned before use to ensure a fresh seed,
-if not set explicitly. For more, see `tfgnn.keras.clone_initializer()`.
+Can be set to a <code>kernel_initializer</code> as understood
+by <code>tf.keras.layers.Dense</code> etc.
+An <code>Initializer</code> object gets cloned before use to ensure a fresh seed,
+if not set explicitly. For more, see <code>tfgnn.keras.clone_initializer()</code>.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
 A GraphUpdate layer for use on a scalar GraphTensor with
-`tfgnn.HIDDEN_STATE` features on the node sets.
+<code>tfgnn.HIDDEN_STATE</code> features on the node sets.
 </td>
 </tr>
 
diff --git a/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention/graph_update_from_config_dict.md b/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention/graph_update_from_config_dict.md
index f6bbf78e..efac65b5 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention/graph_update_from_config_dict.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention/graph_update_from_config_dict.md
@@ -1,72 +1,77 @@
 # multi_head_attention.graph_update_from_config_dict
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/multi_head_attention/config_dict.py#L40-L60">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/multi_head_attention/config_dict.py#L43-L69">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns a MultiHeadAttentionMPNNGraphUpdate initialized from `cfg`.
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>multi_head_attention.graph_update_from_config_dict(
-    cfg: config_dict.ConfigDict
+    cfg: config_dict.ConfigDict,
+    *,
+    node_set_names: Optional[Collection[tfgnn.NodeSetName]] = None
 ) -> tf.keras.layers.Layer
 </code></pre>
 
 <!-- Placeholder for "Used in" -->
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`cfg`<a id="cfg"></a>
+<code>cfg</code><a id="cfg"></a>
 </td>
 <td>
-A `ConfigDict` with the fields defined by
-`graph_update_get_config_dict()`. All fields with non-`None` values are
+A <code>ConfigDict</code> with the fields defined by
+<code>graph_update_get_config_dict()</code>. All fields with non-<code>None</code> values are
 used as keyword arguments for initializing and returning a
-`MultiHeadAttentionMPNNGraphUpdate` object. For the required arguments of
-`MultiHeadAttentionMPNNGraphUpdate.__init__`, users must set a value in
-`cfg` before passing it here.
+<code>MultiHeadAttentionMPNNGraphUpdate</code> object. For the required arguments of
+<code>MultiHeadAttentionMPNNGraphUpdate.__init__</code>, users must set a value in
+<code>cfg</code> before passing it here.
+</td>
+</tr><tr>
+<td>
+<code>node_set_names</code><a id="node_set_names"></a>
+</td>
+<td>
+Optionally, the names of NodeSets to update; forwarded to
+<code>MultiHeadAttentionMPNNGraphUpdate.__init__</code>.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-A new `MultiHeadAttentionMPNNGraphUpdate` object.
+A new <code>MultiHeadAttentionMPNNGraphUpdate</code> object.
 </td>
 </tr>
 
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Raises</h2></th></tr>
 
 <tr>
 <td>
-`TypeError`<a id="TypeError"></a>
+<code>TypeError</code><a id="TypeError"></a>
 </td>
 <td>
-if `cfg` fails to supply a required argument for
-`MultiHeadAttentionMPNNGraphUpdate.__init__`.
+if <code>cfg</code> fails to supply a required argument for
+<code>MultiHeadAttentionMPNNGraphUpdate.__init__</code>.
 </td>
 </tr>
 </table>
diff --git a/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention/graph_update_get_config_dict.md b/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention/graph_update_get_config_dict.md
index 017fba0a..51a8aee9 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention/graph_update_get_config_dict.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/multi_head_attention/graph_update_get_config_dict.md
@@ -1,17 +1,10 @@
 # multi_head_attention.graph_update_get_config_dict
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/multi_head_attention/config_dict.py#L22-L37">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/multi_head_attention/config_dict.py#L25-L40">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns ConfigDict for graph_update_from_config_dict() with defaults.
 
diff --git a/tensorflow_gnn/docs/api_docs/python/models/vanilla_mpnn.md b/tensorflow_gnn/docs/api_docs/python/models/vanilla_mpnn.md
index 755c6346..148b9f45 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/vanilla_mpnn.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/vanilla_mpnn.md
@@ -1,17 +1,10 @@
 # Module: vanilla_mpnn
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/vanilla_mpnn/__init__.py">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/vanilla_mpnn/__init__.py">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 TF-GNN's "Vanilla MPNN" model.
 
diff --git a/tensorflow_gnn/docs/api_docs/python/models/vanilla_mpnn/VanillaMPNNGraphUpdate.md b/tensorflow_gnn/docs/api_docs/python/models/vanilla_mpnn/VanillaMPNNGraphUpdate.md
index 00b4f949..9159630a 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/vanilla_mpnn/VanillaMPNNGraphUpdate.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/vanilla_mpnn/VanillaMPNNGraphUpdate.md
@@ -1,17 +1,10 @@
 # vanilla_mpnn.VanillaMPNNGraphUpdate
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/vanilla_mpnn/layers.py#L22-L115">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/vanilla_mpnn/layers.py#L22-L119">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns a GraphUpdate layer for a Vanilla MPNN.
 
@@ -55,36 +48,41 @@ and the pooled messages from all incident node sets E_1, E_2, ...:
 $$h_v := \text{ReLU}(
     W_{\text{state}} (h_v || m_{E_1} || m_{E_2} || \ldots)).$$
 
+The layer returned by this function can be restored from config by
+`tf.keras.models.load_model()` when saved as part of a Keras model using
+`save_format="tf"`.
+
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`units`<a id="units"></a>
+<code>units</code><a id="units"></a>
 </td>
 <td>
 The dimension of output hidden states for each node.
 </td>
 </tr><tr>
 <td>
-`message_dim`<a id="message_dim"></a>
+<code>message_dim</code><a id="message_dim"></a>
 </td>
 <td>
 The dimension of messages computed on each edge.
 </td>
 </tr><tr>
 <td>
-`receiver_tag`<a id="receiver_tag"></a>
+<code>receiver_tag</code><a id="receiver_tag"></a>
 </td>
 <td>
-one of `tfgnn.TARGET` or `tfgnn.SOURCE`, to select the
+one of <code>tfgnn.TARGET</code> or <code>tfgnn.SOURCE</code>, to select the
 incident node of each edge that receives the message.
 </td>
 </tr><tr>
 <td>
-`node_set_names`<a id="node_set_names"></a>
+<code>node_set_names</code><a id="node_set_names"></a>
 </td>
 <td>
 The names of node sets to update. If unset, updates all
@@ -92,26 +90,26 @@ that are on the receiving end of any edge set.
 </td>
 </tr><tr>
 <td>
-`edge_feature`<a id="edge_feature"></a>
+<code>edge_feature</code><a id="edge_feature"></a>
 </td>
 <td>
 Can be set to a feature name of the edge set to select
-it as an input feature. By default, this set to `None`, which disables
+it as an input feature. By default, this set to <code>None</code>, which disables
 this input.
 </td>
 </tr><tr>
 <td>
-`reduce_type`<a id="reduce_type"></a>
+<code>reduce_type</code><a id="reduce_type"></a>
 </td>
 <td>
 How to pool the messages from edges to receiver nodes; defaults
-to `"sum"`. Can be any reduce_type understood by `tfgnn.pool()`, including
-concatenations like `"sum|max"` (but mind the increased dimension of the
+to <code>"sum"</code>. Can be any reduce_type understood by <code>tfgnn.pool()</code>, including
+concatenations like <code>"sum|max"</code> (but mind the increased dimension of the
 result and the growing number of model weights in the next-state layer).
 </td>
 </tr><tr>
 <td>
-`l2_regularization`<a id="l2_regularization"></a>
+<code>l2_regularization</code><a id="l2_regularization"></a>
 </td>
 <td>
 The coefficient of L2 regularization for weights and
@@ -119,7 +117,7 @@ biases.
 </td>
 </tr><tr>
 <td>
-`dropout_rate`<a id="dropout_rate"></a>
+<code>dropout_rate</code><a id="dropout_rate"></a>
 </td>
 <td>
 The dropout rate applied to messages on each edge and to the
@@ -127,17 +125,17 @@ new node state.
 </td>
 </tr><tr>
 <td>
-`kernel_initializer`<a id="kernel_initializer"></a>
+<code>kernel_initializer</code><a id="kernel_initializer"></a>
 </td>
 <td>
-Can be set to a `kernel_initializer` as understood
-by `tf.keras.layers.Dense` etc.
-An `Initializer` object gets cloned before use to ensure a fresh seed,
-if not set explicitly. For more, see `tfgnn.keras.clone_initializer()`.
+Can be set to a <code>kernel_initializer</code> as understood
+by <code>tf.keras.layers.Dense</code> etc.
+An <code>Initializer</code> object gets cloned before use to ensure a fresh seed,
+if not set explicitly. For more, see <code>tfgnn.keras.clone_initializer()</code>.
 </td>
 </tr><tr>
 <td>
-`use_layer_normalization`<a id="use_layer_normalization"></a>
+<code>use_layer_normalization</code><a id="use_layer_normalization"></a>
 </td>
 <td>
 Flag to determine whether to apply layer
@@ -147,13 +145,14 @@ normalization to the new node state.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
 A GraphUpdate layer for use on a scalar GraphTensor with
-`tfgnn.HIDDEN_STATE` features on the node sets.
+<code>tfgnn.HIDDEN_STATE</code> features on the node sets.
 </td>
 </tr>
 
diff --git a/tensorflow_gnn/docs/api_docs/python/models/vanilla_mpnn/graph_update_from_config_dict.md b/tensorflow_gnn/docs/api_docs/python/models/vanilla_mpnn/graph_update_from_config_dict.md
index 539deb0f..c08e7a78 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/vanilla_mpnn/graph_update_from_config_dict.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/vanilla_mpnn/graph_update_from_config_dict.md
@@ -1,72 +1,77 @@
 # vanilla_mpnn.graph_update_from_config_dict
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/vanilla_mpnn/config_dict.py#L38-L58">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/vanilla_mpnn/config_dict.py#L41-L66">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns a VanillaMPNNGraphUpdate initialized from `cfg`.
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>vanilla_mpnn.graph_update_from_config_dict(
-    cfg: config_dict.ConfigDict
+    cfg: config_dict.ConfigDict,
+    *,
+    node_set_names: Optional[Collection[tfgnn.NodeSetName]] = None
 ) -> tf.keras.layers.Layer
 </code></pre>
 
 <!-- Placeholder for "Used in" -->
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`cfg`<a id="cfg"></a>
+<code>cfg</code><a id="cfg"></a>
 </td>
 <td>
-A `ConfigDict` with the fields defined by
-`graph_update_get_config_dict()`. All fields with non-`None` values are
+A <code>ConfigDict</code> with the fields defined by
+<code>graph_update_get_config_dict()</code>. All fields with non-<code>None</code> values are
 used as keyword arguments for initializing and returning a
-`VanillaMPNNGraphUpdate` object. For the required arguments of
-`VanillaMPNNGraphUpdate.__init__`, users must set a value in `cfg` before
+<code>VanillaMPNNGraphUpdate</code> object. For the required arguments of
+<code>VanillaMPNNGraphUpdate.__init__</code>, users must set a value in <code>cfg</code> before
 passing it here.
 </td>
+</tr><tr>
+<td>
+<code>node_set_names</code><a id="node_set_names"></a>
+</td>
+<td>
+Optionally, the names of NodeSets to update; forwarded to
+<code>MtAlbisGraphUpdate.__init__</code>.
+</td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-A new `VanillaMPNNGraphUpdate` object.
+A new <code>VanillaMPNNGraphUpdate</code> object.
 </td>
 </tr>
 
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Raises</h2></th></tr>
 
 <tr>
 <td>
-`TypeError`<a id="TypeError"></a>
+<code>TypeError</code><a id="TypeError"></a>
 </td>
 <td>
-if `cfg` fails to supply a required argument for
-`VanillaMPNNGraphUpdate.__init__`.
+if <code>cfg</code> fails to supply a required argument for
+<code>VanillaMPNNGraphUpdate.__init__</code>.
 </td>
 </tr>
 </table>
diff --git a/tensorflow_gnn/docs/api_docs/python/models/vanilla_mpnn/graph_update_get_config_dict.md b/tensorflow_gnn/docs/api_docs/python/models/vanilla_mpnn/graph_update_get_config_dict.md
index 82ee8053..879fba16 100644
--- a/tensorflow_gnn/docs/api_docs/python/models/vanilla_mpnn/graph_update_get_config_dict.md
+++ b/tensorflow_gnn/docs/api_docs/python/models/vanilla_mpnn/graph_update_get_config_dict.md
@@ -1,17 +1,10 @@
 # vanilla_mpnn.graph_update_get_config_dict
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/vanilla_mpnn/config_dict.py#L22-L35">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/models/vanilla_mpnn/config_dict.py#L25-L38">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns ConfigDict for graph_update_from_config_dict() with defaults.
 
diff --git a/tensorflow_gnn/docs/api_docs/python/runner.md b/tensorflow_gnn/docs/api_docs/python/runner.md
new file mode 100644
index 00000000..e8a39924
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner.md
@@ -0,0 +1,173 @@
+# Module: runner
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/__init__.py">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+A general purpose runner for TF-GNN.
+
+## Classes
+
+[`class ContextLabelFn`](./runner/ContextLabelFn.md): Reads out a `tfgnn.Field`
+from the `GraphTensor` context.
+
+[`class DatasetProvider`](./runner/DatasetProvider.md): Helper class that
+provides a standard way to create an ABC using inheritance.
+
+[`class DotProductLinkPrediction`](./runner/DotProductLinkPrediction.md):
+Implements edge score as dot product of features of endpoint nodes.
+
+[`class FitOrSkipPadding`](./runner/FitOrSkipPadding.md): Calculates fit or skip
+`SizeConstraints` for `GraphTensor` padding.
+
+[`class GraphBinaryClassification`](./runner/GraphBinaryClassification.md):
+Graph binary (or multi-label) classification from pooled node states.
+
+[`class GraphMeanAbsoluteError`](./runner/GraphMeanAbsoluteError.md): Mean
+absolute error task.
+
+[`class GraphMeanAbsolutePercentageError`](./runner/GraphMeanAbsolutePercentageError.md):
+Mean absolute percentage error task.
+
+[`class GraphMeanSquaredError`](./runner/GraphMeanSquaredError.md): Mean squared
+error task.
+
+[`class GraphMeanSquaredLogScaledError`](./runner/GraphMeanSquaredLogScaledError.md):
+Mean squared log scaled error task.
+
+[`class GraphMeanSquaredLogarithmicError`](./runner/GraphMeanSquaredLogarithmicError.md):
+Mean squared logarithmic error task.
+
+[`class GraphMulticlassClassification`](./runner/GraphMulticlassClassification.md):
+Graph multiclass classification from pooled node states.
+
+[`class GraphTensorPadding`](./runner/GraphTensorPadding.md): Collects
+`GraphtTensor` padding helpers.
+
+[`class GraphTensorProcessorFn`](./runner/GraphTensorProcessorFn.md): A class
+for `GraphTensor` processing.
+
+[`class HadamardProductLinkPrediction`](./runner/HadamardProductLinkPrediction.md):
+Implements edge score as hadamard product of features of endpoint nodes.
+
+[`class IntegratedGradientsExporter`](./runner/IntegratedGradientsExporter.md):
+Exports a Keras model with an additional integrated gradients signature.
+
+[`class KerasModelExporter`](./runner/KerasModelExporter.md): Exports a Keras
+model (with Keras API) via `tf.keras.models.save_model`.
+
+[`class KerasTrainer`](./runner/KerasTrainer.md): Trains using the
+`tf.keras.Model.fit` training loop.
+
+[`class KerasTrainerCheckpointOptions`](./runner/KerasTrainerCheckpointOptions.md):
+Provides Keras Checkpointing related configuration options.
+
+[`class KerasTrainerOptions`](./runner/KerasTrainerOptions.md): Provides Keras
+training related options.
+
+[`class ModelExporter`](./runner/ModelExporter.md): Saves a Keras model.
+
+[`class NodeBinaryClassification`](./runner/NodeBinaryClassification.md): Node
+binary (or multi-label) classification via structured readout.
+
+[`class NodeMulticlassClassification`](./runner/NodeMulticlassClassification.md):
+Node multiclass classification via structured readout.
+
+[`class ParameterServerStrategy`](./runner/ParameterServerStrategy.md): A
+`ParameterServerStrategy` convenience wrapper.
+
+[`class PassthruDatasetProvider`](./runner/PassthruDatasetProvider.md): Builds a
+`tf.data.Dataset` from a pass thru dataset.
+
+[`class PassthruSampleDatasetsProvider`](./runner/PassthruSampleDatasetsProvider.md):
+Builds a sampled `tf.data.Dataset` from multiple pass thru datasets.
+
+[`class RootNodeBinaryClassification`](./runner/RootNodeBinaryClassification.md):
+Root node binary (or multi-label) classification.
+
+[`class RootNodeLabelFn`](./runner/RootNodeLabelFn.md): Reads out a
+`tfgnn.Field` from the `GraphTensor` root (i.e. first) node.
+
+[`class RootNodeMeanAbsoluteError`](./runner/RootNodeMeanAbsoluteError.md): Mean
+absolute error task.
+
+[`class RootNodeMeanAbsoluteLogarithmicError`](./runner/RootNodeMeanAbsoluteLogarithmicError.md):
+Root node mean absolute logarithmic error task.
+
+[`class RootNodeMeanAbsolutePercentageError`](./runner/RootNodeMeanAbsolutePercentageError.md):
+Mean absolute percentage error task.
+
+[`class RootNodeMeanSquaredError`](./runner/RootNodeMeanSquaredError.md): Mean
+squared error task.
+
+[`class RootNodeMeanSquaredLogScaledError`](./runner/RootNodeMeanSquaredLogScaledError.md):
+Mean squared log scaled error task.
+
+[`class RootNodeMeanSquaredLogarithmicError`](./runner/RootNodeMeanSquaredLogarithmicError.md):
+Mean squared logarithmic error task.
+
+[`class RootNodeMulticlassClassification`](./runner/RootNodeMulticlassClassification.md):
+Root node multiclass classification.
+
+[`class RunResult`](./runner/RunResult.md): Holds the return values of
+`run(...)`.
+
+[`class SampleTFRecordDatasetsProvider`](./runner/SampleTFRecordDatasetsProvider.md):
+Builds a sampling `tf.data.Dataset` from multiple filenames.
+
+[`class SimpleDatasetProvider`](./runner/SimpleDatasetProvider.md): Builds a
+`tf.data.Dataset` from a list of files.
+
+[`class SimpleSampleDatasetsProvider`](./runner/SimpleSampleDatasetsProvider.md):
+Builds a sampling `tf.data.Dataset` from multiple filenames.
+
+[`class SubmoduleExporter`](./runner/SubmoduleExporter.md): Exports a Keras
+submodule.
+
+[`class TFDataServiceConfig`](./runner/TFDataServiceConfig.md): Provides tf.data
+service related configuration options.
+
+[`class TFRecordDatasetProvider`](./runner/TFRecordDatasetProvider.md): Builds a
+`tf.data.Dataset` from a list of files.
+
+[`class TPUStrategy`](./runner/TPUStrategy.md): A `TPUStrategy` convenience
+wrapper.
+
+[`class Task`](./runner/Task.md): Defines a learning objective for a GNN.
+
+[`class TightPadding`](./runner/TightPadding.md): Calculates tight
+`SizeConstraints` for `GraphTensor` padding.
+
+[`class Trainer`](./runner/Trainer.md): A class for training and validation of a
+Keras model.
+
+## Functions
+
+[`export_model(...)`](./runner/export_model.md): Exports a Keras model without
+traces s.t. it is loadable without TF-GNN.
+
+[`incrementing_model_dir(...)`](./runner/incrementing_model_dir.md): Create,
+given some `dirname`, an incrementing model directory.
+
+[`integrated_gradients(...)`](./runner/integrated_gradients.md): Integrated
+gradients.
+
+[`one_node_per_component(...)`](./runner/one_node_per_component.md): Returns a
+`Mapping` `node_set_name: 1` for every node set in `gtspec`.
+
+[`run(...)`](./runner/run.md): Runs training (and validation) of a model on
+task(s) with the given data.
+
+## Type Aliases
+
+[`Loss`](./runner/Loss.md)
+
+[`Losses`](./runner/Losses.md)
+
+[`Metric`](./runner/Loss.md)
+
+[`Metrics`](./runner/Metrics.md)
+
+[`Predictions`](./runner/Predictions.md)
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/ContextLabelFn.md b/tensorflow_gnn/docs/api_docs/python/runner/ContextLabelFn.md
new file mode 100644
index 00000000..b5145681
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/ContextLabelFn.md
@@ -0,0 +1,17 @@
+# runner.ContextLabelFn
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/utils/label_fns.py#L22-L40">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Reads out a `tfgnn.Field` from the `GraphTensor` context.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.ContextLabelFn(
+    feature_name: str, **kwargs
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/DatasetProvider.md b/tensorflow_gnn/docs/api_docs/python/runner/DatasetProvider.md
new file mode 100644
index 00000000..e7f9b971
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/DatasetProvider.md
@@ -0,0 +1,27 @@
+# runner.DatasetProvider
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/interfaces.py#L76-L81">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Helper class that provides a standard way to create an ABC using inheritance.
+
+<!-- Placeholder for "Used in" -->
+
+## Methods
+
+<h3 id="get_dataset"><code>get_dataset</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/interfaces.py#L78-L81">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>get_dataset(
+    context: tf.distribute.InputContext
+) -> tf.data.Dataset
+</code></pre>
+
+Get a `tf.data.Dataset` by `context` per replica.
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/DotProductLinkPrediction.md b/tensorflow_gnn/docs/api_docs/python/runner/DotProductLinkPrediction.md
new file mode 100644
index 00000000..dd562c93
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/DotProductLinkPrediction.md
@@ -0,0 +1,200 @@
+# runner.DotProductLinkPrediction
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/link_prediction.py#L162-L171">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Implements edge score as dot product of features of endpoint nodes.
+
+Inherits From: [`Task`](../runner/Task.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.DotProductLinkPrediction(
+    *,
+    node_feature_name: tfgnn.FieldName = tfgnn.HIDDEN_STATE,
+    readout_label_feature_name: str = &#x27;label&#x27;,
+    readout_node_set_name: tfgnn.NodeSetName = &#x27;_readout&#x27;
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>node_feature_name</code><a id="node_feature_name"></a>
+</td>
+<td>
+Name of feature where node state for link-prediction
+is read from. The final link prediction score will be:
+`score(graph.node_sets[source][node_feature_name],
+        graph.node_sets[target][node_feature_name])`
+where <code>source</code> and <code>target</code>, respectively, are:
+<code>graph.edge_sets[readout_node_set_name+"/source"].adjacency.source_name</code>
+and
+<code>graph.edge_sets[readout_node_set_name+"/target"].adjacency.source_name</code>
+</td>
+</tr><tr>
+<td>
+<code>readout_label_feature_name</code><a id="readout_label_feature_name"></a>
+</td>
+<td>
+The labels for edge connections,
+source nodes
+<code>graph.edge_sets[readout_node_set_name+"/source"].adjacency.source</code> in
+node set <code>graph.node_sets[source]</code> against target nodes
+<code>graph.edge_sets[readout_node_set_name+"/target"].adjacency.source</code> in
+node set <code>graph.node_sets[source]</code>, must be stored in
+<code>graph.node_sets[readout_node_set_name][readout_label_feature_name]</code>.
+</td>
+</tr><tr>
+<td>
+<code>readout_node_set_name</code><a id="readout_node_set_name"></a>
+</td>
+<td>
+Determines the readout node-set, which must have
+feature <code>readout_label_feature_name</code>, and must receive connections (at
+target endpoints) from edge-sets <code>readout_node_set_name+"/source"</code> and
+<code>readout_node_set_name+"/target"</code>.
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="losses"><code>losses</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/link_prediction.py#L154-L156">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>losses() -> <a href="../runner/Losses.md"><code>runner.Losses</code></a>
+</code></pre>
+
+Binary cross-entropy.
+
+<h3 id="metrics"><code>metrics</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/link_prediction.py#L158-L159">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>metrics() -> <a href="../runner/Metrics.md"><code>runner.Metrics</code></a>
+</code></pre>
+
+Returns arbitrary task specific metrics.
+
+<h3 id="predict"><code>predict</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/link_prediction.py#L144-L152">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>predict(
+    graph: tfgnn.GraphTensor
+) -> <a href="../runner/Predictions.md"><code>runner.Predictions</code></a>
+</code></pre>
+
+Produces prediction outputs for the learning objective.
+
+Overall model composition* makes use of the Keras Functional API
+(https://www.tensorflow.org/guide/keras/functional) to map symbolic Keras
+`GraphTensor` inputs to symbolic Keras `Field` outputs. Outputs must match the
+structure (one or mapping) of labels from `preprocess`.
+
+*) `outputs = predict(GNN(inputs))` where `inputs` are those `GraphTensor`
+returned by `preprocess(...)`, `GNN` is the base GNN, `predict` is this method
+and `outputs` are the prediction outputs for the learning objective.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>*args</code>
+</td>
+<td>
+The symbolic Keras <code>GraphTensor</code> inputs(s). These inputs correspond
+(in sequence) to the base GNN output of each <code>GraphTensor</code> returned by
+<code>preprocess(...)</code>.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+The model's prediction output for this task.
+</td>
+</tr>
+
+</table>
+
+<h3 id="preprocess"><code>preprocess</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/link_prediction.py#L134-L142">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>preprocess(
+    gt: tfgnn.GraphTensor
+) -> Tuple[tfgnn.GraphTensor, tfgnn.Field]
+</code></pre>
+
+Preprocesses a scalar (after `merge_batch_to_components`) `GraphTensor`.
+
+This function uses the Keras functional API to define non-trainable
+transformations of the symbolic input `GraphTensor`, which get executed during
+dataset preprocessing in a `tf.data.Dataset.map(...)` operation. It has two
+responsibilities:
+
+1.  Splitting the training label out of the input for training. It must be
+    returned as a separate tensor or mapping of tensors.
+2.  Optionally, transforming input features. Some advanced modeling techniques
+    require running the same base GNN on multiple different transformations, so
+    this function may return a single `GraphTensor` or a non-empty sequence of
+    `GraphTensors`. The corresponding base GNN output for each `GraphTensor` is
+    provided to the `predict(...)` method.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A symbolic Keras <code>GraphTensor</code> for processing.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A tuple of processed <code>GraphTensor</code>(s) and a (one or mapping of) <code>Field</code> to
+be used as labels.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/FitOrSkipPadding.md b/tensorflow_gnn/docs/api_docs/python/runner/FitOrSkipPadding.md
new file mode 100644
index 00000000..3e349260
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/FitOrSkipPadding.md
@@ -0,0 +1,49 @@
+# runner.FitOrSkipPadding
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/utils/padding.py#L60-L91">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Calculates fit or skip `SizeConstraints` for `GraphTensor` padding.
+
+Inherits From: [`GraphTensorPadding`](../runner/GraphTensorPadding.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.FitOrSkipPadding(
+    gtspec: tfgnn.GraphTensorSpec,
+    dataset_provider: <a href="../runner/DatasetProvider.md"><code>runner.DatasetProvider</code></a>,
+    min_nodes_per_component: Optional[Mapping[str, int]] = None,
+    fit_or_skip_sample_sample_size: int = 10000,
+    fit_or_skip_success_ratio: float = 0.99
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+See: `tfgnn.learn_fit_or_skip_size_constraints.`
+
+## Methods
+
+<h3 id="get_filter_fn"><code>get_filter_fn</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/utils/padding.py#L78-L82">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_filter_fn(
+    size_constraints: SizeConstraints
+) -> Callable[..., bool]
+</code></pre>
+
+<h3 id="get_size_constraints"><code>get_size_constraints</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/utils/padding.py#L84-L91">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_size_constraints(
+    target_batch_size: int
+) -> SizeConstraints
+</code></pre>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/GraphBinaryClassification.md b/tensorflow_gnn/docs/api_docs/python/runner/GraphBinaryClassification.md
new file mode 100644
index 00000000..237ce1c9
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/GraphBinaryClassification.md
@@ -0,0 +1,222 @@
+# runner.GraphBinaryClassification
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L366-L405">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Graph binary (or multi-label) classification from pooled node states.
+
+Inherits From: [`Task`](../runner/Task.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.GraphBinaryClassification(
+    node_set_name: str,
+    units: int = 1,
+    *,
+    state_name: str = tfgnn.HIDDEN_STATE,
+    reduce_type: str = &#x27;mean&#x27;,
+    name: str = &#x27;classification_logits&#x27;,
+    label_fn: Optional[LabelFn] = None,
+    label_feature_name: Optional[str] = None
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>node_set_name</code><a id="node_set_name"></a>
+</td>
+<td>
+The node set to pool.
+</td>
+</tr><tr>
+<td>
+<code>units</code><a id="units"></a>
+</td>
+<td>
+The units for the classification head. (Typically <code>1</code> for binary
+classification and the number of labels for multi-label classification.)
+</td>
+</tr><tr>
+<td>
+<code>state_name</code><a id="state_name"></a>
+</td>
+<td>
+The feature name for activations (e.g.: tfgnn.HIDDEN_STATE).
+</td>
+</tr><tr>
+<td>
+<code>reduce_type</code><a id="reduce_type"></a>
+</td>
+<td>
+The context pooling reduction type.
+</td>
+</tr><tr>
+<td>
+<code>name</code><a id="name"></a>
+</td>
+<td>
+The classification head's layer name. To control the naming of saved
+model outputs see the runner model exporters (e.g.,
+<code>KerasModelExporter</code>).
+</td>
+</tr><tr>
+<td>
+<code>label_fn</code><a id="label_fn"></a>
+</td>
+<td>
+A label extraction function. This function mutates the input
+<code>GraphTensor</code>. Mutually exclusive with <code>label_feature_name</code>.
+</td>
+</tr><tr>
+<td>
+<code>label_feature_name</code><a id="label_feature_name"></a>
+</td>
+<td>
+A label feature name for readout from the auxiliary
+'_readout' node set. Readout does not mutate the input <code>GraphTensor</code>.
+Mutually exclusive with <code>label_fn</code>.
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="gather_activations"><code>gather_activations</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L263-L268">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>gather_activations(
+    inputs: GraphTensor
+) -> Field
+</code></pre>
+
+<h3 id="losses"><code>losses</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L179-L180">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>losses() -> interfaces.Losses
+</code></pre>
+
+Returns arbitrary task specific losses.
+
+<h3 id="metrics"><code>metrics</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L182-L188">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>metrics() -> interfaces.Metrics
+</code></pre>
+
+Returns arbitrary task specific metrics.
+
+<h3 id="predict"><code>predict</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L139-L153">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>predict(
+    inputs: tfgnn.GraphTensor
+) -> interfaces.Predictions
+</code></pre>
+
+Apply a linear head for classification.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A <code>tfgnn.GraphTensor</code> for classification.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+The classification logits.
+</td>
+</tr>
+
+</table>
+
+<h3 id="preprocess"><code>preprocess</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L155-L162">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>preprocess(
+    inputs: GraphTensor
+) -> tuple[GraphTensor, Field]
+</code></pre>
+
+Preprocesses a scalar (after `merge_batch_to_components`) `GraphTensor`.
+
+This function uses the Keras functional API to define non-trainable
+transformations of the symbolic input `GraphTensor`, which get executed during
+dataset preprocessing in a `tf.data.Dataset.map(...)` operation. It has two
+responsibilities:
+
+1.  Splitting the training label out of the input for training. It must be
+    returned as a separate tensor or mapping of tensors.
+2.  Optionally, transforming input features. Some advanced modeling techniques
+    require running the same base GNN on multiple different transformations, so
+    this function may return a single `GraphTensor` or a non-empty sequence of
+    `GraphTensors`. The corresponding base GNN output for each `GraphTensor` is
+    provided to the `predict(...)` method.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A symbolic Keras <code>GraphTensor</code> for processing.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A tuple of processed <code>GraphTensor</code>(s) and a (one or mapping of) <code>Field</code> to
+be used as labels.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/GraphMeanAbsoluteError.md b/tensorflow_gnn/docs/api_docs/python/runner/GraphMeanAbsoluteError.md
new file mode 100644
index 00000000..0894d92a
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/GraphMeanAbsoluteError.md
@@ -0,0 +1,195 @@
+# runner.GraphMeanAbsoluteError
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L310-L311">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Mean absolute error task.
+
+Inherits From: [`Task`](../runner/Task.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.GraphMeanAbsoluteError(
+    node_set_name: str,
+    *,
+    units: int = 1,
+    state_name: str = tfgnn.HIDDEN_STATE,
+    reduce_type: str = &#x27;mean&#x27;,
+    **kwargs
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>units</code><a id="units"></a>
+</td>
+<td>
+The units for the regression head.
+</td>
+</tr><tr>
+<td>
+<code>name</code><a id="name"></a>
+</td>
+<td>
+The regression head's layer name. This name typically appears in
+the exported model's SignatureDef.
+</td>
+</tr><tr>
+<td>
+<code>label_fn</code><a id="label_fn"></a>
+</td>
+<td>
+A label extraction function. This function mutates the input
+<code>GraphTensor</code>. Mutually exclusive with <code>label_feature_name</code>.
+</td>
+</tr><tr>
+<td>
+<code>label_feature_name</code><a id="label_feature_name"></a>
+</td>
+<td>
+A label feature name for readout from the auxiliary
+'_readout' node set. Readout does not mutate the input <code>GraphTensor</code>.
+Mutually exclusive with <code>label_fn</code>.
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="gather_activations"><code>gather_activations</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L126-L131">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>gather_activations(
+    inputs: GraphTensor
+) -> tf.Tensor
+</code></pre>
+
+<h3 id="losses"><code>losses</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L156-L157">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>losses() -> interfaces.Losses
+</code></pre>
+
+<h3 id="metrics"><code>metrics</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L103-L108">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>metrics() -> interfaces.Metrics
+</code></pre>
+
+Regression metrics.
+
+<h3 id="predict"><code>predict</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L74-L88">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>predict(
+    inputs: tfgnn.GraphTensor
+) -> interfaces.Predictions
+</code></pre>
+
+Apply a linear head for regression.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A <code>tfgnn.GraphTensor</code> for regression.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+The regression logits.
+</td>
+</tr>
+
+</table>
+
+<h3 id="preprocess"><code>preprocess</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L90-L97">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>preprocess(
+    inputs: GraphTensor
+) -> tuple[GraphTensor, Field]
+</code></pre>
+
+Preprocesses a scalar (after `merge_batch_to_components`) `GraphTensor`.
+
+This function uses the Keras functional API to define non-trainable
+transformations of the symbolic input `GraphTensor`, which get executed during
+dataset preprocessing in a `tf.data.Dataset.map(...)` operation. It has two
+responsibilities:
+
+1.  Splitting the training label out of the input for training. It must be
+    returned as a separate tensor or mapping of tensors.
+2.  Optionally, transforming input features. Some advanced modeling techniques
+    require running the same base GNN on multiple different transformations, so
+    this function may return a single `GraphTensor` or a non-empty sequence of
+    `GraphTensors`. The corresponding base GNN output for each `GraphTensor` is
+    provided to the `predict(...)` method.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A symbolic Keras <code>GraphTensor</code> for processing.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A tuple of processed <code>GraphTensor</code>(s) and a (one or mapping of) <code>Field</code> to
+be used as labels.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/GraphMeanAbsolutePercentageError.md b/tensorflow_gnn/docs/api_docs/python/runner/GraphMeanAbsolutePercentageError.md
new file mode 100644
index 00000000..4794f58e
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/GraphMeanAbsolutePercentageError.md
@@ -0,0 +1,195 @@
+# runner.GraphMeanAbsolutePercentageError
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L314-L316">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Mean absolute percentage error task.
+
+Inherits From: [`Task`](../runner/Task.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.GraphMeanAbsolutePercentageError(
+    node_set_name: str,
+    *,
+    units: int = 1,
+    state_name: str = tfgnn.HIDDEN_STATE,
+    reduce_type: str = &#x27;mean&#x27;,
+    **kwargs
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>units</code><a id="units"></a>
+</td>
+<td>
+The units for the regression head.
+</td>
+</tr><tr>
+<td>
+<code>name</code><a id="name"></a>
+</td>
+<td>
+The regression head's layer name. This name typically appears in
+the exported model's SignatureDef.
+</td>
+</tr><tr>
+<td>
+<code>label_fn</code><a id="label_fn"></a>
+</td>
+<td>
+A label extraction function. This function mutates the input
+<code>GraphTensor</code>. Mutually exclusive with <code>label_feature_name</code>.
+</td>
+</tr><tr>
+<td>
+<code>label_feature_name</code><a id="label_feature_name"></a>
+</td>
+<td>
+A label feature name for readout from the auxiliary
+'_readout' node set. Readout does not mutate the input <code>GraphTensor</code>.
+Mutually exclusive with <code>label_fn</code>.
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="gather_activations"><code>gather_activations</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L126-L131">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>gather_activations(
+    inputs: GraphTensor
+) -> tf.Tensor
+</code></pre>
+
+<h3 id="losses"><code>losses</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L163-L164">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>losses() -> interfaces.Losses
+</code></pre>
+
+<h3 id="metrics"><code>metrics</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L103-L108">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>metrics() -> interfaces.Metrics
+</code></pre>
+
+Regression metrics.
+
+<h3 id="predict"><code>predict</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L74-L88">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>predict(
+    inputs: tfgnn.GraphTensor
+) -> interfaces.Predictions
+</code></pre>
+
+Apply a linear head for regression.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A <code>tfgnn.GraphTensor</code> for regression.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+The regression logits.
+</td>
+</tr>
+
+</table>
+
+<h3 id="preprocess"><code>preprocess</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L90-L97">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>preprocess(
+    inputs: GraphTensor
+) -> tuple[GraphTensor, Field]
+</code></pre>
+
+Preprocesses a scalar (after `merge_batch_to_components`) `GraphTensor`.
+
+This function uses the Keras functional API to define non-trainable
+transformations of the symbolic input `GraphTensor`, which get executed during
+dataset preprocessing in a `tf.data.Dataset.map(...)` operation. It has two
+responsibilities:
+
+1.  Splitting the training label out of the input for training. It must be
+    returned as a separate tensor or mapping of tensors.
+2.  Optionally, transforming input features. Some advanced modeling techniques
+    require running the same base GNN on multiple different transformations, so
+    this function may return a single `GraphTensor` or a non-empty sequence of
+    `GraphTensors`. The corresponding base GNN output for each `GraphTensor` is
+    provided to the `predict(...)` method.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A symbolic Keras <code>GraphTensor</code> for processing.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A tuple of processed <code>GraphTensor</code>(s) and a (one or mapping of) <code>Field</code> to
+be used as labels.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/GraphMeanSquaredError.md b/tensorflow_gnn/docs/api_docs/python/runner/GraphMeanSquaredError.md
new file mode 100644
index 00000000..fc7f9332
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/GraphMeanSquaredError.md
@@ -0,0 +1,195 @@
+# runner.GraphMeanSquaredError
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L319-L320">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Mean squared error task.
+
+Inherits From: [`Task`](../runner/Task.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.GraphMeanSquaredError(
+    node_set_name: str,
+    *,
+    units: int = 1,
+    state_name: str = tfgnn.HIDDEN_STATE,
+    reduce_type: str = &#x27;mean&#x27;,
+    **kwargs
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>units</code><a id="units"></a>
+</td>
+<td>
+The units for the regression head.
+</td>
+</tr><tr>
+<td>
+<code>name</code><a id="name"></a>
+</td>
+<td>
+The regression head's layer name. This name typically appears in
+the exported model's SignatureDef.
+</td>
+</tr><tr>
+<td>
+<code>label_fn</code><a id="label_fn"></a>
+</td>
+<td>
+A label extraction function. This function mutates the input
+<code>GraphTensor</code>. Mutually exclusive with <code>label_feature_name</code>.
+</td>
+</tr><tr>
+<td>
+<code>label_feature_name</code><a id="label_feature_name"></a>
+</td>
+<td>
+A label feature name for readout from the auxiliary
+'_readout' node set. Readout does not mutate the input <code>GraphTensor</code>.
+Mutually exclusive with <code>label_fn</code>.
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="gather_activations"><code>gather_activations</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L126-L131">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>gather_activations(
+    inputs: GraphTensor
+) -> tf.Tensor
+</code></pre>
+
+<h3 id="losses"><code>losses</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L170-L171">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>losses() -> interfaces.Losses
+</code></pre>
+
+<h3 id="metrics"><code>metrics</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L103-L108">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>metrics() -> interfaces.Metrics
+</code></pre>
+
+Regression metrics.
+
+<h3 id="predict"><code>predict</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L74-L88">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>predict(
+    inputs: tfgnn.GraphTensor
+) -> interfaces.Predictions
+</code></pre>
+
+Apply a linear head for regression.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A <code>tfgnn.GraphTensor</code> for regression.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+The regression logits.
+</td>
+</tr>
+
+</table>
+
+<h3 id="preprocess"><code>preprocess</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L90-L97">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>preprocess(
+    inputs: GraphTensor
+) -> tuple[GraphTensor, Field]
+</code></pre>
+
+Preprocesses a scalar (after `merge_batch_to_components`) `GraphTensor`.
+
+This function uses the Keras functional API to define non-trainable
+transformations of the symbolic input `GraphTensor`, which get executed during
+dataset preprocessing in a `tf.data.Dataset.map(...)` operation. It has two
+responsibilities:
+
+1.  Splitting the training label out of the input for training. It must be
+    returned as a separate tensor or mapping of tensors.
+2.  Optionally, transforming input features. Some advanced modeling techniques
+    require running the same base GNN on multiple different transformations, so
+    this function may return a single `GraphTensor` or a non-empty sequence of
+    `GraphTensors`. The corresponding base GNN output for each `GraphTensor` is
+    provided to the `predict(...)` method.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A symbolic Keras <code>GraphTensor</code> for processing.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A tuple of processed <code>GraphTensor</code>(s) and a (one or mapping of) <code>Field</code> to
+be used as labels.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/GraphMeanSquaredLogScaledError.md b/tensorflow_gnn/docs/api_docs/python/runner/GraphMeanSquaredLogScaledError.md
new file mode 100644
index 00000000..00e3fe85
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/GraphMeanSquaredLogScaledError.md
@@ -0,0 +1,155 @@
+# runner.GraphMeanSquaredLogScaledError
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L328-L330">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Mean squared log scaled error task.
+
+Inherits From: [`Task`](../runner/Task.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.GraphMeanSquaredLogScaledError(
+    *args,
+    alpha_loss_param: float = 5.0,
+    epsilon_loss_param: float = 1e-08,
+    reduction: tf.keras.losses.Reduction = AUTO,
+    name: Optional[str] = None,
+    **kwargs
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+## Methods
+
+<h3 id="gather_activations"><code>gather_activations</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L126-L131">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>gather_activations(
+    inputs: GraphTensor
+) -> tf.Tensor
+</code></pre>
+
+<h3 id="losses"><code>losses</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L242-L248">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>losses() -> interfaces.Losses
+</code></pre>
+
+<h3 id="metrics"><code>metrics</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L103-L108">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>metrics() -> interfaces.Metrics
+</code></pre>
+
+Regression metrics.
+
+<h3 id="predict"><code>predict</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L74-L88">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>predict(
+    inputs: tfgnn.GraphTensor
+) -> interfaces.Predictions
+</code></pre>
+
+Apply a linear head for regression.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A <code>tfgnn.GraphTensor</code> for regression.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+The regression logits.
+</td>
+</tr>
+
+</table>
+
+<h3 id="preprocess"><code>preprocess</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L90-L97">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>preprocess(
+    inputs: GraphTensor
+) -> tuple[GraphTensor, Field]
+</code></pre>
+
+Preprocesses a scalar (after `merge_batch_to_components`) `GraphTensor`.
+
+This function uses the Keras functional API to define non-trainable
+transformations of the symbolic input `GraphTensor`, which get executed during
+dataset preprocessing in a `tf.data.Dataset.map(...)` operation. It has two
+responsibilities:
+
+1.  Splitting the training label out of the input for training. It must be
+    returned as a separate tensor or mapping of tensors.
+2.  Optionally, transforming input features. Some advanced modeling techniques
+    require running the same base GNN on multiple different transformations, so
+    this function may return a single `GraphTensor` or a non-empty sequence of
+    `GraphTensors`. The corresponding base GNN output for each `GraphTensor` is
+    provided to the `predict(...)` method.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A symbolic Keras <code>GraphTensor</code> for processing.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A tuple of processed <code>GraphTensor</code>(s) and a (one or mapping of) <code>Field</code> to
+be used as labels.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/GraphMeanSquaredLogarithmicError.md b/tensorflow_gnn/docs/api_docs/python/runner/GraphMeanSquaredLogarithmicError.md
new file mode 100644
index 00000000..3031008c
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/GraphMeanSquaredLogarithmicError.md
@@ -0,0 +1,195 @@
+# runner.GraphMeanSquaredLogarithmicError
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L323-L325">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Mean squared logarithmic error task.
+
+Inherits From: [`Task`](../runner/Task.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.GraphMeanSquaredLogarithmicError(
+    node_set_name: str,
+    *,
+    units: int = 1,
+    state_name: str = tfgnn.HIDDEN_STATE,
+    reduce_type: str = &#x27;mean&#x27;,
+    **kwargs
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>units</code><a id="units"></a>
+</td>
+<td>
+The units for the regression head.
+</td>
+</tr><tr>
+<td>
+<code>name</code><a id="name"></a>
+</td>
+<td>
+The regression head's layer name. This name typically appears in
+the exported model's SignatureDef.
+</td>
+</tr><tr>
+<td>
+<code>label_fn</code><a id="label_fn"></a>
+</td>
+<td>
+A label extraction function. This function mutates the input
+<code>GraphTensor</code>. Mutually exclusive with <code>label_feature_name</code>.
+</td>
+</tr><tr>
+<td>
+<code>label_feature_name</code><a id="label_feature_name"></a>
+</td>
+<td>
+A label feature name for readout from the auxiliary
+'_readout' node set. Readout does not mutate the input <code>GraphTensor</code>.
+Mutually exclusive with <code>label_fn</code>.
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="gather_activations"><code>gather_activations</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L126-L131">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>gather_activations(
+    inputs: GraphTensor
+) -> tf.Tensor
+</code></pre>
+
+<h3 id="losses"><code>losses</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L177-L178">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>losses() -> interfaces.Losses
+</code></pre>
+
+<h3 id="metrics"><code>metrics</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L103-L108">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>metrics() -> interfaces.Metrics
+</code></pre>
+
+Regression metrics.
+
+<h3 id="predict"><code>predict</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L74-L88">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>predict(
+    inputs: tfgnn.GraphTensor
+) -> interfaces.Predictions
+</code></pre>
+
+Apply a linear head for regression.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A <code>tfgnn.GraphTensor</code> for regression.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+The regression logits.
+</td>
+</tr>
+
+</table>
+
+<h3 id="preprocess"><code>preprocess</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L90-L97">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>preprocess(
+    inputs: GraphTensor
+) -> tuple[GraphTensor, Field]
+</code></pre>
+
+Preprocesses a scalar (after `merge_batch_to_components`) `GraphTensor`.
+
+This function uses the Keras functional API to define non-trainable
+transformations of the symbolic input `GraphTensor`, which get executed during
+dataset preprocessing in a `tf.data.Dataset.map(...)` operation. It has two
+responsibilities:
+
+1.  Splitting the training label out of the input for training. It must be
+    returned as a separate tensor or mapping of tensors.
+2.  Optionally, transforming input features. Some advanced modeling techniques
+    require running the same base GNN on multiple different transformations, so
+    this function may return a single `GraphTensor` or a non-empty sequence of
+    `GraphTensors`. The corresponding base GNN output for each `GraphTensor` is
+    provided to the `predict(...)` method.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A symbolic Keras <code>GraphTensor</code> for processing.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A tuple of processed <code>GraphTensor</code>(s) and a (one or mapping of) <code>Field</code> to
+be used as labels.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/GraphMulticlassClassification.md b/tensorflow_gnn/docs/api_docs/python/runner/GraphMulticlassClassification.md
new file mode 100644
index 00000000..e667763c
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/GraphMulticlassClassification.md
@@ -0,0 +1,239 @@
+# runner.GraphMulticlassClassification
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L408-L452">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Graph multiclass classification from pooled node states.
+
+Inherits From: [`Task`](../runner/Task.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.GraphMulticlassClassification(
+    node_set_name: str,
+    *,
+    num_classes: Optional[int] = None,
+    class_names: Optional[Sequence[str]] = None,
+    per_class_statistics: bool = False,
+    state_name: str = tfgnn.HIDDEN_STATE,
+    reduce_type: str = &#x27;mean&#x27;,
+    name: str = &#x27;classification_logits&#x27;,
+    label_fn: Optional[LabelFn] = None,
+    label_feature_name: Optional[str] = None
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>node_set_name</code><a id="node_set_name"></a>
+</td>
+<td>
+The node set to pool.
+</td>
+</tr><tr>
+<td>
+<code>num_classes</code><a id="num_classes"></a>
+</td>
+<td>
+The number of classes. Exactly one of <code>num_classes</code> or
+<code>class_names</code> must be specified
+</td>
+</tr><tr>
+<td>
+<code>class_names</code><a id="class_names"></a>
+</td>
+<td>
+The class names. Exactly one of <code>num_classes</code> or
+<code>class_names</code> must be specified
+</td>
+</tr><tr>
+<td>
+<code>per_class_statistics</code><a id="per_class_statistics"></a>
+</td>
+<td>
+Whether to compute statistics per class.
+</td>
+</tr><tr>
+<td>
+<code>state_name</code><a id="state_name"></a>
+</td>
+<td>
+The feature name for activations (e.g.: tfgnn.HIDDEN_STATE).
+</td>
+</tr><tr>
+<td>
+<code>reduce_type</code><a id="reduce_type"></a>
+</td>
+<td>
+The context pooling reduction type.
+</td>
+</tr><tr>
+<td>
+<code>name</code><a id="name"></a>
+</td>
+<td>
+The classification head's layer name. To control the naming of saved
+model outputs see the runner model exporters (e.g.,
+<code>KerasModelExporter</code>).
+</td>
+</tr><tr>
+<td>
+<code>label_fn</code><a id="label_fn"></a>
+</td>
+<td>
+A label extraction function. This function mutates the input
+<code>GraphTensor</code>. Mutually exclusive with <code>label_feature_name</code>.
+</td>
+</tr><tr>
+<td>
+<code>label_feature_name</code><a id="label_feature_name"></a>
+</td>
+<td>
+A label feature name for readout from the auxiliary
+'_readout' node set. Readout does not mutate the input <code>GraphTensor</code>.
+Mutually exclusive with <code>label_fn</code>.
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="gather_activations"><code>gather_activations</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L263-L268">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>gather_activations(
+    inputs: GraphTensor
+) -> Field
+</code></pre>
+
+<h3 id="losses"><code>losses</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L212-L214">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>losses() -> interfaces.Losses
+</code></pre>
+
+Sparse categorical crossentropy loss.
+
+<h3 id="metrics"><code>metrics</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L216-L232">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>metrics() -> interfaces.Metrics
+</code></pre>
+
+Sparse categorical metrics.
+
+<h3 id="predict"><code>predict</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L139-L153">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>predict(
+    inputs: tfgnn.GraphTensor
+) -> interfaces.Predictions
+</code></pre>
+
+Apply a linear head for classification.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A <code>tfgnn.GraphTensor</code> for classification.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+The classification logits.
+</td>
+</tr>
+
+</table>
+
+<h3 id="preprocess"><code>preprocess</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L155-L162">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>preprocess(
+    inputs: GraphTensor
+) -> tuple[GraphTensor, Field]
+</code></pre>
+
+Preprocesses a scalar (after `merge_batch_to_components`) `GraphTensor`.
+
+This function uses the Keras functional API to define non-trainable
+transformations of the symbolic input `GraphTensor`, which get executed during
+dataset preprocessing in a `tf.data.Dataset.map(...)` operation. It has two
+responsibilities:
+
+1.  Splitting the training label out of the input for training. It must be
+    returned as a separate tensor or mapping of tensors.
+2.  Optionally, transforming input features. Some advanced modeling techniques
+    require running the same base GNN on multiple different transformations, so
+    this function may return a single `GraphTensor` or a non-empty sequence of
+    `GraphTensors`. The corresponding base GNN output for each `GraphTensor` is
+    provided to the `predict(...)` method.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A symbolic Keras <code>GraphTensor</code> for processing.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A tuple of processed <code>GraphTensor</code>(s) and a (one or mapping of) <code>Field</code> to
+be used as labels.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/GraphTensorPadding.md b/tensorflow_gnn/docs/api_docs/python/runner/GraphTensorPadding.md
new file mode 100644
index 00000000..63ece9ef
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/GraphTensorPadding.md
@@ -0,0 +1,37 @@
+# runner.GraphTensorPadding
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/interfaces.py#L84-L95">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Collects `GraphtTensor` padding helpers.
+
+<!-- Placeholder for "Used in" -->
+
+## Methods
+
+<h3 id="get_filter_fn"><code>get_filter_fn</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/interfaces.py#L87-L91">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>get_filter_fn(
+    size_constraints: SizeConstraints
+) -> Callable[..., bool]
+</code></pre>
+
+<h3 id="get_size_constraints"><code>get_size_constraints</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/interfaces.py#L93-L95">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>get_size_constraints(
+    target_batch_size: int
+) -> SizeConstraints
+</code></pre>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/GraphTensorProcessorFn.md b/tensorflow_gnn/docs/api_docs/python/runner/GraphTensorProcessorFn.md
new file mode 100644
index 00000000..55b2eb0e
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/GraphTensorProcessorFn.md
@@ -0,0 +1,26 @@
+# runner.GraphTensorProcessorFn
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/interfaces.py#L98-L104">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+A class for `GraphTensor` processing.
+
+<!-- Placeholder for "Used in" -->
+
+## Methods
+
+<h3 id="__call__"><code>__call__</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/interfaces.py#L102-L104">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>__call__(
+    inputs: GraphTensor
+) -> GraphTensor
+</code></pre>
+
+Processes a `GraphTensor`.
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/HadamardProductLinkPrediction.md b/tensorflow_gnn/docs/api_docs/python/runner/HadamardProductLinkPrediction.md
new file mode 100644
index 00000000..d66469ba
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/HadamardProductLinkPrediction.md
@@ -0,0 +1,203 @@
+# runner.HadamardProductLinkPrediction
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/link_prediction.py#L174-L190">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Implements edge score as hadamard product of features of endpoint nodes.
+
+Inherits From: [`Task`](../runner/Task.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.HadamardProductLinkPrediction(
+    *,
+    node_feature_name: tfgnn.FieldName = tfgnn.HIDDEN_STATE,
+    readout_label_feature_name: str = &#x27;label&#x27;,
+    readout_node_set_name: tfgnn.NodeSetName = &#x27;_readout&#x27;
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+The hadamard product is followed by one layer with scalar output.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>node_feature_name</code><a id="node_feature_name"></a>
+</td>
+<td>
+Name of feature where node state for link-prediction
+is read from. The final link prediction score will be:
+`score(graph.node_sets[source][node_feature_name],
+        graph.node_sets[target][node_feature_name])`
+where <code>source</code> and <code>target</code>, respectively, are:
+<code>graph.edge_sets[readout_node_set_name+"/source"].adjacency.source_name</code>
+and
+<code>graph.edge_sets[readout_node_set_name+"/target"].adjacency.source_name</code>
+</td>
+</tr><tr>
+<td>
+<code>readout_label_feature_name</code><a id="readout_label_feature_name"></a>
+</td>
+<td>
+The labels for edge connections,
+source nodes
+<code>graph.edge_sets[readout_node_set_name+"/source"].adjacency.source</code> in
+node set <code>graph.node_sets[source]</code> against target nodes
+<code>graph.edge_sets[readout_node_set_name+"/target"].adjacency.source</code> in
+node set <code>graph.node_sets[source]</code>, must be stored in
+<code>graph.node_sets[readout_node_set_name][readout_label_feature_name]</code>.
+</td>
+</tr><tr>
+<td>
+<code>readout_node_set_name</code><a id="readout_node_set_name"></a>
+</td>
+<td>
+Determines the readout node-set, which must have
+feature <code>readout_label_feature_name</code>, and must receive connections (at
+target endpoints) from edge-sets <code>readout_node_set_name+"/source"</code> and
+<code>readout_node_set_name+"/target"</code>.
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="losses"><code>losses</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/link_prediction.py#L154-L156">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>losses() -> <a href="../runner/Losses.md"><code>runner.Losses</code></a>
+</code></pre>
+
+Binary cross-entropy.
+
+<h3 id="metrics"><code>metrics</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/link_prediction.py#L158-L159">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>metrics() -> <a href="../runner/Metrics.md"><code>runner.Metrics</code></a>
+</code></pre>
+
+Returns arbitrary task specific metrics.
+
+<h3 id="predict"><code>predict</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/link_prediction.py#L144-L152">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>predict(
+    graph: tfgnn.GraphTensor
+) -> <a href="../runner/Predictions.md"><code>runner.Predictions</code></a>
+</code></pre>
+
+Produces prediction outputs for the learning objective.
+
+Overall model composition* makes use of the Keras Functional API
+(https://www.tensorflow.org/guide/keras/functional) to map symbolic Keras
+`GraphTensor` inputs to symbolic Keras `Field` outputs. Outputs must match the
+structure (one or mapping) of labels from `preprocess`.
+
+*) `outputs = predict(GNN(inputs))` where `inputs` are those `GraphTensor`
+returned by `preprocess(...)`, `GNN` is the base GNN, `predict` is this method
+and `outputs` are the prediction outputs for the learning objective.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>*args</code>
+</td>
+<td>
+The symbolic Keras <code>GraphTensor</code> inputs(s). These inputs correspond
+(in sequence) to the base GNN output of each <code>GraphTensor</code> returned by
+<code>preprocess(...)</code>.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+The model's prediction output for this task.
+</td>
+</tr>
+
+</table>
+
+<h3 id="preprocess"><code>preprocess</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/link_prediction.py#L134-L142">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>preprocess(
+    gt: tfgnn.GraphTensor
+) -> Tuple[tfgnn.GraphTensor, tfgnn.Field]
+</code></pre>
+
+Preprocesses a scalar (after `merge_batch_to_components`) `GraphTensor`.
+
+This function uses the Keras functional API to define non-trainable
+transformations of the symbolic input `GraphTensor`, which get executed during
+dataset preprocessing in a `tf.data.Dataset.map(...)` operation. It has two
+responsibilities:
+
+1.  Splitting the training label out of the input for training. It must be
+    returned as a separate tensor or mapping of tensors.
+2.  Optionally, transforming input features. Some advanced modeling techniques
+    require running the same base GNN on multiple different transformations, so
+    this function may return a single `GraphTensor` or a non-empty sequence of
+    `GraphTensors`. The corresponding base GNN output for each `GraphTensor` is
+    provided to the `predict(...)` method.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A symbolic Keras <code>GraphTensor</code> for processing.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A tuple of processed <code>GraphTensor</code>(s) and a (one or mapping of) <code>Field</code> to
+be used as labels.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/IntegratedGradientsExporter.md b/tensorflow_gnn/docs/api_docs/python/runner/IntegratedGradientsExporter.md
new file mode 100644
index 00000000..2aac8e65
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/IntegratedGradientsExporter.md
@@ -0,0 +1,127 @@
+# runner.IntegratedGradientsExporter
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/utils/attribution.py#L303-L402">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Exports a Keras model with an additional integrated gradients signature.
+
+Inherits From: [`ModelExporter`](../runner/ModelExporter.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.IntegratedGradientsExporter(
+    integrated_gradients_output_name: Optional[str] = None,
+    subdirectory: Optional[str] = None,
+    random_counterfactual: bool = True,
+    steps: int = 32,
+    seed: Optional[int] = None,
+    options: Optional[tf.saved_model.SaveOptions] = None
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>integrated_gradients_output_name</code><a id="integrated_gradients_output_name"></a>
+</td>
+<td>
+The name for the integrated gradients
+output tensor. If unset, the tensor will be named by Keras defaults.
+</td>
+</tr><tr>
+<td>
+<code>subdirectory</code><a id="subdirectory"></a>
+</td>
+<td>
+An optional subdirectory, if set: models are exported to
+<code>os.path.join(export_dir, subdirectory).</code>
+</td>
+</tr><tr>
+<td>
+<code>random_counterfactual</code><a id="random_counterfactual"></a>
+</td>
+<td>
+Whether to use a random uniform counterfactual.
+</td>
+</tr><tr>
+<td>
+<code>steps</code><a id="steps"></a>
+</td>
+<td>
+The number of interpolations of the Riemann sum approximation.
+</td>
+</tr><tr>
+<td>
+<code>seed</code><a id="seed"></a>
+</td>
+<td>
+An optional random seed.
+</td>
+</tr><tr>
+<td>
+<code>options</code><a id="options"></a>
+</td>
+<td>
+Options for saving to SavedModel.
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="save"><code>save</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/utils/attribution.py#L345-L402">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>save(
+    run_result: <a href="../runner/RunResult.md"><code>runner.RunResult</code></a>,
+    export_dir: str
+)
+</code></pre>
+
+Exports a Keras model with an additional integrated gradients signature.
+
+Importantly: the `run_result.preprocess_model`, if provided, and
+`run_result.trained_model` are stacked before any export. Stacking involves the
+chaining of the first output of `run_result.preprocess_model` to the only input
+of `run_result.trained_model.` The result is a model with the input of
+`run_result.preprocess_model` and the output of `run_result.trained_model.`
+
+Two serving signatures are exported:
+
+'serving_default') The default serving signature (i.e., the `preprocess_model`
+input signature), 'integrated_gradients') The integrated gradients signature
+(i.e., the `preprocess_model` input signature).
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>run_result</code>
+</td>
+<td>
+A <code>RunResult</code> from training.
+</td>
+</tr><tr>
+<td>
+<code>export_dir</code>
+</td>
+<td>
+A destination directory.
+</td>
+</tr>
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/KerasModelExporter.md b/tensorflow_gnn/docs/api_docs/python/runner/KerasModelExporter.md
new file mode 100644
index 00000000..4de9ca64
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/KerasModelExporter.md
@@ -0,0 +1,121 @@
+# runner.KerasModelExporter
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/utils/model_export.py#L25-L87">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Exports a Keras model (with Keras API) via `tf.keras.models.save_model`.
+
+Inherits From: [`ModelExporter`](../runner/ModelExporter.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.KerasModelExporter(
+    *,
+    output_names: Optional[Any] = None,
+    subdirectory: Optional[str] = None,
+    include_preprocessing: bool = True,
+    options: Optional[tf.saved_model.SaveOptions] = None,
+    use_legacy_model_save: Optional[bool] = None
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>output_names</code><a id="output_names"></a>
+</td>
+<td>
+By default, each output of the exported model uses the name
+of the final Keras layer that created it as its key in the SavedModel
+signature. This argument can be set to a single <code>str</code> name or a nested
+structure of <code>str</code> names to override the output names. Its nesting
+structure must match the exported model's output (as checked by
+<code>tf.nest.assert_same_structure</code>). Any <code>None</code> values in <code>output_names</code>
+are ignored, leaving that output with its default name.
+</td>
+</tr><tr>
+<td>
+<code>subdirectory</code><a id="subdirectory"></a>
+</td>
+<td>
+An optional subdirectory, if set: models are exported to
+<code>os.path.join(export_dir, subdirectory).</code>
+</td>
+</tr><tr>
+<td>
+<code>include_preprocessing</code><a id="include_preprocessing"></a>
+</td>
+<td>
+Whether to include any <code>preprocess_model.</code>
+</td>
+</tr><tr>
+<td>
+<code>options</code><a id="options"></a>
+</td>
+<td>
+Options for saving to a TensorFlow <code>SavedModel</code>.
+</td>
+</tr><tr>
+<td>
+<code>use_legacy_model_save</code><a id="use_legacy_model_save"></a>
+</td>
+<td>
+Optional; most users can leave it unset to get a
+useful default for export to inference. See <a href="../runner/export_model.md"><code>runner.export_model()</code></a>
+for more.
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="save"><code>save</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/utils/model_export.py#L59-L87">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>save(
+    run_result: <a href="../runner/RunResult.md"><code>runner.RunResult</code></a>,
+    export_dir: str
+)
+</code></pre>
+
+Exports a Keras model (with Keras API) via tf.keras.models.save_model.
+
+Importantly: the `run_result.preprocess_model`, if provided, and
+`run_result.trained_model` are stacked before any export. Stacking involves the
+chaining of the first output of `run_result.preprocess_model` to the only input
+of `run_result.trained_model.` The result is a model with the input of
+`run_result.preprocess_model` and the output of `run_result.trained_model.`
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>run_result</code>
+</td>
+<td>
+A <code>RunResult</code> from training.
+</td>
+</tr><tr>
+<td>
+<code>export_dir</code>
+</td>
+<td>
+A destination directory.
+</td>
+</tr>
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/KerasTrainer.md b/tensorflow_gnn/docs/api_docs/python/runner/KerasTrainer.md
new file mode 100644
index 00000000..a8468368
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/KerasTrainer.md
@@ -0,0 +1,263 @@
+# runner.KerasTrainer
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/trainers/keras_fit.py#L57-L316">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Trains using the `tf.keras.Model.fit` training loop.
+
+Inherits From: [`Trainer`](../runner/Trainer.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.KerasTrainer(
+    strategy: tf.distribute.Strategy,
+    *,
+    model_dir: str,
+    checkpoint_options: Optional[<a href="../runner/KerasTrainerCheckpointOptions.md"><code>runner.KerasTrainerCheckpointOptions</code></a>] = None,
+    backup_dir: Optional[str] = None,
+    steps_per_epoch: Optional[int] = None,
+    verbose: Union[int, str] = &#x27;auto&#x27;,
+    validation_steps: Optional[int] = None,
+    validation_per_epoch: Optional[int] = None,
+    validation_freq: Optional[int] = None,
+    summarize_every_n_steps: Union[int, str] = 500,
+    checkpoint_every_n_steps: Union[int, str] = &#x27;epoch&#x27;,
+    backup_and_restore: bool = True,
+    callbacks: Optional[Sequence[tf.keras.callbacks.Callback]] = None,
+    restore_best_weights: Optional[bool] = None,
+    options: Optional[<a href="../runner/KerasTrainerOptions.md"><code>runner.KerasTrainerOptions</code></a>] = None
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>strategy</code><a id="strategy"></a>
+</td>
+<td>
+A <code>tf.distribute.Strategy.</code>
+</td>
+</tr><tr>
+<td>
+<code>model_dir</code><a id="model_dir"></a>
+</td>
+<td>
+A model directory for summaries.
+</td>
+</tr><tr>
+<td>
+<code>checkpoint_options</code><a id="checkpoint_options"></a>
+</td>
+<td>
+An optional configuration for checkpointing related
+configs. If checkpoint_options.checkpoint_dir is unset;
+<code>os.path.join(model_dir, "ckpnt")</code> is used.
+</td>
+</tr><tr>
+<td>
+<code>backup_dir</code><a id="backup_dir"></a>
+</td>
+<td>
+An optional directory for backup, if unset;
+<code>(os.path.join(model_dir, "backup"),)</code> is used.
+</td>
+</tr><tr>
+<td>
+<code>steps_per_epoch</code><a id="steps_per_epoch"></a>
+</td>
+<td>
+The number of training steps per epoch. Optional,
+if unspecified: epochs are at <code>tf.data.Dataset</code> end.
+</td>
+</tr><tr>
+<td>
+<code>verbose</code><a id="verbose"></a>
+</td>
+<td>
+Forwarded to <code>tf.keras.Model.fit()</code>. Possible values are
+0 (silent), 1 (print progress bar), 2 (one line per epoch), and
+"auto" (default) defers to keras to select verbosity.
+</td>
+</tr><tr>
+<td>
+<code>validation_steps</code><a id="validation_steps"></a>
+</td>
+<td>
+The number of steps used during validation. Optional,
+if unspecified: the entire validation <code>tf.data.Dataset</code> is evaluated.
+</td>
+</tr><tr>
+<td>
+<code>validation_per_epoch</code><a id="validation_per_epoch"></a>
+</td>
+<td>
+The number of validations done per training epoch.
+Optional, if unspecified: Perform one validation per training epoch.
+Only one of <code>validation_per_epoch</code> and <code>validation_freq</code> can be
+specified.
+</td>
+</tr><tr>
+<td>
+<code>validation_freq</code><a id="validation_freq"></a>
+</td>
+<td>
+Specifies how many training epochs to run before a new
+validation run is performed. Optional, if unspecified: Performs
+validation after every training epoch. Only one of
+<code>validation_per_epoch</code> and <code>validation_freq</code> can be specified.
+</td>
+</tr><tr>
+<td>
+<code>summarize_every_n_steps</code><a id="summarize_every_n_steps"></a>
+</td>
+<td>
+The frequency for writing TensorBoard summaries,
+as an integer number of steps, or "epoch" for once per epoch, or
+"never".
+</td>
+</tr><tr>
+<td>
+<code>checkpoint_every_n_steps</code><a id="checkpoint_every_n_steps"></a>
+</td>
+<td>
+The frequency for writing latest models, as an
+integer number of steps, or "epoch" for once per epoch, or "never".
+The best model will always be saved after each validation epoch except
+when this parameter is set to "never", because the validation metric is
+available only after validation epoch.
+</td>
+</tr><tr>
+<td>
+<code>backup_and_restore</code><a id="backup_and_restore"></a>
+</td>
+<td>
+Whether to backup and restore (According to
+<code>tf.keras.callbacks.BackupAndRestore</code>). The backup
+directory is determined by <code>backup_dir</code>.
+</td>
+</tr><tr>
+<td>
+<code>callbacks</code><a id="callbacks"></a>
+</td>
+<td>
+Optional additional <code>tf.keras.callbacks.Callback</code> for
+<code>tf.keras.Model.fit.</code>
+</td>
+</tr><tr>
+<td>
+<code>restore_best_weights</code><a id="restore_best_weights"></a>
+</td>
+<td>
+Requires a <code>checkpoint_every_n_steps</code> other than
+"never." Whether to restore the best model weights as determined by
+<code>tf.keras.callbacks.ModelCheckpoint</code> after training. If unspecified,
+its value is determined at <code>train(...)</code> invocation: <code>True if
+valid_ds_provider is not None else False</code>.
+</td>
+</tr><tr>
+<td>
+<code>options</code><a id="options"></a>
+</td>
+<td>
+A <code>KerasTrainerOptions.</code>
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr> <td> <code>model_dir</code><a id="model_dir"></a> </td> <td>
+
+</td> </tr><tr> <td> <code>strategy</code><a id="strategy"></a> </td> <td>
+
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="train"><code>train</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/trainers/keras_fit.py#L165-L316">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>train(
+    model_fn: Callable[[], tf.keras.Model],
+    train_ds_provider: <a href="../runner/DatasetProvider.md"><code>runner.DatasetProvider</code></a>,
+    *,
+    epochs: int = 1,
+    valid_ds_provider: Optional[<a href="../runner/DatasetProvider.md"><code>runner.DatasetProvider</code></a>] = None
+) -> tf.keras.Model
+</code></pre>
+
+Runs `tf.keras.Model.fit` with the`tf.distribute.Strategy` provided.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>model_fn</code>
+</td>
+<td>
+A <code>ModelFn</code>, to be invoked in the <code>tf.distribute.Strategty</code>
+scope.
+</td>
+</tr><tr>
+<td>
+<code>train_ds_provider</code>
+</td>
+<td>
+A function that returns a <code>tf.data.Dataset</code> for
+training.The items of the <code>tf.data.Dataset</code> are pairs
+<code>(graph_tensor, label)</code> that represent one batch of per-replica training
+inputs after <code>GraphTensor.merge_batch_to_components()</code> has been applied.
+</td>
+</tr><tr>
+<td>
+<code>epochs</code>
+</td>
+<td>
+The epochs to train: adjusted for <code>validation_per_epoch.</code>
+</td>
+</tr><tr>
+<td>
+<code>valid_ds_provider</code>
+</td>
+<td>
+An optional function that returns a <code>tf.data.Dataset</code>
+for validation. The items of the <code>tf.data.Dataset</code> are pairs
+<code>(graph_tensor, label)</code> that represent one batch of per-replica training
+inputs after <code>GraphTensor.merge_batch_to_components()</code> has been applied.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A trained <code>tf.keras.Model.</code>
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/KerasTrainerCheckpointOptions.md b/tensorflow_gnn/docs/api_docs/python/runner/KerasTrainerCheckpointOptions.md
new file mode 100644
index 00000000..b3fbff8d
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/KerasTrainerCheckpointOptions.md
@@ -0,0 +1,108 @@
+# runner.KerasTrainerCheckpointOptions
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/trainers/keras_fit.py#L37-L54">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Provides Keras Checkpointing related configuration options.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.KerasTrainerCheckpointOptions(
+    checkpoint_dir: Optional[str] = None,
+    best_checkpoint: str = &#x27;best&#x27;,
+    latest_checkpoint: str = &#x27;latest&#x27;
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr>
+<td>
+<code>checkpoint_dir</code><a id="checkpoint_dir"></a>
+</td>
+<td>
+Directory path to save checkpoint files.
+</td>
+</tr><tr>
+<td>
+<code>best_checkpoint</code><a id="best_checkpoint"></a>
+</td>
+<td>
+Filename for the best checkpoint.
+</td>
+</tr><tr>
+<td>
+<code>latest_checkpoint</code><a id="latest_checkpoint"></a>
+</td>
+<td>
+Filename for the latest checkpoint.
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="best_checkpoint_filepath"><code>best_checkpoint_filepath</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/trainers/keras_fit.py#L50-L51">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>best_checkpoint_filepath() -> str
+</code></pre>
+
+<h3 id="latest_checkpoint_filepath"><code>latest_checkpoint_filepath</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/trainers/keras_fit.py#L53-L54">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>latest_checkpoint_filepath() -> str
+</code></pre>
+
+<h3 id="__eq__"><code>__eq__</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>__eq__(
+    other
+)
+</code></pre>
+
+Return self==value.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Class Variables</h2></th></tr>
+
+<tr>
+<td>
+best_checkpoint<a id="best_checkpoint"></a>
+</td>
+<td>
+<code>'best'</code>
+</td>
+</tr><tr>
+<td>
+checkpoint_dir<a id="checkpoint_dir"></a>
+</td>
+<td>
+<code>None</code>
+</td>
+</tr><tr>
+<td>
+latest_checkpoint<a id="latest_checkpoint"></a>
+</td>
+<td>
+<code>'latest'</code>
+</td>
+</tr>
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/KerasTrainerOptions.md b/tensorflow_gnn/docs/api_docs/python/runner/KerasTrainerOptions.md
new file mode 100644
index 00000000..fd30bc44
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/KerasTrainerOptions.md
@@ -0,0 +1,90 @@
+# runner.KerasTrainerOptions
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/trainers/keras_fit.py#L28-L34">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Provides Keras training related options.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.KerasTrainerOptions(
+    policy: Optional[Union[str, tf.keras.mixed_precision.Policy]] = None,
+    soft_device_placement: bool = False,
+    enable_check_numerics: bool = False
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr>
+<td>
+<code>policy</code><a id="policy"></a>
+</td>
+<td>
+Dataclass field
+</td>
+</tr><tr>
+<td>
+<code>soft_device_placement</code><a id="soft_device_placement"></a>
+</td>
+<td>
+Dataclass field
+</td>
+</tr><tr>
+<td>
+<code>enable_check_numerics</code><a id="enable_check_numerics"></a>
+</td>
+<td>
+Dataclass field
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="__eq__"><code>__eq__</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>__eq__(
+    other
+)
+</code></pre>
+
+Return self==value.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Class Variables</h2></th></tr>
+
+<tr>
+<td>
+enable_check_numerics<a id="enable_check_numerics"></a>
+</td>
+<td>
+<code>False</code>
+</td>
+</tr><tr>
+<td>
+policy<a id="policy"></a>
+</td>
+<td>
+<code>None</code>
+</td>
+</tr><tr>
+<td>
+soft_device_placement<a id="soft_device_placement"></a>
+</td>
+<td>
+<code>False</code>
+</td>
+</tr>
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/Loss.md b/tensorflow_gnn/docs/api_docs/python/runner/Loss.md
new file mode 100644
index 00000000..fa45eb7e
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/Loss.md
@@ -0,0 +1,17 @@
+# runner.Loss
+
+<!-- Insert buttons and diff -->
+
+This symbol is a **type alias**.
+
+#### Source:
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>Loss = Callable[
+    tf.Tensor,
+    tf.Tensor,
+    tf.Tensor
+]
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/Losses.md b/tensorflow_gnn/docs/api_docs/python/runner/Losses.md
new file mode 100644
index 00000000..839e40c3
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/Losses.md
@@ -0,0 +1,16 @@
+# runner.Losses
+
+<!-- Insert buttons and diff -->
+
+This symbol is a **type alias**.
+
+#### Source:
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>Losses = Union[
+    <a href="../runner/Loss.md"><code>runner.Loss</code></a>,
+    Mapping[str, <a href="../runner/Loss.md"><code>runner.Loss</code></a>]
+]
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/Metrics.md b/tensorflow_gnn/docs/api_docs/python/runner/Metrics.md
new file mode 100644
index 00000000..5d281a66
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/Metrics.md
@@ -0,0 +1,17 @@
+# runner.Metrics
+
+<!-- Insert buttons and diff -->
+
+This symbol is a **type alias**.
+
+#### Source:
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>Metrics = Union[
+    <a href="../runner/Loss.md"><code>runner.Loss</code></a>,
+    Sequence[<a href="../runner/Loss.md"><code>runner.Loss</code></a>],
+    Mapping[str, Union[<a href="../runner/Loss.md"><code>runner.Loss</code></a>, Sequence[<a href="../runner/Loss.md"><code>runner.Loss</code></a>]]]
+]
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/ModelExporter.md b/tensorflow_gnn/docs/api_docs/python/runner/ModelExporter.md
new file mode 100644
index 00000000..3fbc5b4c
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/ModelExporter.md
@@ -0,0 +1,53 @@
+# runner.ModelExporter
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/interfaces.py#L107-L121">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Saves a Keras model.
+
+<!-- Placeholder for "Used in" -->
+
+## Methods
+
+<h3 id="save"><code>save</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/interfaces.py#L110-L121">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>save(
+    run_result: RunResult, export_dir: str
+)
+</code></pre>
+
+Saves a Keras model.
+
+All persistence decisions are left to the implementation: e.g., a Keras model
+with full API or a simple `tf.train.Checkpoint` may be saved.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>run_result</code>
+</td>
+<td>
+A <code>RunResult</code> from training.
+</td>
+</tr><tr>
+<td>
+<code>export_dir</code>
+</td>
+<td>
+A destination directory.
+</td>
+</tr>
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/NodeBinaryClassification.md b/tensorflow_gnn/docs/api_docs/python/runner/NodeBinaryClassification.md
new file mode 100644
index 00000000..2926e68c
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/NodeBinaryClassification.md
@@ -0,0 +1,237 @@
+# runner.NodeBinaryClassification
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L547-L594">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Node binary (or multi-label) classification via structured readout.
+
+Inherits From: [`Task`](../runner/Task.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.NodeBinaryClassification(
+    key: str = &#x27;seed&#x27;,
+    units: int = 1,
+    *,
+    feature_name: str = tfgnn.HIDDEN_STATE,
+    readout_node_set: tfgnn.NodeSetName = &#x27;_readout&#x27;,
+    validate: bool = True,
+    name: str = &#x27;classification_logits&#x27;,
+    label_fn: Optional[LabelFn] = None,
+    label_feature_name: Optional[str] = None
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>key</code><a id="key"></a>
+</td>
+<td>
+A string key to select between possibly multiple named readouts.
+</td>
+</tr><tr>
+<td>
+<code>units</code><a id="units"></a>
+</td>
+<td>
+The units for the classification head. (Typically <code>1</code> for binary
+classification and the number of labels for multi-label classification.)
+</td>
+</tr><tr>
+<td>
+<code>feature_name</code><a id="feature_name"></a>
+</td>
+<td>
+The name of the feature to read. If unset,
+<code>tfgnn.HIDDEN_STATE</code> will be read.
+</td>
+</tr><tr>
+<td>
+<code>readout_node_set</code><a id="readout_node_set"></a>
+</td>
+<td>
+A string, defaults to <code>"_readout"</code>. This is used as the
+name for the readout node set and as a name prefix for its edge sets.
+</td>
+</tr><tr>
+<td>
+<code>validate</code><a id="validate"></a>
+</td>
+<td>
+Setting this to false disables the validity checks for the
+auxiliary edge sets. This is stronlgy discouraged, unless great care is
+taken to run <code>tfgnn.validate_graph_tensor_for_readout()</code> earlier on
+structurally unchanged GraphTensors.
+</td>
+</tr><tr>
+<td>
+<code>name</code><a id="name"></a>
+</td>
+<td>
+The classification head's layer name. To control the naming of saved
+model outputs see the runner model exporters (e.g.,
+<code>KerasModelExporter</code>).
+</td>
+</tr><tr>
+<td>
+<code>label_fn</code><a id="label_fn"></a>
+</td>
+<td>
+A label extraction function. This function mutates the input
+<code>GraphTensor</code>. Mutually exclusive with <code>label_feature_name</code>.
+</td>
+</tr><tr>
+<td>
+<code>label_feature_name</code><a id="label_feature_name"></a>
+</td>
+<td>
+A label feature name for readout from the auxiliary
+'_readout' node set. Readout does not mutate the input <code>GraphTensor</code>.
+Mutually exclusive with <code>label_fn</code>.
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="gather_activations"><code>gather_activations</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L346-L363">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>gather_activations(
+    inputs: GraphTensor
+) -> Field
+</code></pre>
+
+Gather activations from auxiliary node (and edge) sets.
+
+<h3 id="losses"><code>losses</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L179-L180">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>losses() -> interfaces.Losses
+</code></pre>
+
+Returns arbitrary task specific losses.
+
+<h3 id="metrics"><code>metrics</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L182-L188">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>metrics() -> interfaces.Metrics
+</code></pre>
+
+Returns arbitrary task specific metrics.
+
+<h3 id="predict"><code>predict</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L139-L153">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>predict(
+    inputs: tfgnn.GraphTensor
+) -> interfaces.Predictions
+</code></pre>
+
+Apply a linear head for classification.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A <code>tfgnn.GraphTensor</code> for classification.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+The classification logits.
+</td>
+</tr>
+
+</table>
+
+<h3 id="preprocess"><code>preprocess</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L155-L162">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>preprocess(
+    inputs: GraphTensor
+) -> tuple[GraphTensor, Field]
+</code></pre>
+
+Preprocesses a scalar (after `merge_batch_to_components`) `GraphTensor`.
+
+This function uses the Keras functional API to define non-trainable
+transformations of the symbolic input `GraphTensor`, which get executed during
+dataset preprocessing in a `tf.data.Dataset.map(...)` operation. It has two
+responsibilities:
+
+1.  Splitting the training label out of the input for training. It must be
+    returned as a separate tensor or mapping of tensors.
+2.  Optionally, transforming input features. Some advanced modeling techniques
+    require running the same base GNN on multiple different transformations, so
+    this function may return a single `GraphTensor` or a non-empty sequence of
+    `GraphTensors`. The corresponding base GNN output for each `GraphTensor` is
+    provided to the `predict(...)` method.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A symbolic Keras <code>GraphTensor</code> for processing.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A tuple of processed <code>GraphTensor</code>(s) and a (one or mapping of) <code>Field</code> to
+be used as labels.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/NodeMulticlassClassification.md b/tensorflow_gnn/docs/api_docs/python/runner/NodeMulticlassClassification.md
new file mode 100644
index 00000000..308eea6c
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/NodeMulticlassClassification.md
@@ -0,0 +1,254 @@
+# runner.NodeMulticlassClassification
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L597-L649">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Node multiclass classification via structured readout.
+
+Inherits From: [`Task`](../runner/Task.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.NodeMulticlassClassification(
+    key: str = &#x27;seed&#x27;,
+    *,
+    feature_name: str = tfgnn.HIDDEN_STATE,
+    readout_node_set: tfgnn.NodeSetName = &#x27;_readout&#x27;,
+    validate: bool = True,
+    num_classes: Optional[int] = None,
+    class_names: Optional[Sequence[str]] = None,
+    per_class_statistics: bool = False,
+    name: str = &#x27;classification_logits&#x27;,
+    label_fn: Optional[LabelFn] = None,
+    label_feature_name: Optional[str] = None
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>key</code><a id="key"></a>
+</td>
+<td>
+A string key to select between possibly multiple named readouts.
+</td>
+</tr><tr>
+<td>
+<code>feature_name</code><a id="feature_name"></a>
+</td>
+<td>
+The name of the feature to read. If unset,
+<code>tfgnn.HIDDEN_STATE</code> will be read.
+</td>
+</tr><tr>
+<td>
+<code>readout_node_set</code><a id="readout_node_set"></a>
+</td>
+<td>
+A string, defaults to <code>"_readout"</code>. This is used as the
+name for the readout node set and as a name prefix for its edge sets.
+</td>
+</tr><tr>
+<td>
+<code>validate</code><a id="validate"></a>
+</td>
+<td>
+Setting this to false disables the validity checks for the
+auxiliary edge sets. This is stronlgy discouraged, unless great care is
+taken to run <code>tfgnn.validate_graph_tensor_for_readout()</code> earlier on
+structurally unchanged GraphTensors.
+</td>
+</tr><tr>
+<td>
+<code>num_classes</code><a id="num_classes"></a>
+</td>
+<td>
+The number of classes. Exactly one of <code>num_classes</code> or
+<code>class_names</code> must be specified
+</td>
+</tr><tr>
+<td>
+<code>class_names</code><a id="class_names"></a>
+</td>
+<td>
+The class names. Exactly one of <code>num_classes</code> or
+<code>class_names</code> must be specified
+</td>
+</tr><tr>
+<td>
+<code>per_class_statistics</code><a id="per_class_statistics"></a>
+</td>
+<td>
+Whether to compute statistics per class.
+</td>
+</tr><tr>
+<td>
+<code>name</code><a id="name"></a>
+</td>
+<td>
+The classification head's layer name. To control the naming of saved
+model outputs see the runner model exporters (e.g.,
+<code>KerasModelExporter</code>).
+</td>
+</tr><tr>
+<td>
+<code>label_fn</code><a id="label_fn"></a>
+</td>
+<td>
+A label extraction function. This function mutates the input
+<code>GraphTensor</code>. Mutually exclusive with <code>label_feature_name</code>.
+</td>
+</tr><tr>
+<td>
+<code>label_feature_name</code><a id="label_feature_name"></a>
+</td>
+<td>
+A label feature name for readout from the auxiliary
+'_readout' node set. Readout does not mutate the input <code>GraphTensor</code>.
+Mutually exclusive with <code>label_fn</code>.
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="gather_activations"><code>gather_activations</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L346-L363">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>gather_activations(
+    inputs: GraphTensor
+) -> Field
+</code></pre>
+
+Gather activations from auxiliary node (and edge) sets.
+
+<h3 id="losses"><code>losses</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L212-L214">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>losses() -> interfaces.Losses
+</code></pre>
+
+Sparse categorical crossentropy loss.
+
+<h3 id="metrics"><code>metrics</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L216-L232">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>metrics() -> interfaces.Metrics
+</code></pre>
+
+Sparse categorical metrics.
+
+<h3 id="predict"><code>predict</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L139-L153">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>predict(
+    inputs: tfgnn.GraphTensor
+) -> interfaces.Predictions
+</code></pre>
+
+Apply a linear head for classification.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A <code>tfgnn.GraphTensor</code> for classification.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+The classification logits.
+</td>
+</tr>
+
+</table>
+
+<h3 id="preprocess"><code>preprocess</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L155-L162">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>preprocess(
+    inputs: GraphTensor
+) -> tuple[GraphTensor, Field]
+</code></pre>
+
+Preprocesses a scalar (after `merge_batch_to_components`) `GraphTensor`.
+
+This function uses the Keras functional API to define non-trainable
+transformations of the symbolic input `GraphTensor`, which get executed during
+dataset preprocessing in a `tf.data.Dataset.map(...)` operation. It has two
+responsibilities:
+
+1.  Splitting the training label out of the input for training. It must be
+    returned as a separate tensor or mapping of tensors.
+2.  Optionally, transforming input features. Some advanced modeling techniques
+    require running the same base GNN on multiple different transformations, so
+    this function may return a single `GraphTensor` or a non-empty sequence of
+    `GraphTensors`. The corresponding base GNN output for each `GraphTensor` is
+    provided to the `predict(...)` method.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A symbolic Keras <code>GraphTensor</code> for processing.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A tuple of processed <code>GraphTensor</code>(s) and a (one or mapping of) <code>Field</code> to
+be used as labels.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/ParameterServerStrategy.md b/tensorflow_gnn/docs/api_docs/python/runner/ParameterServerStrategy.md
new file mode 100644
index 00000000..342da409
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/ParameterServerStrategy.md
@@ -0,0 +1,1082 @@
+# runner.ParameterServerStrategy
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/utils/strategies.py#L22-L33">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+A `ParameterServerStrategy` convenience wrapper.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.ParameterServerStrategy(
+    min_shard_bytes: Optional[int] = None
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr> <td> <code>cluster_resolver</code><a id="cluster_resolver"></a> </td> <td>
+a <code>tf.distribute.cluster_resolver.ClusterResolver</code> object. </td>
+</tr><tr> <td>
+<code>variable_partitioner</code><a id="variable_partitioner"></a> </td> <td> a
+<code>distribute.experimental.partitioners.Partitioner</code> that specifies how
+to partition variables. If <code>None</code>, variables will not be partitioned.
+
+*   Predefined partitioners in
+    <code>tf.distribute.experimental.partitioners</code> can be used for this
+    argument. A commonly used partitioner is
+    <code>MinSizePartitioner(min_shard_bytes = 256 << 10, max_shards =
+    num_ps)</code>, which allocates at least 256K per shard, and each ps gets at
+    most one shard.
+
+*   <code>variable_partitioner</code> will be called for each variable created
+    under strategy <code>scope</code> to instruct how the variable should be
+    partitioned. Variables that have only one partition along the partitioning
+    axis (i.e., no need for partition) will be created as a normal
+    <code>tf.Variable</code>.
+
+*   Only the first / outermost axis partitioning is supported.
+
+*   Div partition strategy is used to partition variables. Assuming we assign
+    consecutive integer ids along the first axis of a variable, then ids are
+    assigned to shards in a contiguous manner, while attempting to keep each
+    shard size identical. If the ids do not evenly divide the number of shards,
+    each of the first several shards will be assigned one more id. For instance,
+    a variable whose first dimension is 13 has 13 ids, and they are split across
+    5 shards as: <code>[[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10], [11,
+    12]]</code>.
+
+*   Variables created under <code>strategy.extended.colocate_vars_with</code> will
+    not be partitioned.
+    </td>
+    </tr>
+    </table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr> <td> <code>cluster_resolver</code><a id="cluster_resolver"></a> </td> <td>
+Returns the cluster resolver associated with this strategy.
+
+In general, when using a multi-worker <code>tf.distribute</code> strategy such
+as <code>tf.distribute.experimental.MultiWorkerMirroredStrategy</code> or
+<code>tf.distribute.TPUStrategy()</code>, there is a
+<code>tf.distribute.cluster_resolver.ClusterResolver</code> associated with the
+strategy used, and such an instance is returned by this property.
+
+Strategies that intend to have an associated
+<code>tf.distribute.cluster_resolver.ClusterResolver</code> must set the
+relevant attribute, or override this property; otherwise, <code>None</code> is
+returned by default. Those strategies should also provide information regarding
+what is returned by this property.
+
+Single-worker strategies usually do not have a
+<code>tf.distribute.cluster_resolver.ClusterResolver</code>, and in those cases
+this property will return <code>None</code>.
+
+The <code>tf.distribute.cluster_resolver.ClusterResolver</code> may be useful
+when the user needs to access information such as the cluster spec, task type or
+task id. For example,
+
+```python
+
+os.environ['TF_CONFIG'] = json.dumps({
+  'cluster': {
+      'worker': ["localhost:12345", "localhost:23456"],
+      'ps': ["localhost:34567"]
+  },
+  'task': {'type': 'worker', 'index': 0}
+})
+
+# This implicitly uses TF_CONFIG for the cluster and current task info.
+strategy = tf.distribute.experimental.MultiWorkerMirroredStrategy()
+
+...
+
+if strategy.cluster_resolver.task_type == 'worker':
+  # Perform something that's only applicable on workers. Since we set this
+  # as a worker above, this block will run on this particular instance.
+elif strategy.cluster_resolver.task_type == 'ps':
+  # Perform something that's only applicable on parameter servers. Since we
+  # set this as a worker above, this block will not run on this particular
+  # instance.
+```
+
+For more information, please see
+<code>tf.distribute.cluster_resolver.ClusterResolver</code>'s API docstring.
+</td>
+</tr><tr>
+<td>
+<code>extended</code><a id="extended"></a>
+</td>
+<td>
+<code>tf.distribute.StrategyExtended</code> with additional methods.
+</td>
+</tr><tr>
+<td>
+<code>num_replicas_in_sync</code><a id="num_replicas_in_sync"></a>
+</td>
+<td>
+Returns number of replicas over which gradients are aggregated.
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="distribute_datasets_from_function"><code>distribute_datasets_from_function</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>distribute_datasets_from_function(
+    dataset_fn, options=None
+)
+</code></pre>
+
+Distributes `tf.data.Dataset` instances created by calls to `dataset_fn`.
+
+The argument `dataset_fn` that users pass in is an input function that has a
+`tf.distribute.InputContext` argument and returns a `tf.data.Dataset` instance.
+It is expected that the returned dataset from `dataset_fn` is already batched by
+per-replica batch size (i.e. global batch size divided by the number of replicas
+in sync) and sharded. `tf.distribute.Strategy.distribute_datasets_from_function`
+does not batch or shard the `tf.data.Dataset` instance returned from the input
+function. `dataset_fn` will be called on the CPU device of each of the workers
+and each generates a dataset where every replica on that worker will dequeue one
+batch of inputs (i.e. if a worker has two replicas, two batches will be dequeued
+from the `Dataset` every step).
+
+This method can be used for several purposes. First, it allows you to specify
+your own batching and sharding logic. (In contrast,
+`tf.distribute.experimental_distribute_dataset` does batching and sharding for
+you.) For example, where `experimental_distribute_dataset` is unable to shard
+the input files, this method might be used to manually shard the dataset
+(avoiding the slow fallback behavior in `experimental_distribute_dataset`). In
+cases where the dataset is infinite, this sharding can be done by creating
+dataset replicas that differ only in their random seed.
+
+The `dataset_fn` should take an `tf.distribute.InputContext` instance where
+information about batching and input replication can be accessed.
+
+You can use `element_spec` property of the `tf.distribute.DistributedDataset`
+returned by this API to query the `tf.TypeSpec` of the elements returned by the
+iterator. This can be used to set the `input_signature` property of a
+`tf.function`. Follow `tf.distribute.DistributedDataset.element_spec` to see an
+example.
+
+IMPORTANT: The `tf.data.Dataset` returned by `dataset_fn` should have a
+per-replica batch size, unlike `experimental_distribute_dataset`, which uses the
+global batch size. This may be computed using
+`input_context.get_per_replica_batch_size`.
+
+Note: If you are using TPUStrategy, the order in which the data is processed by
+the workers when using `tf.distribute.Strategy.experimental_distribute_dataset`
+or `tf.distribute.Strategy.distribute_datasets_from_function` is not guaranteed.
+This is typically required if you are using `tf.distribute` to scale prediction.
+You can however insert an index for each element in the batch and order outputs
+accordingly. Refer to
+[this snippet](https://www.tensorflow.org/tutorials/distribute/input#caveats)
+for an example of how to order outputs.
+
+Note: Stateful dataset transformations are currently not supported with
+`tf.distribute.experimental_distribute_dataset` or
+`tf.distribute.distribute_datasets_from_function`. Any stateful ops that the
+dataset may have are currently ignored. For example, if your dataset has a
+`map_fn` that uses `tf.random.uniform` to rotate an image, then you have a
+dataset graph that depends on state (i.e the random seed) on the local machine
+where the python process is being executed.
+
+For a tutorial on more usage and properties of this method, refer to the
+[tutorial on distributed input](https://www.tensorflow.org/tutorials/distribute/input#tfdistributestrategyexperimental_distribute_datasets_from_function)).
+If you are interested in last partial batch handling, read
+[this section](https://www.tensorflow.org/tutorials/distribute/input#partial_batches).
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>dataset_fn</code>
+</td>
+<td>
+A function taking a <code>tf.distribute.InputContext</code> instance and
+returning a <code>tf.data.Dataset</code>.
+</td>
+</tr><tr>
+<td>
+<code>options</code>
+</td>
+<td>
+<code>tf.distribute.InputOptions</code> used to control options on how this
+dataset is distributed.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A <code>tf.distribute.DistributedDataset</code>.
+</td>
+</tr>
+
+</table>
+
+<h3 id="experimental_distribute_dataset"><code>experimental_distribute_dataset</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>experimental_distribute_dataset(
+    dataset, options=None
+)
+</code></pre>
+
+Creates `tf.distribute.DistributedDataset` from `tf.data.Dataset`.
+
+The returned `tf.distribute.DistributedDataset` can be iterated over similar to
+regular datasets. NOTE: The user cannot add any more transformations to a
+`tf.distribute.DistributedDataset`. You can only create an iterator or examine
+the `tf.TypeSpec` of the data generated by it. See API docs of
+`tf.distribute.DistributedDataset` to learn more.
+
+The following is an example:
+
+```
+>>> global_batch_size = 2
+>>> # Passing the devices is optional.
+... strategy = tf.distribute.MirroredStrategy(devices=["GPU:0", "GPU:1"])
+>>> # Create a dataset
+... dataset = tf.data.Dataset.range(4).batch(global_batch_size)
+>>> # Distribute that dataset
+... dist_dataset = strategy.experimental_distribute_dataset(dataset)
+>>> @tf.function
+... def replica_fn(input):
+...   return input*2
+>>> result = []
+>>> # Iterate over the `tf.distribute.DistributedDataset`
+... for x in dist_dataset:
+...   # process dataset elements
+...   result.append(strategy.run(replica_fn, args=(x,)))
+>>> print(result)
+[PerReplica:{
+  0: <tf.Tensor: shape=(1,), dtype=int64, numpy=array([0])>,
+  1: <tf.Tensor: shape=(1,), dtype=int64, numpy=array([2])>
+}, PerReplica:{
+  0: <tf.Tensor: shape=(1,), dtype=int64, numpy=array([4])>,
+  1: <tf.Tensor: shape=(1,), dtype=int64, numpy=array([6])>
+}]
+```
+
+Three key actions happening under the hood of this method are batching,
+sharding, and prefetching.
+
+In the code snippet above, `dataset` is batched by `global_batch_size`, and
+calling `experimental_distribute_dataset` on it rebatches `dataset` to a new
+batch size that is equal to the global batch size divided by the number of
+replicas in sync. We iterate through it using a Pythonic for loop. `x` is a
+`tf.distribute.DistributedValues` containing data for all replicas, and each
+replica gets data of the new batch size. `tf.distribute.Strategy.run` will take
+care of feeding the right per-replica data in `x` to the right `replica_fn`
+executed on each replica.
+
+Sharding contains autosharding across multiple workers and within every worker.
+First, in multi-worker distributed training (i.e. when you use
+`tf.distribute.experimental.MultiWorkerMirroredStrategy` or
+`tf.distribute.TPUStrategy`), autosharding a dataset over a set of workers means
+that each worker is assigned a subset of the entire dataset (if the right
+`tf.data.experimental.AutoShardPolicy` is set). This is to ensure that at each
+step, a global batch size of non-overlapping dataset elements will be processed
+by each worker. Autosharding has a couple of different options that can be
+specified using `tf.data.experimental.DistributeOptions`. Then, sharding within
+each worker means the method will split the data among all the worker devices
+(if more than one a present). This will happen regardless of multi-worker
+autosharding.
+
+Note: for autosharding across multiple workers, the default mode is
+`tf.data.experimental.AutoShardPolicy.AUTO`. This mode will attempt to shard the
+input dataset by files if the dataset is being created out of reader datasets
+(e.g. `tf.data.TFRecordDataset`, `tf.data.TextLineDataset`, etc.) or otherwise
+shard the dataset by data, where each of the workers will read the entire
+dataset and only process the shard assigned to it. However, if you have less
+than one input file per worker, we suggest that you disable dataset autosharding
+across workers by setting the
+`tf.data.experimental.DistributeOptions.auto_shard_policy` to be
+`tf.data.experimental.AutoShardPolicy.OFF`.
+
+By default, this method adds a prefetch transformation at the end of the user
+provided `tf.data.Dataset` instance. The argument to the prefetch transformation
+which is `buffer_size` is equal to the number of replicas in sync.
+
+If the above batch splitting and dataset sharding logic is undesirable, please
+use `tf.distribute.Strategy.distribute_datasets_from_function` instead, which
+does not do any automatic batching or sharding for you.
+
+Note: If you are using TPUStrategy, the order in which the data is processed by
+the workers when using `tf.distribute.Strategy.experimental_distribute_dataset`
+or `tf.distribute.Strategy.distribute_datasets_from_function` is not guaranteed.
+This is typically required if you are using `tf.distribute` to scale prediction.
+You can however insert an index for each element in the batch and order outputs
+accordingly. Refer to
+[this snippet](https://www.tensorflow.org/tutorials/distribute/input#caveats)
+for an example of how to order outputs.
+
+Note: Stateful dataset transformations are currently not supported with
+`tf.distribute.experimental_distribute_dataset` or
+`tf.distribute.distribute_datasets_from_function`. Any stateful ops that the
+dataset may have are currently ignored. For example, if your dataset has a
+`map_fn` that uses `tf.random.uniform` to rotate an image, then you have a
+dataset graph that depends on state (i.e the random seed) on the local machine
+where the python process is being executed.
+
+For a tutorial on more usage and properties of this method, refer to the
+[tutorial on distributed input](https://www.tensorflow.org/tutorials/distribute/input#tfdistributestrategyexperimental_distribute_dataset).
+If you are interested in last partial batch handling, read
+[this section](https://www.tensorflow.org/tutorials/distribute/input#partial_batches).
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>dataset</code>
+</td>
+<td>
+<code>tf.data.Dataset</code> that will be sharded across all replicas using
+the rules stated above.
+</td>
+</tr><tr>
+<td>
+<code>options</code>
+</td>
+<td>
+<code>tf.distribute.InputOptions</code> used to control options on how this
+dataset is distributed.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A <code>tf.distribute.DistributedDataset</code>.
+</td>
+</tr>
+
+</table>
+
+<h3 id="experimental_distribute_values_from_function"><code>experimental_distribute_values_from_function</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>experimental_distribute_values_from_function(
+    value_fn
+)
+</code></pre>
+
+Generates `tf.distribute.DistributedValues` from `value_fn`.
+
+This function is to generate `tf.distribute.DistributedValues` to pass into
+`run`, `reduce`, or other methods that take distributed values when not using
+datasets.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>value_fn</code>
+</td>
+<td>
+The function to run to generate values. It is called for
+each replica with <code>tf.distribute.ValueContext</code> as the sole argument. It
+must return a Tensor or a type that can be converted to a Tensor.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A <code>tf.distribute.DistributedValues</code> containing a value for each replica.
+</td>
+</tr>
+
+</table>
+
+#### Example usage:
+
+1.  Return constant value per replica:
+
+    ```
+    >>> strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1"])
+    >>> def value_fn(ctx):
+    ...   return tf.constant(1.)
+    >>> distributed_values = (
+    ...     strategy.experimental_distribute_values_from_function(
+    ...        value_fn))
+    >>> local_result = strategy.experimental_local_results(
+    ...     distributed_values)
+    >>> local_result
+    (<tf.Tensor: shape=(), dtype=float32, numpy=1.0>,
+    <tf.Tensor: shape=(), dtype=float32, numpy=1.0>)
+    ```
+
+2.  Distribute values in array based on replica_id: {: value=2}
+
+    ```
+    >>> strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1"])
+    >>> array_value = np.array([3., 2., 1.])
+    >>> def value_fn(ctx):
+    ...   return array_value[ctx.replica_id_in_sync_group]
+    >>> distributed_values = (
+    ...     strategy.experimental_distribute_values_from_function(
+    ...         value_fn))
+    >>> local_result = strategy.experimental_local_results(
+    ...     distributed_values)
+    >>> local_result
+    (3.0, 2.0)
+    ```
+
+3.  Specify values using num_replicas_in_sync: {: value=3}
+
+    ```
+    >>> strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1"])
+    >>> def value_fn(ctx):
+    ...   return ctx.num_replicas_in_sync
+    >>> distributed_values = (
+    ...     strategy.experimental_distribute_values_from_function(
+    ...         value_fn))
+    >>> local_result = strategy.experimental_local_results(
+    ...     distributed_values)
+    >>> local_result
+    (2, 2)
+    ```
+
+4.  Place values on devices and distribute: {: value=4}
+
+    ```
+    strategy = tf.distribute.TPUStrategy()
+    worker_devices = strategy.extended.worker_devices
+    multiple_values = []
+    for i in range(strategy.num_replicas_in_sync):
+      with tf.device(worker_devices[i]):
+        multiple_values.append(tf.constant(1.0))
+
+    def value_fn(ctx):
+      return multiple_values[ctx.replica_id_in_sync_group]
+
+    distributed_values = strategy.
+      experimental_distribute_values_from_function(
+      value_fn)
+    ```
+
+<h3 id="experimental_local_results"><code>experimental_local_results</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>experimental_local_results(
+    value
+)
+</code></pre>
+
+Returns the list of all local per-replica values contained in `value`.
+
+Note: This only returns values on the worker initiated by this client. When
+using a `tf.distribute.Strategy` like
+`tf.distribute.experimental.MultiWorkerMirroredStrategy`, each worker will be
+its own client, and this function will only return values computed on that
+worker.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>value</code>
+</td>
+<td>
+A value returned by <code>experimental_run()</code>, <code>run(), or a variable
+created in </code>scope`.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A tuple of values contained in <code>value</code> where ith element corresponds to
+ith replica. If <code>value</code> represents a single value, this returns
+<code>(value,).</code>
+</td>
+</tr>
+
+</table>
+
+<h3 id="gather"><code>gather</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>gather(
+    value, axis
+)
+</code></pre>
+
+Gather `value` across replicas along `axis` to the current device.
+
+Given a `tf.distribute.DistributedValues` or `tf.Tensor`-like object `value`,
+this API gathers and concatenates `value` across replicas along the `axis`-th
+dimension. The result is copied to the "current" device, which would typically
+be the CPU of the worker on which the program is running. For
+`tf.distribute.TPUStrategy`, it is the first TPU host. For multi-client
+`tf.distribute.MultiWorkerMirroredStrategy`, this is the CPU of each worker.
+
+This API can only be called in the cross-replica context. For a counterpart in
+the replica context, see `tf.distribute.ReplicaContext.all_gather`.
+
+Note: For all strategies except `tf.distribute.TPUStrategy`, the input `value`
+on different replicas must have the same rank, and their shapes must be the same
+in all dimensions except the `axis`-th dimension. In other words, their shapes
+cannot be different in a dimension `d` where `d` does not equal to the `axis`
+argument. For example, given a `tf.distribute.DistributedValues` with component
+tensors of shape `(1, 2, 3)` and `(1, 3, 3)` on two replicas, you can call
+`gather(..., axis=1, ...)` on it, but not `gather(..., axis=0, ...)` or
+`gather(..., axis=2, ...)`. However, for `tf.distribute.TPUStrategy.gather`, all
+tensors must have exactly the same rank and same shape.
+
+Note: Given a `tf.distribute.DistributedValues` `value`, its component tensors
+must have a non-zero rank. Otherwise, consider using `tf.expand_dims` before
+gathering them.
+
+```
+>>> strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1"])
+>>> # A DistributedValues with component tensor of shape (2, 1) on each replica
+... distributed_values = strategy.experimental_distribute_values_from_function(lambda _: tf.identity(tf.constant([[1], [2]])))
+>>> @tf.function
+... def run():
+...   return strategy.gather(distributed_values, axis=0)
+>>> run()
+<tf.Tensor: shape=(4, 1), dtype=int32, numpy=
+array([[1],
+       [2],
+       [1],
+       [2]], dtype=int32)>
+```
+
+Consider the following example for more combinations:
+
+```
+>>> strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1", "GPU:2", "GPU:3"])
+>>> single_tensor = tf.reshape(tf.range(6), shape=(1,2,3))
+>>> distributed_values = strategy.experimental_distribute_values_from_function(lambda _: tf.identity(single_tensor))
+>>> @tf.function
+... def run(axis):
+...   return strategy.gather(distributed_values, axis=axis)
+>>> axis=0
+>>> run(axis)
+<tf.Tensor: shape=(4, 2, 3), dtype=int32, numpy=
+array([[[0, 1, 2],
+        [3, 4, 5]],
+       [[0, 1, 2],
+        [3, 4, 5]],
+       [[0, 1, 2],
+        [3, 4, 5]],
+       [[0, 1, 2],
+        [3, 4, 5]]], dtype=int32)>
+>>> axis=1
+>>> run(axis)
+<tf.Tensor: shape=(1, 8, 3), dtype=int32, numpy=
+array([[[0, 1, 2],
+        [3, 4, 5],
+        [0, 1, 2],
+        [3, 4, 5],
+        [0, 1, 2],
+        [3, 4, 5],
+        [0, 1, 2],
+        [3, 4, 5]]], dtype=int32)>
+>>> axis=2
+>>> run(axis)
+<tf.Tensor: shape=(1, 2, 12), dtype=int32, numpy=
+array([[[0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2],
+        [3, 4, 5, 3, 4, 5, 3, 4, 5, 3, 4, 5]]], dtype=int32)>
+```
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>value</code>
+</td>
+<td>
+a <code>tf.distribute.DistributedValues</code> instance, e.g. returned by
+<code>Strategy.run</code>, to be combined into a single tensor. It can also be a
+regular tensor when used with <code>tf.distribute.OneDeviceStrategy</code> or the
+default strategy. The tensors that constitute the DistributedValues
+can only be dense tensors with non-zero rank, NOT a <code>tf.IndexedSlices</code>.
+</td>
+</tr><tr>
+<td>
+<code>axis</code>
+</td>
+<td>
+0-D int32 Tensor. Dimension along which to gather. Must be in the
+range [0, rank(value)).
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A <code>Tensor</code> that's the concatenation of <code>value</code> across replicas along
+<code>axis</code> dimension.
+</td>
+</tr>
+
+</table>
+
+<h3 id="reduce"><code>reduce</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reduce(
+    reduce_op, value, axis
+)
+</code></pre>
+
+Reduce `value` across replicas and return result on current device.
+
+```
+>>> strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1"])
+>>> def step_fn():
+...   i = tf.distribute.get_replica_context().replica_id_in_sync_group
+...   return tf.identity(i)
+>>>
+>>> per_replica_result = strategy.run(step_fn)
+>>> total = strategy.reduce("SUM", per_replica_result, axis=None)
+>>> total
+<tf.Tensor: shape=(), dtype=int32, numpy=1>
+```
+
+To see how this would look with multiple replicas, consider the same example
+with MirroredStrategy with 2 GPUs:
+
+```python
+strategy = tf.distribute.MirroredStrategy(devices=["GPU:0", "GPU:1"])
+def step_fn():
+  i = tf.distribute.get_replica_context().replica_id_in_sync_group
+  return tf.identity(i)
+
+per_replica_result = strategy.run(step_fn)
+# Check devices on which per replica result is:
+strategy.experimental_local_results(per_replica_result)[0].device
+# /job:localhost/replica:0/task:0/device:GPU:0
+strategy.experimental_local_results(per_replica_result)[1].device
+# /job:localhost/replica:0/task:0/device:GPU:1
+
+total = strategy.reduce("SUM", per_replica_result, axis=None)
+# Check device on which reduced result is:
+total.device
+# /job:localhost/replica:0/task:0/device:CPU:0
+
+```
+
+This API is typically used for aggregating the results returned from different
+replicas, for reporting etc. For example, loss computed from different replicas
+can be averaged using this API before printing.
+
+Note: The result is copied to the "current" device - which would typically be
+the CPU of the worker on which the program is running. For `TPUStrategy`, it is
+the first TPU host. For multi client `MultiWorkerMirroredStrategy`, this is CPU
+of each worker.
+
+There are a number of different tf.distribute APIs for reducing values across
+replicas: * `tf.distribute.ReplicaContext.all_reduce`: This differs from
+`Strategy.reduce` in that it is for replica context and does not copy the
+results to the host device. `all_reduce` should be typically used for reductions
+inside the training step such as gradients. *
+`tf.distribute.StrategyExtended.reduce_to` and
+`tf.distribute.StrategyExtended.batch_reduce_to`: These APIs are more advanced
+versions of `Strategy.reduce` as they allow customizing the destination of the
+result. They are also called in cross replica context.
+
+*What should axis be?*
+
+Given a per-replica value returned by `run`, say a per-example loss, the batch
+will be divided across all the replicas. This function allows you to aggregate
+across replicas and optionally also across batch elements by specifying the axis
+parameter accordingly.
+
+For example, if you have a global batch size of 8 and 2 replicas, values for
+examples `[0, 1, 2, 3]` will be on replica 0 and `[4, 5, 6, 7]` will be on
+replica 1. With `axis=None`, `reduce` will aggregate only across replicas,
+returning `[0+4, 1+5, 2+6, 3+7]`. This is useful when each replica is computing
+a scalar or some other value that doesn't have a "batch" dimension (like a
+gradient or loss). `strategy.reduce("sum", per_replica_result, axis=None)`
+
+Sometimes, you will want to aggregate across both the global batch *and* all
+replicas. You can get this behavior by specifying the batch dimension as the
+`axis`, typically `axis=0`. In this case it would return a scalar
+`0+1+2+3+4+5+6+7`. `strategy.reduce("sum", per_replica_result, axis=0)`
+
+If there is a last partial batch, you will need to specify an axis so that the
+resulting shape is consistent across replicas. So if the last batch has size 6
+and it is divided into [0, 1, 2, 3] and [4, 5], you would get a shape mismatch
+unless you specify `axis=0`. If you specify `tf.distribute.ReduceOp.MEAN`, using
+`axis=0` will use the correct denominator of 6. Contrast this with computing
+`reduce_mean` to get a scalar value on each replica and this function to average
+those means, which will weigh some values `1/8` and others `1/4`.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>reduce_op</code>
+</td>
+<td>
+a <code>tf.distribute.ReduceOp</code> value specifying how values should
+be combined. Allows using string representation of the enum such as
+"SUM", "MEAN".
+</td>
+</tr><tr>
+<td>
+<code>value</code>
+</td>
+<td>
+a <code>tf.distribute.DistributedValues</code> instance, e.g. returned by
+<code>Strategy.run</code>, to be combined into a single tensor. It can also be a
+regular tensor when used with <code>OneDeviceStrategy</code> or default strategy.
+</td>
+</tr><tr>
+<td>
+<code>axis</code>
+</td>
+<td>
+specifies the dimension to reduce along within each
+replica's tensor. Should typically be set to the batch dimension, or
+<code>None</code> to only reduce across replicas (e.g. if the tensor has no batch
+dimension).
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A <code>Tensor</code>.
+</td>
+</tr>
+
+</table>
+
+<h3 id="run"><code>run</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>run(
+    fn, args=(), kwargs=None, options=None
+)
+</code></pre>
+
+Invokes `fn` on each replica, with the given arguments.
+
+This method is the primary way to distribute your computation with a
+tf.distribute object. It invokes `fn` on each replica. If `args` or `kwargs`
+have `tf.distribute.DistributedValues`, such as those produced by a
+`tf.distribute.DistributedDataset` from
+`tf.distribute.Strategy.experimental_distribute_dataset` or
+`tf.distribute.Strategy.distribute_datasets_from_function`, when `fn` is
+executed on a particular replica, it will be executed with the component of
+`tf.distribute.DistributedValues` that correspond to that replica.
+
+`fn` is invoked under a replica context. `fn` may call
+`tf.distribute.get_replica_context()` to access members such as `all_reduce`.
+Please see the module-level docstring of tf.distribute for the concept of
+replica context.
+
+All arguments in `args` or `kwargs` can be a nested structure of tensors, e.g. a
+list of tensors, in which case `args` and `kwargs` will be passed to the `fn`
+invoked on each replica. Or `args` or `kwargs` can be
+`tf.distribute.DistributedValues` containing tensors or composite tensors, i.e.
+`tf.compat.v1.TensorInfo.CompositeTensor`, in which case each `fn` call will get
+the component of a `tf.distribute.DistributedValues` corresponding to its
+replica. Note that arbitrary Python values that are not of the types above are
+not supported.
+
+IMPORTANT: Depending on the implementation of `tf.distribute.Strategy` and
+whether eager execution is enabled, `fn` may be called one or more times. If
+`fn` is annotated with `tf.function` or `tf.distribute.Strategy.run` is called
+inside a `tf.function` (eager execution is disabled inside a `tf.function` by
+default), `fn` is called once per replica to generate a Tensorflow graph, which
+will then be reused for execution with new inputs. Otherwise, if eager execution
+is enabled, `fn` will be called once per replica every step just like regular
+python code.
+
+#### Example usage:
+
+1.  Constant tensor input.
+
+    ```
+    >>> strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1"])
+    >>> tensor_input = tf.constant(3.0)
+    >>> @tf.function
+    ... def replica_fn(input):
+    ...   return input*2.0
+    >>> result = strategy.run(replica_fn, args=(tensor_input,))
+    >>> result
+    PerReplica:{
+      0: <tf.Tensor: shape=(), dtype=float32, numpy=6.0>,
+      1: <tf.Tensor: shape=(), dtype=float32, numpy=6.0>
+    }
+    ```
+
+2.  DistributedValues input. {: value=2}
+
+    ```
+    >>> strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1"])
+    >>> @tf.function
+    ... def run():
+    ...   def value_fn(value_context):
+    ...     return value_context.num_replicas_in_sync
+    ...   distributed_values = (
+    ...     strategy.experimental_distribute_values_from_function(
+    ...       value_fn))
+    ...   def replica_fn2(input):
+    ...     return input*2
+    ...   return strategy.run(replica_fn2, args=(distributed_values,))
+    >>> result = run()
+    >>> result
+    <tf.Tensor: shape=(), dtype=int32, numpy=4>
+    ```
+
+3.  Use `tf.distribute.ReplicaContext` to allreduce values. {: value=3}
+
+    ```
+    >>> strategy = tf.distribute.MirroredStrategy(["gpu:0", "gpu:1"])
+    >>> @tf.function
+    ... def run():
+    ...    def value_fn(value_context):
+    ...      return tf.constant(value_context.replica_id_in_sync_group)
+    ...    distributed_values = (
+    ...        strategy.experimental_distribute_values_from_function(
+    ...            value_fn))
+    ...    def replica_fn(input):
+    ...      return tf.distribute.get_replica_context().all_reduce(
+    ...          "sum", input)
+    ...    return strategy.run(replica_fn, args=(distributed_values,))
+    >>> result = run()
+    >>> result
+    PerReplica:{
+      0: <tf.Tensor: shape=(), dtype=int32, numpy=1>,
+      1: <tf.Tensor: shape=(), dtype=int32, numpy=1>
+    }
+    ```
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>fn</code>
+</td>
+<td>
+The function to run on each replica.
+</td>
+</tr><tr>
+<td>
+<code>args</code>
+</td>
+<td>
+Optional positional arguments to <code>fn</code>. Its element can be a tensor,
+a nested structure of tensors or a <code>tf.distribute.DistributedValues</code>.
+</td>
+</tr><tr>
+<td>
+<code>kwargs</code>
+</td>
+<td>
+Optional keyword arguments to <code>fn</code>. Its element can be a tensor,
+a nested structure of tensors or a <code>tf.distribute.DistributedValues</code>.
+</td>
+</tr><tr>
+<td>
+<code>options</code>
+</td>
+<td>
+An optional instance of <code>tf.distribute.RunOptions</code> specifying
+the options to run <code>fn</code>.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+Merged return value of <code>fn</code> across replicas. The structure of the return
+value is the same as the return value from <code>fn</code>. Each element in the
+structure can either be <code>tf.distribute.DistributedValues</code>, <code>Tensor</code>
+objects, or <code>Tensor</code>s (for example, if running on a single replica).
+</td>
+</tr>
+
+</table>
+
+<h3 id="scope"><code>scope</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>scope()
+</code></pre>
+
+Context manager to make the strategy current and distribute variables.
+
+This method returns a context manager, and is used as follows:
+
+```
+>>> strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1"])
+>>> # Variable created inside scope:
+>>> with strategy.scope():
+...   mirrored_variable = tf.Variable(1.)
+>>> mirrored_variable
+MirroredVariable:{
+  0: <tf.Variable 'Variable:0' shape=() dtype=float32, numpy=1.0>,
+  1: <tf.Variable 'Variable/replica_1:0' shape=() dtype=float32, numpy=1.0>
+}
+>>> # Variable created outside scope:
+>>> regular_variable = tf.Variable(1.)
+>>> regular_variable
+<tf.Variable 'Variable:0' shape=() dtype=float32, numpy=1.0>
+```
+
+*What happens when Strategy.scope is entered?*
+
+*   `strategy` is installed in the global context as the "current" strategy.
+    Inside this scope, `tf.distribute.get_strategy()` will now return this
+    strategy. Outside this scope, it returns the default no-op strategy.
+*   Entering the scope also enters the "cross-replica context". See
+    `tf.distribute.StrategyExtended` for an explanation on cross-replica and
+    replica contexts.
+*   Variable creation inside `scope` is intercepted by the strategy. Each
+    strategy defines how it wants to affect the variable creation. Sync
+    strategies like `MirroredStrategy`, `TPUStrategy` and
+    `MultiWorkerMiroredStrategy` create variables replicated on each replica,
+    whereas `ParameterServerStrategy` creates variables on the parameter
+    servers. This is done using a custom `tf.variable_creator_scope`.
+*   In some strategies, a default device scope may also be entered: in
+    `MultiWorkerMiroredStrategy`, a default device scope of "/CPU:0" is entered
+    on each worker.
+
+Note: Entering a scope does not automatically distribute a computation, except
+in the case of high level training framework like keras `model.fit`. If you're
+not using `model.fit`, you need to use `strategy.run` API to explicitly
+distribute that computation. See an example in the
+[custom training loop tutorial](https://www.tensorflow.org/tutorials/distribute/custom_training).
+
+*What should be in scope and what should be outside?*
+
+There are a number of requirements on what needs to happen inside the scope.
+However, in places where we have information about which strategy is in use, we
+often enter the scope for the user, so they don't have to do it explicitly (i.e.
+calling those either inside or outside the scope is OK).
+
+*   Anything that creates variables that should be distributed variables must be
+    called in a `strategy.scope`. This can be accomplished either by directly
+    calling the variable creating function within the scope context, or by
+    relying on another API like `strategy.run` or `keras.Model.fit` to
+    automatically enter it for you. Any variable that is created outside scope
+    will not be distributed and may have performance implications. Some common
+    objects that create variables in TF are Models, Optimizers, Metrics. Such
+    objects should always be initialized in the scope, and any functions that
+    may lazily create variables (e.g., `Model.__call__()`, tracing a
+    `tf.function`, etc.) should similarly be called within scope. Another source
+    of variable creation can be a checkpoint restore - when variables are
+    created lazily. Note that any variable created inside a strategy captures
+    the strategy information. So reading and writing to these variables outside
+    the `strategy.scope` can also work seamlessly, without the user having to
+    enter the scope.
+*   Some strategy APIs (such as `strategy.run` and `strategy.reduce`) which
+    require to be in a strategy's scope, enter the scope automatically, which
+    means when using those APIs you don't need to explicitly enter the scope
+    yourself.
+*   When a `tf.keras.Model` is created inside a `strategy.scope`, the Model
+    object captures the scope information. When high level training framework
+    methods such as `model.compile`, `model.fit`, etc. are then called, the
+    captured scope will be automatically entered, and the associated strategy
+    will be used to distribute the training etc. See a detailed example in
+    [distributed keras tutorial](https://www.tensorflow.org/tutorials/distribute/keras).
+    WARNING: Simply calling `model(..)` does not automatically enter the
+    captured scope -- only high level training framework APIs support this
+    behavior: `model.compile`, `model.fit`, `model.evaluate`, `model.predict`
+    and `model.save` can all be called inside or outside the scope.
+*   The following can be either inside or outside the scope:
+    *   Creating the input datasets
+    *   Defining `tf.function`s that represent your training step
+    *   Saving APIs such as `tf.saved_model.save`. Loading creates variables, so
+        that should go inside the scope if you want to train the model in a
+        distributed way.
+    *   Checkpoint saving. As mentioned above - `checkpoint.restore` may
+        sometimes need to be inside scope if it creates variables.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A context manager.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/PassthruDatasetProvider.md b/tensorflow_gnn/docs/api_docs/python/runner/PassthruDatasetProvider.md
new file mode 100644
index 00000000..3e1bdaea
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/PassthruDatasetProvider.md
@@ -0,0 +1,40 @@
+# runner.PassthruDatasetProvider
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/input/datasets.py#L59-L83">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Builds a `tf.data.Dataset` from a pass thru dataset.
+
+Inherits From: [`DatasetProvider`](../runner/DatasetProvider.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.PassthruDatasetProvider(
+    dataset: tf.data.Dataset,
+    *,
+    shuffle_datasets: bool = False,
+    examples_shuffle_size: Optional[int] = None
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+Passes any `dataset` thru: omitting any sharding. For detailed documentation,
+see the filename dataset provider complement: `SimpleDatasetsProvider.`
+
+## Methods
+
+<h3 id="get_dataset"><code>get_dataset</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/input/datasets.py#L75-L83">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_dataset(
+    _: tf.distribute.InputContext
+) -> tf.data.Dataset
+</code></pre>
+
+Gets a `tf.data.Dataset` omitting any input context.
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/PassthruSampleDatasetsProvider.md b/tensorflow_gnn/docs/api_docs/python/runner/PassthruSampleDatasetsProvider.md
new file mode 100644
index 00000000..0c841412
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/PassthruSampleDatasetsProvider.md
@@ -0,0 +1,46 @@
+# runner.PassthruSampleDatasetsProvider
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/input/datasets.py#L227-L271">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Builds a sampled `tf.data.Dataset` from multiple pass thru datasets.
+
+Inherits From: [`DatasetProvider`](../runner/DatasetProvider.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.PassthruSampleDatasetsProvider(
+    principal_dataset: tf.data.Dataset,
+    extra_datasets: Sequence[tf.data.Dataset],
+    principal_weight: Optional[float] = None,
+    extra_weights: Optional[Sequence[float]] = None,
+    *,
+    principal_cardinality: Optional[int] = None,
+    fixed_cardinality: bool = False,
+    shuffle_dataset: bool = False,
+    examples_shuffle_size: Optional[int] = None
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+Passes any `principal_dataset` and `extra_datasets` thru: omitting any sharding.
+For detailed documentation, see the filename dataset provider complement:
+`SimpleSampleDatasetsProvider.`
+
+## Methods
+
+<h3 id="get_dataset"><code>get_dataset</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/input/datasets.py#L258-L271">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_dataset(
+    _: tf.distribute.InputContext
+) -> tf.data.Dataset
+</code></pre>
+
+Gets a sampled `tf.data.Dataset` omitting any input context.
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/Predictions.md b/tensorflow_gnn/docs/api_docs/python/runner/Predictions.md
new file mode 100644
index 00000000..31446b86
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/Predictions.md
@@ -0,0 +1,17 @@
+# runner.Predictions
+
+<!-- Insert buttons and diff -->
+
+This symbol is a **type alias**.
+
+#### Source:
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>Predictions = Union[
+    tf.Tensor,
+    tf.RaggedTensor,
+    Mapping[str, Union[tf.Tensor, tf.RaggedTensor]]
+]
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/RootNodeBinaryClassification.md b/tensorflow_gnn/docs/api_docs/python/runner/RootNodeBinaryClassification.md
new file mode 100644
index 00000000..817ae0f5
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/RootNodeBinaryClassification.md
@@ -0,0 +1,216 @@
+# runner.RootNodeBinaryClassification
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L455-L496">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Root node binary (or multi-label) classification.
+
+Inherits From: [`Task`](../runner/Task.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.RootNodeBinaryClassification(
+    node_set_name: str,
+    units: int = 1,
+    *,
+    state_name: str = tfgnn.HIDDEN_STATE,
+    name: str = &#x27;classification_logits&#x27;,
+    label_fn: Optional[LabelFn] = None,
+    label_feature_name: Optional[str] = None
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>node_set_name</code><a id="node_set_name"></a>
+</td>
+<td>
+The node set containing the root node.
+</td>
+</tr><tr>
+<td>
+<code>units</code><a id="units"></a>
+</td>
+<td>
+The units for the classification head. (Typically <code>1</code> for binary
+classification and the number of labels for multi-label classification.)
+</td>
+</tr><tr>
+<td>
+<code>state_name</code><a id="state_name"></a>
+</td>
+<td>
+The feature name for activations (e.g.: tfgnn.HIDDEN_STATE).
+</td>
+</tr><tr>
+<td>
+<code>name</code><a id="name"></a>
+</td>
+<td>
+The classification head's layer name. To control the naming of saved
+model outputs see the runner model exporters (e.g.,
+<code>KerasModelExporter</code>).
+</td>
+</tr><tr>
+<td>
+<code>label_fn</code><a id="label_fn"></a>
+</td>
+<td>
+A label extraction function. This function mutates the input
+<code>GraphTensor</code>. Mutually exclusive with <code>label_feature_name</code>.
+</td>
+</tr><tr>
+<td>
+<code>label_feature_name</code><a id="label_feature_name"></a>
+</td>
+<td>
+A label feature name for readout from the auxiliary
+'_readout' node set. Readout does not mutate the input <code>GraphTensor</code>.
+Mutually exclusive with <code>label_fn</code>.
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="gather_activations"><code>gather_activations</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L300-L304">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>gather_activations(
+    inputs: GraphTensor
+) -> Field
+</code></pre>
+
+Gather activations from root nodes.
+
+<h3 id="losses"><code>losses</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L179-L180">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>losses() -> interfaces.Losses
+</code></pre>
+
+Returns arbitrary task specific losses.
+
+<h3 id="metrics"><code>metrics</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L182-L188">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>metrics() -> interfaces.Metrics
+</code></pre>
+
+Returns arbitrary task specific metrics.
+
+<h3 id="predict"><code>predict</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L139-L153">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>predict(
+    inputs: tfgnn.GraphTensor
+) -> interfaces.Predictions
+</code></pre>
+
+Apply a linear head for classification.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A <code>tfgnn.GraphTensor</code> for classification.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+The classification logits.
+</td>
+</tr>
+
+</table>
+
+<h3 id="preprocess"><code>preprocess</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L155-L162">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>preprocess(
+    inputs: GraphTensor
+) -> tuple[GraphTensor, Field]
+</code></pre>
+
+Preprocesses a scalar (after `merge_batch_to_components`) `GraphTensor`.
+
+This function uses the Keras functional API to define non-trainable
+transformations of the symbolic input `GraphTensor`, which get executed during
+dataset preprocessing in a `tf.data.Dataset.map(...)` operation. It has two
+responsibilities:
+
+1.  Splitting the training label out of the input for training. It must be
+    returned as a separate tensor or mapping of tensors.
+2.  Optionally, transforming input features. Some advanced modeling techniques
+    require running the same base GNN on multiple different transformations, so
+    this function may return a single `GraphTensor` or a non-empty sequence of
+    `GraphTensors`. The corresponding base GNN output for each `GraphTensor` is
+    provided to the `predict(...)` method.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A symbolic Keras <code>GraphTensor</code> for processing.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A tuple of processed <code>GraphTensor</code>(s) and a (one or mapping of) <code>Field</code> to
+be used as labels.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/RootNodeLabelFn.md b/tensorflow_gnn/docs/api_docs/python/runner/RootNodeLabelFn.md
new file mode 100644
index 00000000..80a27eee
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/RootNodeLabelFn.md
@@ -0,0 +1,20 @@
+# runner.RootNodeLabelFn
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/utils/label_fns.py#L43-L72">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Reads out a `tfgnn.Field` from the `GraphTensor` root (i.e. first) node.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.RootNodeLabelFn(
+    node_set_name: tfgnn.NodeSetName,
+    *,
+    feature_name: tfgnn.FieldName = tfgnn.HIDDEN_STATE,
+    **kwargs
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/RootNodeMeanAbsoluteError.md b/tensorflow_gnn/docs/api_docs/python/runner/RootNodeMeanAbsoluteError.md
new file mode 100644
index 00000000..08daf609
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/RootNodeMeanAbsoluteError.md
@@ -0,0 +1,194 @@
+# runner.RootNodeMeanAbsoluteError
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L333-L335">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Mean absolute error task.
+
+Inherits From: [`Task`](../runner/Task.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.RootNodeMeanAbsoluteError(
+    node_set_name: str,
+    *,
+    units: int = 1,
+    state_name: str = tfgnn.HIDDEN_STATE,
+    **kwargs
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>units</code><a id="units"></a>
+</td>
+<td>
+The units for the regression head.
+</td>
+</tr><tr>
+<td>
+<code>name</code><a id="name"></a>
+</td>
+<td>
+The regression head's layer name. This name typically appears in
+the exported model's SignatureDef.
+</td>
+</tr><tr>
+<td>
+<code>label_fn</code><a id="label_fn"></a>
+</td>
+<td>
+A label extraction function. This function mutates the input
+<code>GraphTensor</code>. Mutually exclusive with <code>label_feature_name</code>.
+</td>
+</tr><tr>
+<td>
+<code>label_feature_name</code><a id="label_feature_name"></a>
+</td>
+<td>
+A label feature name for readout from the auxiliary
+'_readout' node set. Readout does not mutate the input <code>GraphTensor</code>.
+Mutually exclusive with <code>label_fn</code>.
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="gather_activations"><code>gather_activations</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L147-L150">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>gather_activations(
+    inputs: GraphTensor
+) -> tf.Tensor
+</code></pre>
+
+<h3 id="losses"><code>losses</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L156-L157">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>losses() -> interfaces.Losses
+</code></pre>
+
+<h3 id="metrics"><code>metrics</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L103-L108">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>metrics() -> interfaces.Metrics
+</code></pre>
+
+Regression metrics.
+
+<h3 id="predict"><code>predict</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L74-L88">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>predict(
+    inputs: tfgnn.GraphTensor
+) -> interfaces.Predictions
+</code></pre>
+
+Apply a linear head for regression.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A <code>tfgnn.GraphTensor</code> for regression.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+The regression logits.
+</td>
+</tr>
+
+</table>
+
+<h3 id="preprocess"><code>preprocess</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L90-L97">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>preprocess(
+    inputs: GraphTensor
+) -> tuple[GraphTensor, Field]
+</code></pre>
+
+Preprocesses a scalar (after `merge_batch_to_components`) `GraphTensor`.
+
+This function uses the Keras functional API to define non-trainable
+transformations of the symbolic input `GraphTensor`, which get executed during
+dataset preprocessing in a `tf.data.Dataset.map(...)` operation. It has two
+responsibilities:
+
+1.  Splitting the training label out of the input for training. It must be
+    returned as a separate tensor or mapping of tensors.
+2.  Optionally, transforming input features. Some advanced modeling techniques
+    require running the same base GNN on multiple different transformations, so
+    this function may return a single `GraphTensor` or a non-empty sequence of
+    `GraphTensors`. The corresponding base GNN output for each `GraphTensor` is
+    provided to the `predict(...)` method.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A symbolic Keras <code>GraphTensor</code> for processing.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A tuple of processed <code>GraphTensor</code>(s) and a (one or mapping of) <code>Field</code> to
+be used as labels.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/RootNodeMeanAbsoluteLogarithmicError.md b/tensorflow_gnn/docs/api_docs/python/runner/RootNodeMeanAbsoluteLogarithmicError.md
new file mode 100644
index 00000000..57c56240
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/RootNodeMeanAbsoluteLogarithmicError.md
@@ -0,0 +1,152 @@
+# runner.RootNodeMeanAbsoluteLogarithmicError
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L284-L306">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Root node mean absolute logarithmic error task.
+
+Inherits From: [`Task`](../runner/Task.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.RootNodeMeanAbsoluteLogarithmicError(
+    reduction: tf.keras.losses.Reduction = AUTO,
+    name: Optional[str] = None,
+    **kwargs
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+## Methods
+
+<h3 id="gather_activations"><code>gather_activations</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L147-L150">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>gather_activations(
+    inputs: GraphTensor
+) -> tf.Tensor
+</code></pre>
+
+<h3 id="losses"><code>losses</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L280-L281">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>losses() -> interfaces.Losses
+</code></pre>
+
+<h3 id="metrics"><code>metrics</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L103-L108">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>metrics() -> interfaces.Metrics
+</code></pre>
+
+Regression metrics.
+
+<h3 id="predict"><code>predict</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L289-L306">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>predict(
+    inputs: tfgnn.GraphTensor
+) -> interfaces.Predictions
+</code></pre>
+
+Apply a head with ReLU for nonnegative regression.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A <code>tfgnn.GraphTensor</code> use for prediction.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+The nonnegative logits.
+</td>
+</tr>
+
+</table>
+
+<h3 id="preprocess"><code>preprocess</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L90-L97">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>preprocess(
+    inputs: GraphTensor
+) -> tuple[GraphTensor, Field]
+</code></pre>
+
+Preprocesses a scalar (after `merge_batch_to_components`) `GraphTensor`.
+
+This function uses the Keras functional API to define non-trainable
+transformations of the symbolic input `GraphTensor`, which get executed during
+dataset preprocessing in a `tf.data.Dataset.map(...)` operation. It has two
+responsibilities:
+
+1.  Splitting the training label out of the input for training. It must be
+    returned as a separate tensor or mapping of tensors.
+2.  Optionally, transforming input features. Some advanced modeling techniques
+    require running the same base GNN on multiple different transformations, so
+    this function may return a single `GraphTensor` or a non-empty sequence of
+    `GraphTensors`. The corresponding base GNN output for each `GraphTensor` is
+    provided to the `predict(...)` method.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A symbolic Keras <code>GraphTensor</code> for processing.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A tuple of processed <code>GraphTensor</code>(s) and a (one or mapping of) <code>Field</code> to
+be used as labels.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/RootNodeMeanAbsolutePercentageError.md b/tensorflow_gnn/docs/api_docs/python/runner/RootNodeMeanAbsolutePercentageError.md
new file mode 100644
index 00000000..defda629
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/RootNodeMeanAbsolutePercentageError.md
@@ -0,0 +1,194 @@
+# runner.RootNodeMeanAbsolutePercentageError
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L338-L340">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Mean absolute percentage error task.
+
+Inherits From: [`Task`](../runner/Task.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.RootNodeMeanAbsolutePercentageError(
+    node_set_name: str,
+    *,
+    units: int = 1,
+    state_name: str = tfgnn.HIDDEN_STATE,
+    **kwargs
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>units</code><a id="units"></a>
+</td>
+<td>
+The units for the regression head.
+</td>
+</tr><tr>
+<td>
+<code>name</code><a id="name"></a>
+</td>
+<td>
+The regression head's layer name. This name typically appears in
+the exported model's SignatureDef.
+</td>
+</tr><tr>
+<td>
+<code>label_fn</code><a id="label_fn"></a>
+</td>
+<td>
+A label extraction function. This function mutates the input
+<code>GraphTensor</code>. Mutually exclusive with <code>label_feature_name</code>.
+</td>
+</tr><tr>
+<td>
+<code>label_feature_name</code><a id="label_feature_name"></a>
+</td>
+<td>
+A label feature name for readout from the auxiliary
+'_readout' node set. Readout does not mutate the input <code>GraphTensor</code>.
+Mutually exclusive with <code>label_fn</code>.
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="gather_activations"><code>gather_activations</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L147-L150">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>gather_activations(
+    inputs: GraphTensor
+) -> tf.Tensor
+</code></pre>
+
+<h3 id="losses"><code>losses</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L163-L164">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>losses() -> interfaces.Losses
+</code></pre>
+
+<h3 id="metrics"><code>metrics</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L103-L108">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>metrics() -> interfaces.Metrics
+</code></pre>
+
+Regression metrics.
+
+<h3 id="predict"><code>predict</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L74-L88">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>predict(
+    inputs: tfgnn.GraphTensor
+) -> interfaces.Predictions
+</code></pre>
+
+Apply a linear head for regression.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A <code>tfgnn.GraphTensor</code> for regression.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+The regression logits.
+</td>
+</tr>
+
+</table>
+
+<h3 id="preprocess"><code>preprocess</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L90-L97">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>preprocess(
+    inputs: GraphTensor
+) -> tuple[GraphTensor, Field]
+</code></pre>
+
+Preprocesses a scalar (after `merge_batch_to_components`) `GraphTensor`.
+
+This function uses the Keras functional API to define non-trainable
+transformations of the symbolic input `GraphTensor`, which get executed during
+dataset preprocessing in a `tf.data.Dataset.map(...)` operation. It has two
+responsibilities:
+
+1.  Splitting the training label out of the input for training. It must be
+    returned as a separate tensor or mapping of tensors.
+2.  Optionally, transforming input features. Some advanced modeling techniques
+    require running the same base GNN on multiple different transformations, so
+    this function may return a single `GraphTensor` or a non-empty sequence of
+    `GraphTensors`. The corresponding base GNN output for each `GraphTensor` is
+    provided to the `predict(...)` method.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A symbolic Keras <code>GraphTensor</code> for processing.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A tuple of processed <code>GraphTensor</code>(s) and a (one or mapping of) <code>Field</code> to
+be used as labels.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/RootNodeMeanSquaredError.md b/tensorflow_gnn/docs/api_docs/python/runner/RootNodeMeanSquaredError.md
new file mode 100644
index 00000000..18b717d5
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/RootNodeMeanSquaredError.md
@@ -0,0 +1,194 @@
+# runner.RootNodeMeanSquaredError
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L343-L344">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Mean squared error task.
+
+Inherits From: [`Task`](../runner/Task.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.RootNodeMeanSquaredError(
+    node_set_name: str,
+    *,
+    units: int = 1,
+    state_name: str = tfgnn.HIDDEN_STATE,
+    **kwargs
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>units</code><a id="units"></a>
+</td>
+<td>
+The units for the regression head.
+</td>
+</tr><tr>
+<td>
+<code>name</code><a id="name"></a>
+</td>
+<td>
+The regression head's layer name. This name typically appears in
+the exported model's SignatureDef.
+</td>
+</tr><tr>
+<td>
+<code>label_fn</code><a id="label_fn"></a>
+</td>
+<td>
+A label extraction function. This function mutates the input
+<code>GraphTensor</code>. Mutually exclusive with <code>label_feature_name</code>.
+</td>
+</tr><tr>
+<td>
+<code>label_feature_name</code><a id="label_feature_name"></a>
+</td>
+<td>
+A label feature name for readout from the auxiliary
+'_readout' node set. Readout does not mutate the input <code>GraphTensor</code>.
+Mutually exclusive with <code>label_fn</code>.
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="gather_activations"><code>gather_activations</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L147-L150">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>gather_activations(
+    inputs: GraphTensor
+) -> tf.Tensor
+</code></pre>
+
+<h3 id="losses"><code>losses</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L170-L171">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>losses() -> interfaces.Losses
+</code></pre>
+
+<h3 id="metrics"><code>metrics</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L103-L108">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>metrics() -> interfaces.Metrics
+</code></pre>
+
+Regression metrics.
+
+<h3 id="predict"><code>predict</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L74-L88">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>predict(
+    inputs: tfgnn.GraphTensor
+) -> interfaces.Predictions
+</code></pre>
+
+Apply a linear head for regression.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A <code>tfgnn.GraphTensor</code> for regression.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+The regression logits.
+</td>
+</tr>
+
+</table>
+
+<h3 id="preprocess"><code>preprocess</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L90-L97">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>preprocess(
+    inputs: GraphTensor
+) -> tuple[GraphTensor, Field]
+</code></pre>
+
+Preprocesses a scalar (after `merge_batch_to_components`) `GraphTensor`.
+
+This function uses the Keras functional API to define non-trainable
+transformations of the symbolic input `GraphTensor`, which get executed during
+dataset preprocessing in a `tf.data.Dataset.map(...)` operation. It has two
+responsibilities:
+
+1.  Splitting the training label out of the input for training. It must be
+    returned as a separate tensor or mapping of tensors.
+2.  Optionally, transforming input features. Some advanced modeling techniques
+    require running the same base GNN on multiple different transformations, so
+    this function may return a single `GraphTensor` or a non-empty sequence of
+    `GraphTensors`. The corresponding base GNN output for each `GraphTensor` is
+    provided to the `predict(...)` method.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A symbolic Keras <code>GraphTensor</code> for processing.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A tuple of processed <code>GraphTensor</code>(s) and a (one or mapping of) <code>Field</code> to
+be used as labels.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/RootNodeMeanSquaredLogScaledError.md b/tensorflow_gnn/docs/api_docs/python/runner/RootNodeMeanSquaredLogScaledError.md
new file mode 100644
index 00000000..d90fc77c
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/RootNodeMeanSquaredLogScaledError.md
@@ -0,0 +1,155 @@
+# runner.RootNodeMeanSquaredLogScaledError
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L352-L354">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Mean squared log scaled error task.
+
+Inherits From: [`Task`](../runner/Task.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.RootNodeMeanSquaredLogScaledError(
+    *args,
+    alpha_loss_param: float = 5.0,
+    epsilon_loss_param: float = 1e-08,
+    reduction: tf.keras.losses.Reduction = AUTO,
+    name: Optional[str] = None,
+    **kwargs
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+## Methods
+
+<h3 id="gather_activations"><code>gather_activations</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L147-L150">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>gather_activations(
+    inputs: GraphTensor
+) -> tf.Tensor
+</code></pre>
+
+<h3 id="losses"><code>losses</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L242-L248">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>losses() -> interfaces.Losses
+</code></pre>
+
+<h3 id="metrics"><code>metrics</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L103-L108">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>metrics() -> interfaces.Metrics
+</code></pre>
+
+Regression metrics.
+
+<h3 id="predict"><code>predict</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L74-L88">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>predict(
+    inputs: tfgnn.GraphTensor
+) -> interfaces.Predictions
+</code></pre>
+
+Apply a linear head for regression.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A <code>tfgnn.GraphTensor</code> for regression.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+The regression logits.
+</td>
+</tr>
+
+</table>
+
+<h3 id="preprocess"><code>preprocess</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L90-L97">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>preprocess(
+    inputs: GraphTensor
+) -> tuple[GraphTensor, Field]
+</code></pre>
+
+Preprocesses a scalar (after `merge_batch_to_components`) `GraphTensor`.
+
+This function uses the Keras functional API to define non-trainable
+transformations of the symbolic input `GraphTensor`, which get executed during
+dataset preprocessing in a `tf.data.Dataset.map(...)` operation. It has two
+responsibilities:
+
+1.  Splitting the training label out of the input for training. It must be
+    returned as a separate tensor or mapping of tensors.
+2.  Optionally, transforming input features. Some advanced modeling techniques
+    require running the same base GNN on multiple different transformations, so
+    this function may return a single `GraphTensor` or a non-empty sequence of
+    `GraphTensors`. The corresponding base GNN output for each `GraphTensor` is
+    provided to the `predict(...)` method.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A symbolic Keras <code>GraphTensor</code> for processing.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A tuple of processed <code>GraphTensor</code>(s) and a (one or mapping of) <code>Field</code> to
+be used as labels.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/RootNodeMeanSquaredLogarithmicError.md b/tensorflow_gnn/docs/api_docs/python/runner/RootNodeMeanSquaredLogarithmicError.md
new file mode 100644
index 00000000..bbe8e2d8
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/RootNodeMeanSquaredLogarithmicError.md
@@ -0,0 +1,194 @@
+# runner.RootNodeMeanSquaredLogarithmicError
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L347-L349">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Mean squared logarithmic error task.
+
+Inherits From: [`Task`](../runner/Task.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.RootNodeMeanSquaredLogarithmicError(
+    node_set_name: str,
+    *,
+    units: int = 1,
+    state_name: str = tfgnn.HIDDEN_STATE,
+    **kwargs
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>units</code><a id="units"></a>
+</td>
+<td>
+The units for the regression head.
+</td>
+</tr><tr>
+<td>
+<code>name</code><a id="name"></a>
+</td>
+<td>
+The regression head's layer name. This name typically appears in
+the exported model's SignatureDef.
+</td>
+</tr><tr>
+<td>
+<code>label_fn</code><a id="label_fn"></a>
+</td>
+<td>
+A label extraction function. This function mutates the input
+<code>GraphTensor</code>. Mutually exclusive with <code>label_feature_name</code>.
+</td>
+</tr><tr>
+<td>
+<code>label_feature_name</code><a id="label_feature_name"></a>
+</td>
+<td>
+A label feature name for readout from the auxiliary
+'_readout' node set. Readout does not mutate the input <code>GraphTensor</code>.
+Mutually exclusive with <code>label_fn</code>.
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="gather_activations"><code>gather_activations</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L147-L150">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>gather_activations(
+    inputs: GraphTensor
+) -> tf.Tensor
+</code></pre>
+
+<h3 id="losses"><code>losses</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L177-L178">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>losses() -> interfaces.Losses
+</code></pre>
+
+<h3 id="metrics"><code>metrics</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L103-L108">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>metrics() -> interfaces.Metrics
+</code></pre>
+
+Regression metrics.
+
+<h3 id="predict"><code>predict</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L74-L88">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>predict(
+    inputs: tfgnn.GraphTensor
+) -> interfaces.Predictions
+</code></pre>
+
+Apply a linear head for regression.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A <code>tfgnn.GraphTensor</code> for regression.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+The regression logits.
+</td>
+</tr>
+
+</table>
+
+<h3 id="preprocess"><code>preprocess</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/regression.py#L90-L97">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>preprocess(
+    inputs: GraphTensor
+) -> tuple[GraphTensor, Field]
+</code></pre>
+
+Preprocesses a scalar (after `merge_batch_to_components`) `GraphTensor`.
+
+This function uses the Keras functional API to define non-trainable
+transformations of the symbolic input `GraphTensor`, which get executed during
+dataset preprocessing in a `tf.data.Dataset.map(...)` operation. It has two
+responsibilities:
+
+1.  Splitting the training label out of the input for training. It must be
+    returned as a separate tensor or mapping of tensors.
+2.  Optionally, transforming input features. Some advanced modeling techniques
+    require running the same base GNN on multiple different transformations, so
+    this function may return a single `GraphTensor` or a non-empty sequence of
+    `GraphTensors`. The corresponding base GNN output for each `GraphTensor` is
+    provided to the `predict(...)` method.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A symbolic Keras <code>GraphTensor</code> for processing.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A tuple of processed <code>GraphTensor</code>(s) and a (one or mapping of) <code>Field</code> to
+be used as labels.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/RootNodeMulticlassClassification.md b/tensorflow_gnn/docs/api_docs/python/runner/RootNodeMulticlassClassification.md
new file mode 100644
index 00000000..a87e6a36
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/RootNodeMulticlassClassification.md
@@ -0,0 +1,233 @@
+# runner.RootNodeMulticlassClassification
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L499-L544">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Root node multiclass classification.
+
+Inherits From: [`Task`](../runner/Task.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.RootNodeMulticlassClassification(
+    node_set_name: str,
+    *,
+    num_classes: Optional[int] = None,
+    class_names: Optional[Sequence[str]] = None,
+    per_class_statistics: bool = False,
+    state_name: str = tfgnn.HIDDEN_STATE,
+    name: str = &#x27;classification_logits&#x27;,
+    label_fn: Optional[LabelFn] = None,
+    label_feature_name: Optional[str] = None
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>node_set_name</code><a id="node_set_name"></a>
+</td>
+<td>
+The node set containing the root node.
+</td>
+</tr><tr>
+<td>
+<code>num_classes</code><a id="num_classes"></a>
+</td>
+<td>
+The number of classes. Exactly one of <code>num_classes</code> or
+<code>class_names</code> must be specified
+</td>
+</tr><tr>
+<td>
+<code>class_names</code><a id="class_names"></a>
+</td>
+<td>
+The class names. Exactly one of <code>num_classes</code> or
+<code>class_names</code> must be specified
+</td>
+</tr><tr>
+<td>
+<code>per_class_statistics</code><a id="per_class_statistics"></a>
+</td>
+<td>
+Whether to compute statistics per class.
+</td>
+</tr><tr>
+<td>
+<code>state_name</code><a id="state_name"></a>
+</td>
+<td>
+The feature name for activations (e.g.: tfgnn.HIDDEN_STATE).
+</td>
+</tr><tr>
+<td>
+<code>name</code><a id="name"></a>
+</td>
+<td>
+The classification head's layer name. To control the naming of saved
+model outputs see the runner model exporters (e.g.,
+<code>KerasModelExporter</code>).
+</td>
+</tr><tr>
+<td>
+<code>label_fn</code><a id="label_fn"></a>
+</td>
+<td>
+A label extraction function. This function mutates the input
+<code>GraphTensor</code>. Mutually exclusive with <code>label_feature_name</code>.
+</td>
+</tr><tr>
+<td>
+<code>label_feature_name</code><a id="label_feature_name"></a>
+</td>
+<td>
+A label feature name for readout from the auxiliary
+'_readout' node set. Readout does not mutate the input <code>GraphTensor</code>.
+Mutually exclusive with <code>label_fn</code>.
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="gather_activations"><code>gather_activations</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L300-L304">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>gather_activations(
+    inputs: GraphTensor
+) -> Field
+</code></pre>
+
+Gather activations from root nodes.
+
+<h3 id="losses"><code>losses</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L212-L214">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>losses() -> interfaces.Losses
+</code></pre>
+
+Sparse categorical crossentropy loss.
+
+<h3 id="metrics"><code>metrics</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L216-L232">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>metrics() -> interfaces.Metrics
+</code></pre>
+
+Sparse categorical metrics.
+
+<h3 id="predict"><code>predict</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L139-L153">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>predict(
+    inputs: tfgnn.GraphTensor
+) -> interfaces.Predictions
+</code></pre>
+
+Apply a linear head for classification.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A <code>tfgnn.GraphTensor</code> for classification.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+The classification logits.
+</td>
+</tr>
+
+</table>
+
+<h3 id="preprocess"><code>preprocess</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/tasks/classification.py#L155-L162">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>preprocess(
+    inputs: GraphTensor
+) -> tuple[GraphTensor, Field]
+</code></pre>
+
+Preprocesses a scalar (after `merge_batch_to_components`) `GraphTensor`.
+
+This function uses the Keras functional API to define non-trainable
+transformations of the symbolic input `GraphTensor`, which get executed during
+dataset preprocessing in a `tf.data.Dataset.map(...)` operation. It has two
+responsibilities:
+
+1.  Splitting the training label out of the input for training. It must be
+    returned as a separate tensor or mapping of tensors.
+2.  Optionally, transforming input features. Some advanced modeling techniques
+    require running the same base GNN on multiple different transformations, so
+    this function may return a single `GraphTensor` or a non-empty sequence of
+    `GraphTensors`. The corresponding base GNN output for each `GraphTensor` is
+    provided to the `predict(...)` method.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A symbolic Keras <code>GraphTensor</code> for processing.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A tuple of processed <code>GraphTensor</code>(s) and a (one or mapping of) <code>Field</code> to
+be used as labels.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/RunResult.md b/tensorflow_gnn/docs/api_docs/python/runner/RunResult.md
new file mode 100644
index 00000000..8e8aa163
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/RunResult.md
@@ -0,0 +1,71 @@
+# runner.RunResult
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/interfaces.py#L51-L73">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Holds the return values of `run(...)`.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.RunResult(
+    preprocess_model: Optional[tf.keras.Model],
+    base_model: tf.keras.Model,
+    trained_model: tf.keras.Model
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr>
+<td>
+<code>preprocess_model</code><a id="preprocess_model"></a>
+</td>
+<td>
+Keras model containing only the computation for
+preprocessing inputs. It is not trained. The model takes serialized
+<code>GraphTensor</code>s as its inputs and returns preprocessed <code>GraphTensor</code>s.
+<code>None</code> when no preprocess model exists.
+</td>
+</tr><tr>
+<td>
+<code>base_model</code><a id="base_model"></a>
+</td>
+<td>
+Keras base GNN (as returned by the user provided <code>model_fn</code>).
+The model both takes and returns <code>GraphTensor</code>s. The model contains
+any--but not all--trained weights. The <code>trained_model</code> contains all
+<code>base_model</code> trained weights in addition to any prediction trained
+weights.
+</td>
+</tr><tr>
+<td>
+<code>trained_model</code><a id="trained_model"></a>
+</td>
+<td>
+Keras model for the e2e GNN. (Base GNN plus any prediction
+head(s).) The model takes <code>preprocess_model</code> output as its inputs and
+returns <code>Task</code> predictions as its output. Output matches the structure of
+the <code>Task</code>: an atom for single- or a mapping for multi- <code>Task</code> training.
+The model contains all trained weights.
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="__eq__"><code>__eq__</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>__eq__(
+    other
+)
+</code></pre>
+
+Return self==value.
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/SampleTFRecordDatasetsProvider.md b/tensorflow_gnn/docs/api_docs/python/runner/SampleTFRecordDatasetsProvider.md
new file mode 100644
index 00000000..d3950aec
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/SampleTFRecordDatasetsProvider.md
@@ -0,0 +1,229 @@
+# runner.SampleTFRecordDatasetsProvider
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/input/datasets.py#L449-L455">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Builds a sampling `tf.data.Dataset` from multiple filenames.
+
+Inherits From:
+[`SimpleSampleDatasetsProvider`](../runner/SimpleSampleDatasetsProvider.md),
+[`DatasetProvider`](../runner/DatasetProvider.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.SampleTFRecordDatasetsProvider(
+    *args, **kwargs
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+For complete explanations regarding sampling see `_process_sampled_dataset()`.
+
+This `SimpleSampleDatasetsProvider` builds a `tf.data.Dataset` as follows:
+
+-   The object is initialized with a list of filenames specified by
+    `principle_filenames` and `extra_filenames` argument. For convenience, the
+    corresponding file pattern `principal_file_pattern` and
+    `extra_file_patterns` can be specified instead, which will be expanded to a
+    sorted list.
+-   The filenames are sharded between replicas according to the `InputContext`
+    (order matters).
+-   Filenames are shuffled per replica (if requested).
+-   Examples from all file patterns are sampled according to `principal_weight`
+    and `extra_weights.`
+-   The files in each shard are interleaved after being read by the
+    `interleave_fn`.
+-   Examples are shuffled (if requested), auto-prefetched, and returned for use
+    in one replica of the trainer.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>principal_file_pattern</code><a id="principal_file_pattern"></a>
+</td>
+<td>
+A principal file pattern for sampling, to be
+expanded by <code>tf.io.gfile.glob</code> and sorted into the list of
+<code>principal_filenames</code>.
+</td>
+</tr><tr>
+<td>
+<code>extra_file_patterns</code><a id="extra_file_patterns"></a>
+</td>
+<td>
+File patterns, to be expanded by <code>tf.io.gfile.glob</code>
+and sorted into the list of <code>extra_filenames</code>.
+</td>
+</tr><tr>
+<td>
+<code>principal_weight</code><a id="principal_weight"></a>
+</td>
+<td>
+An optional weight for the dataset corresponding to
+<code>principal_file_pattern.</code> Required iff <code>extra_weights</code> are also
+provided.
+</td>
+</tr><tr>
+<td>
+<code>extra_weights</code><a id="extra_weights"></a>
+</td>
+<td>
+Optional weights corresponding to <code>file_patterns</code> for
+sampling. Required iff <code>principal_weight</code> is also provided.
+</td>
+</tr><tr>
+<td>
+<code>principal_filenames</code><a id="principal_filenames"></a>
+</td>
+<td>
+A list of principal filenames, specified explicitly.
+This argument is mutually exclusive with <code>principal_file_pattern</code>.
+</td>
+</tr><tr>
+<td>
+<code>extra_filenames</code><a id="extra_filenames"></a>
+</td>
+<td>
+A list of extra filenames, specified explicitly.
+This argument is mutually exclusive with <code>extra_file_patterns</code>.
+</td>
+</tr><tr>
+<td>
+<code>principal_cardinality</code><a id="principal_cardinality"></a>
+</td>
+<td>
+Iff <code>fixed_cardinality</code>=True, the size of the
+returned dataset is computed as <code>principal_cardinality</code> /
+<code>principal_weight</code> (with a default of uniform weights).
+</td>
+</tr><tr>
+<td>
+<code>fixed_cardinality</code><a id="fixed_cardinality"></a>
+</td>
+<td>
+Whether to take a fixed number of elements.
+</td>
+</tr><tr>
+<td>
+<code>shuffle_filenames</code><a id="shuffle_filenames"></a>
+</td>
+<td>
+If enabled, filenames will be shuffled after sharding
+ between replicas, before any file reads. Through interleaving, some
+files may be read in parallel: the details are auto-tuned for throughput.
+</td>
+</tr><tr>
+<td>
+<code>interleave_fn</code><a id="interleave_fn"></a>
+</td>
+<td>
+A fn applied with <code>tf.data.Dataset.interleave.</code>
+</td>
+</tr><tr>
+<td>
+<code>examples_shuffle_size</code><a id="examples_shuffle_size"></a>
+</td>
+<td>
+An optional buffer size for example shuffling. If
+specified, the size is adjusted to <code>shuffle_size //
+(len(file_patterns) + 1).</code>
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="get_dataset"><code>get_dataset</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/input/datasets.py#L360-L437">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_dataset(
+    context: tf.distribute.InputContext
+) -> tf.data.Dataset
+</code></pre>
+
+Creates a `tf.data.Dataset` by sampling.
+
+The contents of the resulting `tf.data.Dataset` are sampled from several
+sources, each stored as a sharded dataset: * one principal input, whose size
+determines the size of the resulting `tf.data.Dataset`; * zero or more side
+inputs, which are repeated if necessary to preserve the requested samping
+weights.
+
+Each input dataset is shared before interleaving. The result of interleaving is
+only shuffled if a `examples_shuffle_size` is provided.
+
+Datasets are sampled from with `tf.data.Dataset.sample_from_datasets.` For
+sampling details, please refer to the TensorFlow documentation at:
+https://www.tensorflow.org/api_docs/python/tf/data/Dataset#sample_from_datasets.
+
+Two methods are supported to determine the end of the resulting
+`tf.data.Dataset`:
+
+fixed_cardinality=True) Returns a dataset with a fixed cardinality, set at
+`principal_cardinality` // `principal_weight.` `principal_dataset` and
+`principal_cardinality` are required for this method. `principal_weight` is
+required iff `extra_weights` are also provided.
+
+fixed_cardinality=False) Returns a dataset that ends after the principal input
+has been exhausted, subject to the random selection of samples.
+`principal_dataset` is required for this method. `principal_weight` is required
+iff `extra_weights` are also provided.
+
+The choice of `principal_dataset` is important and should, in most cases, be
+chosen as the largest underlying dataset as compared to `extra_datasets.`
+`positives` and `negatives` where `len(negatives)` >> `len(positives)` and with
+`positives` corresponding to `principal_dataset,` the desired behavior of epochs
+determined by the exhaustion of `positives` and the continued mixing of unique
+elements from `negatives` may not occur: On sampled dataset reiteration
+`positives` will again be exhausted but elements from `negatives` may be those
+same seen in the previous epoch (as they occur at the beginning of the same,
+reiterated underlying `negatives` dataset). In this case, the recommendations
+are to:
+
+1) Reformulate the sampling in terms of the larger dataset (`negatives`), where,
+with `fixed_cardinality=False`, if the exhaustion of `negatives` is desired, or,
+with `fixed_cardinality=True`, when `principal_cardinality` can be used to
+specify the desired number of elements from `negatives.` 2) Ensure that the
+underlying `principal_dataset` of `negatives` are well-sharded. In this way, the
+nondeterminism of interleaving will randomly access elements of `negatives` on
+reiteration.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>context</code>
+</td>
+<td>
+An <code>tf.distribute.InputContext</code> for sharding.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A <code>tf.data.Dataset.</code>
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/SimpleDatasetProvider.md b/tensorflow_gnn/docs/api_docs/python/runner/SimpleDatasetProvider.md
new file mode 100644
index 00000000..bb730e58
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/SimpleDatasetProvider.md
@@ -0,0 +1,97 @@
+# runner.SimpleDatasetProvider
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/input/datasets.py#L86-L146">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Builds a `tf.data.Dataset` from a list of files.
+
+Inherits From: [`DatasetProvider`](../runner/DatasetProvider.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.SimpleDatasetProvider(
+    file_pattern: Optional[str] = None,
+    *,
+    filenames: Optional[Sequence[str]] = None,
+    shuffle_filenames: bool = False,
+    interleave_fn: Callable[..., tf.data.Dataset],
+    examples_shuffle_size: Optional[int] = None
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+This `SimpleDatasetProvider` builds a `tf.data.Dataset` as follows: - The object
+is initialized with a list of filenames. For convenience, a file pattern can be
+specified instead, which will be expanded to a sorted list. - The filenames are
+sharded between replicas according to the `InputContext` (order matters). -
+Filenames are shuffled per replica (if requested). - The files in each shard are
+interleaved after being read by the `interleave_fn`. - Examples are shuffled (if
+requested), auto-prefetched, and returned for use in one replica of the trainer.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>file_pattern</code><a id="file_pattern"></a>
+</td>
+<td>
+A file pattern, to be expanded by <code>tf.io.gfile.glob</code>
+and sorted into the list of all <code>filenames</code>.
+</td>
+</tr><tr>
+<td>
+<code>filenames</code><a id="filenames"></a>
+</td>
+<td>
+A list of all filenames, specified explicitly.
+This argument is mutually exclusive with <code>file_pattern</code>.
+</td>
+</tr><tr>
+<td>
+<code>shuffle_filenames</code><a id="shuffle_filenames"></a>
+</td>
+<td>
+If enabled, filenames will be shuffled after sharding
+between replicas, before any file reads. Through interleaving, some
+files may be read in parallel: the details are auto-tuned for
+throughput.
+</td>
+</tr><tr>
+<td>
+<code>interleave_fn</code><a id="interleave_fn"></a>
+</td>
+<td>
+A callback that receives a single filename and returns
+a <code>tf.data.Dataset</code> with the <code>tf.Example</code> values from that file.
+</td>
+</tr><tr>
+<td>
+<code>examples_shuffle_size</code><a id="examples_shuffle_size"></a>
+</td>
+<td>
+An optional buffer size for example shuffling.
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="get_dataset"><code>get_dataset</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/input/datasets.py#L134-L146">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_dataset(
+    context: tf.distribute.InputContext
+) -> tf.data.Dataset
+</code></pre>
+
+Gets a `tf.data.Dataset` by `context` per replica.
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/SimpleSampleDatasetsProvider.md b/tensorflow_gnn/docs/api_docs/python/runner/SimpleSampleDatasetsProvider.md
new file mode 100644
index 00000000..87e5c5e4
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/SimpleSampleDatasetsProvider.md
@@ -0,0 +1,238 @@
+# runner.SimpleSampleDatasetsProvider
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/input/datasets.py#L274-L437">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Builds a sampling `tf.data.Dataset` from multiple filenames.
+
+Inherits From: [`DatasetProvider`](../runner/DatasetProvider.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.SimpleSampleDatasetsProvider(
+    principal_file_pattern: Optional[str] = None,
+    extra_file_patterns: Optional[Sequence[str]] = None,
+    principal_weight: Optional[float] = None,
+    extra_weights: Optional[Sequence[float]] = None,
+    *,
+    principal_filenames: Optional[Sequence[str]] = None,
+    extra_filenames: Optional[Sequence[Sequence[str]]] = None,
+    principal_cardinality: Optional[int] = None,
+    fixed_cardinality: bool = False,
+    shuffle_filenames: bool = False,
+    interleave_fn: Callable[..., tf.data.Dataset],
+    examples_shuffle_size: Optional[int] = None
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+For complete explanations regarding sampling see `_process_sampled_dataset()`.
+
+This `SimpleSampleDatasetsProvider` builds a `tf.data.Dataset` as follows:
+
+-   The object is initialized with a list of filenames specified by
+    `principle_filenames` and `extra_filenames` argument. For convenience, the
+    corresponding file pattern `principal_file_pattern` and
+    `extra_file_patterns` can be specified instead, which will be expanded to a
+    sorted list.
+-   The filenames are sharded between replicas according to the `InputContext`
+    (order matters).
+-   Filenames are shuffled per replica (if requested).
+-   Examples from all file patterns are sampled according to `principal_weight`
+    and `extra_weights.`
+-   The files in each shard are interleaved after being read by the
+    `interleave_fn`.
+-   Examples are shuffled (if requested), auto-prefetched, and returned for use
+    in one replica of the trainer.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>principal_file_pattern</code><a id="principal_file_pattern"></a>
+</td>
+<td>
+A principal file pattern for sampling, to be
+expanded by <code>tf.io.gfile.glob</code> and sorted into the list of
+<code>principal_filenames</code>.
+</td>
+</tr><tr>
+<td>
+<code>extra_file_patterns</code><a id="extra_file_patterns"></a>
+</td>
+<td>
+File patterns, to be expanded by <code>tf.io.gfile.glob</code>
+and sorted into the list of <code>extra_filenames</code>.
+</td>
+</tr><tr>
+<td>
+<code>principal_weight</code><a id="principal_weight"></a>
+</td>
+<td>
+An optional weight for the dataset corresponding to
+<code>principal_file_pattern.</code> Required iff <code>extra_weights</code> are also
+provided.
+</td>
+</tr><tr>
+<td>
+<code>extra_weights</code><a id="extra_weights"></a>
+</td>
+<td>
+Optional weights corresponding to <code>file_patterns</code> for
+sampling. Required iff <code>principal_weight</code> is also provided.
+</td>
+</tr><tr>
+<td>
+<code>principal_filenames</code><a id="principal_filenames"></a>
+</td>
+<td>
+A list of principal filenames, specified explicitly.
+This argument is mutually exclusive with <code>principal_file_pattern</code>.
+</td>
+</tr><tr>
+<td>
+<code>extra_filenames</code><a id="extra_filenames"></a>
+</td>
+<td>
+A list of extra filenames, specified explicitly.
+This argument is mutually exclusive with <code>extra_file_patterns</code>.
+</td>
+</tr><tr>
+<td>
+<code>principal_cardinality</code><a id="principal_cardinality"></a>
+</td>
+<td>
+Iff <code>fixed_cardinality</code>=True, the size of the
+returned dataset is computed as <code>principal_cardinality</code> /
+<code>principal_weight</code> (with a default of uniform weights).
+</td>
+</tr><tr>
+<td>
+<code>fixed_cardinality</code><a id="fixed_cardinality"></a>
+</td>
+<td>
+Whether to take a fixed number of elements.
+</td>
+</tr><tr>
+<td>
+<code>shuffle_filenames</code><a id="shuffle_filenames"></a>
+</td>
+<td>
+If enabled, filenames will be shuffled after sharding
+ between replicas, before any file reads. Through interleaving, some
+files may be read in parallel: the details are auto-tuned for throughput.
+</td>
+</tr><tr>
+<td>
+<code>interleave_fn</code><a id="interleave_fn"></a>
+</td>
+<td>
+A fn applied with <code>tf.data.Dataset.interleave.</code>
+</td>
+</tr><tr>
+<td>
+<code>examples_shuffle_size</code><a id="examples_shuffle_size"></a>
+</td>
+<td>
+An optional buffer size for example shuffling. If
+specified, the size is adjusted to <code>shuffle_size //
+(len(file_patterns) + 1).</code>
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="get_dataset"><code>get_dataset</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/input/datasets.py#L360-L437">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_dataset(
+    context: tf.distribute.InputContext
+) -> tf.data.Dataset
+</code></pre>
+
+Creates a `tf.data.Dataset` by sampling.
+
+The contents of the resulting `tf.data.Dataset` are sampled from several
+sources, each stored as a sharded dataset: * one principal input, whose size
+determines the size of the resulting `tf.data.Dataset`; * zero or more side
+inputs, which are repeated if necessary to preserve the requested samping
+weights.
+
+Each input dataset is shared before interleaving. The result of interleaving is
+only shuffled if a `examples_shuffle_size` is provided.
+
+Datasets are sampled from with `tf.data.Dataset.sample_from_datasets.` For
+sampling details, please refer to the TensorFlow documentation at:
+https://www.tensorflow.org/api_docs/python/tf/data/Dataset#sample_from_datasets.
+
+Two methods are supported to determine the end of the resulting
+`tf.data.Dataset`:
+
+fixed_cardinality=True) Returns a dataset with a fixed cardinality, set at
+`principal_cardinality` // `principal_weight.` `principal_dataset` and
+`principal_cardinality` are required for this method. `principal_weight` is
+required iff `extra_weights` are also provided.
+
+fixed_cardinality=False) Returns a dataset that ends after the principal input
+has been exhausted, subject to the random selection of samples.
+`principal_dataset` is required for this method. `principal_weight` is required
+iff `extra_weights` are also provided.
+
+The choice of `principal_dataset` is important and should, in most cases, be
+chosen as the largest underlying dataset as compared to `extra_datasets.`
+`positives` and `negatives` where `len(negatives)` >> `len(positives)` and with
+`positives` corresponding to `principal_dataset,` the desired behavior of epochs
+determined by the exhaustion of `positives` and the continued mixing of unique
+elements from `negatives` may not occur: On sampled dataset reiteration
+`positives` will again be exhausted but elements from `negatives` may be those
+same seen in the previous epoch (as they occur at the beginning of the same,
+reiterated underlying `negatives` dataset). In this case, the recommendations
+are to:
+
+1) Reformulate the sampling in terms of the larger dataset (`negatives`), where,
+with `fixed_cardinality=False`, if the exhaustion of `negatives` is desired, or,
+with `fixed_cardinality=True`, when `principal_cardinality` can be used to
+specify the desired number of elements from `negatives.` 2) Ensure that the
+underlying `principal_dataset` of `negatives` are well-sharded. In this way, the
+nondeterminism of interleaving will randomly access elements of `negatives` on
+reiteration.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>context</code>
+</td>
+<td>
+An <code>tf.distribute.InputContext</code> for sharding.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A <code>tf.data.Dataset.</code>
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/SubmoduleExporter.md b/tensorflow_gnn/docs/api_docs/python/runner/SubmoduleExporter.md
new file mode 100644
index 00000000..bb2dc054
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/SubmoduleExporter.md
@@ -0,0 +1,125 @@
+# runner.SubmoduleExporter
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/utils/model_export.py#L90-L167">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Exports a Keras submodule.
+
+Inherits From: [`ModelExporter`](../runner/ModelExporter.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.SubmoduleExporter(
+    sublayer_name: str,
+    *,
+    output_names: Optional[Any] = None,
+    subdirectory: Optional[str] = None,
+    include_preprocessing: bool = False,
+    options: Optional[tf.saved_model.SaveOptions] = None
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+Given a `RunResult`, this exporter creates and exports a submodule with inputs
+identical to the trained model and outputs from some intermediate layer (named
+`sublayer_name`). For example, with pseudocode:
+
+`trained_model = tf.keras.Sequential([layer1, layer2, layer3, layer4])` and
+`SubmoduleExporter(sublayer_name='layer2')`
+
+The exported submodule is:
+
+`submodule = tf.keras.Sequential([layer1, layer2])`
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>sublayer_name</code><a id="sublayer_name"></a>
+</td>
+<td>
+The name of the submodule's final layer.
+</td>
+</tr><tr>
+<td>
+<code>output_names</code><a id="output_names"></a>
+</td>
+<td>
+The names for output Tensor(s), see: <code>KerasModelExporter</code>.
+</td>
+</tr><tr>
+<td>
+<code>subdirectory</code><a id="subdirectory"></a>
+</td>
+<td>
+An optional subdirectory, if set: submodules are exported
+to <code>os.path.join(export_dir, subdirectory)</code>.
+</td>
+</tr><tr>
+<td>
+<code>include_preprocessing</code><a id="include_preprocessing"></a>
+</td>
+<td>
+Whether to include any <code>preprocess_model</code>.
+</td>
+</tr><tr>
+<td>
+<code>options</code><a id="options"></a>
+</td>
+<td>
+Options for saving to a TensorFlow <code>SavedModel</code>.
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="save"><code>save</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/utils/model_export.py#L129-L167">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>save(
+    run_result: <a href="../runner/RunResult.md"><code>runner.RunResult</code></a>,
+    export_dir: str
+)
+</code></pre>
+
+Saves a Keras model submodule.
+
+Importantly: the `run_result.preprocess_model`, if provided, and
+`run_result.trained_model` are stacked before any export. Stacking involves the
+chaining of the first output of `run_result.preprocess_model` to the only input
+of `run_result.trained_model.` The result is a model with the input of
+`run_result.preprocess_model` and the output of `run_result.trained_model.`
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>run_result</code>
+</td>
+<td>
+A <code>RunResult</code> from training.
+</td>
+</tr><tr>
+<td>
+<code>export_dir</code>
+</td>
+<td>
+A destination directory.
+</td>
+</tr>
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/TFDataServiceConfig.md b/tensorflow_gnn/docs/api_docs/python/runner/TFDataServiceConfig.md
new file mode 100644
index 00000000..f94a5164
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/TFDataServiceConfig.md
@@ -0,0 +1,66 @@
+# runner.TFDataServiceConfig
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/orchestration.py#L68-L79">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Provides tf.data service related configuration options.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.TFDataServiceConfig(
+    tf_data_service_address: str,
+    tf_data_service_job_name: str,
+    tf_data_service_mode: Union[str, tf.data.experimental.service.ShardingPolicy]
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+tf.data service has data flexible visitation guarantees, its impact over your
+training pipelines will be empirical. Check out the tf.data service internals
+and operation details from
+https://www.tensorflow.org/api_docs/python/tf/data/experimental/service.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr>
+<td>
+<code>tf_data_service_address</code><a id="tf_data_service_address"></a>
+</td>
+<td>
+Dataclass field
+</td>
+</tr><tr>
+<td>
+<code>tf_data_service_job_name</code><a id="tf_data_service_job_name"></a>
+</td>
+<td>
+Dataclass field
+</td>
+</tr><tr>
+<td>
+<code>tf_data_service_mode</code><a id="tf_data_service_mode"></a>
+</td>
+<td>
+Dataclass field
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="__eq__"><code>__eq__</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>__eq__(
+    other
+)
+</code></pre>
+
+Return self==value.
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/TFRecordDatasetProvider.md b/tensorflow_gnn/docs/api_docs/python/runner/TFRecordDatasetProvider.md
new file mode 100644
index 00000000..d3b381de
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/TFRecordDatasetProvider.md
@@ -0,0 +1,93 @@
+# runner.TFRecordDatasetProvider
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/input/datasets.py#L440-L446">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Builds a `tf.data.Dataset` from a list of files.
+
+Inherits From: [`SimpleDatasetProvider`](../runner/SimpleDatasetProvider.md),
+[`DatasetProvider`](../runner/DatasetProvider.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.TFRecordDatasetProvider(
+    *args, **kwargs
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+This `SimpleDatasetProvider` builds a `tf.data.Dataset` as follows: - The object
+is initialized with a list of filenames. For convenience, a file pattern can be
+specified instead, which will be expanded to a sorted list. - The filenames are
+sharded between replicas according to the `InputContext` (order matters). -
+Filenames are shuffled per replica (if requested). - The files in each shard are
+interleaved after being read by the `interleave_fn`. - Examples are shuffled (if
+requested), auto-prefetched, and returned for use in one replica of the trainer.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>file_pattern</code><a id="file_pattern"></a>
+</td>
+<td>
+A file pattern, to be expanded by <code>tf.io.gfile.glob</code>
+and sorted into the list of all <code>filenames</code>.
+</td>
+</tr><tr>
+<td>
+<code>filenames</code><a id="filenames"></a>
+</td>
+<td>
+A list of all filenames, specified explicitly.
+This argument is mutually exclusive with <code>file_pattern</code>.
+</td>
+</tr><tr>
+<td>
+<code>shuffle_filenames</code><a id="shuffle_filenames"></a>
+</td>
+<td>
+If enabled, filenames will be shuffled after sharding
+between replicas, before any file reads. Through interleaving, some
+files may be read in parallel: the details are auto-tuned for
+throughput.
+</td>
+</tr><tr>
+<td>
+<code>interleave_fn</code><a id="interleave_fn"></a>
+</td>
+<td>
+A callback that receives a single filename and returns
+a <code>tf.data.Dataset</code> with the <code>tf.Example</code> values from that file.
+</td>
+</tr><tr>
+<td>
+<code>examples_shuffle_size</code><a id="examples_shuffle_size"></a>
+</td>
+<td>
+An optional buffer size for example shuffling.
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="get_dataset"><code>get_dataset</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/input/datasets.py#L134-L146">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_dataset(
+    context: tf.distribute.InputContext
+) -> tf.data.Dataset
+</code></pre>
+
+Gets a `tf.data.Dataset` by `context` per replica.
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/TPUStrategy.md b/tensorflow_gnn/docs/api_docs/python/runner/TPUStrategy.md
new file mode 100644
index 00000000..d80ce7b5
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/TPUStrategy.md
@@ -0,0 +1,1228 @@
+# runner.TPUStrategy
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/utils/strategies.py#L36-L43">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+A `TPUStrategy` convenience wrapper.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.TPUStrategy(
+    tpu: str = &#x27;&#x27;
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>tpu_cluster_resolver</code><a id="tpu_cluster_resolver"></a>
+</td>
+<td>
+A
+<code>tf.distribute.cluster_resolver.TPUClusterResolver</code> instance, which
+provides information about the TPU cluster. If None, it will assume
+running on a local TPU worker.
+</td>
+</tr><tr>
+<td>
+<code>experimental_device_assignment</code><a id="experimental_device_assignment"></a>
+</td>
+<td>
+Optional
+<code>tf.tpu.experimental.DeviceAssignment</code> to specify the placement of
+replicas on the TPU cluster.
+</td>
+</tr><tr>
+<td>
+<code>experimental_spmd_xla_partitioning</code><a id="experimental_spmd_xla_partitioning"></a>
+</td>
+<td>
+If True, enable the SPMD (Single
+Program Multiple Data) mode in XLA compiler. This flag only affects the
+performance of XLA compilation and the HBM requirement of the compiled
+TPU program. Ceveat: if this flag is True, calling
+<code>tf.distribute.TPUStrategy.experimental_assign_to_logical_device</code> will
+result in a ValueError.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr> <td> <code>cluster_resolver</code><a id="cluster_resolver"></a> </td> <td>
+Returns the cluster resolver associated with this strategy.
+
+<code>tf.distribute.TPUStrategy</code> provides the associated
+<code>tf.distribute.cluster_resolver.ClusterResolver</code>. If the user provides one
+in <code>**init**</code>, that instance is returned; if the user does not, a default
+<code>tf.distribute.cluster_resolver.TPUClusterResolver</code> is provided.
+</td>
+</tr><tr>
+<td>
+<code>extended</code><a id="extended"></a>
+</td>
+<td>
+<code>tf.distribute.StrategyExtended</code> with additional methods.
+</td>
+</tr><tr>
+<td>
+<code>num_replicas_in_sync</code><a id="num_replicas_in_sync"></a>
+</td>
+<td>
+Returns number of replicas over which gradients are aggregated.
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="distribute_datasets_from_function"><code>distribute_datasets_from_function</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>distribute_datasets_from_function(
+    dataset_fn, options=None
+)
+</code></pre>
+
+Distributes `tf.data.Dataset` instances created by calls to `dataset_fn`.
+
+The argument `dataset_fn` that users pass in is an input function that has a
+`tf.distribute.InputContext` argument and returns a `tf.data.Dataset` instance.
+It is expected that the returned dataset from `dataset_fn` is already batched by
+per-replica batch size (i.e. global batch size divided by the number of replicas
+in sync) and sharded. `tf.distribute.Strategy.distribute_datasets_from_function`
+does not batch or shard the `tf.data.Dataset` instance returned from the input
+function. `dataset_fn` will be called on the CPU device of each of the workers
+and each generates a dataset where every replica on that worker will dequeue one
+batch of inputs (i.e. if a worker has two replicas, two batches will be dequeued
+from the `Dataset` every step).
+
+This method can be used for several purposes. First, it allows you to specify
+your own batching and sharding logic. (In contrast,
+`tf.distribute.experimental_distribute_dataset` does batching and sharding for
+you.) For example, where `experimental_distribute_dataset` is unable to shard
+the input files, this method might be used to manually shard the dataset
+(avoiding the slow fallback behavior in `experimental_distribute_dataset`). In
+cases where the dataset is infinite, this sharding can be done by creating
+dataset replicas that differ only in their random seed.
+
+The `dataset_fn` should take an `tf.distribute.InputContext` instance where
+information about batching and input replication can be accessed.
+
+You can use `element_spec` property of the `tf.distribute.DistributedDataset`
+returned by this API to query the `tf.TypeSpec` of the elements returned by the
+iterator. This can be used to set the `input_signature` property of a
+`tf.function`. Follow `tf.distribute.DistributedDataset.element_spec` to see an
+example.
+
+IMPORTANT: The `tf.data.Dataset` returned by `dataset_fn` should have a
+per-replica batch size, unlike `experimental_distribute_dataset`, which uses the
+global batch size. This may be computed using
+`input_context.get_per_replica_batch_size`.
+
+Note: If you are using TPUStrategy, the order in which the data is processed by
+the workers when using `tf.distribute.Strategy.experimental_distribute_dataset`
+or `tf.distribute.Strategy.distribute_datasets_from_function` is not guaranteed.
+This is typically required if you are using `tf.distribute` to scale prediction.
+You can however insert an index for each element in the batch and order outputs
+accordingly. Refer to
+[this snippet](https://www.tensorflow.org/tutorials/distribute/input#caveats)
+for an example of how to order outputs.
+
+Note: Stateful dataset transformations are currently not supported with
+`tf.distribute.experimental_distribute_dataset` or
+`tf.distribute.distribute_datasets_from_function`. Any stateful ops that the
+dataset may have are currently ignored. For example, if your dataset has a
+`map_fn` that uses `tf.random.uniform` to rotate an image, then you have a
+dataset graph that depends on state (i.e the random seed) on the local machine
+where the python process is being executed.
+
+For a tutorial on more usage and properties of this method, refer to the
+[tutorial on distributed input](https://www.tensorflow.org/tutorials/distribute/input#tfdistributestrategyexperimental_distribute_datasets_from_function)).
+If you are interested in last partial batch handling, read
+[this section](https://www.tensorflow.org/tutorials/distribute/input#partial_batches).
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>dataset_fn</code>
+</td>
+<td>
+A function taking a <code>tf.distribute.InputContext</code> instance and
+returning a <code>tf.data.Dataset</code>.
+</td>
+</tr><tr>
+<td>
+<code>options</code>
+</td>
+<td>
+<code>tf.distribute.InputOptions</code> used to control options on how this
+dataset is distributed.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A <code>tf.distribute.DistributedDataset</code>.
+</td>
+</tr>
+
+</table>
+
+<h3 id="experimental_assign_to_logical_device"><code>experimental_assign_to_logical_device</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>experimental_assign_to_logical_device(
+    tensor, logical_device_id
+)
+</code></pre>
+
+Adds annotation that `tensor` will be assigned to a logical device.
+
+This adds an annotation to `tensor` specifying that operations on `tensor` will
+be invoked on logical core device id `logical_device_id`. When model parallelism
+is used, the default behavior is that all ops are placed on zero-th logical
+device.
+
+```python
+
+# Initializing TPU system with 2 logical devices and 4 replicas.
+resolver = tf.distribute.cluster_resolver.TPUClusterResolver(tpu='')
+tf.config.experimental_connect_to_cluster(resolver)
+topology = tf.tpu.experimental.initialize_tpu_system(resolver)
+device_assignment = tf.tpu.experimental.DeviceAssignment.build(
+    topology,
+    computation_shape=[1, 1, 1, 2],
+    num_replicas=4)
+strategy = tf.distribute.TPUStrategy(
+    resolver, experimental_device_assignment=device_assignment)
+iterator = iter(inputs)
+
+@tf.function()
+def step_fn(inputs):
+  output = tf.add(inputs, inputs)
+
+  # Add operation will be executed on logical device 0.
+  output = strategy.experimental_assign_to_logical_device(output, 0)
+  return output
+
+strategy.run(step_fn, args=(next(iterator),))
+```
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>tensor</code>
+</td>
+<td>
+Input tensor to annotate.
+</td>
+</tr><tr>
+<td>
+<code>logical_device_id</code>
+</td>
+<td>
+Id of the logical core to which the tensor will be
+assigned.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Raises</th></tr>
+
+<tr>
+<td>
+<code>ValueError</code>
+</td>
+<td>
+The logical device id presented is not consistent with total
+number of partitions specified by the device assignment or the TPUStrategy
+is constructed with <code>experimental_spmd_xla_partitioning=True</code>.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+Annotated tensor with identical value as <code>tensor</code>.
+</td>
+</tr>
+
+</table>
+
+<h3 id="experimental_distribute_dataset"><code>experimental_distribute_dataset</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>experimental_distribute_dataset(
+    dataset, options=None
+)
+</code></pre>
+
+Creates `tf.distribute.DistributedDataset` from `tf.data.Dataset`.
+
+The returned `tf.distribute.DistributedDataset` can be iterated over similar to
+regular datasets. NOTE: The user cannot add any more transformations to a
+`tf.distribute.DistributedDataset`. You can only create an iterator or examine
+the `tf.TypeSpec` of the data generated by it. See API docs of
+`tf.distribute.DistributedDataset` to learn more.
+
+The following is an example:
+
+```
+>>> global_batch_size = 2
+>>> # Passing the devices is optional.
+... strategy = tf.distribute.MirroredStrategy(devices=["GPU:0", "GPU:1"])
+>>> # Create a dataset
+... dataset = tf.data.Dataset.range(4).batch(global_batch_size)
+>>> # Distribute that dataset
+... dist_dataset = strategy.experimental_distribute_dataset(dataset)
+>>> @tf.function
+... def replica_fn(input):
+...   return input*2
+>>> result = []
+>>> # Iterate over the `tf.distribute.DistributedDataset`
+... for x in dist_dataset:
+...   # process dataset elements
+...   result.append(strategy.run(replica_fn, args=(x,)))
+>>> print(result)
+[PerReplica:{
+  0: <tf.Tensor: shape=(1,), dtype=int64, numpy=array([0])>,
+  1: <tf.Tensor: shape=(1,), dtype=int64, numpy=array([2])>
+}, PerReplica:{
+  0: <tf.Tensor: shape=(1,), dtype=int64, numpy=array([4])>,
+  1: <tf.Tensor: shape=(1,), dtype=int64, numpy=array([6])>
+}]
+```
+
+Three key actions happening under the hood of this method are batching,
+sharding, and prefetching.
+
+In the code snippet above, `dataset` is batched by `global_batch_size`, and
+calling `experimental_distribute_dataset` on it rebatches `dataset` to a new
+batch size that is equal to the global batch size divided by the number of
+replicas in sync. We iterate through it using a Pythonic for loop. `x` is a
+`tf.distribute.DistributedValues` containing data for all replicas, and each
+replica gets data of the new batch size. `tf.distribute.Strategy.run` will take
+care of feeding the right per-replica data in `x` to the right `replica_fn`
+executed on each replica.
+
+Sharding contains autosharding across multiple workers and within every worker.
+First, in multi-worker distributed training (i.e. when you use
+`tf.distribute.experimental.MultiWorkerMirroredStrategy` or
+`tf.distribute.TPUStrategy`), autosharding a dataset over a set of workers means
+that each worker is assigned a subset of the entire dataset (if the right
+`tf.data.experimental.AutoShardPolicy` is set). This is to ensure that at each
+step, a global batch size of non-overlapping dataset elements will be processed
+by each worker. Autosharding has a couple of different options that can be
+specified using `tf.data.experimental.DistributeOptions`. Then, sharding within
+each worker means the method will split the data among all the worker devices
+(if more than one a present). This will happen regardless of multi-worker
+autosharding.
+
+Note: for autosharding across multiple workers, the default mode is
+`tf.data.experimental.AutoShardPolicy.AUTO`. This mode will attempt to shard the
+input dataset by files if the dataset is being created out of reader datasets
+(e.g. `tf.data.TFRecordDataset`, `tf.data.TextLineDataset`, etc.) or otherwise
+shard the dataset by data, where each of the workers will read the entire
+dataset and only process the shard assigned to it. However, if you have less
+than one input file per worker, we suggest that you disable dataset autosharding
+across workers by setting the
+`tf.data.experimental.DistributeOptions.auto_shard_policy` to be
+`tf.data.experimental.AutoShardPolicy.OFF`.
+
+By default, this method adds a prefetch transformation at the end of the user
+provided `tf.data.Dataset` instance. The argument to the prefetch transformation
+which is `buffer_size` is equal to the number of replicas in sync.
+
+If the above batch splitting and dataset sharding logic is undesirable, please
+use `tf.distribute.Strategy.distribute_datasets_from_function` instead, which
+does not do any automatic batching or sharding for you.
+
+Note: If you are using TPUStrategy, the order in which the data is processed by
+the workers when using `tf.distribute.Strategy.experimental_distribute_dataset`
+or `tf.distribute.Strategy.distribute_datasets_from_function` is not guaranteed.
+This is typically required if you are using `tf.distribute` to scale prediction.
+You can however insert an index for each element in the batch and order outputs
+accordingly. Refer to
+[this snippet](https://www.tensorflow.org/tutorials/distribute/input#caveats)
+for an example of how to order outputs.
+
+Note: Stateful dataset transformations are currently not supported with
+`tf.distribute.experimental_distribute_dataset` or
+`tf.distribute.distribute_datasets_from_function`. Any stateful ops that the
+dataset may have are currently ignored. For example, if your dataset has a
+`map_fn` that uses `tf.random.uniform` to rotate an image, then you have a
+dataset graph that depends on state (i.e the random seed) on the local machine
+where the python process is being executed.
+
+For a tutorial on more usage and properties of this method, refer to the
+[tutorial on distributed input](https://www.tensorflow.org/tutorials/distribute/input#tfdistributestrategyexperimental_distribute_dataset).
+If you are interested in last partial batch handling, read
+[this section](https://www.tensorflow.org/tutorials/distribute/input#partial_batches).
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>dataset</code>
+</td>
+<td>
+<code>tf.data.Dataset</code> that will be sharded across all replicas using
+the rules stated above.
+</td>
+</tr><tr>
+<td>
+<code>options</code>
+</td>
+<td>
+<code>tf.distribute.InputOptions</code> used to control options on how this
+dataset is distributed.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A <code>tf.distribute.DistributedDataset</code>.
+</td>
+</tr>
+
+</table>
+
+<h3 id="experimental_distribute_values_from_function"><code>experimental_distribute_values_from_function</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>experimental_distribute_values_from_function(
+    value_fn
+)
+</code></pre>
+
+Generates `tf.distribute.DistributedValues` from `value_fn`.
+
+This function is to generate `tf.distribute.DistributedValues` to pass into
+`run`, `reduce`, or other methods that take distributed values when not using
+datasets.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>value_fn</code>
+</td>
+<td>
+The function to run to generate values. It is called for
+each replica with <code>tf.distribute.ValueContext</code> as the sole argument. It
+must return a Tensor or a type that can be converted to a Tensor.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A <code>tf.distribute.DistributedValues</code> containing a value for each replica.
+</td>
+</tr>
+
+</table>
+
+#### Example usage:
+
+1.  Return constant value per replica:
+
+    ```
+    >>> strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1"])
+    >>> def value_fn(ctx):
+    ...   return tf.constant(1.)
+    >>> distributed_values = (
+    ...     strategy.experimental_distribute_values_from_function(
+    ...        value_fn))
+    >>> local_result = strategy.experimental_local_results(
+    ...     distributed_values)
+    >>> local_result
+    (<tf.Tensor: shape=(), dtype=float32, numpy=1.0>,
+    <tf.Tensor: shape=(), dtype=float32, numpy=1.0>)
+    ```
+
+2.  Distribute values in array based on replica_id: {: value=2}
+
+    ```
+    >>> strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1"])
+    >>> array_value = np.array([3., 2., 1.])
+    >>> def value_fn(ctx):
+    ...   return array_value[ctx.replica_id_in_sync_group]
+    >>> distributed_values = (
+    ...     strategy.experimental_distribute_values_from_function(
+    ...         value_fn))
+    >>> local_result = strategy.experimental_local_results(
+    ...     distributed_values)
+    >>> local_result
+    (3.0, 2.0)
+    ```
+
+3.  Specify values using num_replicas_in_sync: {: value=3}
+
+    ```
+    >>> strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1"])
+    >>> def value_fn(ctx):
+    ...   return ctx.num_replicas_in_sync
+    >>> distributed_values = (
+    ...     strategy.experimental_distribute_values_from_function(
+    ...         value_fn))
+    >>> local_result = strategy.experimental_local_results(
+    ...     distributed_values)
+    >>> local_result
+    (2, 2)
+    ```
+
+4.  Place values on devices and distribute: {: value=4}
+
+    ```
+    strategy = tf.distribute.TPUStrategy()
+    worker_devices = strategy.extended.worker_devices
+    multiple_values = []
+    for i in range(strategy.num_replicas_in_sync):
+      with tf.device(worker_devices[i]):
+        multiple_values.append(tf.constant(1.0))
+
+    def value_fn(ctx):
+      return multiple_values[ctx.replica_id_in_sync_group]
+
+    distributed_values = strategy.
+      experimental_distribute_values_from_function(
+      value_fn)
+    ```
+
+<h3 id="experimental_local_results"><code>experimental_local_results</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>experimental_local_results(
+    value
+)
+</code></pre>
+
+Returns the list of all local per-replica values contained in `value`.
+
+Note: This only returns values on the worker initiated by this client. When
+using a `tf.distribute.Strategy` like
+`tf.distribute.experimental.MultiWorkerMirroredStrategy`, each worker will be
+its own client, and this function will only return values computed on that
+worker.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>value</code>
+</td>
+<td>
+A value returned by <code>experimental_run()</code>, <code>run(), or a variable
+created in </code>scope`.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A tuple of values contained in <code>value</code> where ith element corresponds to
+ith replica. If <code>value</code> represents a single value, this returns
+<code>(value,).</code>
+</td>
+</tr>
+
+</table>
+
+<h3 id="experimental_replicate_to_logical_devices"><code>experimental_replicate_to_logical_devices</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>experimental_replicate_to_logical_devices(
+    tensor
+)
+</code></pre>
+
+Adds annotation that `tensor` will be replicated to all logical devices.
+
+This adds an annotation to tensor `tensor` specifying that operations on
+`tensor` will be invoked on all logical devices.
+
+```python
+# Initializing TPU system with 2 logical devices and 4 replicas.
+resolver = tf.distribute.cluster_resolver.TPUClusterResolver(tpu='')
+tf.config.experimental_connect_to_cluster(resolver)
+topology = tf.tpu.experimental.initialize_tpu_system(resolver)
+device_assignment = tf.tpu.experimental.DeviceAssignment.build(
+    topology,
+    computation_shape=[1, 1, 1, 2],
+    num_replicas=4)
+strategy = tf.distribute.TPUStrategy(
+    resolver, experimental_device_assignment=device_assignment)
+
+iterator = iter(inputs)
+
+@tf.function()
+def step_fn(inputs):
+  images, labels = inputs
+  images = strategy.experimental_split_to_logical_devices(
+    inputs, [1, 2, 4, 1])
+
+  # model() function will be executed on 8 logical devices with `inputs`
+  # split 2 * 4  ways.
+  output = model(inputs)
+
+  # For loss calculation, all logical devices share the same logits
+  # and labels.
+  labels = strategy.experimental_replicate_to_logical_devices(labels)
+  output = strategy.experimental_replicate_to_logical_devices(output)
+  loss = loss_fn(labels, output)
+
+  return loss
+
+strategy.run(step_fn, args=(next(iterator),))
+```
+
+Args: tensor: Input tensor to annotate.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+Annotated tensor with identical value as <code>tensor</code>.
+</td>
+</tr>
+
+</table>
+
+<h3 id="experimental_split_to_logical_devices"><code>experimental_split_to_logical_devices</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>experimental_split_to_logical_devices(
+    tensor, partition_dimensions
+)
+</code></pre>
+
+Adds annotation that `tensor` will be split across logical devices.
+
+This adds an annotation to tensor `tensor` specifying that operations on
+`tensor` will be split among multiple logical devices. Tensor `tensor` will be
+split across dimensions specified by `partition_dimensions`. The dimensions of
+`tensor` must be divisible by corresponding value in `partition_dimensions`.
+
+For example, for system with 8 logical devices, if `tensor` is an image tensor
+with shape (batch_size, width, height, channel) and `partition_dimensions` is
+[1, 2, 4, 1], then `tensor` will be split 2 in width dimension and 4 way in
+height dimension and the split tensor values will be fed into 8 logical devices.
+
+```python
+# Initializing TPU system with 8 logical devices and 1 replica.
+resolver = tf.distribute.cluster_resolver.TPUClusterResolver(tpu='')
+tf.config.experimental_connect_to_cluster(resolver)
+topology = tf.tpu.experimental.initialize_tpu_system(resolver)
+device_assignment = tf.tpu.experimental.DeviceAssignment.build(
+    topology,
+    computation_shape=[1, 2, 2, 2],
+    num_replicas=1)
+# Construct the TPUStrategy. Since we are going to split the image across
+# logical devices, here we set `experimental_spmd_xla_partitioning=True`
+# so that the partitioning can be compiled in SPMD mode, which usually
+# results in faster compilation and smaller HBM requirement if the size of
+# input and activation tensors are much bigger than that of the model
+# parameters. Note that this flag is suggested but not a hard requirement
+# for `experimental_split_to_logical_devices`.
+strategy = tf.distribute.TPUStrategy(
+    resolver, experimental_device_assignment=device_assignment,
+    experimental_spmd_xla_partitioning=True)
+
+iterator = iter(inputs)
+
+@tf.function()
+def step_fn(inputs):
+  inputs = strategy.experimental_split_to_logical_devices(
+    inputs, [1, 2, 4, 1])
+
+  # model() function will be executed on 8 logical devices with `inputs`
+  # split 2 * 4  ways.
+  output = model(inputs)
+  return output
+
+strategy.run(step_fn, args=(next(iterator),))
+```
+
+Args: tensor: Input tensor to annotate. partition_dimensions: An unnested list
+of integers with the size equal to rank of `tensor` specifying how `tensor` will
+be partitioned. The product of all elements in `partition_dimensions` must be
+equal to the total number of logical devices per replica.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Raises</th></tr>
+
+<tr>
+<td>
+<code>ValueError</code>
+</td>
+<td>
+1) If the size of partition_dimensions does not equal to rank
+of <code>tensor</code> or 2) if product of elements of <code>partition_dimensions</code> does
+not match the number of logical devices per replica defined by the
+implementing DistributionStrategy's device specification or
+3) if a known size of <code>tensor</code> is not divisible by corresponding
+value in <code>partition_dimensions</code>.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+Annotated tensor with identical value as <code>tensor</code>.
+</td>
+</tr>
+
+</table>
+
+<h3 id="gather"><code>gather</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>gather(
+    value, axis
+)
+</code></pre>
+
+Gather `value` across replicas along `axis` to the current device.
+
+Given a `tf.distribute.DistributedValues` or `tf.Tensor`-like object `value`,
+this API gathers and concatenates `value` across replicas along the `axis`-th
+dimension. The result is copied to the "current" device, which would typically
+be the CPU of the worker on which the program is running. For
+`tf.distribute.TPUStrategy`, it is the first TPU host. For multi-client
+`tf.distribute.MultiWorkerMirroredStrategy`, this is the CPU of each worker.
+
+This API can only be called in the cross-replica context. For a counterpart in
+the replica context, see `tf.distribute.ReplicaContext.all_gather`.
+
+Note: For all strategies except `tf.distribute.TPUStrategy`, the input `value`
+on different replicas must have the same rank, and their shapes must be the same
+in all dimensions except the `axis`-th dimension. In other words, their shapes
+cannot be different in a dimension `d` where `d` does not equal to the `axis`
+argument. For example, given a `tf.distribute.DistributedValues` with component
+tensors of shape `(1, 2, 3)` and `(1, 3, 3)` on two replicas, you can call
+`gather(..., axis=1, ...)` on it, but not `gather(..., axis=0, ...)` or
+`gather(..., axis=2, ...)`. However, for `tf.distribute.TPUStrategy.gather`, all
+tensors must have exactly the same rank and same shape.
+
+Note: Given a `tf.distribute.DistributedValues` `value`, its component tensors
+must have a non-zero rank. Otherwise, consider using `tf.expand_dims` before
+gathering them.
+
+```
+>>> strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1"])
+>>> # A DistributedValues with component tensor of shape (2, 1) on each replica
+... distributed_values = strategy.experimental_distribute_values_from_function(lambda _: tf.identity(tf.constant([[1], [2]])))
+>>> @tf.function
+... def run():
+...   return strategy.gather(distributed_values, axis=0)
+>>> run()
+<tf.Tensor: shape=(4, 1), dtype=int32, numpy=
+array([[1],
+       [2],
+       [1],
+       [2]], dtype=int32)>
+```
+
+Consider the following example for more combinations:
+
+```
+>>> strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1", "GPU:2", "GPU:3"])
+>>> single_tensor = tf.reshape(tf.range(6), shape=(1,2,3))
+>>> distributed_values = strategy.experimental_distribute_values_from_function(lambda _: tf.identity(single_tensor))
+>>> @tf.function
+... def run(axis):
+...   return strategy.gather(distributed_values, axis=axis)
+>>> axis=0
+>>> run(axis)
+<tf.Tensor: shape=(4, 2, 3), dtype=int32, numpy=
+array([[[0, 1, 2],
+        [3, 4, 5]],
+       [[0, 1, 2],
+        [3, 4, 5]],
+       [[0, 1, 2],
+        [3, 4, 5]],
+       [[0, 1, 2],
+        [3, 4, 5]]], dtype=int32)>
+>>> axis=1
+>>> run(axis)
+<tf.Tensor: shape=(1, 8, 3), dtype=int32, numpy=
+array([[[0, 1, 2],
+        [3, 4, 5],
+        [0, 1, 2],
+        [3, 4, 5],
+        [0, 1, 2],
+        [3, 4, 5],
+        [0, 1, 2],
+        [3, 4, 5]]], dtype=int32)>
+>>> axis=2
+>>> run(axis)
+<tf.Tensor: shape=(1, 2, 12), dtype=int32, numpy=
+array([[[0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2],
+        [3, 4, 5, 3, 4, 5, 3, 4, 5, 3, 4, 5]]], dtype=int32)>
+```
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>value</code>
+</td>
+<td>
+a <code>tf.distribute.DistributedValues</code> instance, e.g. returned by
+<code>Strategy.run</code>, to be combined into a single tensor. It can also be a
+regular tensor when used with <code>tf.distribute.OneDeviceStrategy</code> or the
+default strategy. The tensors that constitute the DistributedValues
+can only be dense tensors with non-zero rank, NOT a <code>tf.IndexedSlices</code>.
+</td>
+</tr><tr>
+<td>
+<code>axis</code>
+</td>
+<td>
+0-D int32 Tensor. Dimension along which to gather. Must be in the
+range [0, rank(value)).
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A <code>Tensor</code> that's the concatenation of <code>value</code> across replicas along
+<code>axis</code> dimension.
+</td>
+</tr>
+
+</table>
+
+<h3 id="reduce"><code>reduce</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reduce(
+    reduce_op, value, axis
+)
+</code></pre>
+
+Reduce `value` across replicas and return result on current device.
+
+```
+>>> strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1"])
+>>> def step_fn():
+...   i = tf.distribute.get_replica_context().replica_id_in_sync_group
+...   return tf.identity(i)
+>>>
+>>> per_replica_result = strategy.run(step_fn)
+>>> total = strategy.reduce("SUM", per_replica_result, axis=None)
+>>> total
+<tf.Tensor: shape=(), dtype=int32, numpy=1>
+```
+
+To see how this would look with multiple replicas, consider the same example
+with MirroredStrategy with 2 GPUs:
+
+```python
+strategy = tf.distribute.MirroredStrategy(devices=["GPU:0", "GPU:1"])
+def step_fn():
+  i = tf.distribute.get_replica_context().replica_id_in_sync_group
+  return tf.identity(i)
+
+per_replica_result = strategy.run(step_fn)
+# Check devices on which per replica result is:
+strategy.experimental_local_results(per_replica_result)[0].device
+# /job:localhost/replica:0/task:0/device:GPU:0
+strategy.experimental_local_results(per_replica_result)[1].device
+# /job:localhost/replica:0/task:0/device:GPU:1
+
+total = strategy.reduce("SUM", per_replica_result, axis=None)
+# Check device on which reduced result is:
+total.device
+# /job:localhost/replica:0/task:0/device:CPU:0
+
+```
+
+This API is typically used for aggregating the results returned from different
+replicas, for reporting etc. For example, loss computed from different replicas
+can be averaged using this API before printing.
+
+Note: The result is copied to the "current" device - which would typically be
+the CPU of the worker on which the program is running. For `TPUStrategy`, it is
+the first TPU host. For multi client `MultiWorkerMirroredStrategy`, this is CPU
+of each worker.
+
+There are a number of different tf.distribute APIs for reducing values across
+replicas: * `tf.distribute.ReplicaContext.all_reduce`: This differs from
+`Strategy.reduce` in that it is for replica context and does not copy the
+results to the host device. `all_reduce` should be typically used for reductions
+inside the training step such as gradients. *
+`tf.distribute.StrategyExtended.reduce_to` and
+`tf.distribute.StrategyExtended.batch_reduce_to`: These APIs are more advanced
+versions of `Strategy.reduce` as they allow customizing the destination of the
+result. They are also called in cross replica context.
+
+*What should axis be?*
+
+Given a per-replica value returned by `run`, say a per-example loss, the batch
+will be divided across all the replicas. This function allows you to aggregate
+across replicas and optionally also across batch elements by specifying the axis
+parameter accordingly.
+
+For example, if you have a global batch size of 8 and 2 replicas, values for
+examples `[0, 1, 2, 3]` will be on replica 0 and `[4, 5, 6, 7]` will be on
+replica 1. With `axis=None`, `reduce` will aggregate only across replicas,
+returning `[0+4, 1+5, 2+6, 3+7]`. This is useful when each replica is computing
+a scalar or some other value that doesn't have a "batch" dimension (like a
+gradient or loss). `strategy.reduce("sum", per_replica_result, axis=None)`
+
+Sometimes, you will want to aggregate across both the global batch *and* all
+replicas. You can get this behavior by specifying the batch dimension as the
+`axis`, typically `axis=0`. In this case it would return a scalar
+`0+1+2+3+4+5+6+7`. `strategy.reduce("sum", per_replica_result, axis=0)`
+
+If there is a last partial batch, you will need to specify an axis so that the
+resulting shape is consistent across replicas. So if the last batch has size 6
+and it is divided into [0, 1, 2, 3] and [4, 5], you would get a shape mismatch
+unless you specify `axis=0`. If you specify `tf.distribute.ReduceOp.MEAN`, using
+`axis=0` will use the correct denominator of 6. Contrast this with computing
+`reduce_mean` to get a scalar value on each replica and this function to average
+those means, which will weigh some values `1/8` and others `1/4`.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>reduce_op</code>
+</td>
+<td>
+a <code>tf.distribute.ReduceOp</code> value specifying how values should
+be combined. Allows using string representation of the enum such as
+"SUM", "MEAN".
+</td>
+</tr><tr>
+<td>
+<code>value</code>
+</td>
+<td>
+a <code>tf.distribute.DistributedValues</code> instance, e.g. returned by
+<code>Strategy.run</code>, to be combined into a single tensor. It can also be a
+regular tensor when used with <code>OneDeviceStrategy</code> or default strategy.
+</td>
+</tr><tr>
+<td>
+<code>axis</code>
+</td>
+<td>
+specifies the dimension to reduce along within each
+replica's tensor. Should typically be set to the batch dimension, or
+<code>None</code> to only reduce across replicas (e.g. if the tensor has no batch
+dimension).
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A <code>Tensor</code>.
+</td>
+</tr>
+
+</table>
+
+<h3 id="run"><code>run</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>run(
+    fn, args=(), kwargs=None, options=None
+)
+</code></pre>
+
+Run the computation defined by `fn` on each TPU replica.
+
+Executes ops specified by `fn` on each replica. If `args` or `kwargs` have
+`tf.distribute.DistributedValues`, such as those produced by a
+`tf.distribute.DistributedDataset` from
+`tf.distribute.Strategy.experimental_distribute_dataset` or
+`tf.distribute.Strategy.distribute_datasets_from_function`, when `fn` is
+executed on a particular replica, it will be executed with the component of
+`tf.distribute.DistributedValues` that correspond to that replica.
+
+`fn` may call `tf.distribute.get_replica_context()` to access members such as
+`all_reduce`.
+
+All arguments in `args` or `kwargs` should either be nest of tensors or
+`tf.distribute.DistributedValues` containing tensors or composite tensors.
+
+#### Example usage:
+
+```
+>>> resolver = tf.distribute.cluster_resolver.TPUClusterResolver(tpu='')
+>>> tf.config.experimental_connect_to_cluster(resolver)
+>>> tf.tpu.experimental.initialize_tpu_system(resolver)
+>>> strategy = tf.distribute.TPUStrategy(resolver)
+>>> @tf.function
+... def run():
+...   def value_fn(value_context):
+...     return value_context.num_replicas_in_sync
+...   distributed_values = (
+...       strategy.experimental_distribute_values_from_function(value_fn))
+...   def replica_fn(input):
+...     return input * 2
+...   return strategy.run(replica_fn, args=(distributed_values,))
+>>> result = run()
+```
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>fn</code>
+</td>
+<td>
+The function to run. The output must be a <code>tf.nest</code> of <code>Tensor</code>s.
+</td>
+</tr><tr>
+<td>
+<code>args</code>
+</td>
+<td>
+(Optional) Positional arguments to <code>fn</code>.
+</td>
+</tr><tr>
+<td>
+<code>kwargs</code>
+</td>
+<td>
+(Optional) Keyword arguments to <code>fn</code>.
+</td>
+</tr><tr>
+<td>
+<code>options</code>
+</td>
+<td>
+(Optional) An instance of <code>tf.distribute.RunOptions</code> specifying
+the options to run <code>fn</code>.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+Merged return value of <code>fn</code> across replicas. The structure of the return
+value is the same as the return value from <code>fn</code>. Each element in the
+structure can either be <code>tf.distribute.DistributedValues</code>, <code>Tensor</code>
+objects, or <code>Tensor</code>s (for example, if running on a single replica).
+</td>
+</tr>
+
+</table>
+
+<h3 id="scope"><code>scope</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>scope()
+</code></pre>
+
+Context manager to make the strategy current and distribute variables.
+
+This method returns a context manager, and is used as follows:
+
+```
+>>> strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1"])
+>>> # Variable created inside scope:
+>>> with strategy.scope():
+...   mirrored_variable = tf.Variable(1.)
+>>> mirrored_variable
+MirroredVariable:{
+  0: <tf.Variable 'Variable:0' shape=() dtype=float32, numpy=1.0>,
+  1: <tf.Variable 'Variable/replica_1:0' shape=() dtype=float32, numpy=1.0>
+}
+>>> # Variable created outside scope:
+>>> regular_variable = tf.Variable(1.)
+>>> regular_variable
+<tf.Variable 'Variable:0' shape=() dtype=float32, numpy=1.0>
+```
+
+*What happens when Strategy.scope is entered?*
+
+*   `strategy` is installed in the global context as the "current" strategy.
+    Inside this scope, `tf.distribute.get_strategy()` will now return this
+    strategy. Outside this scope, it returns the default no-op strategy.
+*   Entering the scope also enters the "cross-replica context". See
+    `tf.distribute.StrategyExtended` for an explanation on cross-replica and
+    replica contexts.
+*   Variable creation inside `scope` is intercepted by the strategy. Each
+    strategy defines how it wants to affect the variable creation. Sync
+    strategies like `MirroredStrategy`, `TPUStrategy` and
+    `MultiWorkerMiroredStrategy` create variables replicated on each replica,
+    whereas `ParameterServerStrategy` creates variables on the parameter
+    servers. This is done using a custom `tf.variable_creator_scope`.
+*   In some strategies, a default device scope may also be entered: in
+    `MultiWorkerMiroredStrategy`, a default device scope of "/CPU:0" is entered
+    on each worker.
+
+Note: Entering a scope does not automatically distribute a computation, except
+in the case of high level training framework like keras `model.fit`. If you're
+not using `model.fit`, you need to use `strategy.run` API to explicitly
+distribute that computation. See an example in the
+[custom training loop tutorial](https://www.tensorflow.org/tutorials/distribute/custom_training).
+
+*What should be in scope and what should be outside?*
+
+There are a number of requirements on what needs to happen inside the scope.
+However, in places where we have information about which strategy is in use, we
+often enter the scope for the user, so they don't have to do it explicitly (i.e.
+calling those either inside or outside the scope is OK).
+
+*   Anything that creates variables that should be distributed variables must be
+    called in a `strategy.scope`. This can be accomplished either by directly
+    calling the variable creating function within the scope context, or by
+    relying on another API like `strategy.run` or `keras.Model.fit` to
+    automatically enter it for you. Any variable that is created outside scope
+    will not be distributed and may have performance implications. Some common
+    objects that create variables in TF are Models, Optimizers, Metrics. Such
+    objects should always be initialized in the scope, and any functions that
+    may lazily create variables (e.g., `Model.__call__()`, tracing a
+    `tf.function`, etc.) should similarly be called within scope. Another source
+    of variable creation can be a checkpoint restore - when variables are
+    created lazily. Note that any variable created inside a strategy captures
+    the strategy information. So reading and writing to these variables outside
+    the `strategy.scope` can also work seamlessly, without the user having to
+    enter the scope.
+*   Some strategy APIs (such as `strategy.run` and `strategy.reduce`) which
+    require to be in a strategy's scope, enter the scope automatically, which
+    means when using those APIs you don't need to explicitly enter the scope
+    yourself.
+*   When a `tf.keras.Model` is created inside a `strategy.scope`, the Model
+    object captures the scope information. When high level training framework
+    methods such as `model.compile`, `model.fit`, etc. are then called, the
+    captured scope will be automatically entered, and the associated strategy
+    will be used to distribute the training etc. See a detailed example in
+    [distributed keras tutorial](https://www.tensorflow.org/tutorials/distribute/keras).
+    WARNING: Simply calling `model(..)` does not automatically enter the
+    captured scope -- only high level training framework APIs support this
+    behavior: `model.compile`, `model.fit`, `model.evaluate`, `model.predict`
+    and `model.save` can all be called inside or outside the scope.
+*   The following can be either inside or outside the scope:
+    *   Creating the input datasets
+    *   Defining `tf.function`s that represent your training step
+    *   Saving APIs such as `tf.saved_model.save`. Loading creates variables, so
+        that should go inside the scope if you want to train the model in a
+        distributed way.
+    *   Checkpoint saving. As mentioned above - `checkpoint.restore` may
+        sometimes need to be inside scope if it creates variables.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A context manager.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/Task.md b/tensorflow_gnn/docs/api_docs/python/runner/Task.md
new file mode 100644
index 00000000..770bcb48
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/Task.md
@@ -0,0 +1,179 @@
+# runner.Task
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/interfaces.py#L124-L222">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Defines a learning objective for a GNN.
+
+<!-- Placeholder for "Used in" -->
+
+A `Task` represents a learning objective for a GNN model and defines all the
+non-GNN pieces around the base GNN. Specifically:
+
+1) `preprocess` is expected to return a `GraphTensor` (or `GraphTensor`s) and a
+`Field` where (a) the base GNN's output for each `GraphTensor` is passed to
+`predict` and (b) the `Field` is used as the training label (for supervised
+tasks); 2) `predict` is expected to (a) take the base GNN's output for each
+`GraphTensor` returned by `preprocess` and (b) return a tensor with the model's
+prediction for this task; 3) `losses` is expected to return callables
+(`tf.Tensor`, `tf.Tensor`) -> `tf.Tensor` that accept (`y_true`, `y_pred`) where
+`y_true` is produced by some dataset and `y_pred` is the model's prediction from
+(2); 4) `metrics` is expected to return callables (`tf.Tensor`, `tf.Tensor`) ->
+`tf.Tensor` that accept (`y_true`, `y_pred`) where `y_true` is produced by some
+dataset and `y_pred` is the model's prediction from (2).
+
+`Task` can emit multiple outputs in `predict`: in that case we require that (a)
+it is a mapping, (b) outputs of `losses` and `metrics` are also mappings with
+matching keys, and (c) there is exactly one loss per key (there may be a
+sequence of metrics per key). This is done to prevent accidental dropping of
+losses (see b/291874188).
+
+No constraints are made on the `predict` method; e.g.: it may append a head with
+learnable weights or it may perform tensor computations only. (The entire `Task`
+coordinates what that means with respect to dataset—via `preprocess`—,
+modeling—via `predict`— and optimization—via `losses`.)
+
+`Task`s are applied in the scope of a training invocation: they are subject to
+the executing context of the `Trainer` and should, when needed, override it
+(e.g., a global policy, like `tf.keras.mixed_precision.global_policy()` and its
+implications over logit and activation layers).
+
+## Methods
+
+<h3 id="losses"><code>losses</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/interfaces.py#L214-L217">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>losses() -> Losses
+</code></pre>
+
+Returns arbitrary task specific losses.
+
+<h3 id="metrics"><code>metrics</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/interfaces.py#L219-L222">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>metrics() -> Metrics
+</code></pre>
+
+Returns arbitrary task specific metrics.
+
+<h3 id="predict"><code>predict</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/interfaces.py#L190-L212">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>predict(
+    *args
+) -> Predictions
+</code></pre>
+
+Produces prediction outputs for the learning objective.
+
+Overall model composition* makes use of the Keras Functional API
+(https://www.tensorflow.org/guide/keras/functional) to map symbolic Keras
+`GraphTensor` inputs to symbolic Keras `Field` outputs. Outputs must match the
+structure (one or mapping) of labels from `preprocess`.
+
+*) `outputs = predict(GNN(inputs))` where `inputs` are those `GraphTensor`
+returned by `preprocess(...)`, `GNN` is the base GNN, `predict` is this method
+and `outputs` are the prediction outputs for the learning objective.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>*args</code>
+</td>
+<td>
+The symbolic Keras <code>GraphTensor</code> inputs(s). These inputs correspond
+(in sequence) to the base GNN output of each <code>GraphTensor</code> returned by
+<code>preprocess(...)</code>.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+The model's prediction output for this task.
+</td>
+</tr>
+
+</table>
+
+<h3 id="preprocess"><code>preprocess</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/interfaces.py#L161-L188">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>preprocess(
+    inputs: GraphTensor
+) -> tuple[OneOrSequenceOf[GraphTensor], OneOrMappingOf[Field]]
+</code></pre>
+
+Preprocesses a scalar (after `merge_batch_to_components`) `GraphTensor`.
+
+This function uses the Keras functional API to define non-trainable
+transformations of the symbolic input `GraphTensor`, which get executed during
+dataset preprocessing in a `tf.data.Dataset.map(...)` operation. It has two
+responsibilities:
+
+1.  Splitting the training label out of the input for training. It must be
+    returned as a separate tensor or mapping of tensors.
+2.  Optionally, transforming input features. Some advanced modeling techniques
+    require running the same base GNN on multiple different transformations, so
+    this function may return a single `GraphTensor` or a non-empty sequence of
+    `GraphTensors`. The corresponding base GNN output for each `GraphTensor` is
+    provided to the `predict(...)` method.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>inputs</code>
+</td>
+<td>
+A symbolic Keras <code>GraphTensor</code> for processing.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A tuple of processed <code>GraphTensor</code>(s) and a (one or mapping of) <code>Field</code> to
+be used as labels.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/TightPadding.md b/tensorflow_gnn/docs/api_docs/python/runner/TightPadding.md
new file mode 100644
index 00000000..d859478b
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/TightPadding.md
@@ -0,0 +1,47 @@
+# runner.TightPadding
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/utils/padding.py#L94-L109">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Calculates tight `SizeConstraints` for `GraphTensor` padding.
+
+Inherits From: [`GraphTensorPadding`](../runner/GraphTensorPadding.md)
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.TightPadding(
+    gtspec: tfgnn.GraphTensorSpec,
+    dataset_provider: <a href="../runner/DatasetProvider.md"><code>runner.DatasetProvider</code></a>,
+    min_nodes_per_component: Optional[Mapping[str, int]] = None
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+See: `tfgnn.find_tight_size_constraints.`
+
+## Methods
+
+<h3 id="get_filter_fn"><code>get_filter_fn</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/utils/padding.py#L100-L102">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_filter_fn(
+    size_constraints: SizeConstraints
+) -> Callable[..., bool]
+</code></pre>
+
+<h3 id="get_size_constraints"><code>get_size_constraints</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/utils/padding.py#L104-L109">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_size_constraints(
+    target_batch_size: int
+) -> SizeConstraints
+</code></pre>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/Trainer.md b/tensorflow_gnn/docs/api_docs/python/runner/Trainer.md
new file mode 100644
index 00000000..1b1db5f8
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/Trainer.md
@@ -0,0 +1,100 @@
+# runner.Trainer
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/interfaces.py#L225-L263">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+A class for training and validation of a Keras model.
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr> <td> <code>model_dir</code><a id="model_dir"></a> </td> <td>
+
+</td> </tr><tr> <td> <code>strategy</code><a id="strategy"></a> </td> <td>
+
+</td>
+</tr>
+</table>
+
+## Methods
+
+<h3 id="train"><code>train</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/interfaces.py#L238-L263">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>train(
+    model_fn: Callable[[], tf.keras.Model],
+    train_ds_provider: DatasetProvider,
+    *,
+    epochs: int = 1,
+    valid_ds_provider: Optional[DatasetProvider] = None
+) -> tf.keras.Model
+</code></pre>
+
+Trains a `tf.keras.Model` with optional validation.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+<code>model_fn</code>
+</td>
+<td>
+Returns a <code>tf.keras.Model</code> for use in training and validation.
+</td>
+</tr><tr>
+<td>
+<code>train_ds_provider</code>
+</td>
+<td>
+A <code>DatasetProvider</code> for training. The items of the
+<code>tf.data.Dataset</code> are pairs <code>(graph_tensor, label)</code> that represent one
+batch of per-replica training inputs after
+<code>GraphTensor.merge_batch_to_components()</code> has been applied.
+</td>
+</tr><tr>
+<td>
+<code>epochs</code>
+</td>
+<td>
+The epochs to train.
+</td>
+</tr><tr>
+<td>
+<code>valid_ds_provider</code>
+</td>
+<td>
+A <code>DatasetProvider</code> for validation. The items of the
+<code>tf.data.Dataset</code> are pairs <code>(graph_tensor, label)</code> that represent one
+batch of per-replica training inputs after
+<code>GraphTensor.merge_batch_to_components()</code> has been applied.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A trained <code>tf.keras.Model</code>.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/all_symbols.md b/tensorflow_gnn/docs/api_docs/python/runner/all_symbols.md
new file mode 100644
index 00000000..b8281392
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/all_symbols.md
@@ -0,0 +1,62 @@
+# All symbols in TensorFlow GNN Runner
+
+<!-- Insert buttons and diff -->
+
+## Primary symbols
+
+*   <a href="../runner.md"><code>runner</code></a>
+*   <a href="../runner/ContextLabelFn.md"><code>runner.ContextLabelFn</code></a>
+*   <a href="../runner/DatasetProvider.md"><code>runner.DatasetProvider</code></a>
+*   <a href="../runner/DotProductLinkPrediction.md"><code>runner.DotProductLinkPrediction</code></a>
+*   <a href="../runner/FitOrSkipPadding.md"><code>runner.FitOrSkipPadding</code></a>
+*   <a href="../runner/GraphBinaryClassification.md"><code>runner.GraphBinaryClassification</code></a>
+*   <a href="../runner/GraphMeanAbsoluteError.md"><code>runner.GraphMeanAbsoluteError</code></a>
+*   <a href="../runner/GraphMeanAbsolutePercentageError.md"><code>runner.GraphMeanAbsolutePercentageError</code></a>
+*   <a href="../runner/GraphMeanSquaredError.md"><code>runner.GraphMeanSquaredError</code></a>
+*   <a href="../runner/GraphMeanSquaredLogScaledError.md"><code>runner.GraphMeanSquaredLogScaledError</code></a>
+*   <a href="../runner/GraphMeanSquaredLogarithmicError.md"><code>runner.GraphMeanSquaredLogarithmicError</code></a>
+*   <a href="../runner/GraphMulticlassClassification.md"><code>runner.GraphMulticlassClassification</code></a>
+*   <a href="../runner/GraphTensorPadding.md"><code>runner.GraphTensorPadding</code></a>
+*   <a href="../runner/GraphTensorProcessorFn.md"><code>runner.GraphTensorProcessorFn</code></a>
+*   <a href="../runner/HadamardProductLinkPrediction.md"><code>runner.HadamardProductLinkPrediction</code></a>
+*   <a href="../runner/IntegratedGradientsExporter.md"><code>runner.IntegratedGradientsExporter</code></a>
+*   <a href="../runner/KerasModelExporter.md"><code>runner.KerasModelExporter</code></a>
+*   <a href="../runner/KerasTrainer.md"><code>runner.KerasTrainer</code></a>
+*   <a href="../runner/KerasTrainerCheckpointOptions.md"><code>runner.KerasTrainerCheckpointOptions</code></a>
+*   <a href="../runner/KerasTrainerOptions.md"><code>runner.KerasTrainerOptions</code></a>
+*   <a href="../runner/Loss.md"><code>runner.Loss</code></a>
+*   <a href="../runner/Losses.md"><code>runner.Losses</code></a>
+*   <a href="../runner/Loss.md"><code>runner.Metric</code></a>
+*   <a href="../runner/Metrics.md"><code>runner.Metrics</code></a>
+*   <a href="../runner/ModelExporter.md"><code>runner.ModelExporter</code></a>
+*   <a href="../runner/NodeBinaryClassification.md"><code>runner.NodeBinaryClassification</code></a>
+*   <a href="../runner/NodeMulticlassClassification.md"><code>runner.NodeMulticlassClassification</code></a>
+*   <a href="../runner/ParameterServerStrategy.md"><code>runner.ParameterServerStrategy</code></a>
+*   <a href="../runner/PassthruDatasetProvider.md"><code>runner.PassthruDatasetProvider</code></a>
+*   <a href="../runner/PassthruSampleDatasetsProvider.md"><code>runner.PassthruSampleDatasetsProvider</code></a>
+*   <a href="../runner/Predictions.md"><code>runner.Predictions</code></a>
+*   <a href="../runner/RootNodeBinaryClassification.md"><code>runner.RootNodeBinaryClassification</code></a>
+*   <a href="../runner/RootNodeLabelFn.md"><code>runner.RootNodeLabelFn</code></a>
+*   <a href="../runner/RootNodeMeanAbsoluteError.md"><code>runner.RootNodeMeanAbsoluteError</code></a>
+*   <a href="../runner/RootNodeMeanAbsoluteLogarithmicError.md"><code>runner.RootNodeMeanAbsoluteLogarithmicError</code></a>
+*   <a href="../runner/RootNodeMeanAbsolutePercentageError.md"><code>runner.RootNodeMeanAbsolutePercentageError</code></a>
+*   <a href="../runner/RootNodeMeanSquaredError.md"><code>runner.RootNodeMeanSquaredError</code></a>
+*   <a href="../runner/RootNodeMeanSquaredLogScaledError.md"><code>runner.RootNodeMeanSquaredLogScaledError</code></a>
+*   <a href="../runner/RootNodeMeanSquaredLogarithmicError.md"><code>runner.RootNodeMeanSquaredLogarithmicError</code></a>
+*   <a href="../runner/RootNodeMulticlassClassification.md"><code>runner.RootNodeMulticlassClassification</code></a>
+*   <a href="../runner/RunResult.md"><code>runner.RunResult</code></a>
+*   <a href="../runner/SampleTFRecordDatasetsProvider.md"><code>runner.SampleTFRecordDatasetsProvider</code></a>
+*   <a href="../runner/SimpleDatasetProvider.md"><code>runner.SimpleDatasetProvider</code></a>
+*   <a href="../runner/SimpleSampleDatasetsProvider.md"><code>runner.SimpleSampleDatasetsProvider</code></a>
+*   <a href="../runner/SubmoduleExporter.md"><code>runner.SubmoduleExporter</code></a>
+*   <a href="../runner/TFDataServiceConfig.md"><code>runner.TFDataServiceConfig</code></a>
+*   <a href="../runner/TFRecordDatasetProvider.md"><code>runner.TFRecordDatasetProvider</code></a>
+*   <a href="../runner/TPUStrategy.md"><code>runner.TPUStrategy</code></a>
+*   <a href="../runner/Task.md"><code>runner.Task</code></a>
+*   <a href="../runner/TightPadding.md"><code>runner.TightPadding</code></a>
+*   <a href="../runner/Trainer.md"><code>runner.Trainer</code></a>
+*   <a href="../runner/export_model.md"><code>runner.export_model</code></a>
+*   <a href="../runner/incrementing_model_dir.md"><code>runner.incrementing_model_dir</code></a>
+*   <a href="../runner/integrated_gradients.md"><code>runner.integrated_gradients</code></a>
+*   <a href="../runner/one_node_per_component.md"><code>runner.one_node_per_component</code></a>
+*   <a href="../runner/run.md"><code>runner.run</code></a>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/export_model.md b/tensorflow_gnn/docs/api_docs/python/runner/export_model.md
new file mode 100644
index 00000000..22c27634
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/export_model.md
@@ -0,0 +1,77 @@
+# runner.export_model
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/utils/model_export.py#L170-L252">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Exports a Keras model without traces s.t. it is loadable without TF-GNN.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.export_model(
+    model: tf.keras.Model,
+    export_dir: str,
+    *,
+    output_names: Optional[Any] = None,
+    options: Optional[tf.saved_model.SaveOptions] = None,
+    use_legacy_model_save: Optional[bool] = None
+) -> None
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>model</code><a id="model"></a>
+</td>
+<td>
+Keras model instance to be saved.
+</td>
+</tr><tr>
+<td>
+<code>export_dir</code><a id="export_dir"></a>
+</td>
+<td>
+Path where to save the model.
+</td>
+</tr><tr>
+<td>
+<code>output_names</code><a id="output_names"></a>
+</td>
+<td>
+Optionally, a nest of <code>str</code> values or <code>None</code> with the same
+structure as the outputs of <code>model</code>. A non-<code>None</code> value is used as that
+output's key in the SavedModel signature. By default, an output gets
+the name of the final Keras layer creating it as its key (matching the
+behavior of legacy <code>Model.save(save_format="tf")</code>).
+</td>
+</tr><tr>
+<td>
+<code>options</code><a id="options"></a>
+</td>
+<td>
+An optional <code>tf.saved_model.SaveOptions</code> argument.
+</td>
+</tr><tr>
+<td>
+<code>use_legacy_model_save</code><a id="use_legacy_model_save"></a>
+</td>
+<td>
+Optional; most users can leave it unset to get a
+useful default for export to inference. If set to <code>True</code>, forces the use
+of <code>Model.save()</code>, which exports a SavedModel suitable for inference and
+potentially also for reloading as a Keras model (depending on its Layers).
+If set to <code>False</code>, forces the use of <code>tf.keras.export.ExportArchive</code>,
+which is usable as of TensorFlow 2.13 and is advertised as the more
+streamlined way of exporting to SavedModel for inference only. Currently,
+<code>None</code> behaves like <code>True</code>, but the long-term plan is to migrate towards
+<code>False</code>.
+</td>
+</tr>
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/incrementing_model_dir.md b/tensorflow_gnn/docs/api_docs/python/runner/incrementing_model_dir.md
new file mode 100644
index 00000000..701754de
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/incrementing_model_dir.md
@@ -0,0 +1,52 @@
+# runner.incrementing_model_dir
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/utils/model_dir.py#L21-L35">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Create, given some `dirname`, an incrementing model directory.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.incrementing_model_dir(
+    dirname: str, start: int = 0
+) -> str
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>dirname</code><a id="dirname"></a>
+</td>
+<td>
+The base directory name.
+</td>
+</tr><tr>
+<td>
+<code>start</code><a id="start"></a>
+</td>
+<td>
+The starting integer.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
+<tr class="alt">
+<td colspan="2">
+A model directory <code>dirname/n</code> where 'n' is the maximum integer in <code>dirname</code>.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/integrated_gradients.md b/tensorflow_gnn/docs/api_docs/python/runner/integrated_gradients.md
new file mode 100644
index 00000000..7b2c69ef
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/integrated_gradients.md
@@ -0,0 +1,95 @@
+# runner.integrated_gradients
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/utils/attribution.py#L230-L300">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Integrated gradients.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.integrated_gradients(
+    preprocess_model: tf.keras.Model,
+    model: tf.keras.Model,
+    *,
+    output_name: Optional[str] = None,
+    random_counterfactual: bool,
+    steps: int,
+    seed: Optional[int] = None
+) -> tf.types.experimental.ConcreteFunction
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+This `tf.function` computes integrated gradients over a `tfgnn.GraphTensor.` The
+`tf.function` will be persisted in the ultimate saved model for subsequent
+attribution.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>preprocess_model</code><a id="preprocess_model"></a>
+</td>
+<td>
+A <code>tf.keras.Model</code> for preprocessing. This model is
+expected to return a tuple (<code>GraphTensor</code>, <code>Tensor</code>) where the
+<code>GraphTensor</code> is used to invoke the below <code>model</code> and the tensor is used
+used for any loss computation. (Via <code>model.compiled_loss</code>.)
+</td>
+</tr><tr>
+<td>
+<code>model</code><a id="model"></a>
+</td>
+<td>
+A <code>tf.keras.Model</code> for integrated gradients.
+</td>
+</tr><tr>
+<td>
+<code>output_name</code><a id="output_name"></a>
+</td>
+<td>
+The output <code>Tensor</code> name. If unset, the tensor will be named
+by Keras defaults.
+</td>
+</tr><tr>
+<td>
+<code>random_counterfactual</code><a id="random_counterfactual"></a>
+</td>
+<td>
+Whether to use a random uniform counterfactual.
+</td>
+</tr><tr>
+<td>
+<code>steps</code><a id="steps"></a>
+</td>
+<td>
+The number of interpolations of the Riemann sum approximation.
+</td>
+</tr><tr>
+<td>
+<code>seed</code><a id="seed"></a>
+</td>
+<td>
+An option random seed.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
+<tr class="alt">
+<td colspan="2">
+A <code>tf.function</code> with the integrated gradients as output.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/one_node_per_component.md b/tensorflow_gnn/docs/api_docs/python/runner/one_node_per_component.md
new file mode 100644
index 00000000..a260d8b0
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/one_node_per_component.md
@@ -0,0 +1,17 @@
+# runner.one_node_per_component
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/utils/padding.py#L28-L30">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Returns a `Mapping` `node_set_name: 1` for every node set in `gtspec`.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.one_node_per_component(
+    gtspec: tfgnn.GraphTensorSpec
+) -> Mapping[str, int]
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
diff --git a/tensorflow_gnn/docs/api_docs/python/runner/run.md b/tensorflow_gnn/docs/api_docs/python/runner/run.md
new file mode 100644
index 00000000..7c234650
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/runner/run.md
@@ -0,0 +1,275 @@
+# runner.run
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/runner/orchestration.py#L364-L639">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Runs training (and validation) of a model on task(s) with the given data.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>runner.run(
+    *,
+    train_ds_provider: DatasetProvider,
+    model_fn: Callable[[GraphTensorSpec], tf.keras.Model],
+    optimizer_fn: Callable[[], tf.keras.optimizers.Optimizer],
+    trainer: Trainer,
+    task: OneOrMappingOf[Task],
+    loss_weights: Optional[Mapping[str, float]] = None,
+    gtspec: GraphTensorSpec,
+    global_batch_size: int,
+    epochs: int = 1,
+    drop_remainder: bool = False,
+    export_dirs: Optional[Sequence[str]] = None,
+    model_exporters: Optional[Sequence[ModelExporter]] = None,
+    feature_processors: Optional[Sequence[GraphTensorProcessorFn]] = None,
+    valid_ds_provider: Optional[DatasetProvider] = None,
+    train_padding: Optional[GraphTensorPadding] = None,
+    valid_padding: Optional[GraphTensorPadding] = None,
+    tf_data_service_config: Optional[TFDataServiceConfig] = None,
+    steps_per_execution: Optional[int] = None,
+    run_eagerly: bool = False
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+This includes preprocessing the input data, appending any suitable head(s), and
+running training (and validation) with the requested distribution strategy.
+
+The input data is processed in multiple stages, starting from the contents of
+the datasets provided by `train_ds_provider` and `valid_ds_provider`:
+
+1.  Input examples are batched.
+2.  If necessary, input batches are parsed as `GraphTensor` values and merged
+    into components (see: `GraphTensor.merge_batch_to_components`).
+3.  If set, `train_padding` and `valid_padding`, resp., are applied.
+4.  The given `feature_processors` are applied in order for all non-trainable
+    feature transformations on CPU (as part of `tf.data.Dataset.map(...)`).
+5.  The
+    <a href="../runner/Task.md#preprocess"><code>Task.preprocess(...)</code></a>
+    method is applied to extract training targets (for supervised learning, that
+    means: labels) and optionally transform the value of the preprocessed
+    `GraphTensor` into a model input (or multiple model inputs for tasks like
+    self-supervised contrastive losses).
+6.  If the resulting `GraphTensor`s have any auxiliary pieces (as indicated by
+    `tfgnn.get_aux_type_prefix(...)`): all features (typically: labels) are
+    removed from those graph pieces.
+
+The base GNN (as built by `model_fn`) is run on all results from step (6).
+<a href="../runner/Task.md#predict"><code>Task.predict(...)</code></a> is called
+on the model outputs that correspond to the one or more graphs requested in step
+(5) by
+<a href="../runner/Task.md#preprocess"><code>Task.preprocess(...)</code></a>.
+
+Trainable transformations of inputs (notably lookups in trainable embedding
+tables) are required to happen inside `model_fn`.
+
+For supervised learning, training labels enter the pipeline as features on the
+`GraphTensor` that undergo the `feature_processors` (shared by all `Task`s) and
+are read out of the `GraphTensor` by
+<a href="../runner/Task.md#preprocess"><code>Task.preprocess(...)</code></a>.
+
+Users are strongly encouraged to take one of the following two approaches to
+prevent the leakage of label information into the training:
+
+*   Store labels on the auxiliary `"_readout"` node set and let
+    <a href="../runner/Task.md#preprocess"><code>Task.preprocess(...)</code></a>
+    read them from there. (For library-supplied `Task`s, that means initializing
+    with `label_feature_name="..."`.) If that is not already true for the input
+    datasets, the label feature can be moved there by one of the
+    `feature_processors`, using `tfgnn.structured_readout_into_feature(...)` or
+    a similar helper function.
+*   For single-Task training only: Let
+    <a href="../runner/Task.md#preprocess"><code>Task.preprocess()</code></a>
+    return modified `GraphTensor`s that no longer contain the separately
+    returned labels. (Library-supplied Tasks delegate this to the
+    `label_fn="..."` passed in initialization.)
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>train_ds_provider</code><a id="train_ds_provider"></a>
+</td>
+<td>
+A <code>DatasetProvider</code> for training. The <code>tf.data.Dataset</code>
+is not batched and contains scalar <code>GraphTensor</code> values conforming to
+<code>gtspec</code>, possibly serialized as a <code>tf.train.Example</code> proto.
+</td>
+</tr><tr>
+<td>
+<code>model_fn</code><a id="model_fn"></a>
+</td>
+<td>
+Returns the base GNN <code>tf.keras.Model</code> for use in training and
+validation.
+</td>
+</tr><tr>
+<td>
+<code>optimizer_fn</code><a id="optimizer_fn"></a>
+</td>
+<td>
+Returns a <code>tf.keras.optimizers.Optimizer</code> for use in training.
+</td>
+</tr><tr>
+<td>
+<code>trainer</code><a id="trainer"></a>
+</td>
+<td>
+A <code>Trainer</code>.
+</td>
+</tr><tr>
+<td>
+<code>task</code><a id="task"></a>
+</td>
+<td>
+A <code>Task</code> for single-Task training or a <code>Mapping[str, Task]</code> for
+multi-Task training. In multi-Task training, <a href="../runner/Task.md#preprocess"><code>Task.preprocess(...)</code></a>
+must return <code>GraphTensors</code> with the same spec as its inputs, only the
+values may change (so that there remains a single spec for <code>model_fn</code>).
+</td>
+</tr><tr>
+<td>
+<code>loss_weights</code><a id="loss_weights"></a>
+</td>
+<td>
+An optional <code>Mapping[str, float]</code> for multi-Task training. If
+given, this structure must match (with <code>tf.nest.assert_same_structure</code>)
+the structure of <code>task</code>. The mapping contains, for each <code>task</code>, a scalar
+coefficient to weight the loss contributions of that <code>task</code>.
+</td>
+</tr><tr>
+<td>
+<code>gtspec</code><a id="gtspec"></a>
+</td>
+<td>
+A <code>GraphTensorSpec</code> matching the elements of <code>train</code> and <code>valid</code>
+datasets. If <code>train</code> or <code>valid</code> contain <code>tf.string</code> elements, this
+<code>GraphTensorSpec</code> is used for parsing; otherwise, <code>train</code> or <code>valid</code> are
+expected to contain <code>GraphTensor</code> elements whose relaxed spec matches
+<code>gtspec</code>.
+</td>
+</tr><tr>
+<td>
+<code>global_batch_size</code><a id="global_batch_size"></a>
+</td>
+<td>
+The <code>tf.data.Dataset</code> global batch size for both training
+and validation.
+</td>
+</tr><tr>
+<td>
+<code>epochs</code><a id="epochs"></a>
+</td>
+<td>
+The epochs to train.
+</td>
+</tr><tr>
+<td>
+<code>drop_remainder</code><a id="drop_remainder"></a>
+</td>
+<td>
+Whether to drop a <code>tf.data.Dataset</code> remainder at batching.
+</td>
+</tr><tr>
+<td>
+<code>export_dirs</code><a id="export_dirs"></a>
+</td>
+<td>
+Optional directories for exports (SavedModels); if unset,
+default behavior is <code>os.path.join(model_dir, "export")</code>.
+</td>
+</tr><tr>
+<td>
+<code>model_exporters</code><a id="model_exporters"></a>
+</td>
+<td>
+Zero or more <code>ModelExporter</code> for exporting (SavedModels) to
+<code>export_dirs</code>. If unset, default behavior is <code>[KerasModelExporter()]</code>.
+</td>
+</tr><tr>
+<td>
+<code>feature_processors</code><a id="feature_processors"></a>
+</td>
+<td>
+A sequence of callables for feature processing with the
+Keras functional API. Each callable must accept and return a symbolic
+scalar <code>GraphTensor</code>. The callables are composed in order and may change
+the <code>GraphTensorSpec</code> (e.g., add/remove features). The resulting Keras
+model is executed on CPU as part of a <code>tf.data.Dataset.map</code> operation.
+</td>
+</tr><tr>
+<td>
+<code>valid_ds_provider</code><a id="valid_ds_provider"></a>
+</td>
+<td>
+A <code>DatasetProvider</code> for validation. The <code>tf.data.Dataset</code>
+is not batched and contains scalar <code>GraphTensor</code> values conforming to
+<code>gtspec</code>, possibly serialized as a <code>tf.train.Example</code> proto.
+</td>
+</tr><tr>
+<td>
+<code>train_padding</code><a id="train_padding"></a>
+</td>
+<td>
+<code>GraphTensor</code> padding for training. Required if training on
+TPU.
+</td>
+</tr><tr>
+<td>
+<code>valid_padding</code><a id="valid_padding"></a>
+</td>
+<td>
+<code>GraphTensor</code> padding for validation. Required if training on
+TPU.
+</td>
+</tr><tr>
+<td>
+<code>tf_data_service_config</code><a id="tf_data_service_config"></a>
+</td>
+<td>
+tf.data service speeds-up tf.data input pipeline
+runtime reducing input bottlenecks for model training. Particularly for
+training on accelerators consider enabling it. For more info please see:
+https://www.tensorflow.org/api_docs/python/tf/data/experimental/service.
+</td>
+</tr><tr>
+<td>
+<code>steps_per_execution</code><a id="steps_per_execution"></a>
+</td>
+<td>
+The number of batches to run during each training
+iteration. If not set, for TPU strategy default to 100 and to <code>None</code>
+otherwise.
+</td>
+</tr><tr>
+<td>
+<code>run_eagerly</code><a id="run_eagerly"></a>
+</td>
+<td>
+Whether to compile the model in eager mode, primarily for
+debugging purposes. Note that the symbolic model will still be run twice,
+so if you use a <code>breakpoint()</code> you will have to Continue twice before you
+are in a real eager execution.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
+<tr class="alt">
+<td colspan="2">
+A <code>RunResult</code> object containing models and information about this run.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn.md b/tensorflow_gnn/docs/api_docs/python/tfgnn.md
index 43d214c5..2d03c918 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn.md
@@ -1,32 +1,20 @@
 # Module: tfgnn
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/__init__.py">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
-
-
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/__init__.py">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Public interface for TensorFlow GNN package.
 
-
 All the public symbols, data types and functions are provided from this
 top-level package. To use the library, you should use a single import statement,
 like this:
 
-    import tensorflow_gnn as tfgnn
-
-The various data types provided by the GNN library have corresponding schemas
-similar to `tf.TensorSpec`. For example, a `FieldSpec` describes an instance of
-`Field`, and a `GraphTensorSpec` describes an instance of `GraphTensor`.
+```
+import tensorflow_gnn as tfgnn
+```
 
 ## Modules
 
@@ -35,43 +23,63 @@ of the public interface of TensorFlow GNN.
 
 [`keras`](./tfgnn/keras.md) module: The tfgnn.keras package.
 
+[`proto`](./tfgnn/proto.md) module: The protocol message (protobuf) types
+defined by TensorFlow GNN.
+
 [`sampler`](./tfgnn/sampler.md) module: Public interface for GNN Sampler.
 
 ## Classes
 
-[`class Adjacency`](./tfgnn/Adjacency.md): Stores how edges connect pairs of nodes from source and target node sets.
+[`class Adjacency`](./tfgnn/Adjacency.md): Stores how edges connect pairs of
+nodes from source and target node sets.
 
-[`class AdjacencySpec`](./tfgnn/AdjacencySpec.md): A type spec for <a href="./tfgnn/Adjacency.md"><code>tfgnn.Adjacency</code></a>.
+[`class AdjacencySpec`](./tfgnn/AdjacencySpec.md): A type spec for
+<a href="./tfgnn/Adjacency.md"><code>tfgnn.Adjacency</code></a>.
 
-[`class Context`](./tfgnn/Context.md): A composite tensor for graph context features.
+[`class Context`](./tfgnn/Context.md): A composite tensor for graph context
+features.
 
-[`class ContextSpec`](./tfgnn/ContextSpec.md): A type spec for <a href="./tfgnn/Context.md"><code>tfgnn.Context</code></a>.
+[`class ContextSpec`](./tfgnn/ContextSpec.md): A type spec for
+<a href="./tfgnn/Context.md"><code>tfgnn.Context</code></a>.
 
-[`class EdgeSet`](./tfgnn/EdgeSet.md): A composite tensor for edge set features, size and adjacency information.
+[`class EdgeSet`](./tfgnn/EdgeSet.md): A composite tensor for edge set features,
+size and adjacency information.
 
-[`class EdgeSetSpec`](./tfgnn/EdgeSetSpec.md): A type spec for <a href="./tfgnn/EdgeSet.md"><code>tfgnn.EdgeSet</code></a>.
+[`class EdgeSetSpec`](./tfgnn/EdgeSetSpec.md): A type spec for
+<a href="./tfgnn/EdgeSet.md"><code>tfgnn.EdgeSet</code></a>.
 
-[`class Feature`](./tfgnn/Feature.md): A schema for a single feature.
+[`class Feature`](./tfgnn/proto/Feature.md): The schema entry for a single
+feature.
 
-[`class FeatureDefaultValues`](./tfgnn/FeatureDefaultValues.md): Default values for graph context, node sets and edge sets features.
+[`class FeatureDefaultValues`](./tfgnn/FeatureDefaultValues.md): Default values
+for graph context, node sets and edge sets features.
 
-[`class GraphSchema`](./tfgnn/GraphSchema.md): A schema definition for graphs.
+[`class GraphSchema`](./tfgnn/proto/GraphSchema.md): The top-level container for
+the schema of a graph dataset.
 
-[`class GraphTensor`](./tfgnn/GraphTensor.md): A composite tensor for heterogeneous directed graphs with features.
+[`class GraphTensor`](./tfgnn/GraphTensor.md): A composite tensor for
+heterogeneous directed graphs with features.
 
-[`class GraphTensorSpec`](./tfgnn/GraphTensorSpec.md): A type spec for <a href="./tfgnn/GraphTensor.md"><code>tfgnn.GraphTensor</code></a>.
+[`class GraphTensorSpec`](./tfgnn/GraphTensorSpec.md): A type spec for
+<a href="./tfgnn/GraphTensor.md"><code>tfgnn.GraphTensor</code></a>.
 
-[`class HyperAdjacency`](./tfgnn/HyperAdjacency.md): Stores how (hyper-)edges connect tuples of nodes from incident node sets.
+[`class HyperAdjacency`](./tfgnn/HyperAdjacency.md): Stores how (hyper-)edges
+connect tuples of nodes from incident node sets.
 
-[`class HyperAdjacencySpec`](./tfgnn/HyperAdjacencySpec.md): A type spec for <a href="./tfgnn/HyperAdjacency.md"><code>tfgnn.HyperAdjacency</code></a>.
+[`class HyperAdjacencySpec`](./tfgnn/HyperAdjacencySpec.md): A type spec for
+<a href="./tfgnn/HyperAdjacency.md"><code>tfgnn.HyperAdjacency</code></a>.
 
-[`class NodeSet`](./tfgnn/NodeSet.md): A composite tensor for node set features plus size information.
+[`class NodeSet`](./tfgnn/NodeSet.md): A composite tensor for node set features
+plus size information.
 
-[`class NodeSetSpec`](./tfgnn/NodeSetSpec.md): A type spec for <a href="./tfgnn/NodeSet.md"><code>tfgnn.NodeSet</code></a>.
+[`class NodeSetSpec`](./tfgnn/NodeSetSpec.md): A type spec for
+<a href="./tfgnn/NodeSet.md"><code>tfgnn.NodeSet</code></a>.
 
-[`class SizeConstraints`](./tfgnn/SizeConstraints.md): Constraints on the number of entities in the graph.
+[`class SizeConstraints`](./tfgnn/SizeConstraints.md): Constraints on the number
+of entities in the graph.
 
-[`class ValidationError`](./tfgnn/ValidationError.md): A schema validation error.
+[`class ValidationError`](./tfgnn/ValidationError.md): A schema validation
+error.
 
 ## Functions
 
@@ -79,23 +87,29 @@ of the public interface of TensorFlow GNN.
 Adds a readout structure equivalent to
 <a href="./tfgnn/gather_first_node.md"><code>tfgnn.gather_first_node()</code></a>.
 
-[`add_self_loops(...)`](./tfgnn/add_self_loops.md): Adds self-loops for edge
-with name `edge_set_name` EVEN if already exist.
+[`add_self_loops(...)`](./tfgnn/add_self_loops.md): Adds self-loops for
+`edge_set_name` EVEN if they already exist.
 
-[`assert_constraints(...)`](./tfgnn/assert_constraints.md): Validate the shape constaints of a graph's features at runtime.
+[`assert_constraints(...)`](./tfgnn/assert_constraints.md): Validate the shape
+constaints of a graph's features at runtime.
 
-[`assert_satisfies_size_constraints(...)`](./tfgnn/assert_satisfies_size_constraints.md): Raises InvalidArgumentError if graph_tensor exceeds size_constraints.
+[`assert_satisfies_size_constraints(...)`](./tfgnn/assert_satisfies_size_constraints.md):
+Raises InvalidArgumentError if graph_tensor exceeds size_constraints.
 
-[`assert_satisfies_total_sizes(...)`](./tfgnn/assert_satisfies_size_constraints.md): Raises InvalidArgumentError if graph_tensor exceeds size_constraints.
+[`assert_satisfies_total_sizes(...)`](./tfgnn/assert_satisfies_size_constraints.md):
+Raises InvalidArgumentError if graph_tensor exceeds size_constraints.
 
 [`broadcast(...)`](./tfgnn/broadcast.md): Broadcasts values from nodes to edges,
 or from context to nodes or edges.
 
-[`broadcast_context_to_edges(...)`](./tfgnn/broadcast_context_to_edges.md): Broadcasts a context value to the `edge_set` edges.
+[`broadcast_context_to_edges(...)`](./tfgnn/broadcast_context_to_edges.md):
+Broadcasts a context value to the `edge_set` edges.
 
-[`broadcast_context_to_nodes(...)`](./tfgnn/broadcast_context_to_nodes.md): Broadcasts a context value to the `node_set` nodes.
+[`broadcast_context_to_nodes(...)`](./tfgnn/broadcast_context_to_nodes.md):
+Broadcasts a context value to the `node_set` nodes.
 
-[`broadcast_node_to_edges(...)`](./tfgnn/broadcast_node_to_edges.md): Broadcasts values from nodes to incident edges.
+[`broadcast_node_to_edges(...)`](./tfgnn/broadcast_node_to_edges.md): Broadcasts
+values from nodes to incident edges.
 
 [`check_compatible_with_schema_pb(...)`](./tfgnn/check_compatible_with_schema_pb.md):
 Checks that the given spec or value is compatible with the graph schema.
@@ -103,28 +117,47 @@ Checks that the given spec or value is compatible with the graph schema.
 [`check_homogeneous_graph_tensor(...)`](./tfgnn/check_homogeneous_graph_tensor.md):
 Raises ValueError when tfgnn.get_homogeneous_node_and_edge_set_name() does.
 
-[`check_required_features(...)`](./tfgnn/check_required_features.md): Checks the requirements of a given schema against another.
+[`check_required_features(...)`](./tfgnn/check_required_features.md): Checks the
+requirements of a given schema against another.
 
-[`check_scalar_graph_tensor(...)`](./tfgnn/check_scalar_graph_tensor.md)
+[`check_scalar_graph_tensor(...)`](./tfgnn/check_scalar_graph_tensor.md): Checks
+that graph tensor is scalar (has rank 0).
 
-[`combine_values(...)`](./tfgnn/combine_values.md): Combines a list of tensors into one (by concatenation or otherwise).
+[`combine_values(...)`](./tfgnn/combine_values.md): Combines a list of tensors
+into one (by concatenation or otherwise).
 
 [`convert_to_line_graph(...)`](./tfgnn/convert_to_line_graph.md): Obtain a
 graph's line graph.
 
-[`create_graph_spec_from_schema_pb(...)`](./tfgnn/create_graph_spec_from_schema_pb.md): Converts a graph schema proto message to a scalar GraphTensorSpec.
+[`create_graph_spec_from_schema_pb(...)`](./tfgnn/create_graph_spec_from_schema_pb.md):
+Converts a graph schema proto message to a scalar GraphTensorSpec.
 
 [`create_schema_pb_from_graph_spec(...)`](./tfgnn/create_schema_pb_from_graph_spec.md):
 Converts scalar GraphTensorSpec to a graph schema proto message.
 
-[`dataset_filter_with_summary(...)`](./tfgnn/dataset_filter_with_summary.md): Dataset filter with a summary for the fraction of dataset elements removed.
+[`dataset_filter_with_summary(...)`](./tfgnn/dataset_filter_with_summary.md):
+Dataset filter with a summary for the fraction of dataset elements removed.
 
 [`dataset_from_generator(...)`](./tfgnn/dataset_from_generator.md): Creates
 dataset from generator of any nest of scalar graph pieces.
 
-[`find_tight_size_constraints(...)`](./tfgnn/find_tight_size_constraints.md): Returns smallest possible size constraints that allow dataset padding.
+[`disable_graph_tensor_validation(...)`](./tfgnn/disable_graph_tensor_validation.md):
+Disables both static and runtime checks of graph tensors.
 
-[`gather_first_node(...)`](./tfgnn/gather_first_node.md): Gathers feature value from the first node of each graph component.
+[`disable_graph_tensor_validation_at_runtime(...)`](./tfgnn/disable_graph_tensor_validation_at_runtime.md):
+Disables runtime checks (`tf.debugging.Assert`) of graph tensors.
+
+[`enable_graph_tensor_validation(...)`](./tfgnn/enable_graph_tensor_validation.md):
+Enables static checks of graph tensors.
+
+[`enable_graph_tensor_validation_at_runtime(...)`](./tfgnn/enable_graph_tensor_validation_at_runtime.md):
+Enables both static and runtime checks of graph tensors.
+
+[`find_tight_size_constraints(...)`](./tfgnn/find_tight_size_constraints.md):
+Returns smallest possible size constraints that allow dataset padding.
+
+[`gather_first_node(...)`](./tfgnn/gather_first_node.md): Gathers feature value
+from the first node of each graph component.
 
 [`get_aux_type_prefix(...)`](./tfgnn/get_aux_type_prefix.md): Returns type
 prefix of aux node or edge set names, or `None` if non-aux.
@@ -132,11 +165,14 @@ prefix of aux node or edge set names, or `None` if non-aux.
 [`get_homogeneous_node_and_edge_set_name(...)`](./tfgnn/get_homogeneous_node_and_edge_set_name.md):
 Returns the sole `node_set_name, edge_set_name` or raises `ValueError`.
 
-[`get_io_spec(...)`](./tfgnn/get_io_spec.md): Returns tf.io parsing features for `GraphTensorSpec` type spec.
+[`get_io_spec(...)`](./tfgnn/get_io_spec.md): Returns tf.io parsing features for
+`GraphTensorSpec` type spec.
 
-[`get_registered_reduce_operation_names(...)`](./tfgnn/get_registered_reduce_operation_names.md): Returns the registered list of supported reduce operation names.
+[`get_registered_reduce_operation_names(...)`](./tfgnn/get_registered_reduce_operation_names.md):
+Returns the registered list of supported reduce operation names.
 
-[`graph_tensor_to_values(...)`](./tfgnn/graph_tensor_to_values.md): Convert an eager `GraphTensor` to a mapping of mappings of PODTs.
+[`graph_tensor_to_values(...)`](./tfgnn/graph_tensor_to_values.md): Convert an
+eager `GraphTensor` to a mapping of mappings of PODTs.
 
 [`homogeneous(...)`](./tfgnn/homogeneous.md): Constructs a homogeneous
 `GraphTensor` with node features and one edge_set.
@@ -144,16 +180,20 @@ Returns the sole `node_set_name, edge_set_name` or raises `ValueError`.
 [`is_dense_tensor(...)`](./tfgnn/is_dense_tensor.md): Returns whether a tensor
 (TF or Keras) is a Tensor.
 
-[`is_graph_tensor(...)`](./tfgnn/is_graph_tensor.md): Returns whether `value` is a GraphTensor (possibly wrapped for Keras).
+[`is_graph_tensor(...)`](./tfgnn/is_graph_tensor.md): Returns whether `value` is
+a GraphTensor (possibly wrapped for Keras).
 
 [`is_ragged_tensor(...)`](./tfgnn/is_ragged_tensor.md): Returns whether a tensor
 (TF or Keras) is a RaggedTensor.
 
-[`iter_features(...)`](./tfgnn/iter_features.md): Utility function to iterate over the features of a graph schema.
+[`iter_features(...)`](./tfgnn/iter_features.md): Utility function to iterate
+over the features of a graph schema.
 
-[`iter_sets(...)`](./tfgnn/iter_sets.md): Utility function to iterate over all the sets present in a graph schema.
+[`iter_sets(...)`](./tfgnn/iter_sets.md): Utility function to iterate over all
+the sets present in a graph schema.
 
-[`learn_fit_or_skip_size_constraints(...)`](./tfgnn/learn_fit_or_skip_size_constraints.md): Learns the optimal size constraints for the fixed size batching with retry.
+[`learn_fit_or_skip_size_constraints(...)`](./tfgnn/learn_fit_or_skip_size_constraints.md):
+Learns the optimal size constraints for the fixed size batching with retry.
 
 [`mask_edges(...)`](./tfgnn/mask_edges.md): Creates a GraphTensor after applying
 edge_mask over the specified edge-set.
@@ -161,35 +201,53 @@ edge_mask over the specified edge-set.
 [`node_degree(...)`](./tfgnn/node_degree.md): Returns the degree of each node
 w.r.t. one side of an edge set.
 
-[`pad_to_total_sizes(...)`](./tfgnn/pad_to_total_sizes.md): Pads graph tensor to the total sizes by inserting fake graph components.
+[`pad_to_total_sizes(...)`](./tfgnn/pad_to_total_sizes.md): Pads graph tensor to
+the total sizes by inserting fake graph components.
 
-[`parse_example(...)`](./tfgnn/parse_example.md): Parses a batch of serialized Example protos into a single `GraphTensor`.
+[`parse_example(...)`](./tfgnn/parse_example.md): Parses a batch of serialized
+Example protos into a single `GraphTensor`.
 
-[`parse_schema(...)`](./tfgnn/parse_schema.md): Parse a schema from text-formatted protos.
+[`parse_schema(...)`](./tfgnn/parse_schema.md): Parse a schema from
+text-formatted protos.
 
-[`parse_single_example(...)`](./tfgnn/parse_single_example.md): Parses a single serialized Example proto into a single `GraphTensor`.
+[`parse_single_example(...)`](./tfgnn/parse_single_example.md): Parses a single
+serialized Example proto into a single `GraphTensor`.
 
 [`pool(...)`](./tfgnn/pool.md): Pools values from edges to nodes, or from nodes
 or edges to context.
 
-[`pool_edges_to_context(...)`](./tfgnn/pool_edges_to_context.md): Aggregates (pools) edge values to graph context.
+[`pool_edges_to_context(...)`](./tfgnn/pool_edges_to_context.md): Aggregates
+(pools) edge values to graph context.
+
+[`pool_edges_to_node(...)`](./tfgnn/pool_edges_to_node.md): Aggregates (pools)
+edge values to incident nodes.
 
-[`pool_edges_to_node(...)`](./tfgnn/pool_edges_to_node.md): Aggregates (pools) edge values to incident nodes.
+[`pool_neighbors_to_node(...)`](./tfgnn/pool_neighbors_to_node.md): Aggregates
+(pools) neighbor node values along one or more edge sets.
 
-[`pool_nodes_to_context(...)`](./tfgnn/pool_nodes_to_context.md): Aggregates (pools) node values to graph context.
+[`pool_neighbors_to_node_feature(...)`](./tfgnn/pool_neighbors_to_node_feature.md):
+Aggregates (pools) sender node feature to receiver nodes feature.
 
-[`random_graph_tensor(...)`](./tfgnn/random_graph_tensor.md): Generate a graph tensor from a schema, with random features.
+[`pool_nodes_to_context(...)`](./tfgnn/pool_nodes_to_context.md): Aggregates
+(pools) node values to graph context.
 
-[`read_schema(...)`](./tfgnn/read_schema.md): Read a proto schema from a file with text-formatted contents.
+[`random_graph_tensor(...)`](./tfgnn/random_graph_tensor.md): Generate a graph
+tensor from a spec, with random features.
+
+[`read_schema(...)`](./tfgnn/read_schema.md): Read a proto schema from a file
+with text-formatted contents.
 
 [`reorder_nodes(...)`](./tfgnn/reorder_nodes.md): Reorders nodes within node
 sets according to indices.
 
-[`reverse_tag(...)`](./tfgnn/reverse_tag.md): Flips tfgnn.SOURCE to tfgnn.TARGET and vice versa.
+[`reverse_tag(...)`](./tfgnn/reverse_tag.md): Flips tfgnn.SOURCE to tfgnn.TARGET
+and vice versa.
 
-[`satisfies_size_constraints(...)`](./tfgnn/satisfies_size_constraints.md): Returns whether the input `graph_tensor` satisfies `total_sizes`.
+[`satisfies_size_constraints(...)`](./tfgnn/satisfies_size_constraints.md):
+Returns whether the input `graph_tensor` satisfies `total_sizes`.
 
-[`satisfies_total_sizes(...)`](./tfgnn/satisfies_size_constraints.md): Returns whether the input `graph_tensor` satisfies `total_sizes`.
+[`satisfies_total_sizes(...)`](./tfgnn/satisfies_size_constraints.md): Returns
+whether the input `graph_tensor` satisfies `total_sizes`.
 
 [`shuffle_features_globally(...)`](./tfgnn/shuffle_features_globally.md):
 Shuffles context, node set and edge set features of a scalar GraphTensor.
@@ -197,9 +255,11 @@ Shuffles context, node set and edge set features of a scalar GraphTensor.
 [`shuffle_nodes(...)`](./tfgnn/shuffle_nodes.md): Randomly reorders nodes of
 given node sets, within each graph component.
 
-[`softmax(...)`](./tfgnn/softmax.md): Computes softmax over a many-to-one relationship in a GraphTensor.
+[`softmax(...)`](./tfgnn/softmax.md): Computes softmax over a many-to-one
+relationship in a GraphTensor.
 
-[`softmax_edges_per_node(...)`](./tfgnn/softmax_edges_per_node.md): Returns softmax() of edge values per common `node_tag` node.
+[`softmax_edges_per_node(...)`](./tfgnn/softmax_edges_per_node.md): Returns
+softmax() of edge values per common `node_tag` node.
 
 [`structured_readout(...)`](./tfgnn/structured_readout.md): Reads out a feature
 value from select nodes (or edges) in a graph.
@@ -213,11 +273,14 @@ Checks `graph` supports `structured_readout()` from `required_keys`.
 [`validate_graph_tensor_spec_for_readout(...)`](./tfgnn/validate_graph_tensor_spec_for_readout.md):
 Checks `graph_spec` supports `structured_readout()` from `required_keys`.
 
-[`validate_schema(...)`](./tfgnn/validate_schema.md): Validates the correctness of a graph schema instance.
+[`validate_schema(...)`](./tfgnn/validate_schema.md): Validates the correctness
+of a graph schema instance.
 
-[`write_example(...)`](./tfgnn/write_example.md): Encode an eager `GraphTensor` to a tf.train.Example proto.
+[`write_example(...)`](./tfgnn/write_example.md): Encode an eager `GraphTensor`
+to a tf.train.Example proto.
 
-[`write_schema(...)`](./tfgnn/write_schema.md): Write a `GraphSchema` to a text-formatted proto file.
+[`write_schema(...)`](./tfgnn/write_schema.md): Write a `GraphSchema` to a
+text-formatted proto file.
 
 ## Type Aliases
 
@@ -233,9 +296,8 @@ Checks `graph_spec` supports `structured_readout()` from `required_keys`.
 
 [`IncidentNodeOrContextTag`](./tfgnn/IncidentNodeOrContextTag.md)
 
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Other Members</h2></th></tr>
@@ -245,70 +307,70 @@ Checks `graph_spec` supports `structured_readout()` from `required_keys`.
 CONTEXT<a id="CONTEXT"></a>
 </td>
 <td>
-`'context'`
+<code>'context'</code>
 </td>
 </tr><tr>
 <td>
 EDGES<a id="EDGES"></a>
 </td>
 <td>
-`'edges'`
+<code>'edges'</code>
 </td>
 </tr><tr>
 <td>
 HIDDEN_STATE<a id="HIDDEN_STATE"></a>
 </td>
 <td>
-`'hidden_state'`
+<code>'hidden_state'</code>
 </td>
 </tr><tr>
 <td>
 NODES<a id="NODES"></a>
 </td>
 <td>
-`'nodes'`
+<code>'nodes'</code>
 </td>
 </tr><tr>
 <td>
 SIZE_NAME<a id="SIZE_NAME"></a>
 </td>
 <td>
-`'#size'`
+<code>'#size'</code>
 </td>
 </tr><tr>
 <td>
 SOURCE<a id="SOURCE"></a>
 </td>
 <td>
-`0`
+<code>0</code>
 </td>
 </tr><tr>
 <td>
 SOURCE_NAME<a id="SOURCE_NAME"></a>
 </td>
 <td>
-`'#source'`
+<code>'#source'</code>
 </td>
 </tr><tr>
 <td>
 TARGET<a id="TARGET"></a>
 </td>
 <td>
-`1`
+<code>1</code>
 </td>
 </tr><tr>
 <td>
 TARGET_NAME<a id="TARGET_NAME"></a>
 </td>
 <td>
-`'#target'`
+<code>'#target'</code>
 </td>
 </tr><tr>
 <td>
 **version**<a id="__version__"></a>
 </td>
 <td>
-`'0.6.0.dev1'`
+<code>'1.0.0.dev2'</code>
 </td>
 </tr>
 </table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/Adjacency.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/Adjacency.md
index f6caa09c..69c0c2c1 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/Adjacency.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/Adjacency.md
@@ -1,17 +1,10 @@
 # tfgnn.Adjacency
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L288-L380">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L378-L472">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Stores how edges connect pairs of nodes from source and target node sets.
 
@@ -19,103 +12,99 @@ Inherits From: [`HyperAdjacency`](../tfgnn/HyperAdjacency.md)
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>tfgnn.Adjacency(
-    data: Data, spec: 'GraphPieceSpecBase', validate: bool = False
+    data: Data, spec: 'GraphPieceSpecBase'
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
-Each hyper-edge connect one node from the source node set with one node from
-the target node sets. The source and target node sets could be the same.
-The adjacency information is a pair of integer tensors containing indices of
-nodes in source and target node sets. Those tensors are indexed by
-edges, have the same type spec and shape of `[*graph_shape, num_edges]`,
-where `num_edges` is the number of edges in the edge set (could be potentially
-ragged). The index tensors are of `tf.Tensor` type if `num_edges` is not
-`None` or `graph_shape.rank = 0` and of`tf.RaggedTensor` type otherwise.
+Each hyper-edge connect one node from the source node set with one node from the
+target node sets. The source and target node sets could be the same. The
+adjacency information is a pair of integer tensors containing indices of nodes
+in source and target node sets. Those tensors are indexed by edges, have the
+same type spec and shape of `[*graph_shape, num_edges]`, where `num_edges` is
+the number of edges in the edge set (could be potentially ragged). The index
+tensors are of `tf.Tensor` type if `num_edges` is not `None` or
+`graph_shape.rank = 0` and of`tf.RaggedTensor` type otherwise.
 
 The Adjacency is a composite tensor and a special case of tfgnn.HyperAdjacency
-class with <a href="../tfgnn.md#SOURCE"><code>tfgnn.SOURCE</code></a> and <a href="../tfgnn.md#TARGET"><code>tfgnn.TARGET</code></a> node tags used for the source and
-target nodes correspondingly.
+class with <a href="../tfgnn.md#SOURCE"><code>tfgnn.SOURCE</code></a> and
+<a href="../tfgnn.md#TARGET"><code>tfgnn.TARGET</code></a> node tags used for
+the source and target nodes correspondingly.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`data`<a id="data"></a>
+<code>data</code><a id="data"></a>
 </td>
 <td>
 Nest of Field or subclasses of GraphPieceBase.
 </td>
 </tr><tr>
 <td>
-`spec`<a id="spec"></a>
+<code>spec</code><a id="spec"></a>
 </td>
 <td>
-A subclass of GraphPieceSpecBase with a `_data_spec` that matches
-`data`.
-</td>
-</tr><tr>
-<td>
-`validate`<a id="validate"></a>
-</td>
-<td>
-if set, checks that data and spec are aligned, compatible and
-supported.
+A subclass of GraphPieceSpecBase with a <code>_data_spec</code> that matches
+<code>data</code>.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
-<tr> <td> `indices_dtype`<a id="indices_dtype"></a> </td> <td> The integer type
-to represent ragged splits. </td> </tr><tr> <td> `rank`<a id="rank"></a> </td>
-<td> The rank of this Tensor. Guaranteed not to be `None`. </td> </tr><tr> <td>
-`shape`<a id="shape"></a> </td> <td> A possibly-partial shape specification for
-this Tensor.
+<tr> <td> <code>indices_dtype</code><a id="indices_dtype"></a> </td> <td> The
+dtype for graph items indexing. One of <code>tf.int32</code> or
+<code>tf.int64</code>. </td> </tr><tr> <td> <code>rank</code><a id="rank"></a>
+</td> <td> The rank of this Tensor. Guaranteed not to be <code>None</code>.
+</td> </tr><tr> <td> <code>row_splits_dtype</code><a id="row_splits_dtype"></a>
+</td> <td> The dtype for ragged row partitions. One of <code>tf.int32</code> or
+<code>tf.int64</code>. </td> </tr><tr> <td> <code>shape</code><a id="shape"></a>
+</td> <td> A possibly-partial shape specification for this Tensor.
 
-The returned `TensorShape` is guaranteed to have a known rank, but the
+The returned <code>TensorShape</code> is guaranteed to have a known rank, but the
 individual dimension sizes may be unknown.
 </td>
 </tr><tr>
 <td>
-`source`<a id="source"></a>
+<code>source</code><a id="source"></a>
 </td>
 <td>
 The indices of source nodes.
 </td>
 </tr><tr>
 <td>
-`source_name`<a id="source_name"></a>
+<code>source_name</code><a id="source_name"></a>
 </td>
 <td>
 The node set name of source nodes.
 </td>
 </tr><tr>
 <td>
-`spec`<a id="spec"></a>
+<code>spec</code><a id="spec"></a>
 </td>
 <td>
 The public type specification of this tensor.
 </td>
 </tr><tr>
 <td>
-`target`<a id="target"></a>
+<code>target</code><a id="target"></a>
 </td>
 <td>
 The indices of target nodes.
 </td>
 </tr><tr>
 <td>
-`target_name`<a id="target_name"></a>
+<code>target_name</code><a id="target_name"></a>
 </td>
 <td>
 The node set name of target nodes.
@@ -127,7 +116,7 @@ The node set name of target nodes.
 
 <h3 id="from_indices"><code>from_indices</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L306-L350">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L396-L442">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -136,17 +125,14 @@ source</a>
     source: <a href="../tfgnn/Field.md"><code>tfgnn.Field</code></a>,
     target: <a href="../tfgnn/Field.md"><code>tfgnn.Field</code></a>,
     *_,
-    validate: bool = True
+    validate: Optional[bool] = None
 ) -> 'Adjacency'
 </code></pre>
 
 Constructs a new instance from the `source` and `target` node indices.
 
-
 #### Example 1:
 
-
-
 ```python
 # Single graph (rank is 0). Connects pairs of nodes (a[0], b[2]),
 # (a[1], b[1]), (a[2], b[0]) from node sets a and b.
@@ -156,8 +142,6 @@ tfgnn.Adjacency.from_indices(('a', [0, 1, 2]),
 
 #### Example 2:
 
-
-
 ```python
 # Batch of two graphs (rank is 1). Connects pairs of nodes in
 # graph 0: (a[0], b[2]), (a[1], b[1]); graph 1: (a[2], b[0]).
@@ -166,60 +150,58 @@ tfgnn.Adjacency.from_indices(('a', tf.ragged.constant([[0, 1], [2]])),
 ```
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`source`
+<code>source</code>
 </td>
 <td>
 The tuple of node set name and nodes index integer tensor. The
-index must have shape of `[*graph_shape, num_edges]`, where `num_edges`
+index must have shape of <code>[*graph_shape, num_edges]</code>, where <code>num_edges</code>
 is the number of edges in each graph (could be ragged). It has
-`tf.Tensor` type if `num_edges` is not `None` or `graph_shape.rank = 0`
-and `tf.RaggedTensor` type otherwise.
+<code>tf.Tensor</code> type if <code>num_edges</code> is not <code>None</code> or <code>graph_shape.rank = 0</code>
+and <code>tf.RaggedTensor</code> type otherwise.
 </td>
 </tr><tr>
 <td>
-`target`
+<code>target</code>
 </td>
 <td>
-Like `source` field, but for target edge endpoint. Index tensor
-must have the same type spec as for the `source`.
+Like <code>source</code> field, but for target edge endpoint. Index tensor
+must have the same type spec as for the <code>source</code>.
 </td>
 </tr><tr>
 <td>
-`validate`
+<code>validate</code>
 </td>
 <td>
-If `True`, checks that source and target indices have the same
+If <code>True</code>, checks that source and target indices have the same
 type spec.
 </td>
 </tr>
 </table>
 
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Returns</th></tr>
 <tr class="alt">
 <td colspan="2">
-An `Adjacency` tensor with a shape and an indices_dtype being inferred
-from the `indices` values.
+An <code>Adjacency</code> tensor with a shape and an indices_dtype being inferred
+from the <code>indices</code> values.
 </td>
 </tr>
 
 </table>
 
-
-
 <h3 id="get_indices_dict"><code>get_indices_dict</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L145-L152">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L175-L182">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -228,10 +210,9 @@ source</a>
 
 Returns copy of indices as a dictionary.
 
-
 <h3 id="node_set_name"><code>node_set_name</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L141-L143">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L171-L173">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -242,24 +223,61 @@ source</a>
 
 Returns a node set name for the given node set tag.
 
-
 <h3 id="set_shape"><code>set_shape</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L300-L306">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L277-L279">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>set_shape(
     new_shape: ShapeLike
-) -> 'GraphPieceSpecBase'
+) -> 'GraphPieceBase'
 </code></pre>
 
-Enforce the common prefix shape on all the contained features.
+Deprecated. Use `with_shape()`.
 
+<h3 id="with_indices_dtype"><code>with_indices_dtype</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L308-L321">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_indices_dtype(
+    dtype: tf.dtypes.DType
+) -> 'GraphPieceBase'
+</code></pre>
+
+Returns a copy of this piece with the given indices dtype.
+
+<h3 id="with_row_splits_dtype"><code>with_row_splits_dtype</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L347-L360">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_row_splits_dtype(
+    dtype: tf.dtypes.DType
+) -> 'GraphPieceBase'
+</code></pre>
+
+Returns a copy of this piece with the given row splits dtype.
+
+<h3 id="with_shape"><code>with_shape</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L281-L295">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_shape(
+    new_shape: ShapeLike
+) -> 'GraphPieceBase'
+</code></pre>
+
+Enforce the common prefix shape on all the contained features.
 
 <h3 id="__getitem__"><code>__getitem__</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L137-L139">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L167-L169">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -269,7 +287,3 @@ source</a>
 </code></pre>
 
 Returns an index tensor for the given node set tag.
-
-
-
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/AdjacencySpec.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/AdjacencySpec.md
index c1a33161..6c84b62f 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/AdjacencySpec.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/AdjacencySpec.md
@@ -1,19 +1,13 @@
 # tfgnn.AdjacencySpec
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L383-L457">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L475-L549">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
-A type spec for <a href="../tfgnn/Adjacency.md"><code>tfgnn.Adjacency</code></a>.
+A type spec for
+<a href="../tfgnn/Adjacency.md"><code>tfgnn.Adjacency</code></a>.
 
 Inherits From: [`HyperAdjacencySpec`](../tfgnn/HyperAdjacencySpec.md)
 
@@ -22,42 +16,43 @@ Inherits From: [`HyperAdjacencySpec`](../tfgnn/HyperAdjacencySpec.md)
     data_spec: DataSpec,
     shape: tf.TensorShape,
     indices_dtype: tf.dtypes.DType,
+    row_splits_dtype: tf.dtypes.DType,
     metadata: Metadata = None,
-    validate: bool = False
+    check_consistent_indices_dtype: bool = False,
+    check_consistent_row_splits_dtype: bool = False
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
-
-
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
-<tr> <td> `indices_dtype`<a id="indices_dtype"></a> </td> <td> The integer type
-to represent ragged splits. </td> </tr><tr> <td> `rank`<a id="rank"></a> </td>
-<td> The rank of the GraphPiece. Guaranteed not to be `None`. </td> </tr><tr>
-<td> `shape`<a id="shape"></a> </td> <td> A possibly-partial shape specification
-of the GraphPiece.
-
-The returned `TensorShape` is guaranteed to have a known rank, but the
-individual dimension sizes may be unknown. </td> </tr><tr> <td>
-`source`<a id="source"></a> </td> <td>
-
-</td> </tr><tr> <td> `source_name`<a id="source_name"></a> </td> <td> Returns
-the node set name for source nodes. </td> </tr><tr> <td>
-`target`<a id="target"></a> </td> <td>
-
-</td> </tr><tr> <td> `target_name`<a id="target_name"></a> </td> <td> Returns
-the node set name for target nodes. </td> </tr><tr> <td>
-`total_size`<a id="total_size"></a> </td> <td> The total number of edges if
-known. </td> </tr><tr> <td> `value_type`<a id="value_type"></a> </td> <td> The
-Python type for values that are compatible with this TypeSpec.
+<tr> <td> <code>indices_dtype</code><a id="indices_dtype"></a> </td> <td> The
+dtype for graph items indexing. One of <code>tf.int32</code> or
+<code>tf.int64</code>. </td> </tr><tr> <td> <code>rank</code><a id="rank"></a>
+</td> <td> The rank of the GraphPiece. Guaranteed not to be <code>None</code>.
+</td> </tr><tr> <td> <code>row_splits_dtype</code><a id="row_splits_dtype"></a>
+</td> <td> The dtype for ragged row partitions. One of <code>tf.int32</code> or
+<code>tf.int64</code>. </td> </tr><tr> <td> <code>shape</code><a id="shape"></a>
+</td> <td> A possibly-partial shape specification of the GraphPiece.
+
+The returned <code>TensorShape</code> is guaranteed to have a known rank, but
+the individual dimension sizes may be unknown. </td> </tr><tr> <td>
+<code>source</code><a id="source"></a> </td> <td>
+
+</td> </tr><tr> <td> <code>source_name</code><a id="source_name"></a> </td> <td>
+Returns the node set name for source nodes. </td> </tr><tr> <td>
+<code>target</code><a id="target"></a> </td> <td>
+
+</td> </tr><tr> <td> <code>target_name</code><a id="target_name"></a> </td> <td>
+Returns the node set name for target nodes. </td> </tr><tr> <td>
+<code>total_size</code><a id="total_size"></a> </td> <td> The total number of
+edges if known. </td> </tr><tr> <td>
+<code>value_type</code><a id="value_type"></a> </td> <td> The Python type for
+values that are compatible with this TypeSpec.
 
 In particular, all values that are compatible with this TypeSpec must be an
 instance of this type.
@@ -65,8 +60,6 @@ instance of this type.
 </tr>
 </table>
 
-
-
 ## Methods
 
 <h3 id="experimental_as_proto"><code>experimental_as_proto</code></h3>
@@ -93,13 +86,14 @@ Returns a TypeSpec instance based on the serialized proto.
 Do NOT override for custom non-TF types.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`proto`
+<code>proto</code>
 </td>
 <td>
 Proto generated using 'experimental_as_proto'.
@@ -120,7 +114,7 @@ Do NOT override for custom non-TF types.
 
 <h3 id="from_incident_node_sets"><code>from_incident_node_sets</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L387-L411">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L479-L503">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -134,59 +128,56 @@ source</a>
 
 Constructs a new instance from the `incident_node_sets`.
 
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`source_node_set`
+<code>source_node_set</code>
 </td>
 <td>
 The name of the source node set.
 </td>
 </tr><tr>
 <td>
-`target_node_set`
+<code>target_node_set</code>
 </td>
 <td>
 The name of the target node set.
 </td>
 </tr><tr>
 <td>
-`index_spec`
+<code>index_spec</code>
 </td>
 <td>
 type spec for source and target index tensors of shape
-`[*graph_shape, num_edges]`, where num_edges is the number of edges in
-each graph. If `num_edges` is not `None` or `graph_shape.rank = 0` the
-spec must be of `tf.TensorSpec` type and of `tf.RaggedTensorSpec` type
+<code>[*graph_shape, num_edges]</code>, where num_edges is the number of edges in
+each graph. If <code>num_edges</code> is not <code>None</code> or <code>graph_shape.rank = 0</code> the
+spec must be of <code>tf.TensorSpec</code> type and of <code>tf.RaggedTensorSpec</code> type
 otherwise.
 </td>
 </tr>
 </table>
 
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Returns</th></tr>
 <tr class="alt">
 <td colspan="2">
-A `AdjacencySpec` TypeSpec.
+A <code>AdjacencySpec</code> TypeSpec.
 </td>
 </tr>
 
 </table>
 
-
-
 <h3 id="from_value"><code>from_value</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L491-L494">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L672-L675">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -198,10 +189,9 @@ source</a>
 
 Extension Types API: Factory method.
 
-
 <h3 id="get_index_specs_dict"><code>get_index_specs_dict</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L235-L242">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L316-L323">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -210,7 +200,6 @@ source</a>
 
 Returns copy of indices type specs as a dictionary.
 
-
 <h3 id="is_compatible_with"><code>is_compatible_with</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -225,13 +214,14 @@ Prefer using "is_subtype_of" and "most_specific_common_supertype" wherever
 possible.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`spec_or_value`
+<code>spec_or_value</code>
 </td>
 <td>
 A TypeSpec or TypeSpec associated value to compare against.
@@ -239,8 +229,6 @@ A TypeSpec or TypeSpec associated value to compare against.
 </tr>
 </table>
 
-
-
 <h3 id="is_subtype_of"><code>is_subtype_of</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -253,19 +241,19 @@ Returns True if `self` is a subtype of `other`.
 
 Implements the tf.types.experimental.func.TraceType interface.
 
-If not overridden by a subclass, the default behavior is to assume the
-TypeSpec is covariant upon attributes that implement TraceType and
-invariant upon rest of the attributes as well as the structure and type
-of the TypeSpec.
+If not overridden by a subclass, the default behavior is to assume the TypeSpec
+is covariant upon attributes that implement TraceType and invariant upon rest of
+the attributes as well as the structure and type of the TypeSpec.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`other`
+<code>other</code>
 </td>
 <td>
 A TraceType object.
@@ -273,8 +261,6 @@ A TraceType object.
 </tr>
 </table>
 
-
-
 <h3 id="most_specific_common_supertype"><code>most_specific_common_supertype</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -283,23 +269,23 @@ A TraceType object.
 ) -> Optional['TypeSpec']
 </code></pre>
 
-Returns the most specific supertype TypeSpec  of `self` and `others`.
+Returns the most specific supertype TypeSpec of `self` and `others`.
 
 Implements the tf.types.experimental.func.TraceType interface.
 
-If not overridden by a subclass, the default behavior is to assume the
-TypeSpec is covariant upon attributes that implement TraceType and
-invariant upon rest of the attributes as well as the structure and type
-of the TypeSpec.
+If not overridden by a subclass, the default behavior is to assume the TypeSpec
+is covariant upon attributes that implement TraceType and invariant upon rest of
+the attributes as well as the structure and type of the TypeSpec.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`others`
+<code>others</code>
 </td>
 <td>
 A sequence of TraceTypes.
@@ -307,8 +293,6 @@ A sequence of TraceTypes.
 </tr>
 </table>
 
-
-
 <h3 id="most_specific_compatible_type"><code>most_specific_compatible_type</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -317,53 +301,51 @@ A sequence of TraceTypes.
 ) -> 'TypeSpec'
 </code></pre>
 
-Returns the most specific TypeSpec compatible with `self` and `other`. (deprecated)
+Returns the most specific TypeSpec compatible with `self` and `other`.
+(deprecated)
 
 Deprecated: THIS FUNCTION IS DEPRECATED. It will be removed in a future version.
-Instructions for updating:
-Use most_specific_common_supertype instead.
+Instructions for updating: Use most_specific_common_supertype instead.
 
-Deprecated. Please use `most_specific_common_supertype` instead.
-Do not override this function.
+Deprecated. Please use `most_specific_common_supertype` instead. Do not override
+this function.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`other`
+<code>other</code>
 </td>
 <td>
-A `TypeSpec`.
+A <code>TypeSpec</code>.
 </td>
 </tr>
 </table>
 
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Raises</th></tr>
 
 <tr>
 <td>
-`ValueError`
+<code>ValueError</code>
 </td>
 <td>
-If there is no TypeSpec that is compatible with both `self`
-and `other`.
+If there is no TypeSpec that is compatible with both <code>self</code>
+and <code>other</code>.
 </td>
 </tr>
 </table>
 
-
-
 <h3 id="node_set_name"><code>node_set_name</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L244-L246">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L325-L327">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -376,7 +358,7 @@ Returns a node set name for the given node set tag.
 
 <h3 id="relax"><code>relax</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L435-L457">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L527-L549">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -390,13 +372,14 @@ Allows variable number of graph edges.
 Calling with all default parameters keeps the spec unchanged.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`num_edges`
+<code>num_edges</code>
 </td>
 <td>
 if True, allows a variable number of edges in each edge set.
@@ -406,6 +389,7 @@ If False, returns spec unchanged.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Returns</th></tr>
@@ -418,13 +402,14 @@ Relaxed compatible spec.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Raises</th></tr>
 
 <tr>
 <td>
-`ValueError`
+<code>ValueError</code>
 </td>
 <td>
 if adjacency is not scalar (rank > 0).
@@ -432,6 +417,45 @@ if adjacency is not scalar (rank > 0).
 </tr>
 </table>
 
+<h3 id="with_indices_dtype"><code>with_indices_dtype</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L596-L608">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_indices_dtype(
+    dtype: tf.dtypes.DType
+) -> 'GraphPieceSpecBase'
+</code></pre>
+
+Returns a copy of this piece spec with the given indices dtype.
+
+<h3 id="with_row_splits_dtype"><code>with_row_splits_dtype</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L637-L651">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_row_splits_dtype(
+    dtype: tf.dtypes.DType
+) -> 'GraphPieceSpecBase'
+</code></pre>
+
+Returns a copy of this piece spec with the given row splits dtype.
+
+<h3 id="with_shape"><code>with_shape</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L570-L584">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_shape(
+    new_shape: ShapeLike
+) -> 'GraphPieceSpecBase'
+</code></pre>
+
+Enforce the common prefix shape on all the contained features.
+
 <h3 id="__eq__"><code>__eq__</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -442,10 +466,9 @@ if adjacency is not scalar (rank > 0).
 
 Return self==value.
 
-
 <h3 id="__getitem__"><code>__getitem__</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L231-L233">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L312-L314">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -456,7 +479,6 @@ source</a>
 
 Returns an index tensor type spec for the given node set tag.
 
-
 <h3 id="__ne__"><code>__ne__</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -466,7 +488,3 @@ Returns an index tensor type spec for the given node set tag.
 </code></pre>
 
 Return self!=value.
-
-
-
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/Context.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/Context.md
index 7172739c..c978edf8 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/Context.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/Context.md
@@ -1,110 +1,98 @@
 # tfgnn.Context
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L229-L343">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L348-L475">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 A composite tensor for graph context features.
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>tfgnn.Context(
-    data: Data, spec: 'GraphPieceSpecBase', validate: bool = False
+    data: Data, spec: 'GraphPieceSpecBase'
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
-The items of the context are the graph components (just like the items of a
-node set are the nodes and the items of an edge set are the edges). The
-`Context` is a composite tensor. It stores features that belong to a graph
-component as a whole, not any particular node or edge. Each context feature
-has a shape `[*graph_shape, num_components, ...]`, where `num_components` is
-the number of graph components in a graph (could be ragged).
+The items of the context are the graph components (just like the items of a node
+set are the nodes and the items of an edge set are the edges). The `Context` is
+a composite tensor. It stores features that belong to a graph component as a
+whole, not any particular node or edge. Each context feature has a shape
+`[*graph_shape, num_components, ...]`, where `num_components` is the number of
+graph components in a graph (could be ragged).
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`data`<a id="data"></a>
+<code>data</code><a id="data"></a>
 </td>
 <td>
 Nest of Field or subclasses of GraphPieceBase.
 </td>
 </tr><tr>
 <td>
-`spec`<a id="spec"></a>
+<code>spec</code><a id="spec"></a>
 </td>
 <td>
-A subclass of GraphPieceSpecBase with a `_data_spec` that matches
-`data`.
-</td>
-</tr><tr>
-<td>
-`validate`<a id="validate"></a>
-</td>
-<td>
-if set, checks that data and spec are aligned, compatible and
-supported.
+A subclass of GraphPieceSpecBase with a <code>_data_spec</code> that matches
+<code>data</code>.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
-<tr> <td> `features`<a id="features"></a> </td> <td> A read-only mapping of
-feature name to feature specs. </td> </tr><tr> <td>
-`indices_dtype`<a id="indices_dtype"></a> </td> <td> The integer type to
-represent ragged splits. </td> </tr><tr> <td>
-`num_components`<a id="num_components"></a> </td> <td> The number of graph
-components for each graph. </td> </tr><tr> <td> `rank`<a id="rank"></a> </td>
-<td> The rank of this Tensor. Guaranteed not to be `None`. </td> </tr><tr> <td>
-`shape`<a id="shape"></a> </td> <td> A possibly-partial shape specification for
-this Tensor.
-
-The returned `TensorShape` is guaranteed to have a known rank, but the
+<tr> <td> <code>features</code><a id="features"></a> </td> <td> A read-only
+mapping of feature name to feature specs. </td> </tr><tr> <td>
+<code>indices_dtype</code><a id="indices_dtype"></a> </td> <td> The dtype for
+graph items indexing. One of <code>tf.int32</code> or <code>tf.int64</code>.
+</td> </tr><tr> <td> <code>num_components</code><a id="num_components"></a>
+</td> <td> The number of graph components for each graph. </td> </tr><tr> <td>
+<code>rank</code><a id="rank"></a> </td> <td> The rank of this Tensor.
+Guaranteed not to be <code>None</code>. </td> </tr><tr> <td>
+<code>row_splits_dtype</code><a id="row_splits_dtype"></a> </td> <td> The dtype
+for ragged row partitions. One of <code>tf.int32</code> or
+<code>tf.int64</code>. </td> </tr><tr> <td> <code>shape</code><a id="shape"></a>
+</td> <td> A possibly-partial shape specification for this Tensor.
+
+The returned <code>TensorShape</code> is guaranteed to have a known rank, but the
 individual dimension sizes may be unknown.
 </td>
 </tr><tr>
 <td>
-`sizes`<a id="sizes"></a>
+<code>sizes</code><a id="sizes"></a>
 </td>
 <td>
 The number of items in each graph component.
 </td>
 </tr><tr>
 <td>
-`spec`<a id="spec"></a>
+<code>spec</code><a id="spec"></a>
 </td>
 <td>
 The public type specification of this tensor.
 </td>
 </tr><tr>
 <td>
-`total_num_components`<a id="total_num_components"></a>
+<code>total_num_components</code><a id="total_num_components"></a>
 </td>
 <td>
 The total number of graph components.
 </td>
 </tr><tr>
 <td>
-`total_size`<a id="total_size"></a>
+<code>total_size</code><a id="total_size"></a>
 </td>
 <td>
 The total number of items.
@@ -116,97 +104,105 @@ The total number of items.
 
 <h3 id="from_fields"><code>from_fields</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L240-L323">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L360-L455">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>@classmethod</code>
 <code>from_fields(
-    *,
+    *_,
     features: Optional[Fields] = None,
     sizes: Optional[Field] = None,
     shape: Optional[ShapeLike] = None,
-    indices_dtype: Optional[tf.dtypes.DType] = None
+    indices_dtype: Optional[tf.dtypes.DType] = None,
+    validate: Optional[bool] = None
 ) -> 'Context'
 </code></pre>
 
 Constructs a new instance from context fields.
 
-
 #### Example:
 
-
-
 ```python
 tfgnn.Context.from_fields(features={'country_code': ['CH']})
 ```
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`features`
+<code>features</code>
 </td>
 <td>
 A mapping from feature name to feature Tensor or RaggedTensor.
-All feature tensors must have shape `[*graph_shape, num_components,
-*feature_shape]`, where `num_components` is the number of graph
-components (could be ragged); `feature_shape` are feature-specific inner
+All feature tensors must have shape <code>[*graph_shape, num_components,
+*feature_shape]</code>, where <code>num_components</code> is the number of graph
+components (could be ragged); <code>feature_shape</code> are feature-specific inner
 dimensions.
 </td>
 </tr><tr>
 <td>
-`sizes`
+<code>sizes</code>
 </td>
 <td>
-A Tensor of 1's with shape `[*graph_shape, num_components]`, where
-`num_components` is the number of graph components (could be ragged).
-For symmetry with `sizes` in NodeSet and EdgeSet, this counts the items
+A Tensor of 1's with shape <code>[*graph_shape, num_components]</code>, where
+<code>num_components</code> is the number of graph components (could be ragged).
+For symmetry with <code>sizes</code> in NodeSet and EdgeSet, this counts the items
 per graph component, but since the items of Context are the components
-themselves, each value is 1. Must be compatible with `shape`, if that is
+themselves, each value is 1. Must be compatible with <code>shape</code>, if that is
 specified.
 </td>
 </tr><tr>
 <td>
-`shape`
+<code>shape</code>
 </td>
 <td>
 The shape of this tensor and a GraphTensor containing it, also
-known as the `graph_shape`. If not specified, the shape is inferred from
-`sizes` or set to `[]` if the `sizes` is not specified.
+known as the <code>graph_shape</code>. If not specified, the shape is inferred from
+<code>sizes</code> or set to <code>[]</code> if the <code>sizes</code> is not specified.
 </td>
 </tr><tr>
 <td>
-`indices_dtype`
+<code>indices_dtype</code>
 </td>
 <td>
-An `indices_dtype` of a GraphTensor containing this object,
-used as `row_splits_dtype` when batching potentially ragged fields. If
-`sizes` are specified they are casted to that type.
+An <code>indices_dtype</code> of a GraphTensor containing this object,
+used as <code>row_splits_dtype</code> when batching potentially ragged fields. If
+<code>sizes</code> are specified they are casted to that type.
+</td>
+</tr><tr>
+<td>
+<code>validate</code>
+</td>
+<td>
+If true, use tf.assert ops to inspect the shapes of each field
+and check at runtime that they form a valid Context. The default
+behavior is set by the <code>disable_graph_tensor_validation_at_runtime()</code>
+and <code>enable_graph_tensor_validation_at_runtime()</code>.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Returns</th></tr>
 <tr class="alt">
 <td colspan="2">
-A `Context` composite tensor.
+A <code>Context</code> composite tensor.
 </td>
 </tr>
 
 </table>
 
-
-
 <h3 id="get_features_dict"><code>get_features_dict</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L156-L158">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L222-L224">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -215,10 +211,9 @@ source</a>
 
 Returns features copy as a dictionary.
 
-
 <h3 id="replace_features"><code>replace_features</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L325-L332">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L457-L464">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -229,20 +224,57 @@ source</a>
 
 Returns a new instance with a new set of features.
 
-
 <h3 id="set_shape"><code>set_shape</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L300-L306">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L277-L279">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>set_shape(
     new_shape: ShapeLike
-) -> 'GraphPieceSpecBase'
+) -> 'GraphPieceBase'
 </code></pre>
 
-Enforce the common prefix shape on all the contained features.
+Deprecated. Use `with_shape()`.
+
+<h3 id="with_indices_dtype"><code>with_indices_dtype</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L308-L321">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_indices_dtype(
+    dtype: tf.dtypes.DType
+) -> 'GraphPieceBase'
+</code></pre>
+
+Returns a copy of this piece with the given indices dtype.
+
+<h3 id="with_row_splits_dtype"><code>with_row_splits_dtype</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L347-L360">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_row_splits_dtype(
+    dtype: tf.dtypes.DType
+) -> 'GraphPieceBase'
+</code></pre>
 
+Returns a copy of this piece with the given row splits dtype.
+
+<h3 id="with_shape"><code>with_shape</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L281-L295">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_shape(
+    new_shape: ShapeLike
+) -> 'GraphPieceBase'
+</code></pre>
+
+Enforce the common prefix shape on all the contained features.
 
 <h3 id="__getitem__"><code>__getitem__</code></h3>
 
@@ -256,7 +288,3 @@ source</a>
 </code></pre>
 
 Indexing operator `[]` to access feature values by their name.
-
-
-
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/ContextSpec.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/ContextSpec.md
index 035b4e88..aa470448 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/ContextSpec.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/ContextSpec.md
@@ -1,17 +1,10 @@
 # tfgnn.ContextSpec
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L346-L413">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L478-L547">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 A type spec for <a href="../tfgnn/Context.md"><code>tfgnn.Context</code></a>.
 
@@ -20,40 +13,41 @@ A type spec for <a href="../tfgnn/Context.md"><code>tfgnn.Context</code></a>.
     data_spec: DataSpec,
     shape: tf.TensorShape,
     indices_dtype: tf.dtypes.DType,
+    row_splits_dtype: tf.dtypes.DType,
     metadata: Metadata = None,
-    validate: bool = False
+    check_consistent_indices_dtype: bool = False,
+    check_consistent_row_splits_dtype: bool = False
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
-
-
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
-<tr> <td> `features_spec`<a id="features_spec"></a> </td> <td> A read-only
-mapping of feature name to feature spec. </td> </tr><tr> <td>
-`indices_dtype`<a id="indices_dtype"></a> </td> <td> The integer type to
-represent ragged splits. </td> </tr><tr> <td> `rank`<a id="rank"></a> </td> <td>
-The rank of the GraphPiece. Guaranteed not to be `None`. </td> </tr><tr> <td>
-`shape`<a id="shape"></a> </td> <td> A possibly-partial shape specification of
-the GraphPiece.
-
-The returned `TensorShape` is guaranteed to have a known rank, but the
-individual dimension sizes may be unknown. </td> </tr><tr> <td>
-`sizes_spec`<a id="sizes_spec"></a> </td> <td> The type spec for the sizes that
-provides num. elements per component. </td> </tr><tr> <td>
-`total_num_components`<a id="total_num_components"></a> </td> <td> The total
-number of graph components if known. </td> </tr><tr> <td>
-`total_size`<a id="total_size"></a> </td> <td> The total number of graph items
-if known. </td> </tr><tr> <td> `value_type`<a id="value_type"></a> </td> <td>
-The Python type for values that are compatible with this TypeSpec.
+<tr> <td> <code>features_spec</code><a id="features_spec"></a> </td> <td> A
+read-only mapping of feature name to feature spec. </td> </tr><tr> <td>
+<code>indices_dtype</code><a id="indices_dtype"></a> </td> <td> The dtype for
+graph items indexing. One of <code>tf.int32</code> or <code>tf.int64</code>.
+</td> </tr><tr> <td> <code>rank</code><a id="rank"></a> </td> <td> The rank of
+the GraphPiece. Guaranteed not to be <code>None</code>. </td> </tr><tr> <td>
+<code>row_splits_dtype</code><a id="row_splits_dtype"></a> </td> <td> The dtype
+for ragged row partitions. One of <code>tf.int32</code> or
+<code>tf.int64</code>. </td> </tr><tr> <td> <code>shape</code><a id="shape"></a>
+</td> <td> A possibly-partial shape specification of the GraphPiece.
+
+The returned <code>TensorShape</code> is guaranteed to have a known rank, but
+the individual dimension sizes may be unknown. </td> </tr><tr> <td>
+<code>sizes_spec</code><a id="sizes_spec"></a> </td> <td> The type spec for the
+sizes that provides num. elements per component. </td> </tr><tr> <td>
+<code>total_num_components</code><a id="total_num_components"></a> </td> <td>
+The total number of graph components if known. </td> </tr><tr> <td>
+<code>total_size</code><a id="total_size"></a> </td> <td> The total number of
+graph items if known. </td> </tr><tr> <td>
+<code>value_type</code><a id="value_type"></a> </td> <td> The Python type for
+values that are compatible with this TypeSpec.
 
 In particular, all values that are compatible with this TypeSpec must be an
 instance of this type.
@@ -61,8 +55,6 @@ instance of this type.
 </tr>
 </table>
 
-
-
 ## Methods
 
 <h3 id="experimental_as_proto"><code>experimental_as_proto</code></h3>
@@ -89,13 +81,14 @@ Returns a TypeSpec instance based on the serialized proto.
 Do NOT override for custom non-TF types.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`proto`
+<code>proto</code>
 </td>
 <td>
 Proto generated using 'experimental_as_proto'.
@@ -116,7 +109,7 @@ Do NOT override for custom non-TF types.
 
 <h3 id="from_field_specs"><code>from_field_specs</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L350-L386">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L482-L520">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -130,12 +123,13 @@ source</a>
 ) -> 'ContextSpec'
 </code></pre>
 
-The counterpart of <a href="../tfgnn/Context.md#from_fields"><code>Context.from_fields()</code></a> for field type specs.
-
+The counterpart of
+<a href="../tfgnn/Context.md#from_fields"><code>Context.from_fields()</code></a>
+for field type specs.
 
 <h3 id="from_value"><code>from_value</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L491-L494">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L672-L675">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -147,7 +141,6 @@ source</a>
 
 Extension Types API: Factory method.
 
-
 <h3 id="is_compatible_with"><code>is_compatible_with</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -162,13 +155,14 @@ Prefer using "is_subtype_of" and "most_specific_common_supertype" wherever
 possible.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`spec_or_value`
+<code>spec_or_value</code>
 </td>
 <td>
 A TypeSpec or TypeSpec associated value to compare against.
@@ -176,8 +170,6 @@ A TypeSpec or TypeSpec associated value to compare against.
 </tr>
 </table>
 
-
-
 <h3 id="is_subtype_of"><code>is_subtype_of</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -190,19 +182,19 @@ Returns True if `self` is a subtype of `other`.
 
 Implements the tf.types.experimental.func.TraceType interface.
 
-If not overridden by a subclass, the default behavior is to assume the
-TypeSpec is covariant upon attributes that implement TraceType and
-invariant upon rest of the attributes as well as the structure and type
-of the TypeSpec.
+If not overridden by a subclass, the default behavior is to assume the TypeSpec
+is covariant upon attributes that implement TraceType and invariant upon rest of
+the attributes as well as the structure and type of the TypeSpec.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`other`
+<code>other</code>
 </td>
 <td>
 A TraceType object.
@@ -210,8 +202,6 @@ A TraceType object.
 </tr>
 </table>
 
-
-
 <h3 id="most_specific_common_supertype"><code>most_specific_common_supertype</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -220,23 +210,23 @@ A TraceType object.
 ) -> Optional['TypeSpec']
 </code></pre>
 
-Returns the most specific supertype TypeSpec  of `self` and `others`.
+Returns the most specific supertype TypeSpec of `self` and `others`.
 
 Implements the tf.types.experimental.func.TraceType interface.
 
-If not overridden by a subclass, the default behavior is to assume the
-TypeSpec is covariant upon attributes that implement TraceType and
-invariant upon rest of the attributes as well as the structure and type
-of the TypeSpec.
+If not overridden by a subclass, the default behavior is to assume the TypeSpec
+is covariant upon attributes that implement TraceType and invariant upon rest of
+the attributes as well as the structure and type of the TypeSpec.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`others`
+<code>others</code>
 </td>
 <td>
 A sequence of TraceTypes.
@@ -244,8 +234,6 @@ A sequence of TraceTypes.
 </tr>
 </table>
 
-
-
 <h3 id="most_specific_compatible_type"><code>most_specific_compatible_type</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -254,51 +242,51 @@ A sequence of TraceTypes.
 ) -> 'TypeSpec'
 </code></pre>
 
-Returns the most specific TypeSpec compatible with `self` and `other`. (deprecated)
+Returns the most specific TypeSpec compatible with `self` and `other`.
+(deprecated)
 
 Deprecated: THIS FUNCTION IS DEPRECATED. It will be removed in a future version.
-Instructions for updating:
-Use most_specific_common_supertype instead.
+Instructions for updating: Use most_specific_common_supertype instead.
 
-Deprecated. Please use `most_specific_common_supertype` instead.
-Do not override this function.
+Deprecated. Please use `most_specific_common_supertype` instead. Do not override
+this function.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`other`
+<code>other</code>
 </td>
 <td>
-A `TypeSpec`.
+A <code>TypeSpec</code>.
 </td>
 </tr>
 </table>
 
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Raises</th></tr>
 
 <tr>
 <td>
-`ValueError`
+<code>ValueError</code>
 </td>
 <td>
-If there is no TypeSpec that is compatible with both `self`
-and `other`.
+If there is no TypeSpec that is compatible with both <code>self</code>
+and <code>other</code>.
 </td>
 </tr>
 </table>
 
 <h3 id="relax"><code>relax</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L392-L413">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L526-L547">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -312,22 +300,24 @@ Allows variable number of graph components.
 Calling with all default parameters keeps the spec unchanged.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`num_components`
+<code>num_components</code>
 </td>
 <td>
 if True, allows variable number of graph components by
-setting the outermost sizes dimension to `None`.
+setting the outermost sizes dimension to <code>None</code>.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Returns</th></tr>
@@ -340,13 +330,14 @@ Relaxed compatible context spec.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Raises</th></tr>
 
 <tr>
 <td>
-`ValueError`
+<code>ValueError</code>
 </td>
 <td>
 if contex is not scalar (rank > 0).
@@ -354,6 +345,45 @@ if contex is not scalar (rank > 0).
 </tr>
 </table>
 
+<h3 id="with_indices_dtype"><code>with_indices_dtype</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L596-L608">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_indices_dtype(
+    dtype: tf.dtypes.DType
+) -> 'GraphPieceSpecBase'
+</code></pre>
+
+Returns a copy of this piece spec with the given indices dtype.
+
+<h3 id="with_row_splits_dtype"><code>with_row_splits_dtype</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L637-L651">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_row_splits_dtype(
+    dtype: tf.dtypes.DType
+) -> 'GraphPieceSpecBase'
+</code></pre>
+
+Returns a copy of this piece spec with the given row splits dtype.
+
+<h3 id="with_shape"><code>with_shape</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L570-L584">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_shape(
+    new_shape: ShapeLike
+) -> 'GraphPieceSpecBase'
+</code></pre>
+
+Enforce the common prefix shape on all the contained features.
+
 <h3 id="__eq__"><code>__eq__</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -364,10 +394,9 @@ if contex is not scalar (rank > 0).
 
 Return self==value.
 
-
 <h3 id="__getitem__"><code>__getitem__</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L164-L165">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L240-L241">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -385,7 +414,3 @@ source</a>
 </code></pre>
 
 Return self!=value.
-
-
-
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/EdgeSet.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/EdgeSet.md
index 907bafce..a4856143 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/EdgeSet.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/EdgeSet.md
@@ -1,36 +1,28 @@
 # tfgnn.EdgeSet
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L548-L631">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L698-L820">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 A composite tensor for edge set features, size and adjacency information.
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>tfgnn.EdgeSet(
-    data: Data, spec: 'GraphPieceSpecBase', validate: bool = False
+    data: Data, spec: 'GraphPieceSpecBase'
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
 Each edge set contains edges as its items that connect nodes from particular
 node sets. The information which edges connect which nodes is encapsulated in
-the <a href="../tfgnn/EdgeSet.md#adjacency"><code>EdgeSet.adjacency</code></a> composite tensor (see adjacency.py).
+the <a href="../tfgnn/EdgeSet.md#adjacency"><code>EdgeSet.adjacency</code></a>
+composite tensor (see adjacency.py).
 
-All edges in a edge set have the same features, identified by a string key.
-Each feature is stored as one tensor and has shape `[*graph_shape, num_edges,
+All edges in a edge set have the same features, identified by a string key. Each
+feature is stored as one tensor and has shape `[*graph_shape, num_edges,
 *feature_shape]`. The `num_edges` is a number of edges in a graph (could be
 ragged). The `feature_shape` is a shape of the feature value for each edge.
 EdgeSet supports both fixed-size and variable-size features. The fixed-size
@@ -38,84 +30,82 @@ features must have fully defined feature_shape. They are stored as `tf.Tensor`
 if `num_edges` is fixed-size or `graph_shape.rank = 0`. Variable-size edge
 features are always stored as `tf.RaggedTensor`.
 
-Note that edge set features are indexed without regard to graph components.
-The information which edge belong to which graph component is contained in
-the `.sizes` tensor which defines the number of edges in each graph component.
+Note that edge set features are indexed without regard to graph components. The
+information which edge belong to which graph component is contained in the
+`.sizes` tensor which defines the number of edges in each graph component.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`data`<a id="data"></a>
+<code>data</code><a id="data"></a>
 </td>
 <td>
 Nest of Field or subclasses of GraphPieceBase.
 </td>
 </tr><tr>
 <td>
-`spec`<a id="spec"></a>
+<code>spec</code><a id="spec"></a>
 </td>
 <td>
-A subclass of GraphPieceSpecBase with a `_data_spec` that matches
-`data`.
-</td>
-</tr><tr>
-<td>
-`validate`<a id="validate"></a>
-</td>
-<td>
-if set, checks that data and spec are aligned, compatible and
-supported.
+A subclass of GraphPieceSpecBase with a <code>_data_spec</code> that matches
+<code>data</code>.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
-<tr> <td> `adjacency`<a id="adjacency"></a> </td> <td> The information which
-edges connect which nodes (see tfgnn.Adjacency). </td> </tr><tr> <td>
-`features`<a id="features"></a> </td> <td> A read-only mapping of feature name
-to feature specs. </td> </tr><tr> <td> `indices_dtype`<a id="indices_dtype"></a>
-</td> <td> The integer type to represent ragged splits. </td> </tr><tr> <td>
-`num_components`<a id="num_components"></a> </td> <td> The number of graph
-components for each graph. </td> </tr><tr> <td> `rank`<a id="rank"></a> </td>
-<td> The rank of this Tensor. Guaranteed not to be `None`. </td> </tr><tr> <td>
-`shape`<a id="shape"></a> </td> <td> A possibly-partial shape specification for
-this Tensor.
-
-The returned `TensorShape` is guaranteed to have a known rank, but the
+<tr> <td> <code>adjacency</code><a id="adjacency"></a> </td> <td> The
+information which edges connect which nodes (see tfgnn.Adjacency). </td>
+</tr><tr> <td> <code>features</code><a id="features"></a> </td> <td> A read-only
+mapping of feature name to feature specs. </td> </tr><tr> <td>
+<code>indices_dtype</code><a id="indices_dtype"></a> </td> <td> The dtype for
+graph items indexing. One of <code>tf.int32</code> or <code>tf.int64</code>.
+</td> </tr><tr> <td> <code>num_components</code><a id="num_components"></a>
+</td> <td> The number of graph components for each graph. </td> </tr><tr> <td>
+<code>rank</code><a id="rank"></a> </td> <td> The rank of this Tensor.
+Guaranteed not to be <code>None</code>. </td> </tr><tr> <td>
+<code>row_splits_dtype</code><a id="row_splits_dtype"></a> </td> <td> The dtype
+for ragged row partitions. One of <code>tf.int32</code> or
+<code>tf.int64</code>. </td> </tr><tr> <td> <code>shape</code><a id="shape"></a>
+</td> <td> A possibly-partial shape specification for this Tensor.
+
+The returned <code>TensorShape</code> is guaranteed to have a known rank, but the
 individual dimension sizes may be unknown.
 </td>
 </tr><tr>
 <td>
-`sizes`<a id="sizes"></a>
+<code>sizes</code><a id="sizes"></a>
 </td>
 <td>
 The number of items in each graph component.
 </td>
 </tr><tr>
 <td>
-`spec`<a id="spec"></a>
+<code>spec</code><a id="spec"></a>
 </td>
 <td>
 The public type specification of this tensor.
 </td>
 </tr><tr>
 <td>
-`total_num_components`<a id="total_num_components"></a>
+<code>total_num_components</code><a id="total_num_components"></a>
 </td>
 <td>
 The total number of graph components.
 </td>
 </tr><tr>
 <td>
-`total_size`<a id="total_size"></a>
+<code>total_size</code><a id="total_size"></a>
 </td>
 <td>
 The total number of items.
@@ -127,19 +117,22 @@ The total number of items.
 
 <h3 id="from_fields"><code>from_fields</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L571-L616">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L722-L805">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>@classmethod</code>
 <code>from_fields(
-    *, features: Optional[Fields] = None, sizes: Field, adjacency: Adjacency
+    *_,
+    features: Optional[Fields] = None,
+    sizes: Field,
+    adjacency: Adjacency,
+    validate: Optional[bool] = None
 ) -> 'EdgeSet'
 </code></pre>
 
 Constructs a new instance from edge set fields.
 
-
 #### Example 1:
 
 ```python
@@ -161,57 +154,67 @@ tfgnn.EdgeSet.from_fields(
 ```
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`features`
+<code>features</code>
 </td>
 <td>
 A mapping from feature name to feature Tensor or RaggedTensor.
-All feature tensors must have shape `[*graph_shape, num_edges,
-*feature_shape]`, where num_edge is the number of edges in the edge set
+All feature tensors must have shape <code>[*graph_shape, num_edges,
+*feature_shape]</code>, where num_edge is the number of edges in the edge set
 (could be ragged) and feature_shape is a shape of the feature value for
 each edge.
 </td>
 </tr><tr>
 <td>
-`sizes`
+<code>sizes</code>
 </td>
 <td>
 The number of edges in each graph component. Has shape
-`[*graph_shape, num_components]`, where `num_components` is the number
+<code>[*graph_shape, num_components]</code>, where <code>num_components</code> is the number
 of graph components (could be ragged).
 </td>
 </tr><tr>
 <td>
-`adjacency`
+<code>adjacency</code>
 </td>
 <td>
 One of the supported adjacency types (see adjacency.py).
 </td>
+</tr><tr>
+<td>
+<code>validate</code>
+</td>
+<td>
+If true, use tf.assert ops to inspect the shapes of each field
+and check at runtime that they form a valid EdgeSet. The default
+behavior is set by the <code>disable_graph_tensor_validation_at_runtime()</code>
+and <code>enable_graph_tensor_validation_at_runtime()</code>.
+</td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Returns</th></tr>
 <tr class="alt">
 <td colspan="2">
-An `EdgeSet` composite tensor.
+An <code>EdgeSet</code> composite tensor.
 </td>
 </tr>
 
 </table>
 
-
-
 <h3 id="get_features_dict"><code>get_features_dict</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L156-L158">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L222-L224">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -220,10 +223,9 @@ source</a>
 
 Returns features copy as a dictionary.
 
-
 <h3 id="replace_features"><code>replace_features</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L419-L425">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L553-L559">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -234,20 +236,57 @@ source</a>
 
 Returns a new instance with a new set of features.
 
-
 <h3 id="set_shape"><code>set_shape</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L300-L306">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L277-L279">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>set_shape(
     new_shape: ShapeLike
-) -> 'GraphPieceSpecBase'
+) -> 'GraphPieceBase'
 </code></pre>
 
-Enforce the common prefix shape on all the contained features.
+Deprecated. Use `with_shape()`.
 
+<h3 id="with_indices_dtype"><code>with_indices_dtype</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L308-L321">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_indices_dtype(
+    dtype: tf.dtypes.DType
+) -> 'GraphPieceBase'
+</code></pre>
+
+Returns a copy of this piece with the given indices dtype.
+
+<h3 id="with_row_splits_dtype"><code>with_row_splits_dtype</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L347-L360">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_row_splits_dtype(
+    dtype: tf.dtypes.DType
+) -> 'GraphPieceBase'
+</code></pre>
+
+Returns a copy of this piece with the given row splits dtype.
+
+<h3 id="with_shape"><code>with_shape</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L281-L295">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_shape(
+    new_shape: ShapeLike
+) -> 'GraphPieceBase'
+</code></pre>
+
+Enforce the common prefix shape on all the contained features.
 
 <h3 id="__getitem__"><code>__getitem__</code></h3>
 
@@ -261,7 +300,3 @@ source</a>
 </code></pre>
 
 Indexing operator `[]` to access feature values by their name.
-
-
-
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/EdgeSetSpec.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/EdgeSetSpec.md
index 97ec4f55..6c323a9f 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/EdgeSetSpec.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/EdgeSetSpec.md
@@ -1,17 +1,10 @@
 # tfgnn.EdgeSetSpec
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L634-L693">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L823-L882">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 A type spec for <a href="../tfgnn/EdgeSet.md"><code>tfgnn.EdgeSet</code></a>.
 
@@ -20,42 +13,43 @@ A type spec for <a href="../tfgnn/EdgeSet.md"><code>tfgnn.EdgeSet</code></a>.
     data_spec: DataSpec,
     shape: tf.TensorShape,
     indices_dtype: tf.dtypes.DType,
+    row_splits_dtype: tf.dtypes.DType,
     metadata: Metadata = None,
-    validate: bool = False
+    check_consistent_indices_dtype: bool = False,
+    check_consistent_row_splits_dtype: bool = False
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
-
-
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
-<tr> <td> `adjacency_spec`<a id="adjacency_spec"></a> </td> <td> A type spec for
-the adjacency composite tensor. </td> </tr><tr> <td>
-`features_spec`<a id="features_spec"></a> </td> <td> A read-only mapping of
-feature name to feature spec. </td> </tr><tr> <td>
-`indices_dtype`<a id="indices_dtype"></a> </td> <td> The integer type to
-represent ragged splits. </td> </tr><tr> <td> `rank`<a id="rank"></a> </td> <td>
-The rank of the GraphPiece. Guaranteed not to be `None`. </td> </tr><tr> <td>
-`shape`<a id="shape"></a> </td> <td> A possibly-partial shape specification of
-the GraphPiece.
-
-The returned `TensorShape` is guaranteed to have a known rank, but the
-individual dimension sizes may be unknown. </td> </tr><tr> <td>
-`sizes_spec`<a id="sizes_spec"></a> </td> <td> The type spec for the sizes that
-provides num. elements per component. </td> </tr><tr> <td>
-`total_num_components`<a id="total_num_components"></a> </td> <td> The total
-number of graph components if known. </td> </tr><tr> <td>
-`total_size`<a id="total_size"></a> </td> <td> The total number of edges if
-known. </td> </tr><tr> <td> `value_type`<a id="value_type"></a> </td> <td> The
-Python type for values that are compatible with this TypeSpec.
+<tr> <td> <code>adjacency_spec</code><a id="adjacency_spec"></a> </td> <td> A
+type spec for the adjacency composite tensor. </td> </tr><tr> <td>
+<code>features_spec</code><a id="features_spec"></a> </td> <td> A read-only
+mapping of feature name to feature spec. </td> </tr><tr> <td>
+<code>indices_dtype</code><a id="indices_dtype"></a> </td> <td> The dtype for
+graph items indexing. One of <code>tf.int32</code> or <code>tf.int64</code>.
+</td> </tr><tr> <td> <code>rank</code><a id="rank"></a> </td> <td> The rank of
+the GraphPiece. Guaranteed not to be <code>None</code>. </td> </tr><tr> <td>
+<code>row_splits_dtype</code><a id="row_splits_dtype"></a> </td> <td> The dtype
+for ragged row partitions. One of <code>tf.int32</code> or
+<code>tf.int64</code>. </td> </tr><tr> <td> <code>shape</code><a id="shape"></a>
+</td> <td> A possibly-partial shape specification of the GraphPiece.
+
+The returned <code>TensorShape</code> is guaranteed to have a known rank, but
+the individual dimension sizes may be unknown. </td> </tr><tr> <td>
+<code>sizes_spec</code><a id="sizes_spec"></a> </td> <td> The type spec for the
+sizes that provides num. elements per component. </td> </tr><tr> <td>
+<code>total_num_components</code><a id="total_num_components"></a> </td> <td>
+The total number of graph components if known. </td> </tr><tr> <td>
+<code>total_size</code><a id="total_size"></a> </td> <td> The total number of
+edges if known. </td> </tr><tr> <td>
+<code>value_type</code><a id="value_type"></a> </td> <td> The Python type for
+values that are compatible with this TypeSpec.
 
 In particular, all values that are compatible with this TypeSpec must be an
 instance of this type.
@@ -63,8 +57,6 @@ instance of this type.
 </tr>
 </table>
 
-
-
 ## Methods
 
 <h3 id="experimental_as_proto"><code>experimental_as_proto</code></h3>
@@ -91,13 +83,14 @@ Returns a TypeSpec instance based on the serialized proto.
 Do NOT override for custom non-TF types.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`proto`
+<code>proto</code>
 </td>
 <td>
 Proto generated using 'experimental_as_proto'.
@@ -118,7 +111,7 @@ Do NOT override for custom non-TF types.
 
 <h3 id="from_field_specs"><code>from_field_specs</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L638-L652">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L827-L841">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -131,12 +124,13 @@ source</a>
 ) -> 'EdgeSetSpec'
 </code></pre>
 
-The counterpart of <a href="../tfgnn/EdgeSet.md#from_fields"><code>EdgeSet.from_fields()</code></a> for values type specs.
-
+The counterpart of
+<a href="../tfgnn/EdgeSet.md#from_fields"><code>EdgeSet.from_fields()</code></a>
+for values type specs.
 
 <h3 id="from_value"><code>from_value</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L491-L494">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L672-L675">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -148,7 +142,6 @@ source</a>
 
 Extension Types API: Factory method.
 
-
 <h3 id="is_compatible_with"><code>is_compatible_with</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -163,13 +156,14 @@ Prefer using "is_subtype_of" and "most_specific_common_supertype" wherever
 possible.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`spec_or_value`
+<code>spec_or_value</code>
 </td>
 <td>
 A TypeSpec or TypeSpec associated value to compare against.
@@ -177,8 +171,6 @@ A TypeSpec or TypeSpec associated value to compare against.
 </tr>
 </table>
 
-
-
 <h3 id="is_subtype_of"><code>is_subtype_of</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -191,19 +183,19 @@ Returns True if `self` is a subtype of `other`.
 
 Implements the tf.types.experimental.func.TraceType interface.
 
-If not overridden by a subclass, the default behavior is to assume the
-TypeSpec is covariant upon attributes that implement TraceType and
-invariant upon rest of the attributes as well as the structure and type
-of the TypeSpec.
+If not overridden by a subclass, the default behavior is to assume the TypeSpec
+is covariant upon attributes that implement TraceType and invariant upon rest of
+the attributes as well as the structure and type of the TypeSpec.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`other`
+<code>other</code>
 </td>
 <td>
 A TraceType object.
@@ -211,8 +203,6 @@ A TraceType object.
 </tr>
 </table>
 
-
-
 <h3 id="most_specific_common_supertype"><code>most_specific_common_supertype</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -221,23 +211,23 @@ A TraceType object.
 ) -> Optional['TypeSpec']
 </code></pre>
 
-Returns the most specific supertype TypeSpec  of `self` and `others`.
+Returns the most specific supertype TypeSpec of `self` and `others`.
 
 Implements the tf.types.experimental.func.TraceType interface.
 
-If not overridden by a subclass, the default behavior is to assume the
-TypeSpec is covariant upon attributes that implement TraceType and
-invariant upon rest of the attributes as well as the structure and type
-of the TypeSpec.
+If not overridden by a subclass, the default behavior is to assume the TypeSpec
+is covariant upon attributes that implement TraceType and invariant upon rest of
+the attributes as well as the structure and type of the TypeSpec.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`others`
+<code>others</code>
 </td>
 <td>
 A sequence of TraceTypes.
@@ -245,8 +235,6 @@ A sequence of TraceTypes.
 </tr>
 </table>
 
-
-
 <h3 id="most_specific_compatible_type"><code>most_specific_compatible_type</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -255,51 +243,51 @@ A sequence of TraceTypes.
 ) -> 'TypeSpec'
 </code></pre>
 
-Returns the most specific TypeSpec compatible with `self` and `other`. (deprecated)
+Returns the most specific TypeSpec compatible with `self` and `other`.
+(deprecated)
 
 Deprecated: THIS FUNCTION IS DEPRECATED. It will be removed in a future version.
-Instructions for updating:
-Use most_specific_common_supertype instead.
+Instructions for updating: Use most_specific_common_supertype instead.
 
-Deprecated. Please use `most_specific_common_supertype` instead.
-Do not override this function.
+Deprecated. Please use `most_specific_common_supertype` instead. Do not override
+this function.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`other`
+<code>other</code>
 </td>
 <td>
-A `TypeSpec`.
+A <code>TypeSpec</code>.
 </td>
 </tr>
 </table>
 
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Raises</th></tr>
 
 <tr>
 <td>
-`ValueError`
+<code>ValueError</code>
 </td>
 <td>
-If there is no TypeSpec that is compatible with both `self`
-and `other`.
+If there is no TypeSpec that is compatible with both <code>self</code>
+and <code>other</code>.
 </td>
 </tr>
 </table>
 
 <h3 id="relax"><code>relax</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L668-L693">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L857-L882">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -313,30 +301,32 @@ Allows variable number of edge or/and graph components.
 Calling with all default parameters keeps the spec unchanged.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`num_components`
+<code>num_components</code>
 </td>
 <td>
 if True, allows variable number of graph components by
-setting the outermost sizes dimension to `None`.
+setting the outermost sizes dimension to <code>None</code>.
 </td>
 </tr><tr>
 <td>
-`num_edges`
+<code>num_edges</code>
 </td>
 <td>
 if True, allows variable number of edges by setting the
-outermost features dimensions to `None`.
+outermost features dimensions to <code>None</code>.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Returns</th></tr>
@@ -349,13 +339,14 @@ Relaxed compatible edge set spec.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Raises</th></tr>
 
 <tr>
 <td>
-`ValueError`
+<code>ValueError</code>
 </td>
 <td>
 if edge set is not scalar (rank > 0).
@@ -363,6 +354,45 @@ if edge set is not scalar (rank > 0).
 </tr>
 </table>
 
+<h3 id="with_indices_dtype"><code>with_indices_dtype</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L596-L608">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_indices_dtype(
+    dtype: tf.dtypes.DType
+) -> 'GraphPieceSpecBase'
+</code></pre>
+
+Returns a copy of this piece spec with the given indices dtype.
+
+<h3 id="with_row_splits_dtype"><code>with_row_splits_dtype</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L637-L651">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_row_splits_dtype(
+    dtype: tf.dtypes.DType
+) -> 'GraphPieceSpecBase'
+</code></pre>
+
+Returns a copy of this piece spec with the given row splits dtype.
+
+<h3 id="with_shape"><code>with_shape</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L570-L584">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_shape(
+    new_shape: ShapeLike
+) -> 'GraphPieceSpecBase'
+</code></pre>
+
+Enforce the common prefix shape on all the contained features.
+
 <h3 id="__eq__"><code>__eq__</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -373,10 +403,9 @@ if edge set is not scalar (rank > 0).
 
 Return self==value.
 
-
 <h3 id="__getitem__"><code>__getitem__</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L164-L165">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L240-L241">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -394,7 +423,3 @@ source</a>
 </code></pre>
 
 Return self!=value.
-
-
-
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/Feature.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/Feature.md
deleted file mode 100644
index db5d2143..00000000
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/Feature.md
+++ /dev/null
@@ -1,75 +0,0 @@
-# tfgnn.Feature
-
-[TOC]
-
-<!-- Insert buttons and diff -->
-
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/proto/graph_schema.proto">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
-
-
-
-A schema for a single feature.
-
-<!-- Placeholder for "Used in" -->
-
-This proto message contains the description, shape, data type and some more
-fields about a feature in the schema.
-
-
-
-<!-- Tabular view -->
- <table class="responsive fixed orange">
-<colgroup><col width="214px"><col></colgroup>
-<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
-
-<tr>
-<td>
-`description`<a id="description"></a>
-</td>
-<td>
-`string description`
-</td>
-</tr><tr>
-<td>
-`dtype`<a id="dtype"></a>
-</td>
-<td>
-`DataType dtype`
-</td>
-</tr><tr>
-<td>
-`example_values`<a id="example_values"></a>
-</td>
-<td>
-`repeated Feature example_values`
-</td>
-</tr><tr>
-<td>
-`sample_values`<a id="sample_values"></a>
-</td>
-<td>
-`Feature sample_values`
-</td>
-</tr><tr>
-<td>
-`shape`<a id="shape"></a>
-</td>
-<td>
-`TensorShapeProto shape`
-</td>
-</tr><tr>
-<td>
-`source`<a id="source"></a>
-</td>
-<td>
-`string source`
-</td>
-</tr>
-</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/FeatureDefaultValues.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/FeatureDefaultValues.md
index 825fcc85..3f14d743 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/FeatureDefaultValues.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/FeatureDefaultValues.md
@@ -1,17 +1,10 @@
 # tfgnn.FeatureDefaultValues
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/preprocessing_common.py#L32-L36">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/preprocessing_common.py#L32-L36">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Default values for graph context, node sets and edge sets features.
 
@@ -21,38 +14,33 @@ Default values for graph context, node sets and edge sets features.
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
-
-
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
 <tr>
 <td>
-`context`<a id="context"></a>
+<code>context</code><a id="context"></a>
 </td>
 <td>
-A `namedtuple` alias for field number 0
+A <code>namedtuple</code> alias for field number 0
 </td>
 </tr><tr>
 <td>
-`node_sets`<a id="node_sets"></a>
+<code>node_sets</code><a id="node_sets"></a>
 </td>
 <td>
-A `namedtuple` alias for field number 1
+A <code>namedtuple</code> alias for field number 1
 </td>
 </tr><tr>
 <td>
-`edge_sets`<a id="edge_sets"></a>
+<code>edge_sets</code><a id="edge_sets"></a>
 </td>
 <td>
-A `namedtuple` alias for field number 2
+A <code>namedtuple</code> alias for field number 2
 </td>
 </tr>
 </table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/Field.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/Field.md
index 3a0e8272..a3ca2ac7 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/Field.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/Field.md
@@ -1,14 +1,8 @@
-<div itemscope itemtype="http://developers.google.com/ReferenceObject">
-<meta itemprop="name" content="tfgnn.Field" />
-<meta itemprop="path" content="Stable" />
-</div>
-
 # tfgnn.Field
 
 <!-- Insert buttons and diff -->
-This symbol is a **type alias**.
-
 
+This symbol is a **type alias**.
 
 #### Source:
 
@@ -19,6 +13,4 @@ This symbol is a **type alias**.
 ]
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/FieldOrFields.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/FieldOrFields.md
index e97514b7..f36b6f8a 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/FieldOrFields.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/FieldOrFields.md
@@ -1,14 +1,8 @@
-<div itemscope itemtype="http://developers.google.com/ReferenceObject">
-<meta itemprop="name" content="tfgnn.FieldOrFields" />
-<meta itemprop="path" content="Stable" />
-</div>
-
 # tfgnn.FieldOrFields
 
 <!-- Insert buttons and diff -->
-This symbol is a **type alias**.
-
 
+This symbol is a **type alias**.
 
 #### Source:
 
@@ -20,6 +14,4 @@ This symbol is a **type alias**.
 ]
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/FieldSpec.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/FieldSpec.md
index 45b5e93b..ec8ef6e0 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/FieldSpec.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/FieldSpec.md
@@ -1,14 +1,8 @@
-<div itemscope itemtype="http://developers.google.com/ReferenceObject">
-<meta itemprop="name" content="tfgnn.FieldSpec" />
-<meta itemprop="path" content="Stable" />
-</div>
-
 # tfgnn.FieldSpec
 
 <!-- Insert buttons and diff -->
-This symbol is a **type alias**.
-
 
+This symbol is a **type alias**.
 
 #### Source:
 
@@ -19,6 +13,4 @@ This symbol is a **type alias**.
 ]
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/Fields.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/Fields.md
index 76fa1f45..103402ae 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/Fields.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/Fields.md
@@ -1,14 +1,8 @@
-<div itemscope itemtype="http://developers.google.com/ReferenceObject">
-<meta itemprop="name" content="tfgnn.Fields" />
-<meta itemprop="path" content="Stable" />
-</div>
-
 # tfgnn.Fields
 
 <!-- Insert buttons and diff -->
-This symbol is a **type alias**.
-
 
+This symbol is a **type alias**.
 
 #### Source:
 
@@ -19,6 +13,4 @@ This symbol is a **type alias**.
 ]
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/FieldsSpec.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/FieldsSpec.md
index 628c9b06..cf61524a 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/FieldsSpec.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/FieldsSpec.md
@@ -1,14 +1,8 @@
-<div itemscope itemtype="http://developers.google.com/ReferenceObject">
-<meta itemprop="name" content="tfgnn.FieldsSpec" />
-<meta itemprop="path" content="Stable" />
-</div>
-
 # tfgnn.FieldsSpec
 
 <!-- Insert buttons and diff -->
-This symbol is a **type alias**.
-
 
+This symbol is a **type alias**.
 
 #### Source:
 
@@ -19,6 +13,4 @@ This symbol is a **type alias**.
 ]
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/GraphSchema.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/GraphSchema.md
deleted file mode 100644
index 472dd033..00000000
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/GraphSchema.md
+++ /dev/null
@@ -1,90 +0,0 @@
-# tfgnn.GraphSchema
-
-[TOC]
-
-<!-- Insert buttons and diff -->
-
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/proto/graph_schema.proto">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
-
-
-
-A schema definition for graphs.
-
-<!-- Placeholder for "Used in" -->
-
-The `GraphSchema` message describes the sets of nodes and edges provided by the
-graph data structure. It also provides the lists of available features within
-each node set, each edge set and over each graph ("context" features).
-
-The purpose of this schema is to describe the tensors available in a graph data
-structure container and the associations between them, their data types, shapes
-and feature descriptions. It exists to allow the user to declare the topology of
-the graph, and for the I/O routines to automatically parse the data. It also
-contains metadata about the graph's various features. See the "Describing your
-Graph" section of the documentation for full details on how to create an
-instance of this message to define the shape and encoding of your dataset.
-
-Note that the schema does not provide a full definition for how the features are
-intended to be used or combined; that belongs to a separate description for a
-model, which is beyond the scope of this schema object. For instance, whether a
-context feature is broadcast over a particular node feature during learning is
-information that isn't related to the data stored in the container of features.
-
-#### Intended usage:
-
-
-
-* To accompany a graph container data structure, as documentation reporting
-  entities, edges and features available during training.
-* To be serialized in the metadata of training data files.
-* To be safeguarded along with model checkpoints in order to keep track of input
-  features used historically.
-* To be utilized to automatically infer good default models.
-
-Note that a feature names beginnning with `#` are explicitly reserved and
-disallowed. (These are used in serialization.)
-
-
-
-<!-- Tabular view -->
- <table class="responsive fixed orange">
-<colgroup><col width="214px"><col></colgroup>
-<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
-
-<tr>
-<td>
-`context`<a id="context"></a>
-</td>
-<td>
-`Context context`
-</td>
-</tr><tr>
-<td>
-`edge_sets`<a id="edge_sets"></a>
-</td>
-<td>
-`repeated EdgeSetsEntry edge_sets`
-</td>
-</tr><tr>
-<td>
-`info`<a id="info"></a>
-</td>
-<td>
-`OriginInfo info`
-</td>
-</tr><tr>
-<td>
-`node_sets`<a id="node_sets"></a>
-</td>
-<td>
-`repeated NodeSetsEntry node_sets`
-</td>
-</tr>
-</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/GraphTensor.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/GraphTensor.md
index 2c350198..819affc8 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/GraphTensor.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/GraphTensor.md
@@ -1,50 +1,41 @@
 # tfgnn.GraphTensor
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L696-L1204">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L885-L1498">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 A composite tensor for heterogeneous directed graphs with features.
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>tfgnn.GraphTensor(
-    data: Data, spec: 'GraphPieceSpecBase', validate: bool = False
+    data: Data, spec: 'GraphPieceSpecBase'
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
 A GraphTensor is an immutable container (as any composite tensor) to represent
 one or more heterogeneous directed graphs, as defined in the GraphTensor guide
 (or even hypergraphs). A GraphTensor consists of NodeSets, EdgeSets and a
-Context (collectively known as graph pieces), which are also composite
-tensors. The graph pieces consist of fields, which are `tf.Tensor`s and/or
-`tf.RaggedTensor`s that store the graph structure (esp. the edges between
-nodes) and user-defined features.
+Context (collectively known as graph pieces), which are also composite tensors.
+The graph pieces consist of fields, which are `tf.Tensor`s and/or
+`tf.RaggedTensor`s that store the graph structure (esp. the edges between nodes)
+and user-defined features.
 
-In the same way as `tf.Tensor` has numbers as its elements, the elements of
-the GraphTensor are graphs. Its `shape` of `[]` describes a scalar (single)
-graph, a shape of `[d0]` describes a `d0`-vector of graphs, a shape of
-`[d0, d1]` a `d0` x `d1` matrix of graphs, and so on.
+In the same way as `tf.Tensor` has numbers as its elements, the elements of the
+GraphTensor are graphs. Its `shape` of `[]` describes a scalar (single) graph, a
+shape of `[d0]` describes a `d0`-vector of graphs, a shape of `[d0, d1]` a `d0`
+x `d1` matrix of graphs, and so on.
 
 RULE: In the shape of a GraphTensor, no dimension except the outermost is
-allowed to be `None` (that is, of unknown size). Future versions of
-GraphTensor may lift that requirement.
+allowed to be `None` (that is, of unknown size). Future versions of GraphTensor
+may lift that requirement.
 
 Each of those graphs in a GraphTensor consists of 0 or more disjoint
-(sub-)graphs called graph components. The number of components could vary
-from graph to graph or be fixed to a value known statically. On a batched
+(sub-)graphs called graph components. The number of components could vary from
+graph to graph or be fixed to a value known statically. On a batched
 GraphTensor, one can call the method merge_batch_to_components() to merge all
 graphs of the batch into one, contiguously indexed graph containing the same
 components as the original graph tensor. See the GraphTensor guide for the
@@ -69,17 +60,18 @@ tfgnn.GraphTensor.from_pieces(
                 target=('node', [3, 7, 1])))})
 ```
 
-All graph pieces provide a mapping interface to access their features by name
-as `graph_piece[feature_name]`. Each graph piece feature has the shape
-`[*graph_shape, num_items, *feature_shape]`, where `graph_shape` is the shape
-of the GraphTensor, `num_items` is the number of items in a piece (number of
-graph components, number of nodes in a node set or edges in an edge set). The
+All graph pieces provide a mapping interface to access their features by name as
+`graph_piece[feature_name]`. Each graph piece feature has the shape
+`[*graph_shape, num_items, *feature_shape]`, where `graph_shape` is the shape of
+the GraphTensor, `num_items` is the number of items in a piece (number of graph
+components, number of nodes in a node set or edges in an edge set). The
 `feature_shape` is the shape of the feature value for each item.
 
-Naturally, the first <a href="../tfgnn/Adjacency.md#rank"><code>GraphTensor.rank</code></a> dimensions of all graph tensor fields
-must index the same graphs, the item dimension must correspond to the same
-item (graph component, node or edge) within the same graph piece (context,
-node set or edge set).
+Naturally, the first
+<a href="../tfgnn/Adjacency.md#rank"><code>GraphTensor.rank</code></a>
+dimensions of all graph tensor fields must index the same graphs, the item
+dimension must correspond to the same item (graph component, node or edge)
+within the same graph piece (context, node set or edge set).
 
 RULE: 'None' always denotes ragged or outermost field dimension. Uniform
 dimensions must have a fixed size that is given in the dimension.
@@ -113,8 +105,8 @@ venues = tfgnn.GraphTensor.from_pieces(
 ```
 
 The assignment of an item to its graph components is stored as the `sizes`
-attribute of the graph piece. Its shape is `[*graph_shape, num_components]`
-(the same for all pieces). The stored values are number of items in each graph
+attribute of the graph piece. Its shape is `[*graph_shape, num_components]` (the
+same for all pieces). The stored values are number of items in each graph
 component (so the `.sizes` has all 1s as its values).
 
 #### Example 3:
@@ -130,26 +122,26 @@ first_paper_year = tf.reduce_min(years_by_venue, -1)  # [2017, 2022]
 ```
 
 The GraphTensor, as a composite tensor, can be used directly in a
-tf.data.Dataset, as an input or output of a Keras Layer or a tf.function,
-and so on. As any other tensor, the GraphTensor has an associated type
-specification object, a GraphTensorSpec, that holds the `tf.TensorSpec` and
-`tf.RaggedTensorSpec` objects for all its fields, plus a bit of metadata such
-as graph connectivity (see tfgnn.GraphTensorSpec).
+tf.data.Dataset, as an input or output of a Keras Layer or a tf.function, and so
+on. As any other tensor, the GraphTensor has an associated type specification
+object, a GraphTensorSpec, that holds the `tf.TensorSpec` and
+`tf.RaggedTensorSpec` objects for all its fields, plus a bit of metadata such as
+graph connectivity (see tfgnn.GraphTensorSpec).
 
 The GraphTensor allows batching of graphs. Batching changes a GraphTensor
-instance's shape to `[batch_size, *graph_shape]` and the GraphTensor's
-rank is increased by 1. Unbatching removes dimension-0, as if truncating with
-`shape[1:]`, and the GraphTensor's rank is decreased by 1. This works
-naturally with the batch and unbatch methods of tf.data.Datatset.
+instance's shape to `[batch_size, *graph_shape]` and the GraphTensor's rank is
+increased by 1. Unbatching removes dimension-0, as if truncating with
+`shape[1:]`, and the GraphTensor's rank is decreased by 1. This works naturally
+with the batch and unbatch methods of tf.data.Datatset.
 
 RULE: Batching followed by unbatching results in a dataset with equal
-GraphTensors as before, except for the last incomplete batch (if batching
-used `drop_remainder=True`).
+GraphTensors as before, except for the last incomplete batch (if batching used
+`drop_remainder=True`).
 
 For now, GraphTensor requires that GraphTensor.shape does not contain `None`,
-except maybe as the outermost dimension. That means repeated calls to
-`.batch()` must set `drop_remainder=True` in all but the last one. Future
-versions of GraphTensor may lift that requirement.
+except maybe as the outermost dimension. That means repeated calls to `.batch()`
+must set `drop_remainder=True` in all but the last one. Future versions of
+GraphTensor may lift that requirement.
 
 All pieces and its fields are batched together with their GraphTensor so that
 shapes of a graph tensor, its pieces and features are all in sync.
@@ -159,67 +151,93 @@ ragged dimension of a RaggedTensor. (Note this is only allowed for the items
 dimension, not a graph dimension.) In all other cases, the type of the field
 (Tensor or RaggedTensor) is preserved.
 
+Graph tensor allows `int64` or `int32` types to index graph items. There are two
+types of indices: `indices_dtype` and `row_splits_dtype`. The `indices_dtype` is
+used to index itemes within graph pieces (`sizes`) and as a `dtype` of adjacency
+indices. The `row_splits_dtype` is a dtype for all ragged row partitions of all
+GraphTensor fields of type `tf.RaggedTensor`.
+
+RULE: `indices_dtype` and `row_splits_dtype` are consistent for all graph pieces
+within the graph tensor.
+
+IMPORTANT: This behaviour is disabled when loading legacy SavedModels created
+before this requirement was introduced. It is strongly recommented to align
+indices for all graph tensors generated by those legacy models using the methods
+`.with_indices_dtype()` and `.with_row_splits_dtype()`.
+
+The `indices_dtype` is `int32` by default, the default integer type in
+Tensorflow The `indices_dtype` for graph tensor and its pieces can be changed
+using `.with_indices_dtype()` method.
+
+The `row_splits_dtype` is `int64` by default, the same as for `RaggedTensor`s.
+They can be changed using `.with_row_splits_dtype()` method.
+
+NOTE: graph tensors can be constructed from pieces with inconsistent
+`indices_dtype` and `row_splits_dtype`. The indices types of the result
+`GraphTensor` are resolved towards the integer types with the maximum capacity
+and all pieces are casted towards those types. For example, if *any* graph piece
+used in `.from_pieces()` has `int64` `indices_dtype` the result graph tensor
+(and all its pieces) would have `int64` `indices_dtype`.
+
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`data`<a id="data"></a>
+<code>data</code><a id="data"></a>
 </td>
 <td>
 Nest of Field or subclasses of GraphPieceBase.
 </td>
 </tr><tr>
 <td>
-`spec`<a id="spec"></a>
-</td>
-<td>
-A subclass of GraphPieceSpecBase with a `_data_spec` that matches
-`data`.
-</td>
-</tr><tr>
-<td>
-`validate`<a id="validate"></a>
+<code>spec</code><a id="spec"></a>
 </td>
 <td>
-if set, checks that data and spec are aligned, compatible and
-supported.
+A subclass of GraphPieceSpecBase with a <code>_data_spec</code> that matches
+<code>data</code>.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
-<tr> <td> `context`<a id="context"></a> </td> <td> The graph context. </td>
-</tr><tr> <td> `edge_sets`<a id="edge_sets"></a> </td> <td> A read-only mapping
-from node set name to the node set. </td> </tr><tr> <td>
-`indices_dtype`<a id="indices_dtype"></a> </td> <td> The integer type to
-represent ragged splits. </td> </tr><tr> <td> `node_sets`<a id="node_sets"></a>
-</td> <td> A read-only mapping from node set name to the node set. </td>
-</tr><tr> <td> `num_components`<a id="num_components"></a> </td> <td> The number
-of graph components for each graph. </td> </tr><tr> <td> `rank`<a id="rank"></a>
-</td> <td> The rank of this Tensor. Guaranteed not to be `None`. </td> </tr><tr>
-<td> `shape`<a id="shape"></a> </td> <td> A possibly-partial shape specification
-for this Tensor.
-
-The returned `TensorShape` is guaranteed to have a known rank, but the
+<tr> <td> <code>context</code><a id="context"></a> </td> <td> The graph context.
+</td> </tr><tr> <td> <code>edge_sets</code><a id="edge_sets"></a> </td> <td> A
+read-only mapping from node set name to the node set. </td> </tr><tr> <td>
+<code>indices_dtype</code><a id="indices_dtype"></a> </td> <td> The dtype for
+graph items indexing. One of <code>tf.int32</code> or <code>tf.int64</code>.
+</td> </tr><tr> <td> <code>node_sets</code><a id="node_sets"></a> </td> <td> A
+read-only mapping from node set name to the node set. </td> </tr><tr> <td>
+<code>num_components</code><a id="num_components"></a> </td> <td> The number of
+graph components for each graph. </td> </tr><tr> <td>
+<code>rank</code><a id="rank"></a> </td> <td> The rank of this Tensor.
+Guaranteed not to be <code>None</code>. </td> </tr><tr> <td>
+<code>row_splits_dtype</code><a id="row_splits_dtype"></a> </td> <td> The dtype
+for ragged row partitions. One of <code>tf.int32</code> or
+<code>tf.int64</code>. </td> </tr><tr> <td> <code>shape</code><a id="shape"></a>
+</td> <td> A possibly-partial shape specification for this Tensor.
+
+The returned <code>TensorShape</code> is guaranteed to have a known rank, but the
 individual dimension sizes may be unknown.
 </td>
 </tr><tr>
 <td>
-`spec`<a id="spec"></a>
+<code>spec</code><a id="spec"></a>
 </td>
 <td>
 The public type specification of this tensor.
 </td>
 </tr><tr>
 <td>
-`total_num_components`<a id="total_num_components"></a>
+<code>total_num_components</code><a id="total_num_components"></a>
 </td>
 <td>
 The total number of graph components.
@@ -231,7 +249,7 @@ The total number of graph components.
 
 <h3 id="from_pieces"><code>from_pieces</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L837-L869">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L1054-L1165">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -239,16 +257,16 @@ source</a>
 <code>from_pieces(
     context: Optional[Context] = None,
     node_sets: Optional[Mapping[NodeSetName, NodeSet]] = None,
-    edge_sets: Optional[Mapping[EdgeSetName, EdgeSet]] = None
+    edge_sets: Optional[Mapping[EdgeSetName, EdgeSet]] = None,
+    validate: Optional[bool] = None
 ) -> 'GraphTensor'
 </code></pre>
 
 Constructs a new `GraphTensor` from context, node sets and edge sets.
 
-
 <h3 id="merge_batch_to_components"><code>merge_batch_to_components</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L871-L956">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L1167-L1250">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -257,13 +275,12 @@ source</a>
 
 Merges all contained graphs into one contiguously indexed graph.
 
-On a batched GraphTensor, one can call this method merge all graphs of the
-batch into one, contiguously indexed graph. The resulting GraphTensor has
-shape [] (i.e., is scalar) and its features have the shape
-`[total_num_items, *feature_shape]` where `total_num_items` is the sum of
-the previous `num_items` per batch element. Most TF-GNN models expect scalar
-GraphTensors. There is no function to reverse this method.
-
+On a batched GraphTensor, one can call this method merge all graphs of the batch
+into one, contiguously indexed graph. The resulting GraphTensor has shape `[]`
+(i.e., is scalar) and its features have the shape `[total_num_items,
+*feature_shape]` where `total_num_items` is the sum of the previous `num_items`
+per batch element. Most TF-GNN models expect scalar GraphTensors. There is no
+function to reverse this method.
 
 Example: Flattening of
 
@@ -307,13 +324,14 @@ tfgnn.GraphTensor.from_pieces(
             sizes=[3, 2, 1, 1],
             features={},
             # Note how node indices have changes to reference nodes
-            # withing the same graph ignoring its components.
+            # within the same graph ignoring its components.
             adjacency=tfgnn.Adjacency.from_indices(
                 source=('node', [0, 1, 1, 2, 3 + 0, 3 + 1 + 0]),
                 target=('node', [0, 0, 1, 2, 3 + 0, 3 + 1 + 0])))})
 ```
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Returns</th></tr>
@@ -325,11 +343,9 @@ A scalar (rank 0) graph tensor.
 
 </table>
 
-
-
 <h3 id="remove_features"><code>remove_features</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L1099-L1190">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L1393-L1484">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -360,8 +376,6 @@ result = graph.remove_features(node_sets={'node.a': ['id']})
 
 #### Result:
 
-
-
 ```python
 tfgnn.GraphTensor.from_pieces(
     node_sets={
@@ -374,20 +388,21 @@ tfgnn.GraphTensor.from_pieces(
 ```
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`context`
+<code>context</code>
 </td>
 <td>
-A list of feature names to remove from the context, or `None`.
+A list of feature names to remove from the context, or <code>None</code>.
 </td>
 </tr><tr>
 <td>
-`node_sets`
+<code>node_sets</code>
 </td>
 <td>
 A mapping from node set names to lists of feature names to be
@@ -395,7 +410,7 @@ removed from the respective node sets.
 </td>
 </tr><tr>
 <td>
-`edge_sets`
+<code>edge_sets</code>
 </td>
 <td>
 A mapping from edge set names to lists of feature names to be
@@ -404,15 +419,14 @@ removed from the respective edge sets.
 </tr>
 </table>
 
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Returns</th></tr>
 <tr class="alt">
 <td colspan="2">
-A `GraphTensor` with the same graph topology as the input and a subset
+A <code>GraphTensor</code> with the same graph topology as the input and a subset
 of its features. Each feature of the input either was named as a feature
 to be removed or is still present in the output.
 </td>
@@ -420,16 +434,15 @@ to be removed or is still present in the output.
 
 </table>
 
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Raises</th></tr>
 
 <tr>
 <td>
-`ValueError`
+<code>ValueError</code>
 </td>
 <td>
 if some feature names in the arguments were not present in the
@@ -438,11 +451,9 @@ input graph tensor.
 </tr>
 </table>
 
-
-
 <h3 id="replace_features"><code>replace_features</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L998-L1097">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L1292-L1391">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -478,8 +489,6 @@ result = graph.replace_features(
 
 #### Result:
 
-
-
 ```python
 tfgnn.GraphTensor.from_pieces(
     context=tfgnn.Context.from_fields(
@@ -499,22 +508,23 @@ tfgnn.GraphTensor.from_pieces(
 ```
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`context`
+<code>context</code>
 </td>
 <td>
-A substitute for the context features, or `None` (which keeps the
+A substitute for the context features, or <code>None</code> (which keeps the
 prior features). Their tensor shapes must match the number of existing
 components, which remains unchanged.
 </td>
 </tr><tr>
 <td>
-`node_sets`
+<code>node_sets</code>
 </td>
 <td>
 Substitutes for the features of the specified node sets. Their
@@ -523,7 +533,7 @@ unchanged. Features on node sets that are not included remain unchanged.
 </td>
 </tr><tr>
 <td>
-`edge_sets`
+<code>edge_sets</code>
 </td>
 <td>
 Substitutes for the features of the specified edge sets. Their
@@ -534,31 +544,29 @@ are not included remain unchanged.
 </tr>
 </table>
 
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Returns</th></tr>
 <tr class="alt">
 <td colspan="2">
-A `GraphTensor` instance with feature maps replaced according to the
+A <code>GraphTensor</code> instance with feature maps replaced according to the
 arguments.
 </td>
 </tr>
 
 </table>
 
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Raises</th></tr>
 
 <tr>
 <td>
-`ValueError`
+<code>ValueError</code>
 </td>
 <td>
 if some node sets or edge sets are not present in the graph
@@ -567,21 +575,54 @@ tensor.
 </tr>
 </table>
 
-
-
 <h3 id="set_shape"><code>set_shape</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L300-L306">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L277-L279">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>set_shape(
     new_shape: ShapeLike
-) -> 'GraphPieceSpecBase'
+) -> 'GraphPieceBase'
 </code></pre>
 
-Enforce the common prefix shape on all the contained features.
+Deprecated. Use `with_shape()`.
+
+<h3 id="with_indices_dtype"><code>with_indices_dtype</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L308-L321">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_indices_dtype(
+    dtype: tf.dtypes.DType
+) -> 'GraphPieceBase'
+</code></pre>
+
+Returns a copy of this piece with the given indices dtype.
 
+<h3 id="with_row_splits_dtype"><code>with_row_splits_dtype</code></h3>
 
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L347-L360">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_row_splits_dtype(
+    dtype: tf.dtypes.DType
+) -> 'GraphPieceBase'
+</code></pre>
 
+Returns a copy of this piece with the given row splits dtype.
 
+<h3 id="with_shape"><code>with_shape</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L281-L295">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_shape(
+    new_shape: ShapeLike
+) -> 'GraphPieceBase'
+</code></pre>
+
+Enforce the common prefix shape on all the contained features.
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/GraphTensorSpec.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/GraphTensorSpec.md
index 623fba59..2fadd307 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/GraphTensorSpec.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/GraphTensorSpec.md
@@ -1,59 +1,54 @@
 # tfgnn.GraphTensorSpec
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L1207-L1311">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L1501-L1627">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
-A type spec for <a href="../tfgnn/GraphTensor.md"><code>tfgnn.GraphTensor</code></a>.
+A type spec for
+<a href="../tfgnn/GraphTensor.md"><code>tfgnn.GraphTensor</code></a>.
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>tfgnn.GraphTensorSpec(
     data_spec: DataSpec,
     shape: tf.TensorShape,
     indices_dtype: tf.dtypes.DType,
+    row_splits_dtype: tf.dtypes.DType,
     metadata: Metadata = None,
-    validate: bool = False
+    check_consistent_indices_dtype: bool = False,
+    check_consistent_row_splits_dtype: bool = False
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
-
-
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
-<tr> <td> `context_spec`<a id="context_spec"></a> </td> <td> The graph context
-type spec. </td> </tr><tr> <td> `edge_sets_spec`<a id="edge_sets_spec"></a>
-</td> <td> A read-only mapping form edge set name to the edge set type spec.
-</td> </tr><tr> <td> `indices_dtype`<a id="indices_dtype"></a> </td> <td> The
-integer type to represent ragged splits. </td> </tr><tr> <td>
-`node_sets_spec`<a id="node_sets_spec"></a> </td> <td> A read-only mapping form
-node set name to the node set type spec. </td> </tr><tr> <td>
-`rank`<a id="rank"></a> </td> <td> The rank of the GraphPiece. Guaranteed not to
-be `None`. </td> </tr><tr> <td> `shape`<a id="shape"></a> </td> <td> A
-possibly-partial shape specification of the GraphPiece.
-
-The returned `TensorShape` is guaranteed to have a known rank, but the
-individual dimension sizes may be unknown. </td> </tr><tr> <td>
-`total_num_components`<a id="total_num_components"></a> </td> <td> The total
-number of graph components if known. </td> </tr><tr> <td>
-`value_type`<a id="value_type"></a> </td> <td> The Python type for values that
-are compatible with this TypeSpec.
+<tr> <td> <code>context_spec</code><a id="context_spec"></a> </td> <td> The
+graph context type spec. </td> </tr><tr> <td>
+<code>edge_sets_spec</code><a id="edge_sets_spec"></a> </td> <td> A read-only
+mapping form edge set name to the edge set type spec. </td> </tr><tr> <td>
+<code>indices_dtype</code><a id="indices_dtype"></a> </td> <td> The dtype for
+graph items indexing. One of <code>tf.int32</code> or <code>tf.int64</code>.
+</td> </tr><tr> <td> <code>node_sets_spec</code><a id="node_sets_spec"></a>
+</td> <td> A read-only mapping form node set name to the node set type spec.
+</td> </tr><tr> <td> <code>rank</code><a id="rank"></a> </td> <td> The rank of
+the GraphPiece. Guaranteed not to be <code>None</code>. </td> </tr><tr> <td>
+<code>row_splits_dtype</code><a id="row_splits_dtype"></a> </td> <td> The dtype
+for ragged row partitions. One of <code>tf.int32</code> or
+<code>tf.int64</code>. </td> </tr><tr> <td> <code>shape</code><a id="shape"></a>
+</td> <td> A possibly-partial shape specification of the GraphPiece.
+
+The returned <code>TensorShape</code> is guaranteed to have a known rank, but
+the individual dimension sizes may be unknown. </td> </tr><tr> <td>
+<code>total_num_components</code><a id="total_num_components"></a> </td> <td>
+The total number of graph components if known. </td> </tr><tr> <td>
+<code>value_type</code><a id="value_type"></a> </td> <td> The Python type for
+values that are compatible with this TypeSpec.
 
 In particular, all values that are compatible with this TypeSpec must be an
 instance of this type.
@@ -61,8 +56,6 @@ instance of this type.
 </tr>
 </table>
 
-
-
 ## Methods
 
 <h3 id="experimental_as_proto"><code>experimental_as_proto</code></h3>
@@ -89,13 +82,14 @@ Returns a TypeSpec instance based on the serialized proto.
 Do NOT override for custom non-TF types.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`proto`
+<code>proto</code>
 </td>
 <td>
 Proto generated using 'experimental_as_proto'.
@@ -116,7 +110,7 @@ Do NOT override for custom non-TF types.
 
 <h3 id="from_piece_specs"><code>from_piece_specs</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L1211-L1241">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L1505-L1557">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -128,12 +122,13 @@ source</a>
 ) -> 'GraphTensorSpec'
 </code></pre>
 
-The counterpart of <a href="../tfgnn/GraphTensor.md#from_pieces"><code>GraphTensor.from_pieces</code></a> for pieces type specs.
-
+The counterpart of
+<a href="../tfgnn/GraphTensor.md#from_pieces"><code>GraphTensor.from_pieces</code></a>
+for pieces type specs.
 
 <h3 id="from_value"><code>from_value</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L491-L494">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L672-L675">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -145,7 +140,6 @@ source</a>
 
 Extension Types API: Factory method.
 
-
 <h3 id="is_compatible_with"><code>is_compatible_with</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -160,13 +154,14 @@ Prefer using "is_subtype_of" and "most_specific_common_supertype" wherever
 possible.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`spec_or_value`
+<code>spec_or_value</code>
 </td>
 <td>
 A TypeSpec or TypeSpec associated value to compare against.
@@ -174,8 +169,6 @@ A TypeSpec or TypeSpec associated value to compare against.
 </tr>
 </table>
 
-
-
 <h3 id="is_subtype_of"><code>is_subtype_of</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -188,19 +181,19 @@ Returns True if `self` is a subtype of `other`.
 
 Implements the tf.types.experimental.func.TraceType interface.
 
-If not overridden by a subclass, the default behavior is to assume the
-TypeSpec is covariant upon attributes that implement TraceType and
-invariant upon rest of the attributes as well as the structure and type
-of the TypeSpec.
+If not overridden by a subclass, the default behavior is to assume the TypeSpec
+is covariant upon attributes that implement TraceType and invariant upon rest of
+the attributes as well as the structure and type of the TypeSpec.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`other`
+<code>other</code>
 </td>
 <td>
 A TraceType object.
@@ -208,8 +201,6 @@ A TraceType object.
 </tr>
 </table>
 
-
-
 <h3 id="most_specific_common_supertype"><code>most_specific_common_supertype</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -218,23 +209,23 @@ A TraceType object.
 ) -> Optional['TypeSpec']
 </code></pre>
 
-Returns the most specific supertype TypeSpec  of `self` and `others`.
+Returns the most specific supertype TypeSpec of `self` and `others`.
 
 Implements the tf.types.experimental.func.TraceType interface.
 
-If not overridden by a subclass, the default behavior is to assume the
-TypeSpec is covariant upon attributes that implement TraceType and
-invariant upon rest of the attributes as well as the structure and type
-of the TypeSpec.
+If not overridden by a subclass, the default behavior is to assume the TypeSpec
+is covariant upon attributes that implement TraceType and invariant upon rest of
+the attributes as well as the structure and type of the TypeSpec.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`others`
+<code>others</code>
 </td>
 <td>
 A sequence of TraceTypes.
@@ -242,8 +233,6 @@ A sequence of TraceTypes.
 </tr>
 </table>
 
-
-
 <h3 id="most_specific_compatible_type"><code>most_specific_compatible_type</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -252,51 +241,51 @@ A sequence of TraceTypes.
 ) -> 'TypeSpec'
 </code></pre>
 
-Returns the most specific TypeSpec compatible with `self` and `other`. (deprecated)
+Returns the most specific TypeSpec compatible with `self` and `other`.
+(deprecated)
 
 Deprecated: THIS FUNCTION IS DEPRECATED. It will be removed in a future version.
-Instructions for updating:
-Use most_specific_common_supertype instead.
+Instructions for updating: Use most_specific_common_supertype instead.
 
-Deprecated. Please use `most_specific_common_supertype` instead.
-Do not override this function.
+Deprecated. Please use `most_specific_common_supertype` instead. Do not override
+this function.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`other`
+<code>other</code>
 </td>
 <td>
-A `TypeSpec`.
+A <code>TypeSpec</code>.
 </td>
 </tr>
 </table>
 
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Raises</th></tr>
 
 <tr>
 <td>
-`ValueError`
+<code>ValueError</code>
 </td>
 <td>
-If there is no TypeSpec that is compatible with both `self`
-and `other`.
+If there is no TypeSpec that is compatible with both <code>self</code>
+and <code>other</code>.
 </td>
 </tr>
 </table>
 
 <h3 id="relax"><code>relax</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L1274-L1311">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L1590-L1627">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -313,27 +302,28 @@ Allows variable number of graph nodes, edges or/and graph components.
 Calling with all default parameters keeps the spec unchanged.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`num_components`
+<code>num_components</code>
 </td>
 <td>
 if True, allows a variable number of graph components.
 </td>
 </tr><tr>
 <td>
-`num_nodes`
+<code>num_nodes</code>
 </td>
 <td>
 if True, allows a variable number of nodes in each node set.
 </td>
 </tr><tr>
 <td>
-`num_edges`
+<code>num_edges</code>
 </td>
 <td>
 if True, allows a variable number of edges in each edge set.
@@ -342,6 +332,7 @@ if True, allows a variable number of edges in each edge set.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Returns</th></tr>
@@ -354,13 +345,14 @@ Relaxed compatible graph tensor spec.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Raises</th></tr>
 
 <tr>
 <td>
-`ValueError`
+<code>ValueError</code>
 </td>
 <td>
 if graph tensor is not scalar (rank > 0).
@@ -368,6 +360,45 @@ if graph tensor is not scalar (rank > 0).
 </tr>
 </table>
 
+<h3 id="with_indices_dtype"><code>with_indices_dtype</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L596-L608">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_indices_dtype(
+    dtype: tf.dtypes.DType
+) -> 'GraphPieceSpecBase'
+</code></pre>
+
+Returns a copy of this piece spec with the given indices dtype.
+
+<h3 id="with_row_splits_dtype"><code>with_row_splits_dtype</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L637-L651">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_row_splits_dtype(
+    dtype: tf.dtypes.DType
+) -> 'GraphPieceSpecBase'
+</code></pre>
+
+Returns a copy of this piece spec with the given row splits dtype.
+
+<h3 id="with_shape"><code>with_shape</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L570-L584">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_shape(
+    new_shape: ShapeLike
+) -> 'GraphPieceSpecBase'
+</code></pre>
+
+Enforce the common prefix shape on all the contained features.
+
 <h3 id="__eq__"><code>__eq__</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -378,7 +409,6 @@ if graph tensor is not scalar (rank > 0).
 
 Return self==value.
 
-
 <h3 id="__ne__"><code>__ne__</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -388,7 +418,3 @@ Return self==value.
 </code></pre>
 
 Return self!=value.
-
-
-
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/HyperAdjacency.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/HyperAdjacency.md
index f5df0954..98b2aadd 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/HyperAdjacency.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/HyperAdjacency.md
@@ -1,93 +1,82 @@
 # tfgnn.HyperAdjacency
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L34-L178">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L34-L245">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Stores how (hyper-)edges connect tuples of nodes from incident node sets.
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>tfgnn.HyperAdjacency(
-    data: Data, spec: 'GraphPieceSpecBase', validate: bool = False
+    data: Data, spec: 'GraphPieceSpecBase'
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
-The incident node sets in the hyper-adjacency are referenced by a unique
-integer identifier called the node set tag. (For a non-hyper graph, it is
-conventional to use the integers <a href="../tfgnn.md#SOURCE"><code>tfgnn.SOURCE</code></a> and <a href="../tfgnn.md#TARGET"><code>tfgnn.TARGET</code></a>.) This
-allows the hyper-adjacency to connect nodes from the same or different node
-sets. Each hyper-edge connects a fixed number of nodes, one node from each
-incident node set. The adjacency information is stored as a mapping from the
-node set tags to integer tensors containing indices of nodes in corresponding
-node sets. Those tensors are indexed by edges. All index tensors have the same
-type spec and shape of `[*graph_shape, num_edges]`, where `num_edges` is the
-number of edges in the edge set (could be potentially ragged). The index
-tensors are of `tf.Tensor` type if `num_edges` is not `None` or
-`graph_shape.rank = 0` and of `tf.RaggedTensor` type otherwise.
+The incident node sets in the hyper-adjacency are referenced by a unique integer
+identifier called the node set tag. (For a non-hyper graph, it is conventional
+to use the integers <a href="../tfgnn.md#SOURCE"><code>tfgnn.SOURCE</code></a>
+and <a href="../tfgnn.md#TARGET"><code>tfgnn.TARGET</code></a>.) This allows the
+hyper-adjacency to connect nodes from the same or different node sets. Each
+hyper-edge connects a fixed number of nodes, one node from each incident node
+set. The adjacency information is stored as a mapping from the node set tags to
+integer tensors containing indices of nodes in corresponding node sets. Those
+tensors are indexed by edges. All index tensors have the same type spec and
+shape of `[*graph_shape, num_edges]`, where `num_edges` is the number of edges
+in the edge set (could be potentially ragged). The index tensors are of
+`tf.Tensor` type if `num_edges` is not `None` or `graph_shape.rank = 0` and of
+`tf.RaggedTensor` type otherwise.
 
 The HyperAdjacency is a composite tensor.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`data`<a id="data"></a>
+<code>data</code><a id="data"></a>
 </td>
 <td>
 Nest of Field or subclasses of GraphPieceBase.
 </td>
 </tr><tr>
 <td>
-`spec`<a id="spec"></a>
+<code>spec</code><a id="spec"></a>
 </td>
 <td>
-A subclass of GraphPieceSpecBase with a `_data_spec` that matches
-`data`.
-</td>
-</tr><tr>
-<td>
-`validate`<a id="validate"></a>
-</td>
-<td>
-if set, checks that data and spec are aligned, compatible and
-supported.
+A subclass of GraphPieceSpecBase with a <code>_data_spec</code> that matches
+<code>data</code>.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
-<tr> <td> `indices_dtype`<a id="indices_dtype"></a> </td> <td> The integer type
-to represent ragged splits. </td> </tr><tr> <td> `rank`<a id="rank"></a> </td>
-<td> The rank of this Tensor. Guaranteed not to be `None`. </td> </tr><tr> <td>
-`shape`<a id="shape"></a> </td> <td> A possibly-partial shape specification for
-this Tensor.
+<tr> <td> <code>indices_dtype</code><a id="indices_dtype"></a> </td> <td> The
+dtype for graph items indexing. One of <code>tf.int32</code> or
+<code>tf.int64</code>. </td> </tr><tr> <td> <code>rank</code><a id="rank"></a>
+</td> <td> The rank of this Tensor. Guaranteed not to be <code>None</code>.
+</td> </tr><tr> <td> <code>row_splits_dtype</code><a id="row_splits_dtype"></a>
+</td> <td> The dtype for ragged row partitions. One of <code>tf.int32</code> or
+<code>tf.int64</code>. </td> </tr><tr> <td> <code>shape</code><a id="shape"></a>
+</td> <td> A possibly-partial shape specification for this Tensor.
 
-The returned `TensorShape` is guaranteed to have a known rank, but the
+The returned <code>TensorShape</code> is guaranteed to have a known rank, but the
 individual dimension sizes may be unknown.
 </td>
 </tr><tr>
 <td>
-`spec`<a id="spec"></a>
+<code>spec</code><a id="spec"></a>
 </td>
 <td>
 The public type specification of this tensor.
@@ -99,23 +88,20 @@ The public type specification of this tensor.
 
 <h3 id="from_indices"><code>from_indices</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L54-L135">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L54-L111">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>@classmethod</code>
 <code>from_indices(
-    indices: Indices, *_, validate: bool = True
+    indices: Indices, *_, validate: Optional[bool] = None
 ) -> 'HyperAdjacency'
 </code></pre>
 
 Constructs a new instance from the `indices` tensors.
 
-
 #### Example 1:
 
-
-
 ```python
 # Single graph (rank is 0). Connects pairs of nodes (a[0], b[2]),
 # (a[1], b[1]), (a[2], b[0]) from node sets a and b.
@@ -127,8 +113,6 @@ tfgnn.HyperAdjacency.from_indices({
 
 #### Example 2:
 
-
-
 ```python
 # Single hypergraph (rank is 0). Connects triplets of nodes
 # (a[0], b[2], c[1]), (a[1], b[1], c[0]) from the node sets a, b and c.
@@ -141,8 +125,6 @@ tfgnn.HyperAdjacency.from_indices({
 
 #### Example 3:
 
-
-
 ```python
 # Batch of two graphs (rank is 1). Connects pairs of nodes in
 # graph 0: (a[0], b[2]), (a[1], b[1]); graph 1: (a[2], b[0]).
@@ -153,52 +135,50 @@ tfgnn.HyperAdjacency.from_indices({
 ```
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`indices`
+<code>indices</code>
 </td>
 <td>
 A mapping from node tags to 2-tuples of node set name and node
 index tensor. The index tensors must have the same type spec and shape
-of `[*graph_shape, num_edges]`, where `num_edges` is the number of edges
-in each graph (could be ragged). The index tensors are of `tf.Tensor`
-type if `num_edges` is not `None` or `graph_shape.rank = 0` and of
-`tf.RaggedTensor` type otherwise.
+of <code>[*graph_shape, num_edges]</code>, where <code>num_edges</code> is the number of edges
+in each graph (could be ragged). The index tensors are of <code>tf.Tensor</code>
+type if <code>num_edges</code> is not <code>None</code> or <code>graph_shape.rank = 0</code> and of
+<code>tf.RaggedTensor</code> type otherwise.
 </td>
 </tr><tr>
 <td>
-`validate`
+<code>validate</code>
 </td>
 <td>
-If `True`, checks that node indices have the same type spec.
+If <code>True</code>, checks that node indices have the same type spec.
 </td>
 </tr>
 </table>
 
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Returns</th></tr>
 <tr class="alt">
 <td colspan="2">
-A `HyperAdjacency` tensor with its `shape` and `indices_dtype` being
-inferred from the passed `indices` values.
+A <code>HyperAdjacency</code> tensor with its <code>shape</code> and <code>indices_dtype</code> being
+inferred from the passed <code>indices</code> values.
 </td>
 </tr>
 
 </table>
 
-
-
 <h3 id="get_indices_dict"><code>get_indices_dict</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L145-L152">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L175-L182">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -207,10 +187,9 @@ source</a>
 
 Returns copy of indices as a dictionary.
 
-
 <h3 id="node_set_name"><code>node_set_name</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L141-L143">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L171-L173">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -221,24 +200,61 @@ source</a>
 
 Returns a node set name for the given node set tag.
 
-
 <h3 id="set_shape"><code>set_shape</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L300-L306">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L277-L279">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>set_shape(
     new_shape: ShapeLike
-) -> 'GraphPieceSpecBase'
+) -> 'GraphPieceBase'
 </code></pre>
 
-Enforce the common prefix shape on all the contained features.
+Deprecated. Use `with_shape()`.
+
+<h3 id="with_indices_dtype"><code>with_indices_dtype</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L308-L321">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_indices_dtype(
+    dtype: tf.dtypes.DType
+) -> 'GraphPieceBase'
+</code></pre>
+
+Returns a copy of this piece with the given indices dtype.
+
+<h3 id="with_row_splits_dtype"><code>with_row_splits_dtype</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L347-L360">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_row_splits_dtype(
+    dtype: tf.dtypes.DType
+) -> 'GraphPieceBase'
+</code></pre>
+
+Returns a copy of this piece with the given row splits dtype.
+
+<h3 id="with_shape"><code>with_shape</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L281-L295">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_shape(
+    new_shape: ShapeLike
+) -> 'GraphPieceBase'
+</code></pre>
 
+Enforce the common prefix shape on all the contained features.
 
 <h3 id="__getitem__"><code>__getitem__</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L137-L139">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L167-L169">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -248,7 +264,3 @@ source</a>
 </code></pre>
 
 Returns an index tensor for the given node set tag.
-
-
-
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/HyperAdjacencySpec.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/HyperAdjacencySpec.md
index 25df52d3..a25fe62e 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/HyperAdjacencySpec.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/HyperAdjacencySpec.md
@@ -1,53 +1,48 @@
 # tfgnn.HyperAdjacencySpec
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L181-L285">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L248-L375">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
-A type spec for <a href="../tfgnn/HyperAdjacency.md"><code>tfgnn.HyperAdjacency</code></a>.
+A type spec for
+<a href="../tfgnn/HyperAdjacency.md"><code>tfgnn.HyperAdjacency</code></a>.
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>tfgnn.HyperAdjacencySpec(
     data_spec: DataSpec,
     shape: tf.TensorShape,
     indices_dtype: tf.dtypes.DType,
+    row_splits_dtype: tf.dtypes.DType,
     metadata: Metadata = None,
-    validate: bool = False
+    check_consistent_indices_dtype: bool = False,
+    check_consistent_row_splits_dtype: bool = False
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
-
-
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
-<tr> <td> `indices_dtype`<a id="indices_dtype"></a> </td> <td> The integer type
-to represent ragged splits. </td> </tr><tr> <td> `rank`<a id="rank"></a> </td>
-<td> The rank of the GraphPiece. Guaranteed not to be `None`. </td> </tr><tr>
-<td> `shape`<a id="shape"></a> </td> <td> A possibly-partial shape specification
-of the GraphPiece.
-
-The returned `TensorShape` is guaranteed to have a known rank, but the
-individual dimension sizes may be unknown. </td> </tr><tr> <td>
-`total_size`<a id="total_size"></a> </td> <td> The total number of edges if
-known. </td> </tr><tr> <td> `value_type`<a id="value_type"></a> </td> <td> The
-Python type for values that are compatible with this TypeSpec.
+<tr> <td> <code>indices_dtype</code><a id="indices_dtype"></a> </td> <td> The
+dtype for graph items indexing. One of <code>tf.int32</code> or
+<code>tf.int64</code>. </td> </tr><tr> <td> <code>rank</code><a id="rank"></a>
+</td> <td> The rank of the GraphPiece. Guaranteed not to be <code>None</code>.
+</td> </tr><tr> <td> <code>row_splits_dtype</code><a id="row_splits_dtype"></a>
+</td> <td> The dtype for ragged row partitions. One of <code>tf.int32</code> or
+<code>tf.int64</code>. </td> </tr><tr> <td> <code>shape</code><a id="shape"></a>
+</td> <td> A possibly-partial shape specification of the GraphPiece.
+
+The returned <code>TensorShape</code> is guaranteed to have a known rank, but
+the individual dimension sizes may be unknown. </td> </tr><tr> <td>
+<code>total_size</code><a id="total_size"></a> </td> <td> The total number of
+edges if known. </td> </tr><tr> <td>
+<code>value_type</code><a id="value_type"></a> </td> <td> The Python type for
+values that are compatible with this TypeSpec.
 
 In particular, all values that are compatible with this TypeSpec must be an
 instance of this type.
@@ -55,8 +50,6 @@ instance of this type.
 </tr>
 </table>
 
-
-
 ## Methods
 
 <h3 id="experimental_as_proto"><code>experimental_as_proto</code></h3>
@@ -83,13 +76,14 @@ Returns a TypeSpec instance based on the serialized proto.
 Do NOT override for custom non-TF types.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`proto`
+<code>proto</code>
 </td>
 <td>
 Proto generated using 'experimental_as_proto'.
@@ -110,7 +104,7 @@ Do NOT override for custom non-TF types.
 
 <h3 id="from_incident_node_sets"><code>from_incident_node_sets</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L185-L225">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L252-L306">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -123,52 +117,49 @@ source</a>
 
 Constructs a new instance from the `incident_node_sets`.
 
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`incident_node_sets`
+<code>incident_node_sets</code>
 </td>
 <td>
 A mapping from incident node tags to node set names.
 </td>
 </tr><tr>
 <td>
-`index_spec`
+<code>index_spec</code>
 </td>
 <td>
 type spec for all index tensors of shape
-`[*graph_shape, num_edges]`, where `num_edges` is the number of edges in
-each graph. If `num_edges` is not `None` or `graph_shape.rank = 0` the
-spec must be of `tf.TensorSpec` type and of `tf.RaggedTensorSpec` type
+<code>[*graph_shape, num_edges]</code>, where <code>num_edges</code> is the number of edges in
+each graph. If <code>num_edges</code> is not <code>None</code> or <code>graph_shape.rank = 0</code> the
+spec must be of <code>tf.TensorSpec</code> type and of <code>tf.RaggedTensorSpec</code> type
 otherwise.
 </td>
 </tr>
 </table>
 
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Returns</th></tr>
 <tr class="alt">
 <td colspan="2">
-A `HyperAdjacencySpec` TypeSpec.
+A <code>HyperAdjacencySpec</code> TypeSpec.
 </td>
 </tr>
 
 </table>
 
-
-
 <h3 id="from_value"><code>from_value</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L491-L494">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L672-L675">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -180,10 +171,9 @@ source</a>
 
 Extension Types API: Factory method.
 
-
 <h3 id="get_index_specs_dict"><code>get_index_specs_dict</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L235-L242">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L316-L323">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -192,7 +182,6 @@ source</a>
 
 Returns copy of indices type specs as a dictionary.
 
-
 <h3 id="is_compatible_with"><code>is_compatible_with</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -207,13 +196,14 @@ Prefer using "is_subtype_of" and "most_specific_common_supertype" wherever
 possible.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`spec_or_value`
+<code>spec_or_value</code>
 </td>
 <td>
 A TypeSpec or TypeSpec associated value to compare against.
@@ -221,8 +211,6 @@ A TypeSpec or TypeSpec associated value to compare against.
 </tr>
 </table>
 
-
-
 <h3 id="is_subtype_of"><code>is_subtype_of</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -235,19 +223,19 @@ Returns True if `self` is a subtype of `other`.
 
 Implements the tf.types.experimental.func.TraceType interface.
 
-If not overridden by a subclass, the default behavior is to assume the
-TypeSpec is covariant upon attributes that implement TraceType and
-invariant upon rest of the attributes as well as the structure and type
-of the TypeSpec.
+If not overridden by a subclass, the default behavior is to assume the TypeSpec
+is covariant upon attributes that implement TraceType and invariant upon rest of
+the attributes as well as the structure and type of the TypeSpec.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`other`
+<code>other</code>
 </td>
 <td>
 A TraceType object.
@@ -255,8 +243,6 @@ A TraceType object.
 </tr>
 </table>
 
-
-
 <h3 id="most_specific_common_supertype"><code>most_specific_common_supertype</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -265,23 +251,23 @@ A TraceType object.
 ) -> Optional['TypeSpec']
 </code></pre>
 
-Returns the most specific supertype TypeSpec  of `self` and `others`.
+Returns the most specific supertype TypeSpec of `self` and `others`.
 
 Implements the tf.types.experimental.func.TraceType interface.
 
-If not overridden by a subclass, the default behavior is to assume the
-TypeSpec is covariant upon attributes that implement TraceType and
-invariant upon rest of the attributes as well as the structure and type
-of the TypeSpec.
+If not overridden by a subclass, the default behavior is to assume the TypeSpec
+is covariant upon attributes that implement TraceType and invariant upon rest of
+the attributes as well as the structure and type of the TypeSpec.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`others`
+<code>others</code>
 </td>
 <td>
 A sequence of TraceTypes.
@@ -289,8 +275,6 @@ A sequence of TraceTypes.
 </tr>
 </table>
 
-
-
 <h3 id="most_specific_compatible_type"><code>most_specific_compatible_type</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -299,53 +283,51 @@ A sequence of TraceTypes.
 ) -> 'TypeSpec'
 </code></pre>
 
-Returns the most specific TypeSpec compatible with `self` and `other`. (deprecated)
+Returns the most specific TypeSpec compatible with `self` and `other`.
+(deprecated)
 
 Deprecated: THIS FUNCTION IS DEPRECATED. It will be removed in a future version.
-Instructions for updating:
-Use most_specific_common_supertype instead.
+Instructions for updating: Use most_specific_common_supertype instead.
 
-Deprecated. Please use `most_specific_common_supertype` instead.
-Do not override this function.
+Deprecated. Please use `most_specific_common_supertype` instead. Do not override
+this function.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`other`
+<code>other</code>
 </td>
 <td>
-A `TypeSpec`.
+A <code>TypeSpec</code>.
 </td>
 </tr>
 </table>
 
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Raises</th></tr>
 
 <tr>
 <td>
-`ValueError`
+<code>ValueError</code>
 </td>
 <td>
-If there is no TypeSpec that is compatible with both `self`
-and `other`.
+If there is no TypeSpec that is compatible with both <code>self</code>
+and <code>other</code>.
 </td>
 </tr>
 </table>
 
-
-
 <h3 id="node_set_name"><code>node_set_name</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L244-L246">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L325-L327">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -358,7 +340,7 @@ Returns a node set name for the given node set tag.
 
 <h3 id="relax"><code>relax</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L255-L285">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L336-L366">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -372,13 +354,14 @@ Allows variable number of graph edges.
 Calling with all default parameters keeps the spec unchanged.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`num_edges`
+<code>num_edges</code>
 </td>
 <td>
 if True, allows a variable number of edges in each edge set.
@@ -388,6 +371,7 @@ If False, returns spec unchanged.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Returns</th></tr>
@@ -400,13 +384,14 @@ Relaxed compatible spec.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Raises</th></tr>
 
 <tr>
 <td>
-`ValueError`
+<code>ValueError</code>
 </td>
 <td>
 if adjacency is not scalar (rank > 0).
@@ -414,6 +399,45 @@ if adjacency is not scalar (rank > 0).
 </tr>
 </table>
 
+<h3 id="with_indices_dtype"><code>with_indices_dtype</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L596-L608">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_indices_dtype(
+    dtype: tf.dtypes.DType
+) -> 'GraphPieceSpecBase'
+</code></pre>
+
+Returns a copy of this piece spec with the given indices dtype.
+
+<h3 id="with_row_splits_dtype"><code>with_row_splits_dtype</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L637-L651">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_row_splits_dtype(
+    dtype: tf.dtypes.DType
+) -> 'GraphPieceSpecBase'
+</code></pre>
+
+Returns a copy of this piece spec with the given row splits dtype.
+
+<h3 id="with_shape"><code>with_shape</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L570-L584">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_shape(
+    new_shape: ShapeLike
+) -> 'GraphPieceSpecBase'
+</code></pre>
+
+Enforce the common prefix shape on all the contained features.
+
 <h3 id="__eq__"><code>__eq__</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -424,10 +448,9 @@ if adjacency is not scalar (rank > 0).
 
 Return self==value.
 
-
 <h3 id="__getitem__"><code>__getitem__</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L231-L233">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/adjacency.py#L312-L314">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -438,7 +461,6 @@ source</a>
 
 Returns an index tensor type spec for the given node set tag.
 
-
 <h3 id="__ne__"><code>__ne__</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -448,7 +470,3 @@ Returns an index tensor type spec for the given node set tag.
 </code></pre>
 
 Return self!=value.
-
-
-
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/IncidentNodeOrContextTag.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/IncidentNodeOrContextTag.md
index 4fd13809..4bacf66d 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/IncidentNodeOrContextTag.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/IncidentNodeOrContextTag.md
@@ -1,14 +1,8 @@
-<div itemscope itemtype="http://developers.google.com/ReferenceObject">
-<meta itemprop="name" content="tfgnn.IncidentNodeOrContextTag" />
-<meta itemprop="path" content="Stable" />
-</div>
-
 # tfgnn.IncidentNodeOrContextTag
 
 <!-- Insert buttons and diff -->
-This symbol is a **type alias**.
-
 
+This symbol is a **type alias**.
 
 #### Source:
 
@@ -19,6 +13,4 @@ This symbol is a **type alias**.
 ]
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/NodeSet.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/NodeSet.md
index 2ad4b413..2d56c857 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/NodeSet.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/NodeSet.md
@@ -1,34 +1,25 @@
 # tfgnn.NodeSet
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L433-L498">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L567-L648">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 A composite tensor for node set features plus size information.
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>tfgnn.NodeSet(
-    data: Data, spec: 'GraphPieceSpecBase', validate: bool = False
+    data: Data, spec: 'GraphPieceSpecBase'
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
 The items of the node set are subset of graph nodes.
 
-All nodes in a node set have the same features, identified by a string key.
-Each feature is stored as one tensor and has shape `[*graph_shape, num_nodes,
+All nodes in a node set have the same features, identified by a string key. Each
+feature is stored as one tensor and has shape `[*graph_shape, num_nodes,
 *feature_shape]`. The `num_nodes` is the number of nodes in a graph (could be
 ragged). The `feature_shape` is the shape of the feature value for each node.
 NodeSet supports both fixed-size and variable-size features. The fixed-size
@@ -36,83 +27,80 @@ features must have fully defined feature_shape. They are stored as `tf.Tensor`
 if `num_nodes` is fixed-size or `graph_shape.rank = 0`. Variable-size node
 features are always stored as `tf.RaggedTensor`.
 
-Note that node set features are indexed without regard to graph components.
-The information which node belong to which graph component is contained in
-the `.sizes` tensor which defines the number of nodes in each graph component.
+Note that node set features are indexed without regard to graph components. The
+information which node belong to which graph component is contained in the
+`.sizes` tensor which defines the number of nodes in each graph component.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`data`<a id="data"></a>
+<code>data</code><a id="data"></a>
 </td>
 <td>
 Nest of Field or subclasses of GraphPieceBase.
 </td>
 </tr><tr>
 <td>
-`spec`<a id="spec"></a>
+<code>spec</code><a id="spec"></a>
 </td>
 <td>
-A subclass of GraphPieceSpecBase with a `_data_spec` that matches
-`data`.
-</td>
-</tr><tr>
-<td>
-`validate`<a id="validate"></a>
-</td>
-<td>
-if set, checks that data and spec are aligned, compatible and
-supported.
+A subclass of GraphPieceSpecBase with a <code>_data_spec</code> that matches
+<code>data</code>.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
-<tr> <td> `features`<a id="features"></a> </td> <td> A read-only mapping of
-feature name to feature specs. </td> </tr><tr> <td>
-`indices_dtype`<a id="indices_dtype"></a> </td> <td> The integer type to
-represent ragged splits. </td> </tr><tr> <td>
-`num_components`<a id="num_components"></a> </td> <td> The number of graph
-components for each graph. </td> </tr><tr> <td> `rank`<a id="rank"></a> </td>
-<td> The rank of this Tensor. Guaranteed not to be `None`. </td> </tr><tr> <td>
-`shape`<a id="shape"></a> </td> <td> A possibly-partial shape specification for
-this Tensor.
-
-The returned `TensorShape` is guaranteed to have a known rank, but the
+<tr> <td> <code>features</code><a id="features"></a> </td> <td> A read-only
+mapping of feature name to feature specs. </td> </tr><tr> <td>
+<code>indices_dtype</code><a id="indices_dtype"></a> </td> <td> The dtype for
+graph items indexing. One of <code>tf.int32</code> or <code>tf.int64</code>.
+</td> </tr><tr> <td> <code>num_components</code><a id="num_components"></a>
+</td> <td> The number of graph components for each graph. </td> </tr><tr> <td>
+<code>rank</code><a id="rank"></a> </td> <td> The rank of this Tensor.
+Guaranteed not to be <code>None</code>. </td> </tr><tr> <td>
+<code>row_splits_dtype</code><a id="row_splits_dtype"></a> </td> <td> The dtype
+for ragged row partitions. One of <code>tf.int32</code> or
+<code>tf.int64</code>. </td> </tr><tr> <td> <code>shape</code><a id="shape"></a>
+</td> <td> A possibly-partial shape specification for this Tensor.
+
+The returned <code>TensorShape</code> is guaranteed to have a known rank, but the
 individual dimension sizes may be unknown.
 </td>
 </tr><tr>
 <td>
-`sizes`<a id="sizes"></a>
+<code>sizes</code><a id="sizes"></a>
 </td>
 <td>
 The number of items in each graph component.
 </td>
 </tr><tr>
 <td>
-`spec`<a id="spec"></a>
+<code>spec</code><a id="spec"></a>
 </td>
 <td>
 The public type specification of this tensor.
 </td>
 </tr><tr>
 <td>
-`total_num_components`<a id="total_num_components"></a>
+<code>total_num_components</code><a id="total_num_components"></a>
 </td>
 <td>
 The total number of graph components.
 </td>
 </tr><tr>
 <td>
-`total_size`<a id="total_size"></a>
+<code>total_size</code><a id="total_size"></a>
 </td>
 <td>
 The total number of items.
@@ -124,23 +112,23 @@ The total number of items.
 
 <h3 id="from_fields"><code>from_fields</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L452-L489">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L587-L639">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>@classmethod</code>
 <code>from_fields(
-    *, features: Optional[Fields] = None, sizes: Field
+    *_,
+    features: Optional[Fields] = None,
+    sizes: Field,
+    validate: Optional[bool] = None
 ) -> 'NodeSet'
 </code></pre>
 
 Constructs a new instance from node set fields.
 
-
 #### Example:
 
-
-
 ```python
 tfgnn.NodeSet.from_fields(
     sizes=tf.constant([3]),
@@ -155,52 +143,60 @@ tfgnn.NodeSet.from_fields(
 ```
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`features`
+<code>features</code>
 </td>
 <td>
 A mapping from feature name to feature Tensors or RaggedTensors.
-All feature tensors must have shape `[*graph_shape, num_nodes,
-*feature_shape]`, where `num_nodes` is the number of nodes in the node
+All feature tensors must have shape <code>[*graph_shape, num_nodes,
+*feature_shape]</code>, where <code>num_nodes</code> is the number of nodes in the node
 set (could be ragged) and feature_shape is a shape of the feature value
 for each node.
 </td>
 </tr><tr>
 <td>
-`sizes`
+<code>sizes</code>
 </td>
 <td>
 A number of nodes in each graph component. Has shape
-`[*graph_shape, num_components]`, where `num_components` is the number
+<code>[*graph_shape, num_components]</code>, where <code>num_components</code> is the number
 of graph components (could be ragged).
 </td>
+</tr><tr>
+<td>
+<code>validate</code>
+</td>
+<td>
+If true, use tf.assert ops to inspect the shapes of each field
+and check at runtime that they form a valid NodeSet.  The default
+behavior is set by the <code>disable_graph_tensor_validation_at_runtime()</code>
+and <code>enable_graph_tensor_validation_at_runtime()</code>.
+</td>
 </tr>
 </table>
 
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Returns</th></tr>
 <tr class="alt">
 <td colspan="2">
-A `NodeSet` composite tensor.
+A <code>NodeSet</code> composite tensor.
 </td>
 </tr>
 
 </table>
 
-
-
 <h3 id="get_features_dict"><code>get_features_dict</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L156-L158">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L222-L224">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -209,10 +205,9 @@ source</a>
 
 Returns features copy as a dictionary.
 
-
 <h3 id="replace_features"><code>replace_features</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L419-L425">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L553-L559">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -223,20 +218,57 @@ source</a>
 
 Returns a new instance with a new set of features.
 
-
 <h3 id="set_shape"><code>set_shape</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L300-L306">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L277-L279">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>set_shape(
     new_shape: ShapeLike
-) -> 'GraphPieceSpecBase'
+) -> 'GraphPieceBase'
 </code></pre>
 
-Enforce the common prefix shape on all the contained features.
+Deprecated. Use `with_shape()`.
+
+<h3 id="with_indices_dtype"><code>with_indices_dtype</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L308-L321">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_indices_dtype(
+    dtype: tf.dtypes.DType
+) -> 'GraphPieceBase'
+</code></pre>
+
+Returns a copy of this piece with the given indices dtype.
+
+<h3 id="with_row_splits_dtype"><code>with_row_splits_dtype</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L347-L360">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_row_splits_dtype(
+    dtype: tf.dtypes.DType
+) -> 'GraphPieceBase'
+</code></pre>
+
+Returns a copy of this piece with the given row splits dtype.
+
+<h3 id="with_shape"><code>with_shape</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L281-L295">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_shape(
+    new_shape: ShapeLike
+) -> 'GraphPieceBase'
+</code></pre>
 
+Enforce the common prefix shape on all the contained features.
 
 <h3 id="__getitem__"><code>__getitem__</code></h3>
 
@@ -250,7 +282,3 @@ source</a>
 </code></pre>
 
 Indexing operator `[]` to access feature values by their name.
-
-
-
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/NodeSetSpec.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/NodeSetSpec.md
index f1cd9522..1742a1ce 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/NodeSetSpec.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/NodeSetSpec.md
@@ -1,17 +1,10 @@
 # tfgnn.NodeSetSpec
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L501-L545">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L651-L695">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 A type spec for <a href="../tfgnn/NodeSet.md"><code>tfgnn.NodeSet</code></a>.
 
@@ -20,40 +13,41 @@ A type spec for <a href="../tfgnn/NodeSet.md"><code>tfgnn.NodeSet</code></a>.
     data_spec: DataSpec,
     shape: tf.TensorShape,
     indices_dtype: tf.dtypes.DType,
+    row_splits_dtype: tf.dtypes.DType,
     metadata: Metadata = None,
-    validate: bool = False
+    check_consistent_indices_dtype: bool = False,
+    check_consistent_row_splits_dtype: bool = False
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
-
-
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
-<tr> <td> `features_spec`<a id="features_spec"></a> </td> <td> A read-only
-mapping of feature name to feature spec. </td> </tr><tr> <td>
-`indices_dtype`<a id="indices_dtype"></a> </td> <td> The integer type to
-represent ragged splits. </td> </tr><tr> <td> `rank`<a id="rank"></a> </td> <td>
-The rank of the GraphPiece. Guaranteed not to be `None`. </td> </tr><tr> <td>
-`shape`<a id="shape"></a> </td> <td> A possibly-partial shape specification of
-the GraphPiece.
-
-The returned `TensorShape` is guaranteed to have a known rank, but the
-individual dimension sizes may be unknown. </td> </tr><tr> <td>
-`sizes_spec`<a id="sizes_spec"></a> </td> <td> The type spec for the sizes that
-provides num. elements per component. </td> </tr><tr> <td>
-`total_num_components`<a id="total_num_components"></a> </td> <td> The total
-number of graph components if known. </td> </tr><tr> <td>
-`total_size`<a id="total_size"></a> </td> <td> The total number of graph items
-if known. </td> </tr><tr> <td> `value_type`<a id="value_type"></a> </td> <td>
-The Python type for values that are compatible with this TypeSpec.
+<tr> <td> <code>features_spec</code><a id="features_spec"></a> </td> <td> A
+read-only mapping of feature name to feature spec. </td> </tr><tr> <td>
+<code>indices_dtype</code><a id="indices_dtype"></a> </td> <td> The dtype for
+graph items indexing. One of <code>tf.int32</code> or <code>tf.int64</code>.
+</td> </tr><tr> <td> <code>rank</code><a id="rank"></a> </td> <td> The rank of
+the GraphPiece. Guaranteed not to be <code>None</code>. </td> </tr><tr> <td>
+<code>row_splits_dtype</code><a id="row_splits_dtype"></a> </td> <td> The dtype
+for ragged row partitions. One of <code>tf.int32</code> or
+<code>tf.int64</code>. </td> </tr><tr> <td> <code>shape</code><a id="shape"></a>
+</td> <td> A possibly-partial shape specification of the GraphPiece.
+
+The returned <code>TensorShape</code> is guaranteed to have a known rank, but
+the individual dimension sizes may be unknown. </td> </tr><tr> <td>
+<code>sizes_spec</code><a id="sizes_spec"></a> </td> <td> The type spec for the
+sizes that provides num. elements per component. </td> </tr><tr> <td>
+<code>total_num_components</code><a id="total_num_components"></a> </td> <td>
+The total number of graph components if known. </td> </tr><tr> <td>
+<code>total_size</code><a id="total_size"></a> </td> <td> The total number of
+graph items if known. </td> </tr><tr> <td>
+<code>value_type</code><a id="value_type"></a> </td> <td> The Python type for
+values that are compatible with this TypeSpec.
 
 In particular, all values that are compatible with this TypeSpec must be an
 instance of this type.
@@ -61,8 +55,6 @@ instance of this type.
 </tr>
 </table>
 
-
-
 ## Methods
 
 <h3 id="experimental_as_proto"><code>experimental_as_proto</code></h3>
@@ -89,13 +81,14 @@ Returns a TypeSpec instance based on the serialized proto.
 Do NOT override for custom non-TF types.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`proto`
+<code>proto</code>
 </td>
 <td>
 Proto generated using 'experimental_as_proto'.
@@ -116,7 +109,7 @@ Do NOT override for custom non-TF types.
 
 <h3 id="from_field_specs"><code>from_field_specs</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L505-L515">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L655-L665">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -126,12 +119,13 @@ source</a>
 ) -> 'NodeSetSpec'
 </code></pre>
 
-The counterpart of <a href="../tfgnn/NodeSet.md#from_fields"><code>NodeSet.from_fields()</code></a> for values type specs.
-
+The counterpart of
+<a href="../tfgnn/NodeSet.md#from_fields"><code>NodeSet.from_fields()</code></a>
+for values type specs.
 
 <h3 id="from_value"><code>from_value</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L491-L494">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L672-L675">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -143,7 +137,6 @@ source</a>
 
 Extension Types API: Factory method.
 
-
 <h3 id="is_compatible_with"><code>is_compatible_with</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -158,13 +151,14 @@ Prefer using "is_subtype_of" and "most_specific_common_supertype" wherever
 possible.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`spec_or_value`
+<code>spec_or_value</code>
 </td>
 <td>
 A TypeSpec or TypeSpec associated value to compare against.
@@ -172,8 +166,6 @@ A TypeSpec or TypeSpec associated value to compare against.
 </tr>
 </table>
 
-
-
 <h3 id="is_subtype_of"><code>is_subtype_of</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -186,19 +178,19 @@ Returns True if `self` is a subtype of `other`.
 
 Implements the tf.types.experimental.func.TraceType interface.
 
-If not overridden by a subclass, the default behavior is to assume the
-TypeSpec is covariant upon attributes that implement TraceType and
-invariant upon rest of the attributes as well as the structure and type
-of the TypeSpec.
+If not overridden by a subclass, the default behavior is to assume the TypeSpec
+is covariant upon attributes that implement TraceType and invariant upon rest of
+the attributes as well as the structure and type of the TypeSpec.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`other`
+<code>other</code>
 </td>
 <td>
 A TraceType object.
@@ -206,8 +198,6 @@ A TraceType object.
 </tr>
 </table>
 
-
-
 <h3 id="most_specific_common_supertype"><code>most_specific_common_supertype</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -216,23 +206,23 @@ A TraceType object.
 ) -> Optional['TypeSpec']
 </code></pre>
 
-Returns the most specific supertype TypeSpec  of `self` and `others`.
+Returns the most specific supertype TypeSpec of `self` and `others`.
 
 Implements the tf.types.experimental.func.TraceType interface.
 
-If not overridden by a subclass, the default behavior is to assume the
-TypeSpec is covariant upon attributes that implement TraceType and
-invariant upon rest of the attributes as well as the structure and type
-of the TypeSpec.
+If not overridden by a subclass, the default behavior is to assume the TypeSpec
+is covariant upon attributes that implement TraceType and invariant upon rest of
+the attributes as well as the structure and type of the TypeSpec.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`others`
+<code>others</code>
 </td>
 <td>
 A sequence of TraceTypes.
@@ -240,8 +230,6 @@ A sequence of TraceTypes.
 </tr>
 </table>
 
-
-
 <h3 id="most_specific_compatible_type"><code>most_specific_compatible_type</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -250,51 +238,51 @@ A sequence of TraceTypes.
 ) -> 'TypeSpec'
 </code></pre>
 
-Returns the most specific TypeSpec compatible with `self` and `other`. (deprecated)
+Returns the most specific TypeSpec compatible with `self` and `other`.
+(deprecated)
 
 Deprecated: THIS FUNCTION IS DEPRECATED. It will be removed in a future version.
-Instructions for updating:
-Use most_specific_common_supertype instead.
+Instructions for updating: Use most_specific_common_supertype instead.
 
-Deprecated. Please use `most_specific_common_supertype` instead.
-Do not override this function.
+Deprecated. Please use `most_specific_common_supertype` instead. Do not override
+this function.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`other`
+<code>other</code>
 </td>
 <td>
-A `TypeSpec`.
+A <code>TypeSpec</code>.
 </td>
 </tr>
 </table>
 
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Raises</th></tr>
 
 <tr>
 <td>
-`ValueError`
+<code>ValueError</code>
 </td>
 <td>
-If there is no TypeSpec that is compatible with both `self`
-and `other`.
+If there is no TypeSpec that is compatible with both <code>self</code>
+and <code>other</code>.
 </td>
 </tr>
 </table>
 
 <h3 id="relax"><code>relax</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L521-L545">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L671-L695">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -308,30 +296,32 @@ Allows variable number of nodes or/and graph components.
 Calling with all default parameters keeps the spec unchanged.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`num_components`
+<code>num_components</code>
 </td>
 <td>
 if True, allows variable number of graph components by
-setting the outermost sizes dimension to `None`.
+setting the outermost sizes dimension to <code>None</code>.
 </td>
 </tr><tr>
 <td>
-`num_nodes`
+<code>num_nodes</code>
 </td>
 <td>
 if True, allows variable number of nodes by setting the
-outermost features dimensions to `None`.
+outermost features dimensions to <code>None</code>.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Returns</th></tr>
@@ -344,13 +334,14 @@ Relaxed compatible edge set spec.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Raises</th></tr>
 
 <tr>
 <td>
-`ValueError`
+<code>ValueError</code>
 </td>
 <td>
 if edge set is not scalar (rank > 0).
@@ -358,6 +349,45 @@ if edge set is not scalar (rank > 0).
 </tr>
 </table>
 
+<h3 id="with_indices_dtype"><code>with_indices_dtype</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L596-L608">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_indices_dtype(
+    dtype: tf.dtypes.DType
+) -> 'GraphPieceSpecBase'
+</code></pre>
+
+Returns a copy of this piece spec with the given indices dtype.
+
+<h3 id="with_row_splits_dtype"><code>with_row_splits_dtype</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L637-L651">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_row_splits_dtype(
+    dtype: tf.dtypes.DType
+) -> 'GraphPieceSpecBase'
+</code></pre>
+
+Returns a copy of this piece spec with the given row splits dtype.
+
+<h3 id="with_shape"><code>with_shape</code></h3>
+
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_piece.py#L570-L584">View
+source</a>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>with_shape(
+    new_shape: ShapeLike
+) -> 'GraphPieceSpecBase'
+</code></pre>
+
+Enforce the common prefix shape on all the contained features.
+
 <h3 id="__eq__"><code>__eq__</code></h3>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -368,10 +398,9 @@ if edge set is not scalar (rank > 0).
 
 Return self==value.
 
-
 <h3 id="__getitem__"><code>__getitem__</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L164-L165">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L240-L241">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -389,7 +418,3 @@ source</a>
 </code></pre>
 
 Return self!=value.
-
-
-
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/SizeConstraints.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/SizeConstraints.md
index 81329a1a..967e35e3 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/SizeConstraints.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/SizeConstraints.md
@@ -1,17 +1,10 @@
 # tfgnn.SizeConstraints
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/preprocessing_common.py#L23-L29">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/preprocessing_common.py#L23-L29">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Constraints on the number of entities in the graph.
 
@@ -24,45 +17,40 @@ Constraints on the number of entities in the graph.
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
-
-
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
 <tr>
 <td>
-`total_num_components`<a id="total_num_components"></a>
+<code>total_num_components</code><a id="total_num_components"></a>
 </td>
 <td>
-A `namedtuple` alias for field number 0
+A <code>namedtuple</code> alias for field number 0
 </td>
 </tr><tr>
 <td>
-`total_num_nodes`<a id="total_num_nodes"></a>
+<code>total_num_nodes</code><a id="total_num_nodes"></a>
 </td>
 <td>
-A `namedtuple` alias for field number 1
+A <code>namedtuple</code> alias for field number 1
 </td>
 </tr><tr>
 <td>
-`total_num_edges`<a id="total_num_edges"></a>
+<code>total_num_edges</code><a id="total_num_edges"></a>
 </td>
 <td>
-A `namedtuple` alias for field number 2
+A <code>namedtuple</code> alias for field number 2
 </td>
 </tr><tr>
 <td>
-`min_nodes_per_component`<a id="min_nodes_per_component"></a>
+<code>min_nodes_per_component</code><a id="min_nodes_per_component"></a>
 </td>
 <td>
-A `namedtuple` alias for field number 3
+A <code>namedtuple</code> alias for field number 3
 </td>
 </tr>
 </table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/ValidationError.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/ValidationError.md
index 6b6d888f..f7149e21 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/ValidationError.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/ValidationError.md
@@ -1,17 +1,10 @@
 # tfgnn.ValidationError
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/schema_validation.py#L37-L42">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/schema_validation.py#L30-L35">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 A schema validation error.
 
@@ -21,10 +14,7 @@ A schema validation error.
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
 This exception is raised if in the course of validating the schema for
 correctness some errors are found.
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/add_readout_from_first_node.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/add_readout_from_first_node.md
index 49733d9f..5f745cd5 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/add_readout_from_first_node.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/add_readout_from_first_node.md
@@ -1,17 +1,10 @@
 # tfgnn.add_readout_from_first_node
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/readout.py#L539-L596">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/readout.py#L545-L603">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Adds a readout structure equivalent to
 <a href="../tfgnn/gather_first_node.md"><code>tfgnn.gather_first_node()</code></a>.
@@ -35,15 +28,15 @@ Adds a readout structure equivalent to
 
 <tr>
 <td>
-`graph`<a id="graph"></a>
+<code>graph</code><a id="graph"></a>
 </td>
 <td>
-A scalar `GraphTensor`. If it contains the `readout_node_set`
+A scalar <code>GraphTensor</code>. If it contains the <code>readout_node_set</code>
 already, its size in each graph component must be 1.
 </td>
 </tr><tr>
 <td>
-`key`<a id="key"></a>
+<code>key</code><a id="key"></a>
 </td>
 <td>
 A key, for use with <a href="../tfgnn/structured_readout.md"><code>tfgnn.structured_readout()</code></a>. The input graph
@@ -51,7 +44,7 @@ must not already contain auxiliary edge sets for readout with this key.
 </td>
 </tr><tr>
 <td>
-`node_set_name`<a id="node_set_name"></a>
+<code>node_set_name</code><a id="node_set_name"></a>
 </td>
 <td>
 The name of the node set from which values are to be read
@@ -59,7 +52,7 @@ out.
 </td>
 </tr><tr>
 <td>
-`readout_node_set`<a id="readout_node_set"></a>
+<code>readout_node_set</code><a id="readout_node_set"></a>
 </td>
 <td>
 The name of the auxiliary node set for readout,
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/add_self_loops.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/add_self_loops.md
index 6d865b51..45baa580 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/add_self_loops.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/add_self_loops.md
@@ -1,71 +1,56 @@
 # tfgnn.add_self_loops
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_ops.py#L63-L178">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_ops.py#L53-L172">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
-Adds self-loops for edge with name `edge_set_name` EVEN if already exist.
+Adds self-loops for `edge_set_name` EVEN if they already exist.
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>tfgnn.add_self_loops(
-    graph: GraphTensor,
-    edge_set_name: gt.EdgeSetName,
-    *,
-    edge_feature_initializer: _EdgeFeatureInitializer = _zero_edge_feat_init
+    graph: GraphTensor, edge_set_name: gt.EdgeSetName
 ) -> GraphTensor
 </code></pre>
 
 <!-- Placeholder for "Used in" -->
-
-Edge `edge_set_name` must connect pair of nodes of the same node set.
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`graph`<a id="graph"></a>
+<code>graph</code><a id="graph"></a>
 </td>
 <td>
-GraphTensor without self-loops. NOTE: If it has self-loops, then
-another round if self-loops will be added.
+A scalar GraphTensor.
 </td>
 </tr><tr>
 <td>
-`edge_set_name`<a id="edge_set_name"></a>
+<code>edge_set_name</code><a id="edge_set_name"></a>
 </td>
 <td>
-Must connect node pairs of the same node set.
-</td>
-</tr><tr>
-<td>
-`edge_feature_initializer`<a id="edge_feature_initializer"></a>
-</td>
-<td>
-initializes edge features for the self-loop edges.
-It defaults to initializing features of new edges to tf.zeros.
+An edge set in <code>graph</code> that has the same node set as source
+and target.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-GraphTensor with self-loops added.
+A GraphTensor with self-loops added. A self-loop is added at each node,
+even if some or all of these nodes already have a loop. All feature tensors
+of the edge set are extended to cover the newly added edges with values
+that are all zeros (for numeric features), false (for boolean features), or
+empty (for string features), respectively.
 </td>
 </tr>
 
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/all_symbols.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/all_symbols.md
index c769ab64..8d458cf3 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/all_symbols.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/all_symbols.md
@@ -11,14 +11,14 @@
 *   <a href="../tfgnn/ContextSpec.md"><code>tfgnn.ContextSpec</code></a>
 *   <a href="../tfgnn/EdgeSet.md"><code>tfgnn.EdgeSet</code></a>
 *   <a href="../tfgnn/EdgeSetSpec.md"><code>tfgnn.EdgeSetSpec</code></a>
-*   <a href="../tfgnn/Feature.md"><code>tfgnn.Feature</code></a>
+*   <a href="../tfgnn/proto/Feature.md"><code>tfgnn.Feature</code></a>
 *   <a href="../tfgnn/FeatureDefaultValues.md"><code>tfgnn.FeatureDefaultValues</code></a>
 *   <a href="../tfgnn/Field.md"><code>tfgnn.Field</code></a>
 *   <a href="../tfgnn/FieldOrFields.md"><code>tfgnn.FieldOrFields</code></a>
 *   <a href="../tfgnn/FieldSpec.md"><code>tfgnn.FieldSpec</code></a>
 *   <a href="../tfgnn/Fields.md"><code>tfgnn.Fields</code></a>
 *   <a href="../tfgnn/FieldsSpec.md"><code>tfgnn.FieldsSpec</code></a>
-*   <a href="../tfgnn/GraphSchema.md"><code>tfgnn.GraphSchema</code></a>
+*   <a href="../tfgnn/proto/GraphSchema.md"><code>tfgnn.GraphSchema</code></a>
 *   <a href="../tfgnn/GraphTensor.md"><code>tfgnn.GraphTensor</code></a>
 *   <a href="../tfgnn/GraphTensorSpec.md"><code>tfgnn.GraphTensorSpec</code></a>
 *   <a href="../tfgnn/HyperAdjacency.md"><code>tfgnn.HyperAdjacency</code></a>
@@ -47,6 +47,10 @@
 *   <a href="../tfgnn/create_schema_pb_from_graph_spec.md"><code>tfgnn.create_schema_pb_from_graph_spec</code></a>
 *   <a href="../tfgnn/dataset_filter_with_summary.md"><code>tfgnn.dataset_filter_with_summary</code></a>
 *   <a href="../tfgnn/dataset_from_generator.md"><code>tfgnn.dataset_from_generator</code></a>
+*   <a href="../tfgnn/disable_graph_tensor_validation.md"><code>tfgnn.disable_graph_tensor_validation</code></a>
+*   <a href="../tfgnn/disable_graph_tensor_validation_at_runtime.md"><code>tfgnn.disable_graph_tensor_validation_at_runtime</code></a>
+*   <a href="../tfgnn/enable_graph_tensor_validation.md"><code>tfgnn.enable_graph_tensor_validation</code></a>
+*   <a href="../tfgnn/enable_graph_tensor_validation_at_runtime.md"><code>tfgnn.enable_graph_tensor_validation_at_runtime</code></a>
 *   <a href="../tfgnn/experimental.md"><code>tfgnn.experimental</code></a>
 *   <a href="../tfgnn/find_tight_size_constraints.md"><code>tfgnn.find_tight_size_constraints</code></a>
 *   <a href="../tfgnn/gather_first_node.md"><code>tfgnn.gather_first_node</code></a>
@@ -99,7 +103,20 @@
 *   <a href="../tfgnn/pool.md"><code>tfgnn.pool</code></a>
 *   <a href="../tfgnn/pool_edges_to_context.md"><code>tfgnn.pool_edges_to_context</code></a>
 *   <a href="../tfgnn/pool_edges_to_node.md"><code>tfgnn.pool_edges_to_node</code></a>
+*   <a href="../tfgnn/pool_neighbors_to_node.md"><code>tfgnn.pool_neighbors_to_node</code></a>
+*   <a href="../tfgnn/pool_neighbors_to_node_feature.md"><code>tfgnn.pool_neighbors_to_node_feature</code></a>
 *   <a href="../tfgnn/pool_nodes_to_context.md"><code>tfgnn.pool_nodes_to_context</code></a>
+*   <a href="../tfgnn/proto.md"><code>tfgnn.proto</code></a>
+*   <a href="../tfgnn/proto/BigQuery.md"><code>tfgnn.proto.BigQuery</code></a>
+*   <a href="../tfgnn/proto/BigQuery/TableSpec.md"><code>tfgnn.proto.BigQuery.TableSpec</code></a>
+*   <a href="../tfgnn/proto/Context.md"><code>tfgnn.proto.Context</code></a>
+*   <a href="../tfgnn/proto/EdgeSet.md"><code>tfgnn.proto.EdgeSet</code></a>
+*   <a href="../tfgnn/proto/Feature.md"><code>tfgnn.proto.Feature</code></a>
+*   <a href="../tfgnn/proto/GraphSchema.md"><code>tfgnn.proto.GraphSchema</code></a>
+*   <a href="../tfgnn/proto/Metadata.md"><code>tfgnn.proto.Metadata</code></a>
+*   <a href="../tfgnn/proto/Metadata/KeyValue.md"><code>tfgnn.proto.Metadata.KeyValue</code></a>
+*   <a href="../tfgnn/proto/NodeSet.md"><code>tfgnn.proto.NodeSet</code></a>
+*   <a href="../tfgnn/proto/OriginInfo.md"><code>tfgnn.proto.OriginInfo</code></a>
 *   <a href="../tfgnn/random_graph_tensor.md"><code>tfgnn.random_graph_tensor</code></a>
 *   <a href="../tfgnn/read_schema.md"><code>tfgnn.read_schema</code></a>
 *   <a href="../tfgnn/reorder_nodes.md"><code>tfgnn.reorder_nodes</code></a>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/assert_constraints.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/assert_constraints.md
index f2954781..9d002af4 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/assert_constraints.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/assert_constraints.md
@@ -1,17 +1,10 @@
 # tfgnn.assert_constraints
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/schema_validation.py#L301-L319">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/schema_validation.py#L301-L319">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Validate the shape constaints of a graph's features at runtime.
 
@@ -21,32 +14,32 @@ Validate the shape constaints of a graph's features at runtime.
 ) -> tf.Operation
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
 This code returns a TensorFlow op with debugging assertions that ensure the
-parsed data has valid shape constraints for a graph. This can be instantiated
-in your TensorFlow graph while debugging if you believe that your data may be
+parsed data has valid shape constraints for a graph. This can be instantiated in
+your TensorFlow graph while debugging if you believe that your data may be
 incorrectly shaped, or simply applied to a manually produced dataset to ensure
 that those constraints have been applied correctly.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`graph`<a id="graph"></a>
+<code>graph</code><a id="graph"></a>
 </td>
 <td>
-An instance of a `GraphTensor`.
+An instance of a <code>GraphTensor</code>.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
@@ -57,4 +50,3 @@ A list of check operations.
 </tr>
 
 </table>
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/assert_satisfies_size_constraints.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/assert_satisfies_size_constraints.md
index a2c360b9..eb4bb5bc 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/assert_satisfies_size_constraints.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/assert_satisfies_size_constraints.md
@@ -1,17 +1,10 @@
 # tfgnn.assert_satisfies_size_constraints
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/padding_ops.py#L213-L253">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/padding_ops.py#L216-L257">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Raises InvalidArgumentError if graph_tensor exceeds size_constraints.
 
@@ -30,8 +23,6 @@ Raises InvalidArgumentError if graph_tensor exceeds size_constraints.
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
 This function can be used as follows:
@@ -46,26 +37,27 @@ Conceptually, that means this function is like standard tensorflow assertions,
 like `tf.debugging.Assert(satisfies_size_constraints(...))`, but with the
 following important advantages:
 
-- This functions logs a detailed message which size constraint is violated.
-- This function works around a TensorFlow issue to make sure the assertion is
-  executed before the ops it guards, even in the presence of conflicting
-  attempts to eliminate constant subexpressions.
+-   This functions logs a detailed message which size constraint is violated.
+-   This function works around a TensorFlow issue to make sure the assertion is
+    executed before the ops it guards, even in the presence of conflicting
+    attempts to eliminate constant subexpressions.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`graph_tensor`<a id="graph_tensor"></a>
+<code>graph_tensor</code><a id="graph_tensor"></a>
 </td>
 <td>
 a graph tensor to check against target total sizes.
 </td>
 </tr><tr>
 <td>
-`size_constraints`<a id="size_constraints"></a>
+<code>size_constraints</code><a id="size_constraints"></a>
 </td>
 <td>
 target total sizes for each graph piece.
@@ -74,31 +66,31 @@ target total sizes for each graph piece.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-Validation operations to execute within a `tf.control_dependencies`.
+Validation operations to execute within a <code>tf.control_dependencies</code>.
 </td>
 </tr>
 
 </table>
 
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Raises</h2></th></tr>
 
 <tr>
 <td>
-`tf.errors.InvalidArgumentError`<a id="tf.errors.InvalidArgumentError"></a>
+<code>tf.errors.InvalidArgumentError</code><a id="tf.errors.InvalidArgumentError"></a>
 </td>
 <td>
 if input graph tensor could not be padded to
-the `size_constraints`.
+the <code>size_constraints</code>.
 </td>
 </tr>
 </table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/broadcast.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/broadcast.md
index a7d40f89..2c7744df 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/broadcast.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/broadcast.md
@@ -1,17 +1,10 @@
 # tfgnn.broadcast
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/broadcast_ops.py#L181-L258">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/broadcast_ops.py#L185-L263">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Broadcasts values from nodes to edges, or from context to nodes or edges.
 
@@ -47,20 +40,21 @@ of <a href="../tfgnn/pool.md"><code>tfgnn.pool()</code></a>, which comes in
 handy for some algorithms.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`graph_tensor`<a id="graph_tensor"></a>
+<code>graph_tensor</code><a id="graph_tensor"></a>
 </td>
 <td>
 A scalar GraphTensor.
 </td>
 </tr><tr>
 <td>
-`from_tag`<a id="from_tag"></a>
+<code>from_tag</code><a id="from_tag"></a>
 </td>
 <td>
 Values are broadcast from context if this is <a href="../tfgnn.md#CONTEXT"><code>tfgnn.CONTEXT</code></a> or
@@ -68,36 +62,36 @@ from the incident node on each edge with this tag.
 </td>
 </tr><tr>
 <td>
-`edge_set_name`<a id="edge_set_name"></a>
+<code>edge_set_name</code><a id="edge_set_name"></a>
 </td>
 <td>
 The name of the edge set to which values are broadcast, or
-a non-empty sequence of such names. Unless `from_tag=tfgnn.CONTEXT`,
+a non-empty sequence of such names. Unless <code>from_tag=tfgnn.CONTEXT</code>,
 all named edge sets must have the same incident node set at the given tag.
 </td>
 </tr><tr>
 <td>
-`node_set_name`<a id="node_set_name"></a>
+<code>node_set_name</code><a id="node_set_name"></a>
 </td>
 <td>
 The name of the node set to which values are broadcast,
 or a non-empty sequence of such names. Can only be passed together with
-`from_tag=tfgnn.CONTEXT`. Exactly one of edge_set_name or node_set_name
+<code>from_tag=tfgnn.CONTEXT</code>. Exactly one of edge_set_name or node_set_name
 must be set.
 </td>
 </tr><tr>
 <td>
-`feature_value`<a id="feature_value"></a>
+<code>feature_value</code><a id="feature_value"></a>
 </td>
 <td>
-A tensor of shape `[num_items, *feature_shape]` from which
+A tensor of shape <code>[num_items, *feature_shape]</code> from which
 the broadcast values are taken. The first dimension indexes the items
 from which the broadcast is done (that is, the nodes of the common node
-set identified by `from_tag`, or the graph components in the context).
+set identified by <code>from_tag</code>, or the graph components in the context).
 </td>
 </tr><tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 The name of a feature stored in the graph, for use instead of
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/broadcast_context_to_edges.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/broadcast_context_to_edges.md
index 802c2530..0ee34eee 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/broadcast_context_to_edges.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/broadcast_context_to_edges.md
@@ -1,17 +1,10 @@
 # tfgnn.broadcast_context_to_edges
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/broadcast_ops.py#L121-L157">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/broadcast_ops.py#L124-L161">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Broadcasts a context value to the `edge_set` edges.
 
@@ -38,37 +31,38 @@ shape prefix of the node set being gathered from. One of `feature_value` or
 `feature_name` must be specified.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`graph_tensor`<a id="graph_tensor"></a>
+<code>graph_tensor</code><a id="graph_tensor"></a>
 </td>
 <td>
 A scalar GraphTensor.
 </td>
 </tr><tr>
 <td>
-`edge_set_name`<a id="edge_set_name"></a>
+<code>edge_set_name</code><a id="edge_set_name"></a>
 </td>
 <td>
 An edge set name.
 </td>
 </tr><tr>
 <td>
-`feature_value`<a id="feature_value"></a>
+<code>feature_value</code><a id="feature_value"></a>
 </td>
 <td>
 A ragged or dense graph context feature value. Has a shape
-`[num_components, *feature_shape]`, where `num_components` is the number
-of components in a graph and `feature_shape` is the shape of the feature
+<code>[num_components, *feature_shape]</code>, where <code>num_components</code> is the number
+of components in a graph and <code>feature_shape</code> is the shape of the feature
 value for each component.
 </td>
 </tr><tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 A context feature name.
@@ -77,16 +71,16 @@ A context feature name.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-Graph context value broadcast to the `edge_set` edges. Has a shape
-`[num_edges, *feature_shape]`, where `num_edges` is the number of edges in
-the `edge_set_name` edge set and `feature_shape` is not affected.
+Graph context value broadcast to the <code>edge_set</code> edges. Has a shape
+<code>[num_edges, *feature_shape]</code>, where <code>num_edges</code> is the number of edges in
+the <code>edge_set_name</code> edge set and <code>feature_shape</code> is not affected.
 </td>
 </tr>
 
 </table>
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/broadcast_context_to_nodes.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/broadcast_context_to_nodes.md
index e86e6d53..742a3965 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/broadcast_context_to_nodes.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/broadcast_context_to_nodes.md
@@ -1,17 +1,10 @@
 # tfgnn.broadcast_context_to_nodes
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/broadcast_ops.py#L82-L118">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/broadcast_ops.py#L84-L121">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Broadcasts a context value to the `node_set` nodes.
 
@@ -38,37 +31,38 @@ shape prefix of the node set being gathered from. One of `feature_value` or
 `feature_name` must be specified.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`graph_tensor`<a id="graph_tensor"></a>
+<code>graph_tensor</code><a id="graph_tensor"></a>
 </td>
 <td>
 A scalar GraphTensor.
 </td>
 </tr><tr>
 <td>
-`node_set_name`<a id="node_set_name"></a>
+<code>node_set_name</code><a id="node_set_name"></a>
 </td>
 <td>
 A node set name.
 </td>
 </tr><tr>
 <td>
-`feature_value`<a id="feature_value"></a>
+<code>feature_value</code><a id="feature_value"></a>
 </td>
 <td>
 A ragged or dense graph context feature value. Has a shape
-`[num_components, *feature_shape]`, where `num_components` is the number
-of components in a graph and `feature_shape` is the shape of the feature
+<code>[num_components, *feature_shape]</code>, where <code>num_components</code> is the number
+of components in a graph and <code>feature_shape</code> is the shape of the feature
 value for each component.
 </td>
 </tr><tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 A context feature name.
@@ -77,16 +71,16 @@ A context feature name.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-Graph context value broadcast to the `node_set` nodes. Has a shape
-`[num_nodes, *feature_shape]`, where `num_nodes` is the number of nodes in
-the `node_set_name` node set and `feature_shape` is not affected.
+Graph context value broadcast to the <code>node_set</code> nodes. Has a shape
+<code>[num_nodes, *feature_shape]</code>, where <code>num_nodes</code> is the number of nodes in
+the <code>node_set_name</code> node set and <code>feature_shape</code> is not affected.
 </td>
 </tr>
 
 </table>
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/broadcast_node_to_edges.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/broadcast_node_to_edges.md
index 5e4f94b9..8b71242d 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/broadcast_node_to_edges.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/broadcast_node_to_edges.md
@@ -1,17 +1,10 @@
 # tfgnn.broadcast_node_to_edges
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/broadcast_ops.py#L35-L79">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/broadcast_ops.py#L36-L81">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Broadcasts values from nodes to incident edges.
 
@@ -28,40 +21,41 @@ Broadcasts values from nodes to incident edges.
 
 <!-- Placeholder for "Used in" -->
 
-Given a particular edge set (identified by `edge_set_name` name), this
-operation collects node features from the specific incident node of each edge
-(as indicated by `node_tag`). For example, setting `node_tag=tfgnn.SOURCE` and
+Given a particular edge set (identified by `edge_set_name` name), this operation
+collects node features from the specific incident node of each edge (as
+indicated by `node_tag`). For example, setting `node_tag=tfgnn.SOURCE` and
 `reduce_type='sum'` gathers the source node features over each edge. (See the
 corresponding `pool_edges_to_node()` mirror operation).
 
 The feature to fetch node values from is provided either by name (using
 `feature_name`) and found in the graph tensor itself, or provided explicitly
 (using `feature_value`) in which case its shape has to be compatible with the
-shape prefix of the node set being gathered from. One of `feature_value`
-or `feature_name` must be specified.
+shape prefix of the node set being gathered from. One of `feature_value` or
+`feature_name` must be specified.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`graph_tensor`<a id="graph_tensor"></a>
+<code>graph_tensor</code><a id="graph_tensor"></a>
 </td>
 <td>
 A scalar GraphTensor.
 </td>
 </tr><tr>
 <td>
-`edge_set_name`<a id="edge_set_name"></a>
+<code>edge_set_name</code><a id="edge_set_name"></a>
 </td>
 <td>
 The name of the edge set to which values are broadcast.
 </td>
 </tr><tr>
 <td>
-`node_tag`<a id="node_tag"></a>
+<code>node_tag</code><a id="node_tag"></a>
 </td>
 <td>
 The incident side of each edge from which values are broadcast,
@@ -70,17 +64,17 @@ specified by its tag in the edge set (e.g. <a href="../tfgnn.md#SOURCE"><code>tf
 </td>
 </tr><tr>
 <td>
-`feature_value`<a id="feature_value"></a>
+<code>feature_value</code><a id="feature_value"></a>
 </td>
 <td>
 A ragged or dense source node feature values. Has a shape
-`[num_nodes, *feature_shape]`, where `num_nodes` is the number of nodes in
+<code>[num_nodes, *feature_shape]</code>, where <code>num_nodes</code> is the number of nodes in
 the incident node set and feature_shape is the shape of the feature value
 for each node.
 </td>
 </tr><tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 A source node feature name.
@@ -89,16 +83,16 @@ A source node feature name.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-Source node value broadcast to corresponding edges. Has a shape `[num_edges,
-*feature_shape]`, where `num_edges` is the number of edges in the
-`edge_set_name` edge set and `feature_shape` is not affected.
+Source node value broadcast to corresponding edges. Has a shape <code>[num_edges,
+*feature_shape]</code>, where <code>num_edges</code> is the number of edges in the
+<code>edge_set_name</code> edge set and <code>feature_shape</code> is not affected.
 </td>
 </tr>
 
 </table>
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/check_compatible_with_schema_pb.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/check_compatible_with_schema_pb.md
index 46759814..c2065d7e 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/check_compatible_with_schema_pb.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/check_compatible_with_schema_pb.md
@@ -1,24 +1,17 @@
 # tfgnn.check_compatible_with_schema_pb
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/schema_utils.py#L188-L227">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/schema_utils.py#L216-L256">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Checks that the given spec or value is compatible with the graph schema.
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>tfgnn.check_compatible_with_schema_pb(
     graph: Union[<a href="../tfgnn/GraphTensor.md"><code>tfgnn.GraphTensor</code></a>, <a href="../tfgnn/GraphTensorSpec.md"><code>tfgnn.GraphTensorSpec</code></a>],
-    schema: <a href="../tfgnn/GraphSchema.md"><code>tfgnn.GraphSchema</code></a>
+    schema: <a href="../tfgnn/proto/GraphSchema.md"><code>tfgnn.proto.GraphSchema</code></a>
 ) -> None
 </code></pre>
 
@@ -35,20 +28,21 @@ The `graph` is compatible with the `schema` if
     <a href="../tfgnn/Adjacency.md"><code>tfgnn.Adjacency</code></a>.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`graph`<a id="graph"></a>
+<code>graph</code><a id="graph"></a>
 </td>
 <td>
 The graph tensor or graph tensor spec.
 </td>
 </tr><tr>
 <td>
-`schema`<a id="schema"></a>
+<code>schema</code><a id="schema"></a>
 </td>
 <td>
 The graph schema.
@@ -57,16 +51,17 @@ The graph schema.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Raises</h2></th></tr>
 
 <tr>
 <td>
-`ValueError`<a id="ValueError"></a>
+<code>ValueError</code><a id="ValueError"></a>
 </td>
 <td>
-if `spec_or_value` is not represented by the graph schema.
+if <code>spec_or_value</code> is not represented by the graph schema.
 </td>
 </tr>
 </table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/check_homogeneous_graph_tensor.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/check_homogeneous_graph_tensor.md
index ab00103f..04bfaa9d 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/check_homogeneous_graph_tensor.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/check_homogeneous_graph_tensor.md
@@ -1,17 +1,10 @@
 # tfgnn.check_homogeneous_graph_tensor
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L1623-L1628">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L1940-L1945">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Raises ValueError when tfgnn.get_homogeneous_node_and_edge_set_name() does.
 
@@ -23,4 +16,3 @@ Raises ValueError when tfgnn.get_homogeneous_node_and_edge_set_name() does.
 </code></pre>
 
 <!-- Placeholder for "Used in" -->
-  
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/check_required_features.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/check_required_features.md
index b6b546ae..8098e57b 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/check_required_features.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/check_required_features.md
@@ -1,64 +1,55 @@
 # tfgnn.check_required_features
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/schema_validation.py#L80-L149">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/schema_validation.py#L73-L142">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Checks the requirements of a given schema against another.
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>tfgnn.check_required_features(
-    requirements: <a href="../tfgnn/GraphSchema.md"><code>tfgnn.GraphSchema</code></a>,
-    actual: <a href="../tfgnn/GraphSchema.md"><code>tfgnn.GraphSchema</code></a>
+    requirements: <a href="../tfgnn/proto/GraphSchema.md"><code>tfgnn.proto.GraphSchema</code></a>,
+    actual: <a href="../tfgnn/proto/GraphSchema.md"><code>tfgnn.proto.GraphSchema</code></a>
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
 This function is used to enable the specification of required features to a
 function. A function accepting a `GraphTensor` instance can this way document
-what features it is expecting to find on it. The function accepts two schemas:
-a `requirements` schema which describes what the function will attempt to
-fetch and use on the `GraphTensor`, and an `actual` schema instance, which is
-the schema describing the dataset. You can use this in your model code to
-ensure that a dataset contains all the expected node sets, edge sets and
-features that the model uses.
+what features it is expecting to find on it. The function accepts two schemas: a
+`requirements` schema which describes what the function will attempt to fetch
+and use on the `GraphTensor`, and an `actual` schema instance, which is the
+schema describing the dataset. You can use this in your model code to ensure
+that a dataset contains all the expected node sets, edge sets and features that
+the model uses.
 
 Note that a dimension with a size of `0` in a feature from the `requirements`
-schema is interpreted specially: it means "accept any value for this
-dimension." The special value `-1` is still used to represent a ragged
-dimension.
+schema is interpreted specially: it means "accept any value for this dimension."
+The special value `-1` is still used to represent a ragged dimension.
 
 (Finally, note that this function predates the existence of `GraphTensorSpec`,
-which is a runtime descriptor for a `GraphTensor`. We may eventually perovide
-an equivalent construct using the `GraphTensorSpec.)
+which is a runtime descriptor for a `GraphTensor`. We may eventually perovide an
+equivalent construct using the `GraphTensorSpec.)
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`requirements`<a id="requirements"></a>
+<code>requirements</code><a id="requirements"></a>
 </td>
 <td>
 An instance of a GraphSchema object, with optional shapes.
 </td>
 </tr><tr>
 <td>
-`actual`<a id="actual"></a>
+<code>actual</code><a id="actual"></a>
 </td>
 <td>
 The instance of actual schema to check is a matching superset
@@ -68,13 +59,14 @@ of the required schema.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Raises</h2></th></tr>
 
 <tr>
 <td>
-`ValidationError`<a id="ValidationError"></a>
+<code>ValidationError</code><a id="ValidationError"></a>
 </td>
 <td>
 If the given schema does not fulfill the requirements.
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/check_scalar_graph_tensor.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/check_scalar_graph_tensor.md
index 59723b3d..fbc17e48 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/check_scalar_graph_tensor.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/check_scalar_graph_tensor.md
@@ -1,17 +1,12 @@
 # tfgnn.check_scalar_graph_tensor
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L1428-L1430">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L1744-L1747">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Checks that graph tensor is scalar (has rank 0).
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>tfgnn.check_scalar_graph_tensor(
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/combine_values.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/combine_values.md
index 6ad1cf5b..ea989c5b 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/combine_values.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/combine_values.md
@@ -1,17 +1,10 @@
 # tfgnn.combine_values
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_ops.py#L350-L383">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_ops.py#L344-L378">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Combines a list of tensors into one (by concatenation or otherwise).
 
@@ -23,32 +16,31 @@ Combines a list of tensors into one (by concatenation or otherwise).
 
 <!-- Placeholder for "Used in" -->
 
-This is a convenience wrapper around standard TensorFlow operations, to
-provide standard names for common types of combining.
+This is a convenience wrapper around standard TensorFlow operations, to provide
+standard names for common types of combining.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
-<tr> <td> `inputs`<a id="inputs"></a> </td> <td> a list of Tensors or
+<tr> <td> <code>inputs</code><a id="inputs"></a> </td> <td> a list of Tensors or
 RaggedTensors, with shapes and types that are compatible for the selected
-combine_type. </td> </tr><tr> <td> `combine_type`<a id="combine_type"></a> </td>
-<td> one of the following string values, to select the method for combining the
-inputs:
+combine_type. </td> </tr><tr> <td>
+<code>combine_type</code><a id="combine_type"></a> </td> <td> one of the
+following string values, to select the method for combining the inputs:
 
-  * "sum": The input tensors are added. Their dtypes and shapes must
-    match.
-  * "concat": The input tensors are concatenated along the last axis.
+*   "sum": The input tensors are added. Their dtypes and shapes must match.
+*   "concat": The input tensors are concatenated along the last axis.
     Their dtypes and shapes must match, except for the number of elements
     along the last axis.
-</td>
-</tr>
-</table>
-
-
+    </td>
+    </tr>
+    </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
@@ -59,4 +51,3 @@ A tensor with the combined value of the inputs.
 </tr>
 
 </table>
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/convert_to_line_graph.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/convert_to_line_graph.md
index c53227e6..61e3ab66 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/convert_to_line_graph.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/convert_to_line_graph.md
@@ -1,17 +1,10 @@
 # tfgnn.convert_to_line_graph
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_ops.py#L1021-L1132">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_ops.py#L1017-L1129">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Obtain a graph's line graph.
 
@@ -73,14 +66,14 @@ sets: 'original/to/lines' with edges a->ab, b->bc, c->ca, and
 
 <tr>
 <td>
-`graph_tensor`<a id="graph_tensor"></a>
+<code>graph_tensor</code><a id="graph_tensor"></a>
 </td>
 <td>
 Graph to convert to a line graph.
 </td>
 </tr><tr>
 <td>
-`connect_from`<a id="connect_from"></a>
+<code>connect_from</code><a id="connect_from"></a>
 </td>
 <td>
 Specifies which endpoint of the original edges
@@ -88,7 +81,7 @@ will determine the source for the line graph edges.
 </td>
 </tr><tr>
 <td>
-`connect_to`<a id="connect_to"></a>
+<code>connect_to</code><a id="connect_to"></a>
 </td>
 <td>
 Specifies which endpoint of the original edges
@@ -96,29 +89,29 @@ will determine the target for the line graph edges.
 </td>
 </tr><tr>
 <td>
-`connect_with_original_nodes`<a id="connect_with_original_nodes"></a>
+<code>connect_with_original_nodes</code><a id="connect_with_original_nodes"></a>
 </td>
 <td>
 If true, keep the original node sets (not the
 original edge sets) and connect them to line graph nodes according to
 source and target in the original graph. The node set names will be called
-`original/{node_set}` and the new edges `original/to/{edge_set}` for the
-SOURCE nodes and `original/from/{edge_set}` for the TARGET nodes.
+<code>original/{node_set}</code> and the new edges <code>original/to/{edge_set}</code> for the
+SOURCE nodes and <code>original/from/{edge_set}</code> for the TARGET nodes.
 </td>
 </tr><tr>
 <td>
-`non_backtracking`<a id="non_backtracking"></a>
+<code>non_backtracking</code><a id="non_backtracking"></a>
 </td>
 <td>
 Whether to return the non-backtracking line graph. Setting
 this to True will only connect edges where the "outer" nodes are
-different, i.e. `u_{1-i} != v_{1-j}`. For default connection settings,
+different, i.e. <code>u_{1-i} != v_{1-j}</code>. For default connection settings,
 for every edge u->v this *removes* line graph edges uv->vu. If
 connect_to=TARGET, this *removes* line graph edges uv->uv.
 </td>
 </tr><tr>
 <td>
-`use_node_features_as_line_graph_edge_features`<a id="use_node_features_as_line_graph_edge_features"></a>
+<code>use_node_features_as_line_graph_edge_features</code><a id="use_node_features_as_line_graph_edge_features"></a>
 </td>
 <td>
 Whether to use the original
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/create_graph_spec_from_schema_pb.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/create_graph_spec_from_schema_pb.md
index 66b57f8d..71202d63 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/create_graph_spec_from_schema_pb.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/create_graph_spec_from_schema_pb.md
@@ -1,73 +1,64 @@
 # tfgnn.create_graph_spec_from_schema_pb
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/schema_utils.py#L67-L121">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/schema_utils.py#L69-L123">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Converts a graph schema proto message to a scalar GraphTensorSpec.
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>tfgnn.create_graph_spec_from_schema_pb(
-    schema: <a href="../tfgnn/GraphSchema.md"><code>tfgnn.GraphSchema</code></a>,
+    schema: <a href="../tfgnn/proto/GraphSchema.md"><code>tfgnn.proto.GraphSchema</code></a>,
     indices_dtype: tf.dtypes.DType = gc.default_indices_dtype
 ) -> <a href="../tfgnn/GraphTensorSpec.md"><code>tfgnn.GraphTensorSpec</code></a>
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
-A `GraphSchema` message contains shape information in a serializable format.
-The `GraphTensorSpec` is a runtime object fulfilling the type spec
-requirements, that accompanies each `GraphTensor` instance and fulfills much
-of the same goal. This function converts the proto to the corresponding type
-spec.
+A `GraphSchema` message contains shape information in a serializable format. The
+`GraphTensorSpec` is a runtime object fulfilling the type spec requirements,
+that accompanies each `GraphTensor` instance and fulfills much of the same goal.
+This function converts the proto to the corresponding type spec.
 
-It is guranteed that the output graph spec is compatible with the input graph
+It is guaranteed that the output graph spec is compatible with the input graph
 schema (as
 <a href="../tfgnn/check_compatible_with_schema_pb.md"><code>tfgnn.check_compatible_with_schema_pb()</code></a>.)
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`schema`<a id="schema"></a>
+<code>schema</code><a id="schema"></a>
 </td>
 <td>
 An instance of the graph schema proto message.
 </td>
 </tr><tr>
 <td>
-`indices_dtype`<a id="indices_dtype"></a>
+<code>indices_dtype</code><a id="indices_dtype"></a>
 </td>
 <td>
-A `tf.dtypes.DType` for GraphTensor edge set source and
+A <code>tf.dtypes.DType</code> for GraphTensor edge set source and
 target indices, node and edge sets sizes.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-A `GraphTensorSpec` specification for the scalar graph tensor (of rank 0).
+A <code>GraphTensorSpec</code> specification for the scalar graph tensor (of rank 0).
 </td>
 </tr>
 
 </table>
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/create_schema_pb_from_graph_spec.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/create_schema_pb_from_graph_spec.md
index ac837446..5cd788df 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/create_schema_pb_from_graph_spec.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/create_schema_pb_from_graph_spec.md
@@ -1,24 +1,17 @@
 # tfgnn.create_schema_pb_from_graph_spec
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/schema_utils.py#L124-L185">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/schema_utils.py#L126-L213">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Converts scalar GraphTensorSpec to a graph schema proto message.
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>tfgnn.create_schema_pb_from_graph_spec(
     graph: Union[<a href="../tfgnn/GraphTensor.md"><code>tfgnn.GraphTensor</code></a>, <a href="../tfgnn/GraphTensorSpec.md"><code>tfgnn.GraphTensorSpec</code></a>]
-) -> <a href="../tfgnn/GraphSchema.md"><code>tfgnn.GraphSchema</code></a>
+) -> <a href="../tfgnn/proto/GraphSchema.md"><code>tfgnn.proto.GraphSchema</code></a>
 </code></pre>
 
 <!-- Placeholder for "Used in" -->
@@ -29,18 +22,19 @@ features. For edge sets their `source` and `target` fields are populated. All
 other fields are left unset. (Callers can set them separately before writing out
 the schema.)
 
-It is guranteed that the input graph is compatible with the output graph schema
+It is guaranteed that the input graph is compatible with the output graph schema
 (as
 <a href="../tfgnn/check_compatible_with_schema_pb.md"><code>tfgnn.check_compatible_with_schema_pb()</code></a>.)
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`graph`<a id="graph"></a>
+<code>graph</code><a id="graph"></a>
 </td>
 <td>
 The scalar graph tensor or its spec with single graph component.
@@ -49,6 +43,7 @@ The scalar graph tensor or its spec with single graph component.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
@@ -61,23 +56,32 @@ An instance of the graph schema proto message.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Raises</h2></th></tr>
 
 <tr>
 <td>
-`ValueError`<a id="ValueError"></a>
+<code>ValueError</code><a id="ValueError"></a>
 </td>
 <td>
 if graph has multiple graph components or rank > 0.
 </td>
 </tr><tr>
 <td>
-`ValueError`<a id="ValueError"></a>
+<code>ValueError</code><a id="ValueError"></a>
+</td>
+<td>
+if adjacency types is not an instance of <code>fgnn.Adjacency</code>.
+</td>
+</tr><tr>
+<td>
+<code>ValueError</code><a id="ValueError"></a>
 </td>
 <td>
-if adjacency types is not an instance of `fgnn.Adjacency`.
+if graph tensor features have types that are not supported
+by the graph schema.
 </td>
 </tr>
 </table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/dataset_filter_with_summary.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/dataset_filter_with_summary.md
index c1fa8af2..4fecb22b 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/dataset_filter_with_summary.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/dataset_filter_with_summary.md
@@ -1,17 +1,10 @@
 # tfgnn.dataset_filter_with_summary
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/preprocessing_common.py#L103-L188">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/preprocessing_common.py#L103-L188">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Dataset filter with a summary for the fraction of dataset elements removed.
 
@@ -26,8 +19,6 @@ Dataset filter with a summary for the fraction of dataset elements removed.
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
 The fraction of removed elements is computed using exponential moving average.
@@ -36,38 +27,39 @@ See https://en.wikipedia.org/wiki/Moving_average.
 The summary is reported each `summary_steps` elements in the input dataset
 before filtering. Statistics are reported using `tf.summary.scalar()` with
 `step` set to the element index in the result (filtered) dataset, see
-https://tensorflow.org/tensorboard/scalars_and_keras#logging_custom_scalars
-for how to write and retreive them.
+https://tensorflow.org/tensorboard/scalars_and_keras#logging_custom_scalars for
+how to write and retrieve them.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`dataset`<a id="dataset"></a>
+<code>dataset</code><a id="dataset"></a>
 </td>
 <td>
 An input dataset.
 </td>
 </tr><tr>
 <td>
-`predicate`<a id="predicate"></a>
+<code>predicate</code><a id="predicate"></a>
 </td>
 <td>
 A function mapping a dataset element to a boolean.
 </td>
 </tr><tr>
 <td>
-`summary_name`<a id="summary_name"></a>
+<code>summary_name</code><a id="summary_name"></a>
 </td>
 <td>
 A name for this summary.
 </td>
 </tr><tr>
 <td>
-`summary_steps`<a id="summary_steps"></a>
+<code>summary_steps</code><a id="summary_steps"></a>
 </td>
 <td>
 Report summary for this number of elements in the input
@@ -75,25 +67,25 @@ dataset before filtering.
 </td>
 </tr><tr>
 <td>
-`summary_decay`<a id="summary_decay"></a>
+<code>summary_decay</code><a id="summary_decay"></a>
 </td>
 <td>
 An exponential moving average decay factor. If not set,
-defaults to the `exp(- 1 / summary_steps)`.
+defaults to the <code>exp(- 1 / summary_steps)</code>.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
 Thed dataset containing the elements of this dataset for which predicate is
-`True`.
+<code>True</code>.
 </td>
 </tr>
 
 </table>
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/dataset_from_generator.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/dataset_from_generator.md
index fa38d480..fb45fa7e 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/dataset_from_generator.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/dataset_from_generator.md
@@ -1,17 +1,10 @@
 # tfgnn.dataset_from_generator
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/batching_utils.py#L787-L880">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/batching_utils.py#L786-L879">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Creates dataset from generator of any nest of scalar graph pieces.
 
@@ -42,13 +35,14 @@ print([dataset1])  # prints: pieceA, pieceD.
 ```
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`generator`<a id="generator"></a>
+<code>generator</code><a id="generator"></a>
 </td>
 <td>
 a callable object that returns an object that supports the iter()
@@ -60,25 +54,27 @@ protocol. Could consist of any nest of tensors and scalar graph pieces
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-A `tf.data.Dataset`.
+A <code>tf.data.Dataset</code>.
 </td>
 </tr>
 
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Raises</h2></th></tr>
 
 <tr>
 <td>
-`ValueError`<a id="ValueError"></a>
+<code>ValueError</code><a id="ValueError"></a>
 </td>
 <td>
 if any contained graph piece is not scalar or has not compatible
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/disable_graph_tensor_validation.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/disable_graph_tensor_validation.md
new file mode 100644
index 00000000..5532d788
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/disable_graph_tensor_validation.md
@@ -0,0 +1,19 @@
+# tfgnn.disable_graph_tensor_validation
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_constants.py#L125-L135">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Disables both static and runtime checks of graph tensors.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>tfgnn.disable_graph_tensor_validation()
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+IMPORTANT: This is temporary workaround for the legacy code (before TF-GNN 1.0
+release) that may rely on the inconsistent number of graph tensor items and
+allowed edges with adjaceny indices for non-existing nodes. **DO NOT USE**.
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/disable_graph_tensor_validation_at_runtime.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/disable_graph_tensor_validation_at_runtime.md
new file mode 100644
index 00000000..160333b7
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/disable_graph_tensor_validation_at_runtime.md
@@ -0,0 +1,15 @@
+# tfgnn.disable_graph_tensor_validation_at_runtime
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_constants.py#L138-L141">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Disables runtime checks (`tf.debugging.Assert`) of graph tensors.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>tfgnn.disable_graph_tensor_validation_at_runtime()
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/enable_graph_tensor_validation.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/enable_graph_tensor_validation.md
new file mode 100644
index 00000000..33441682
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/enable_graph_tensor_validation.md
@@ -0,0 +1,15 @@
+# tfgnn.enable_graph_tensor_validation
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_constants.py#L144-L147">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Enables static checks of graph tensors.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>tfgnn.enable_graph_tensor_validation()
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/enable_graph_tensor_validation_at_runtime.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/enable_graph_tensor_validation_at_runtime.md
new file mode 100644
index 00000000..7ca92008
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/enable_graph_tensor_validation_at_runtime.md
@@ -0,0 +1,15 @@
+# tfgnn.enable_graph_tensor_validation_at_runtime
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_constants.py#L150-L155">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Enables both static and runtime checks of graph tensors.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>tfgnn.enable_graph_tensor_validation_at_runtime()
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/experimental.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/experimental.md
index 807da7f4..95d30833 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/experimental.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/experimental.md
@@ -1,17 +1,10 @@
 # Module: tfgnn.experimental
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/experimental/__init__.py">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/experimental/__init__.py">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Experimental (unstable) parts of the public interface of TensorFlow GNN.
 
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/find_tight_size_constraints.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/find_tight_size_constraints.md
index 393fcb07..f81f5624 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/find_tight_size_constraints.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/find_tight_size_constraints.md
@@ -1,17 +1,10 @@
 # tfgnn.find_tight_size_constraints
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/batching_utils.py#L190-L254">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/batching_utils.py#L190-L254">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns smallest possible size constraints that allow dataset padding.
 
@@ -24,17 +17,15 @@ Returns smallest possible size constraints that allow dataset padding.
 ) -> <a href="../tfgnn/SizeConstraints.md"><code>tfgnn.SizeConstraints</code></a>
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
 Evaluated constraints are intended to be used when it is required that all
 elements of the `dataset` can be padded, e.g., when evaluating models.
 
 Typically, this function is used on a dataset of individual examples (that is,
-not batched), and the `target_batch_size` is passed as an argument. The
-returned constraints will work for all possible batches up to that size drawn
-from the dataset.
+not batched), and the `target_batch_size` is passed as an argument. The returned
+constraints will work for all possible batches up to that size drawn from the
+dataset.
 
 Alternatively, this function can be used on a dataset that is already batched,
 passing `target_batch_size=None`. The returned constraints will work for the
@@ -43,24 +34,25 @@ optimized ways of building a Dataset (like parallel .map() and .interleave()
 calls before .batch()) introduce nondeterminism and may not deliver the exact
 same batches again.
 
-Note that this function iterates over all elements of the input dataset, so
-its execution time is proportional to the dataset's cardinality.
+Note that this function iterates over all elements of the input dataset, so its
+execution time is proportional to the dataset's cardinality.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`dataset`<a id="dataset"></a>
+<code>dataset</code><a id="dataset"></a>
 </td>
 <td>
 finite dataset of graph tensors of any rank.
 </td>
 </tr><tr>
 <td>
-`min_nodes_per_component`<a id="min_nodes_per_component"></a>
+<code>min_nodes_per_component</code><a id="min_nodes_per_component"></a>
 </td>
 <td>
 mapping from a node set name to a minimum number of
@@ -68,16 +60,17 @@ nodes in each graph component. Defaults to 0.
 </td>
 </tr><tr>
 <td>
-`target_batch_size`<a id="target_batch_size"></a>
+<code>target_batch_size</code><a id="target_batch_size"></a>
 </td>
 <td>
-if not `None`, an integer for multiplying the sizes
+if not <code>None</code>, an integer for multiplying the sizes
 measured from dataset before making room for padding.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
@@ -90,20 +83,19 @@ in the input dataset.
 
 </table>
 
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Raises</h2></th></tr>
 
 <tr>
 <td>
-`ValueError`<a id="ValueError"></a>
+<code>ValueError</code><a id="ValueError"></a>
 </td>
 <td>
 if dataset elements are not GraphTensors or its cardinality
-is `tf.data.INFINITE_CARDINALITY`.
+is <code>tf.data.INFINITE_CARDINALITY</code>.
 </td>
 </tr>
 </table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/gather_first_node.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/gather_first_node.md
index 2f380ccc..4067d3f6 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/gather_first_node.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/gather_first_node.md
@@ -1,17 +1,10 @@
 # tfgnn.gather_first_node
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_ops.py#L181-L231">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_ops.py#L175-L226">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Gathers feature value from the first node of each graph component.
 
@@ -27,54 +20,55 @@ Gathers feature value from the first node of each graph component.
 
 <!-- Placeholder for "Used in" -->
 
-Given a particular node set (identified by `node_set_name`), this operation
-will gather the given feature from the first node of each graph component.
+Given a particular node set (identified by `node_set_name`), this operation will
+gather the given feature from the first node of each graph component.
 
 This is often used for rooted graphs created by sampling around the
 neighborhoods of seed nodes in a large graph: by convention, each seed node is
 the first node of its component in the respective node set, and this operation
 reads out the information it has accumulated there. (In other node sets, the
-first node may be arbitrary -- or nonexistant, in which case this operation
-must not be used and may raise an error at runtime.)
+first node may be arbitrary -- or nonexistant, in which case this operation must
+not be used and may raise an error at runtime.)
 
 The feature to fetch node values from is provided either by name (using
 `feature_name`) and found in the graph tensor itself, or provided explicitly
 (using `feature_value`) in which case its shape has to be compatible with the
-shape prefix of the node set being gathered from. One of `feature_value`
-or `feature_name` must be specified.
+shape prefix of the node set being gathered from. One of `feature_value` or
+`feature_name` must be specified.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`graph_tensor`<a id="graph_tensor"></a>
+<code>graph_tensor</code><a id="graph_tensor"></a>
 </td>
 <td>
 A scalar GraphTensor.
 </td>
 </tr><tr>
 <td>
-`node_set_name`<a id="node_set_name"></a>
+<code>node_set_name</code><a id="node_set_name"></a>
 </td>
 <td>
 A seed node set name.
 </td>
 </tr><tr>
 <td>
-`feature_value`<a id="feature_value"></a>
+<code>feature_value</code><a id="feature_value"></a>
 </td>
 <td>
 A ragged or dense node feature value. Has a shape
-`[num_nodes, *feature_shape]`, where `num_nodes` is the number of nodes in
-the `node_set_name` node set and `feature_shape` is the shape of the
+<code>[num_nodes, *feature_shape]</code>, where <code>num_nodes</code> is the number of nodes in
+the <code>node_set_name</code> node set and <code>feature_shape</code> is the shape of the
 feature value for each node.
 </td>
 </tr><tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 A node feature name.
@@ -83,17 +77,17 @@ A node feature name.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
 A tensor of gathered feature values, one for each graph component, like a
-context feature. Has a shape `[num_components, *feature_shape]`, where
-`num_components` is the number of components in a graph and `feature_shape`
+context feature. Has a shape <code>[num_components, *feature_shape]</code>, where
+<code>num_components</code> is the number of components in a graph and <code>feature_shape</code>
 is not affected.
 </td>
 </tr>
 
 </table>
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/get_aux_type_prefix.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/get_aux_type_prefix.md
index f5175ed4..c61e57b1 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/get_aux_type_prefix.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/get_aux_type_prefix.md
@@ -1,17 +1,10 @@
 # tfgnn.get_aux_type_prefix
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L1555-L1587">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L1872-L1904">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns type prefix of aux node or edge set names, or `None` if non-aux.
 
@@ -52,11 +45,11 @@ and node sets are formed.
 
 <tr>
 <td>
-`set_name`<a id="set_name"></a>
+<code>set_name</code><a id="set_name"></a>
 </td>
 <td>
 The name of a node set or edge set in a <a href="../tfgnn/GraphTensor.md"><code>tfgnn.GraphTensor</code></a>,
-<a href="../tfgnn/GraphTensorSpec.md"><code>tfgnn.GraphTensorSpec</code></a> or <a href="../tfgnn/GraphSchema.md"><code>tfgnn.GraphSchema</code></a>.
+<a href="../tfgnn/GraphTensorSpec.md"><code>tfgnn.GraphTensorSpec</code></a> or <a href="../tfgnn/proto/GraphSchema.md"><code>tfgnn.GraphSchema</code></a>.
 </td>
 </tr>
 </table>
@@ -69,7 +62,7 @@ The name of a node set or edge set in a <a href="../tfgnn/GraphTensor.md"><code>
 <tr class="alt">
 <td colspan="2">
 For an auxiliary node set or edge set, a non-empty prefix that identifies
-its type; for other node sets or edge sets, `None`.
+its type; for other node sets or edge sets, <code>None</code>.
 </td>
 </tr>
 
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/get_homogeneous_node_and_edge_set_name.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/get_homogeneous_node_and_edge_set_name.md
index 4ef431b1..6dc8de2a 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/get_homogeneous_node_and_edge_set_name.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/get_homogeneous_node_and_edge_set_name.md
@@ -1,17 +1,10 @@
 # tfgnn.get_homogeneous_node_and_edge_set_name
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L1590-L1620">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L1907-L1937">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns the sole `node_set_name, edge_set_name` or raises `ValueError`.
 
@@ -38,14 +31,14 @@ as appropriate for model-building code.
 
 <tr>
 <td>
-`graph`<a id="graph"></a>
+<code>graph</code><a id="graph"></a>
 </td>
 <td>
-the `GraphTensor` or `GraphTensorSpec` to check.
+the <code>GraphTensor</code> or <code>GraphTensorSpec</code> to check.
 </td>
 </tr><tr>
 <td>
-`name`<a id="name"></a>
+<code>name</code><a id="name"></a>
 </td>
 <td>
 Optionally, the name of the operation (library function, class, ...)
@@ -62,9 +55,9 @@ raised.
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-A tuple `node_set_name, edge_set_name` with the unique node set and edge
-set, resp., in `graph` for which <a href="../tfgnn/get_aux_type_prefix.md"><code>tfgnn.get_aux_type_prefix(set_name)</code></a> is
-`None`.
+A tuple <code>node_set_name, edge_set_name</code> with the unique node set and edge
+set, resp., in <code>graph</code> for which <a href="../tfgnn/get_aux_type_prefix.md"><code>tfgnn.get_aux_type_prefix(set_name)</code></a> is
+<code>None</code>.
 </td>
 </tr>
 
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/get_io_spec.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/get_io_spec.md
index a61de5d7..1092f2ba 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/get_io_spec.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/get_io_spec.md
@@ -1,17 +1,10 @@
 # tfgnn.get_io_spec
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_io.py#L132-L227">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_io.py#L132-L227">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns tf.io parsing features for `GraphTensorSpec` type spec.
 
@@ -27,30 +20,32 @@ Returns tf.io parsing features for `GraphTensorSpec` type spec.
 
 This function returns a mapping of `tf.train.Feature` names to configuration
 objects that can be used to parse instances of `tf.train.Example` (see
-https://www.tensorflow.org/api_docs/python/tf/io). The resulting mapping can
-be used with `tf.io.parse_example()` for reading the individual fields of a
+https://www.tensorflow.org/api_docs/python/tf/io). The resulting mapping can be
+used with `tf.io.parse_example()` for reading the individual fields of a
 `GraphTensor` instance. This essentially forms our encoding of a `GraphTensor`
 to a `tf.train.Example` proto.
 
 (This is an internal function. You are not likely to be using this if you're
-decoding graph tensors. Instead, you should use the <a href="../tfgnn/parse_example.md"><code>tfgnn.parse_example()</code></a>
+decoding graph tensors. Instead, you should use the
+<a href="../tfgnn/parse_example.md"><code>tfgnn.parse_example()</code></a>
 routine directly, which handles this process for you.)
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`spec`<a id="spec"></a>
+<code>spec</code><a id="spec"></a>
 </td>
 <td>
 A graph tensor type specification.
 </td>
 </tr><tr>
 <td>
-`prefix`<a id="prefix"></a>
+<code>prefix</code><a id="prefix"></a>
 </td>
 <td>
 An optional prefix string over all the features. You may use
@@ -58,25 +53,25 @@ this if you are encoding other data in the same protocol buffer.
 </td>
 </tr><tr>
 <td>
-`validate`<a id="validate"></a>
+<code>validate</code><a id="validate"></a>
 </td>
 <td>
 A boolean indicating whether or not to validate that the input
-fields form a valid `GraphTensor`. Defaults to `True`.
+fields form a valid <code>GraphTensor</code>. Defaults to <code>True</code>.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-A dict of `tf.train.Feature` name to feature configuration object, to be
-used in `tf.io.parse_example()`.
+A dict of <code>tf.train.Feature</code> name to feature configuration object, to be
+used in <code>tf.io.parse_example()</code>.
 </td>
 </tr>
 
 </table>
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/get_registered_reduce_operation_names.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/get_registered_reduce_operation_names.md
index 12421892..e283e116 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/get_registered_reduce_operation_names.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/get_registered_reduce_operation_names.md
@@ -1,17 +1,10 @@
 # tfgnn.get_registered_reduce_operation_names
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/pool_ops.py#L815-L817">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/pool_ops.py#L819-L821">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns the registered list of supported reduce operation names.
 
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/graph_tensor_to_values.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/graph_tensor_to_values.md
index e5defd8e..0cab465a 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/graph_tensor_to_values.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/graph_tensor_to_values.md
@@ -1,17 +1,10 @@
 # tfgnn.graph_tensor_to_values
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_pprint.py#L49-L70">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_pprint.py#L50-L72">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Convert an eager `GraphTensor` to a mapping of mappings of PODTs.
 
@@ -21,8 +14,6 @@ Convert an eager `GraphTensor` to a mapping of mappings of PODTs.
 ) -> Dict[str, Any]
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
 This is used for pretty-printing. Convert your graph tensor with this and run
@@ -30,30 +21,31 @@ the result through `pprint.pprint()` or `pprint.pformat()` for display of its
 contents.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`graph`<a id="graph"></a>
+<code>graph</code><a id="graph"></a>
 </td>
 <td>
-An eager `GraphTensor` instance to be pprinted.
+An eager <code>GraphTensor</code> instance to be pprinted.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-A dict of plain-old data types that can be run through `pprint.pprint()` or
+A dict of plain-old data types that can be run through <code>pprint.pprint()</code> or
 a JSON conversion library.
 </td>
 </tr>
 
 </table>
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/homogeneous.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/homogeneous.md
index c74bf357..4a9cc45a 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/homogeneous.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/homogeneous.md
@@ -1,17 +1,10 @@
 # tfgnn.homogeneous
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L1473-L1552">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor.py#L1790-L1869">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Constructs a homogeneous `GraphTensor` with node features and one edge_set.
 
@@ -31,29 +24,29 @@ Constructs a homogeneous `GraphTensor` with node features and one edge_set.
 </code></pre>
 
 <!-- Placeholder for "Used in" -->
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`source`<a id="source"></a>
+<code>source</code><a id="source"></a>
 </td>
 <td>
 A dense Tensor with the source indices for edges
 </td>
 </tr><tr>
 <td>
-`target`<a id="target"></a>
+<code>target</code><a id="target"></a>
 </td>
 <td>
 A dense Tensor with the target indices for edges
 </td>
 </tr><tr>
 <td>
-`node_features`<a id="node_features"></a>
+<code>node_features</code><a id="node_features"></a>
 </td>
 <td>
 A Tensor or mapping from feature name to Tensor of features
@@ -61,7 +54,7 @@ corresponding to graph nodes.
 </td>
 </tr><tr>
 <td>
-`edge_features`<a id="edge_features"></a>
+<code>edge_features</code><a id="edge_features"></a>
 </td>
 <td>
 Optional Tensor or mapping from feature name to Tensor of
@@ -69,7 +62,7 @@ features corresponding to graph edges.
 </td>
 </tr><tr>
 <td>
-`context_features`<a id="context_features"></a>
+<code>context_features</code><a id="context_features"></a>
 </td>
 <td>
 Optional Tensor or mapping from name to Tensor for the
@@ -77,21 +70,21 @@ context (entire graph)
 </td>
 </tr><tr>
 <td>
-`node_set_name`<a id="node_set_name"></a>
+<code>node_set_name</code><a id="node_set_name"></a>
 </td>
 <td>
 Optional name for the node set
 </td>
 </tr><tr>
 <td>
-`edge_set_name`<a id="edge_set_name"></a>
+<code>edge_set_name</code><a id="edge_set_name"></a>
 </td>
 <td>
 Optional name for the edge set
 </td>
 </tr><tr>
 <td>
-`node_set_sizes`<a id="node_set_sizes"></a>
+<code>node_set_sizes</code><a id="node_set_sizes"></a>
 </td>
 <td>
 Optional Tensor with the number of nodes per component. If
@@ -100,7 +93,7 @@ the same length.
 </td>
 </tr><tr>
 <td>
-`edge_set_sizes`<a id="edge_set_sizes"></a>
+<code>edge_set_sizes</code><a id="edge_set_sizes"></a>
 </td>
 <td>
 Optional Tensor with the number of edges per component. If
@@ -111,12 +104,13 @@ the same length.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-A scalar `GraphTensor` with a single node set and edge set.
+A scalar <code>GraphTensor</code> with a single node set and edge set.
 </td>
 </tr>
 
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/is_dense_tensor.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/is_dense_tensor.md
index 82d78186..9150fe64 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/is_dense_tensor.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/is_dense_tensor.md
@@ -1,17 +1,10 @@
 # tfgnn.is_dense_tensor
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/tensor_utils.py#L491-L501">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/tensor_utils.py#L494-L504">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns whether a tensor (TF or Keras) is a Tensor.
 
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/is_graph_tensor.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/is_graph_tensor.md
index 1ae673d4..462cca77 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/is_graph_tensor.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/is_graph_tensor.md
@@ -1,17 +1,10 @@
 # tfgnn.is_graph_tensor
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_ops.py#L401-L403">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_ops.py#L396-L398">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns whether `value` is a GraphTensor (possibly wrapped for Keras).
 
@@ -21,6 +14,4 @@ Returns whether `value` is a GraphTensor (possibly wrapped for Keras).
 ) -> bool
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/is_ragged_tensor.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/is_ragged_tensor.md
index d534e1e6..258bd27f 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/is_ragged_tensor.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/is_ragged_tensor.md
@@ -1,17 +1,10 @@
 # tfgnn.is_ragged_tensor
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/tensor_utils.py#L486-L488">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/tensor_utils.py#L489-L491">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns whether a tensor (TF or Keras) is a RaggedTensor.
 
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/iter_features.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/iter_features.md
index 2e582687..ed1f03eb 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/iter_features.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/iter_features.md
@@ -1,49 +1,42 @@
 # tfgnn.iter_features
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/schema_utils.py#L300-L328">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/schema_utils.py#L330-L359">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Utility function to iterate over the features of a graph schema.
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>tfgnn.iter_features(
-    schema: Union[<a href="../tfgnn/GraphSchema.md"><code>tfgnn.GraphSchema</code></a>, <a href="../tfgnn/GraphTensor.md"><code>tfgnn.GraphTensor</code></a>]
+    schema: Union[<a href="../tfgnn/proto/GraphSchema.md"><code>tfgnn.proto.GraphSchema</code></a>, <a href="../tfgnn/GraphTensor.md"><code>tfgnn.GraphTensor</code></a>]
 ) -> Iterator[Tuple[Text, Text, Text, Union[schema_pb2.Feature, gt.Field]]]
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
 This function iterates over all the feature values of each of the context set,
 each of the node sets, and each of the edge sets.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`schema`<a id="schema"></a>
+<code>schema</code><a id="schema"></a>
 </td>
 <td>
-An instance of a `GraphSchema` proto message.
+An instance of a <code>GraphSchema</code> proto message.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Yields</h2></th></tr>
@@ -51,13 +44,10 @@ An instance of a `GraphSchema` proto message.
 <td colspan="2">
 Triplets of (set-type, set-name, feature-name, feature-value) where
 
-* set-type: A type of set, which is either of "context", "nodes" or "edges".
-* set-name: A string, the name of the set.
-* feature-name: A string, the name of the feature in the set.
-* feature-value: A potentially ragged tensor (either a `tf.Tensor` or a
-  `tf.RaggedTensor`).
-</td>
-</tr>
+*   set-type: A type of set, which is either of "context", "nodes" or "edges".
+*   set-name: A string, the name of the set.
+*   feature-name: A string, the name of the feature in the set.
+*   feature-value: A potentially ragged tensor (either a <code>tf.Tensor</code>
+    or a <code>tf.RaggedTensor</code>). </td> </tr>
 
 </table>
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/iter_sets.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/iter_sets.md
index a0ca5a66..8941eb7a 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/iter_sets.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/iter_sets.md
@@ -1,49 +1,42 @@
 # tfgnn.iter_sets
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/schema_utils.py#L273-L297">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/schema_utils.py#L302-L327">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Utility function to iterate over all the sets present in a graph schema.
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>tfgnn.iter_sets(
-    schema: Union[<a href="../tfgnn/GraphSchema.md"><code>tfgnn.GraphSchema</code></a>, <a href="../tfgnn/GraphTensor.md"><code>tfgnn.GraphTensor</code></a>]
+    schema: Union[<a href="../tfgnn/proto/GraphSchema.md"><code>tfgnn.proto.GraphSchema</code></a>, <a href="../tfgnn/GraphTensor.md"><code>tfgnn.GraphTensor</code></a>]
 ) -> Iterator[Tuple[str, str, Any]]
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
-This function iterates over the context set, each of the node sets, and
-finally each of the edge sets.
+This function iterates over the context set, each of the node sets, and finally
+each of the edge sets.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`schema`<a id="schema"></a>
+<code>schema</code><a id="schema"></a>
 </td>
 <td>
-An instance of a `GraphSchema` proto message.
+An instance of a <code>GraphSchema</code> proto message.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Yields</h2></th></tr>
@@ -51,11 +44,8 @@ An instance of a `GraphSchema` proto message.
 <td colspan="2">
 Triplets of (set-type, set-name, features) where
 
-* set-type: A type of set, which is either of "context", "nodes" or "edges".
-* set-name: A string, the name of the set.
-* features: A dict of feature-name to feature-value.
-</td>
-</tr>
+*   set-type: A type of set, which is either of "context", "nodes" or "edges".
+*   set-name: A string, the name of the set.
+*   features: A dict of feature-name to feature-value. </td> </tr>
 
 </table>
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras.md
index 257ea6c4..99041b1b 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras.md
@@ -1,31 +1,21 @@
 # Module: tfgnn.keras
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/__init__.py">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
-
-
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/__init__.py">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 The tfgnn.keras package.
 
-
-
 ## Modules
 
 [`layers`](../tfgnn/keras/layers.md) module: The tfgnn.keras.layers package.
 
 ## Classes
 
-[`class ConvGNNBuilder`](../tfgnn/keras/ConvGNNBuilder.md): Factory of layers that do convolutions on a graph.
+[`class ConvGNNBuilder`](../tfgnn/keras/ConvGNNBuilder.md): Factory of layers
+that do convolutions on a graph.
 
 ## Functions
 
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/ConvGNNBuilder.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/ConvGNNBuilder.md
index 368fcf70..f99101fe 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/ConvGNNBuilder.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/ConvGNNBuilder.md
@@ -1,17 +1,10 @@
 # tfgnn.keras.ConvGNNBuilder
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/builders.py#L29-L218">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/builders.py#L29-L218">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Factory of layers that do convolutions on a graph.
 
@@ -31,9 +24,8 @@ Factory of layers that do convolutions on a graph.
 ConvGNNBuilder object constructs `GraphUpdate` layers, that apply arbitrary
 convolutions and updates on nodes of a graph. The convolutions (created by the
 `convolutions_factory`) propagate information to the incident edges of the
-graph. The results of the convolution together with the current nodes states
-are used to update the nodes, using a layer created by
-`nodes_next_state_factory`.
+graph. The results of the convolution together with the current nodes states are
+used to update the nodes, using a layer created by `nodes_next_state_factory`.
 
 Layers created by ConvGNNBuilder can be (re-)used in any order.
 
@@ -75,33 +67,34 @@ def graph_update_factory(deferred_init_callback, name):
 ```
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Init args</h2></th></tr>
 
 <tr>
 <td>
-`convolutions_factory`<a id="convolutions_factory"></a>
+<code>convolutions_factory</code><a id="convolutions_factory"></a>
 </td>
 <td>
 called as
-`convolutions_factory(edge_set_name, receiver_tag=receiver_tag)`
+<code>convolutions_factory(edge_set_name, receiver_tag=receiver_tag)</code>
 to return the convolution layer for the edge set towards the specified
-receiver. The `receiver_tag` kwarg is omitted from the call if it is
+receiver. The <code>receiver_tag</code> kwarg is omitted from the call if it is
 omitted from the init args (but that usage is deprecated).
 </td>
 </tr><tr>
 <td>
-`nodes_next_state_factory`<a id="nodes_next_state_factory"></a>
+<code>nodes_next_state_factory</code><a id="nodes_next_state_factory"></a>
 </td>
 <td>
 called as
-`nodes_next_state_factory(node_set_name)` to return the next-state layer
+<code>nodes_next_state_factory(node_set_name)</code> to return the next-state layer
 for the respectve NodeSetUpdate.
 </td>
 </tr><tr>
 <td>
-`receiver_tag`<a id="receiver_tag"></a>
+<code>receiver_tag</code><a id="receiver_tag"></a>
 </td>
 <td>
 Set this to <a href="../../tfgnn.md#TARGET"><code>tfgnn.TARGET</code></a> or <a href="../../tfgnn.md#SOURCE"><code>tfgnn.SOURCE</code></a> to choose which
@@ -111,21 +104,21 @@ New code is expected to set it in any case.
 </td>
 </tr><tr>
 <td>
-`node_set_update_factory`<a id="node_set_update_factory"></a>
+<code>node_set_update_factory</code><a id="node_set_update_factory"></a>
 </td>
 <td>
 If set, called as
-`node_set_update_factory(node_set_name, edge_set_inputs, next_state)`
-to return the node set update for the given `node_set_name`. The
+<code>node_set_update_factory(node_set_name, edge_set_inputs, next_state)</code>
+to return the node set update for the given <code>node_set_name</code>. The
 remaining arguments are as expected by <a href="../../tfgnn/keras/layers/NodeSetUpdate.md"><code>tfgnn.keras.layers.NodeSetUpdate</code></a>.
 </td>
 </tr><tr>
 <td>
-`graph_update_factory`<a id="graph_update_factory"></a>
+<code>graph_update_factory</code><a id="graph_update_factory"></a>
 </td>
 <td>
 If set, called as
-`graph_update_factory(deferred_init_callback, name)` to return the graph
+<code>graph_update_factory(deferred_init_callback, name)</code> to return the graph
 update. The arguments are as expected by <a href="../../tfgnn/keras/layers/GraphUpdate.md"><code>tfgnn.keras.layers.GraphUpdate</code></a>.
 </td>
 </tr>
@@ -147,20 +140,21 @@ source</a>
 
 Constructs GraphUpdate layer for the set of receiver node sets.
 
-This method contructs NodeSetUpdate layers from convolutions and next state
+This method constructs NodeSetUpdate layers from convolutions and next state
 factories (specified during the class construction) for the given receiver node
 sets. The resulting node set update layers are combined and returned as one
 GraphUpdate layer. Auxiliary node sets (e.g., as needed for
 `tfgnn.keras.layers.NamedReadout`) are ignored.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`node_sets`
+<code>node_sets</code>
 </td>
 <td>
 By default, the result updates all node sets that receive from
@@ -171,15 +165,16 @@ auxiliary node sets.
 </td>
 </tr><tr>
 <td>
-`name`
+<code>name</code>
 </td>
 <td>
-Optionally, a name for the returned GraphUpate layer.
+Optionally, a name for the returned GraphUpdate layer.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Returns</th></tr>
@@ -190,8 +185,3 @@ A GraphUpdate layer, with building deferred to the first call.
 </tr>
 
 </table>
-
-
-
-
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/clone_initializer.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/clone_initializer.md
index f8deffef..22ecb47a 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/clone_initializer.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/clone_initializer.md
@@ -1,17 +1,10 @@
 # tfgnn.keras.clone_initializer
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/initializers.py#L20-L67">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/initializers.py#L20-L67">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Clones an initializer to ensure a new default seed.
 
@@ -32,9 +25,9 @@ seed (even if not explicitly specified) at creation time, so that all calls to
 them return the same sequence of numbers. To achieve independent initializations
 of the various model weights, user-specified initializers must be cloned for
 each weight before passing them to Keras. This way, each of them gets a separate
-seed (unless explicitly overriden).
+seed (unless explicitly overridden).
 
-This helper function clones `Initializer` obejcts and passes through all other
+This helper function clones `Initializer` objects and passes through all other
 forms of specifying an initializer. TF-GNN's modeling code applies it before
 passing user-specified initaializers to Keras. User code that calls Keras
 directly and passes an initializer more than once is advised to wrap it with
@@ -66,7 +59,7 @@ return gnn_builder.Convolve()
 
 <tr>
 <td>
-`initializer`<a id="initializer"></a>
+<code>initializer</code><a id="initializer"></a>
 </td>
 <td>
 An initializer specification as understood by Keras.
@@ -81,8 +74,8 @@ An initializer specification as understood by Keras.
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-A new `Initializer` object with the same config as `initializer`,
-or `initializer` unchanged if if was not an `Initializer` object.
+A new <code>Initializer</code> object with the same config as <code>initializer</code>,
+or <code>initializer</code> unchanged if it was not an <code>Initializer</code> object.
 </td>
 </tr>
 
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers.md
index 82cf3f2c..0c3b0049 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers.md
@@ -1,24 +1,13 @@
 # Module: tfgnn.keras.layers
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/__init__.py">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
-
-
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/__init__.py">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 The tfgnn.keras.layers package.
 
-
-
 ## Classes
 
 [`class AddReadoutFromFirstNode`](../../tfgnn/keras/layers/AddReadoutFromFirstNode.md):
@@ -28,36 +17,49 @@ Adds readout node set equivalent to
 [`class AddSelfLoops`](../../tfgnn/keras/layers/AddSelfLoops.md): Adds
 self-loops to scalar graphs.
 
-[`class AnyToAnyConvolutionBase`](../../tfgnn/keras/layers/AnyToAnyConvolutionBase.md): Convenience base class for convolutions to nodes or to context.
+[`class AnyToAnyConvolutionBase`](../../tfgnn/keras/layers/AnyToAnyConvolutionBase.md):
+Convenience base class for convolutions to nodes or to context.
 
-[`class Broadcast`](../../tfgnn/keras/layers/Broadcast.md): Broadcasts a GraphTensor feature.
+[`class Broadcast`](../../tfgnn/keras/layers/Broadcast.md): Broadcasts a
+GraphTensor feature.
 
-[`class ContextUpdate`](../../tfgnn/keras/layers/ContextUpdate.md): A context update with input from node sets and/or edge sets.
+[`class ContextUpdate`](../../tfgnn/keras/layers/ContextUpdate.md): A context
+update with input from node sets and/or edge sets.
 
-[`class EdgeSetUpdate`](../../tfgnn/keras/layers/EdgeSetUpdate.md): Computes the new state of an EdgeSet from select input features.
+[`class EdgeSetUpdate`](../../tfgnn/keras/layers/EdgeSetUpdate.md): Computes the
+new state of an EdgeSet from select input features.
 
-[`class GraphUpdate`](../../tfgnn/keras/layers/GraphUpdate.md): Applies one round of updates to EdgeSets, NodeSets and Context.
+[`class GraphUpdate`](../../tfgnn/keras/layers/GraphUpdate.md): Applies one
+round of updates to EdgeSets, NodeSets and Context.
 
 [`class ItemDropout`](../../tfgnn/keras/layers/ItemDropout.md): Dropout of
 feature values for entire edges, nodes or components.
 
-[`class MakeEmptyFeature`](../../tfgnn/keras/layers/MakeEmptyFeature.md): Returns an empty feature with a shape that fits the input graph piece.
+[`class MakeEmptyFeature`](../../tfgnn/keras/layers/MakeEmptyFeature.md):
+Returns an empty feature with a shape that fits the input graph piece.
 
-[`class MapFeatures`](../../tfgnn/keras/layers/MapFeatures.md): Transforms features on a GraphTensor by user-defined callbacks.
+[`class MapFeatures`](../../tfgnn/keras/layers/MapFeatures.md): Transforms
+features on a GraphTensor by user-defined callbacks.
 
-[`class NextStateFromConcat`](../../tfgnn/keras/layers/NextStateFromConcat.md): Computes a new state by concatenating inputs and applying a Keras Layer.
+[`class NextStateFromConcat`](../../tfgnn/keras/layers/NextStateFromConcat.md):
+Computes a new state by concatenating inputs and applying a Keras Layer.
 
-[`class NodeSetUpdate`](../../tfgnn/keras/layers/NodeSetUpdate.md): A node state update with input from convolutions or other edge set inputs.
+[`class NodeSetUpdate`](../../tfgnn/keras/layers/NodeSetUpdate.md): A node state
+update with input from convolutions or other edge set inputs.
 
-[`class PadToTotalSizes`](../../tfgnn/keras/layers/PadToTotalSizes.md): Applies tfgnn.pad_to_total_sizes() to a GraphTensor.
+[`class PadToTotalSizes`](../../tfgnn/keras/layers/PadToTotalSizes.md): Applies
+tfgnn.pad_to_total_sizes() to a GraphTensor.
 
-[`class ParseExample`](../../tfgnn/keras/layers/ParseExample.md): Applies tfgnn.parse_example(graph_tensor_spec, _) to a batch of strings.
+[`class ParseExample`](../../tfgnn/keras/layers/ParseExample.md): Applies
+tfgnn.parse_example(graph_tensor_spec, _) to a batch of strings.
 
-[`class ParseSingleExample`](../../tfgnn/keras/layers/ParseSingleExample.md): Applies tfgnn.parse_single_example(graph_tensor_spec, _).
+[`class ParseSingleExample`](../../tfgnn/keras/layers/ParseSingleExample.md):
+Applies tfgnn.parse_single_example(graph_tensor_spec, _).
 
 [`class Pool`](../../tfgnn/keras/layers/Pool.md): Pools a GraphTensor feature.
 
-[`class Readout`](../../tfgnn/keras/layers/Readout.md): Reads a feature out of a GraphTensor.
+[`class Readout`](../../tfgnn/keras/layers/Readout.md): Reads a feature out of a
+GraphTensor.
 
 [`class ReadoutFirstNode`](../../tfgnn/keras/layers/ReadoutFirstNode.md): Reads
 a feature from the first node of each graph component.
@@ -65,7 +67,8 @@ a feature from the first node of each graph component.
 [`class ReadoutNamedIntoFeature`](../../tfgnn/keras/layers/ReadoutNamedIntoFeature.md):
 Reads out a feature value from select nodes (or edges) in a graph.
 
-[`class ResidualNextState`](../../tfgnn/keras/layers/ResidualNextState.md): Updates a state with a residual block.
+[`class ResidualNextState`](../../tfgnn/keras/layers/ResidualNextState.md):
+Updates a state with a residual block.
 
 [`class SimpleConv`](../../tfgnn/keras/layers/SimpleConv.md): A convolution
 layer that applies a passed-in message_fn.
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/AddReadoutFromFirstNode.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/AddReadoutFromFirstNode.md
index a5ce1ee2..99f6c175 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/AddReadoutFromFirstNode.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/AddReadoutFromFirstNode.md
@@ -1,17 +1,10 @@
 # tfgnn.keras.layers.AddReadoutFromFirstNode
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/graph_ops.py#L518-L566">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/graph_ops.py#L518-L566">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Adds readout node set equivalent to
 <a href="../../../tfgnn/keras/layers/ReadoutFirstNode.md"><code>tfgnn.keras.layers.ReadoutFirstNode</code></a>.
@@ -35,7 +28,7 @@ Adds readout node set equivalent to
 
 <tr>
 <td>
-`key`<a id="key"></a>
+<code>key</code><a id="key"></a>
 </td>
 <td>
 A key, for use with <a href="../../../tfgnn/keras/layers/StructuredReadout.md"><code>tfgnn.keras.layers.StructuredReadout</code></a>. The input
@@ -44,7 +37,7 @@ key.
 </td>
 </tr><tr>
 <td>
-`node_set_name`<a id="node_set_name"></a>
+<code>node_set_name</code><a id="node_set_name"></a>
 </td>
 <td>
 The name of the node set from which values are to be read
@@ -52,7 +45,7 @@ out.
 </td>
 </tr><tr>
 <td>
-`readout_node_set`<a id="readout_node_set"></a>
+<code>readout_node_set</code><a id="readout_node_set"></a>
 </td>
 <td>
 The name of the auxiliary node set for readout,
@@ -69,10 +62,10 @@ as in <a href="../../../tfgnn/structured_readout.md"><code>tfgnn.structured_read
 
 <tr>
 <td>
-`graph`<a id="graph"></a>
+<code>graph</code><a id="graph"></a>
 </td>
 <td>
-A scalar `GraphTensor`. If it contains the readout_node_set already,
+A scalar <code>GraphTensor</code>. If it contains the readout_node_set already,
 its size in each graph component must be 1.
 </td>
 </tr>
@@ -85,7 +78,7 @@ its size in each graph component must be 1.
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-A modified `GraphTensor` so that <a href="../../../tfgnn/keras/layers/StructuredReadout.md"><code>tfgnn.keras.layers.StructuredReadout(key)</code></a>
+A modified <code>GraphTensor</code> so that <a href="../../../tfgnn/keras/layers/StructuredReadout.md"><code>tfgnn.keras.layers.StructuredReadout(key)</code></a>
 acts like <a href="../../../tfgnn/keras/layers/ReadoutFirstNode.md"><code>tfgnn.keras.layers.ReadoutFirstNode(node_set_name=node_set_name)</code></a>
 on the input graph.
 </td>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/AddSelfLoops.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/AddSelfLoops.md
index 04bfac0b..69dab1e2 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/AddSelfLoops.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/AddSelfLoops.md
@@ -1,17 +1,10 @@
 # tfgnn.keras.layers.AddSelfLoops
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/graph_ops.py#L204-L223">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/graph_ops.py#L204-L223">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Adds self-loops to scalar graphs.
 
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/AnyToAnyConvolutionBase.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/AnyToAnyConvolutionBase.md
index dfc89475..295b5ca3 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/AnyToAnyConvolutionBase.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/AnyToAnyConvolutionBase.md
@@ -1,17 +1,10 @@
 # tfgnn.keras.layers.AnyToAnyConvolutionBase
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/convolution_base.py#L29-L393">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/convolution_base.py#L29-L393">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Convenience base class for convolutions to nodes or to context.
 
@@ -27,8 +20,6 @@ Convenience base class for convolutions to nodes or to context.
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
 This base class simplifies the implementation of graph convolutions as Keras
@@ -36,10 +27,10 @@ Layers. Instead of subclassing Layer directly and implementing `call()` with a
 GraphTensor input, users can subclass this class and implement the abstract
 `convolve()` method, which is invoked with the relevant tensors unpacked from
 the graph, and with callbacks to broadcast and pool values across the relevant
-parts of the graph structure (see its docstring for more).  The resulting
+parts of the graph structure (see its docstring for more). The resulting
 subclass can perform convolutions from nodes to nodes, optionally with a side
-input from edges, or the equivalent computations from nodes to context,
-from edges to context, or from edges into incident nodes.
+input from edges, or the equivalent computations from nodes to context, from
+edges to context, or from edges into incident nodes.
 
 Here is a minimal example:
 
@@ -72,63 +63,51 @@ class ExampleConvolution(tfgnn.keras.layers.AnyToAnyConvolutionBase):
 
 The resulting subclass can be applied to a GraphTensor in four ways:
 
- 1. Convolutions from nodes.
-
-    a) The classic case: convolutions over an edge set.
-       ```
-         sender_node -> sender_edge <->> receiver (node)
-       ```
-       A message is computed for each edge of the edge set, depending on
-       inputs from the sender node, the receiver node (single arrowheads),
-       and/or the edge itself. The messages are aggregated at each receiver
-       node from its incident edges (double arrowhead).
-
-    b) Convolutions from a node set to context.
-       ```
-         sender_node <->> receiver (context)
-       ```
-       Instead of the EdgeSet as in case (1a), there is a NodeSet, which
-       tracks the containment of its nodes in graph components. Instead of
-       a receiver node, there is the per-component graph context.
-       A message is computed for each node of the node set, depending on
-       inputs from the node itself and possibly its context. The messages are
-       aggregated for each component from the nodes contained in it.
-
-
- 2. Pooling from edges.
-
-    a) Pooling from an edge set to an incident node set.
-       ```
-         sender_edge <->> receiver (node)
-       ```
-       This works like a convolution in case (1a) with the side input for the
-       edge feature switched on and the main input from the sender node
-       switched off.
-
-    b) Pooling from an edge set to context.
-       ```
-         sender_edge <->> receiver (context)
-       ```
-       Like in case (1b), the receiver is the context, and senders are
-       connected to it by containment in a graph component. Unlike case (1b),
-       but similar to case (2a), the sender input comes from an edge set,
-       not a node set.
+1.  Convolutions from nodes.
+
+    a) The classic case: convolutions over an edge set. `sender_node ->
+    sender_edge <->> receiver (node)` A message is computed for each edge of the
+    edge set, depending on inputs from the sender node, the receiver node
+    (single arrowheads), and/or the edge itself. The messages are aggregated at
+    each receiver node from its incident edges (double arrowhead).
+
+    b) Convolutions from a node set to context. `sender_node <->> receiver
+    (context)` Instead of the EdgeSet as in case (1a), there is a NodeSet, which
+    tracks the containment of its nodes in graph components. Instead of a
+    receiver node, there is the per-component graph context. A message is
+    computed for each node of the node set, depending on inputs from the node
+    itself and possibly its context. The messages are aggregated for each
+    component from the nodes contained in it.
+
+2.  Pooling from edges.
+
+    a) Pooling from an edge set to an incident node set. `sender_edge <->>
+    receiver (node)` This works like a convolution in case (1a) with the side
+    input for the edge feature switched on and the main input from the sender
+    node switched off.
+
+    b) Pooling from an edge set to context. `sender_edge <->> receiver
+    (context)` Like in case (1b), the receiver is the context, and senders are
+    connected to it by containment in a graph component. Unlike case (1b), but
+    similar to case (2a), the sender input comes from an edge set, not a node
+    set.
 
 Case (1) is solved directly by subclasses of this class with default init args
-`sender_node_feature=tfgnn.HIDDEN_STATE` and `sender_edge_feature=None`.
-The sub-cases are distinguised by passing `receiver_tag=tfgnn.SOURCE` or
-<a href="../../../tfgnn.md#TARGET"><code>tfgnn.TARGET</code></a> for (a) and <a href="../../../tfgnn.md#CONTEXT"><code>tfgnn.CONTEXT</code></a> for (b).
-The side input from edges can be activated in case (1a) by setting the init
-arg `sender_edge_feature=`.
-
-Case (2) is solved indirectly by wrapping a subclass to invert the selection
-for sender inputs: nodes off, edges on. The sub-cases (a) and (b) are again
+`sender_node_feature=tfgnn.HIDDEN_STATE` and `sender_edge_feature=None`. The
+sub-cases are distinguised by passing `receiver_tag=tfgnn.SOURCE` or
+<a href="../../../tfgnn.md#TARGET"><code>tfgnn.TARGET</code></a> for (a) and
+<a href="../../../tfgnn.md#CONTEXT"><code>tfgnn.CONTEXT</code></a> for (b). The
+side input from edges can be activated in case (1a) by setting the init arg
+`sender_edge_feature=`.
+
+Case (2) is solved indirectly by wrapping a subclass to invert the selection for
+sender inputs: nodes off, edges on. The sub-cases (a) and (b) are again
 distinguished by the `receiver_tag`. TF-GNN avoids the term "Convolution" for
 this operation and calls it "EdgePool" instead, to emphasize that sender nodes
 are not involved. Whenever it makes sense to repurpose a convolution as an
 EdgePool operation, we recommend that the convolution is accompanied by a
-wrapper function in the following style so that the proper term "edge pool"
-can be used in code:
+wrapper function in the following style so that the proper term "edge pool" can
+be used in code:
 
 ```python
 def ExampleEdgePool(*args, sender_feature=tfgnn.HIDDEN_STATE, **kwargs):
@@ -142,11 +121,12 @@ attention to neighbor nodes, but in this way we can reuse the same code for
 attention to incident edges, or to all nodes/edges in a graph component.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
-<tr> <td> `receiver_tag`<a id="receiver_tag"></a> </td> <td> one of
+<tr> <td> <code>receiver_tag</code><a id="receiver_tag"></a> </td> <td> one of
 <a href="../../../tfgnn.md#SOURCE"><code>tfgnn.SOURCE</code></a>,
 <a href="../../../tfgnn.md#TARGET"><code>tfgnn.TARGET</code></a> or
 <a href="../../../tfgnn.md#CONTEXT"><code>tfgnn.CONTEXT</code></a>. The results
@@ -158,45 +138,46 @@ of the edges. If set to
 <a href="../../../tfgnn.md#CONTEXT"><code>tfgnn.CONTEXT</code></a>, the layer
 can be called for an edge set or a node set and will aggregate results for
 context (per graph component). If left unset for init, the tag must be passed at
-call time. </td> </tr><tr> <td> `receiver_feature`<a id="receiver_feature"></a>
-</td> <td> The name of the feature that is read from the receiver graph piece
-and passed as convolve(receiver_input=...). </td> </tr><tr> <td>
-`sender_node_feature`<a id="sender_node_feature"></a> </td> <td> The name of the
-feature that is read from the sender nodes, if any, and passed as
-convolve(sender_node_input=...). NOTICE this must be `None` for use with
-`receiver_tag=tfgnn.CONTEXT` on an edge set, or for pooling from edges without
-sender node states. </td> </tr><tr> <td>
-`sender_edge_feature`<a id="sender_edge_feature"></a> </td> <td> The name of the
-feature that is read from the sender edges, if any, and passed as
-convolve(sender_edge_input=...). NOTICE this must not be `None` for use with
-`receiver_tag=tfgnn.CONTEXT` on an edge set. </td> </tr><tr> <td>
-`extra_receiver_ops`<a id="extra_receiver_ops"></a> </td> <td> A str-keyed
-dictionary of Python callables that are wrapped to bind some arguments and then
-passed on to `convolve()`. Sample usage: `extra_receiver_ops={"softmax":
-tfgnn.softmax}`. The values passed in this dict must be callable as follows,
-with two positional arguments:
+call time. </td> </tr><tr> <td>
+<code>receiver_feature</code><a id="receiver_feature"></a> </td> <td> The name
+of the feature that is read from the receiver graph piece and passed as
+convolve(receiver_input=...). </td> </tr><tr> <td>
+<code>sender_node_feature</code><a id="sender_node_feature"></a> </td> <td> The
+name of the feature that is read from the sender nodes, if any, and passed as
+convolve(sender_node_input=...). NOTICE this must be <code>None</code> for use
+with <code>receiver_tag=tfgnn.CONTEXT</code> on an edge set, or for pooling from
+edges without sender node states. </td> </tr><tr> <td>
+<code>sender_edge_feature</code><a id="sender_edge_feature"></a> </td> <td> The
+name of the feature that is read from the sender edges, if any, and passed as
+convolve(sender_edge_input=...). NOTICE this must not be <code>None</code> for
+use with <code>receiver_tag=tfgnn.CONTEXT</code> on an edge set. </td> </tr><tr>
+<td> <code>extra_receiver_ops</code><a id="extra_receiver_ops"></a> </td> <td> A
+str-keyed dictionary of Python callables that are wrapped to bind some arguments
+and then passed on to <code>convolve()</code>. Sample usage:
+<code>extra_receiver_ops={"softmax": tfgnn.softmax}</code>. The values passed in
+this dict must be callable as follows, with two positional arguments:
 
 ```python
 f(graph, receiver_tag, node_set_name=..., feature_value=..., ...)
 f(graph, receiver_tag, edge_set_name=..., feature_value=..., ...)
 ```
 
-The wrapped callables seen by `convolve()` can be called like
+The wrapped callables seen by <code>convolve()</code> can be called like
 
 ```python
 wrapped_f(feature_value, ...)
 ```
 
-The first three arguments of `f` are set to the input GraphTensor of
+The first three arguments of <code>f</code> are set to the input GraphTensor of
 the layer and the tag/name pair required by <a href="../../../tfgnn/broadcast.md"><code>tfgnn.broadcast()</code></a> and
 <a href="../../../tfgnn/pool.md"><code>tfgnn.pool()</code></a> to move values between the receiver and the messages that
 are computed inside the convolution. The sole positional argument of
-`wrapped_f()` is passed to `f()`  as `feature_value=`, and any keyword
+<code>wrapped_f()</code> is passed to <code>f()</code>  as <code>feature_value=</code>, and any keyword
 arguments are forwarded.
 </td>
 </tr><tr>
 <td>
-`**kwargs`<a id="**kwargs"></a>
+<code>**kwargs</code><a id="**kwargs"></a>
 </td>
 <td>
 Forwarded to the base class tf.keras.layers.Layer.
@@ -205,30 +186,31 @@ Forwarded to the base class tf.keras.layers.Layer.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
 <tr>
 <td>
-`takes_receiver_input`<a id="takes_receiver_input"></a>
+<code>takes_receiver_input</code><a id="takes_receiver_input"></a>
 </td>
 <td>
-If `False`, all calls to convolve() will get `receiver_input=None`.
+If <code>False</code>, all calls to convolve() will get <code>receiver_input=None</code>.
 </td>
 </tr><tr>
 <td>
-`takes_sender_edge_input`<a id="takes_sender_edge_input"></a>
+<code>takes_sender_edge_input</code><a id="takes_sender_edge_input"></a>
 </td>
 <td>
-If `False`, all calls to convolve() will get `sender_edge_input=None`.
+If <code>False</code>, all calls to convolve() will get <code>sender_edge_input=None</code>.
 </td>
 </tr><tr>
 <td>
-`takes_sender_node_input`<a id="takes_sender_node_input"></a>
+<code>takes_sender_node_input</code><a id="takes_sender_node_input"></a>
 </td>
 <td>
-If `False`, all calls to convolve() will get `sender_node_input=None`.
+If <code>False</code>, all calls to convolve() will get <code>sender_node_input=None</code>.
 </td>
 </tr>
 </table>
@@ -257,52 +239,52 @@ source</a>
 
 Returns the convolution result.
 
-The Tensor inputs to this function still have their original shapes
-and need to be broadcast such that the leading dimension is indexed
-by the items in the graph for which messages are computed (usually edges;
-except when convolving from nodes to context). In the end, values have to be
-pooled from there into a Tensor with a leading dimension indexed by
-receivers, see `pool_to_receiver`.
+The Tensor inputs to this function still have their original shapes and need to
+be broadcast such that the leading dimension is indexed by the items in the
+graph for which messages are computed (usually edges; except when convolving
+from nodes to context). In the end, values have to be pooled from there into a
+Tensor with a leading dimension indexed by receivers, see `pool_to_receiver`.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`sender_node_input`
+<code>sender_node_input</code>
 </td>
 <td>
-The input Tensor from the sender NodeSet, or `None`.
-If self.takes_sender_node_input is `False`, this arg will be `None`.
-(If it is `True`, that depends on how this layer gets called.)
+The input Tensor from the sender NodeSet, or <code>None</code>.
+If self.takes_sender_node_input is <code>False</code>, this arg will be <code>None</code>.
+(If it is <code>True</code>, that depends on how this layer gets called.)
 See also broadcast_from_sender_node.
 </td>
 </tr><tr>
 <td>
-`sender_edge_input`
+<code>sender_edge_input</code>
 </td>
 <td>
-The input Tensor from the sender EdgeSet, or `None`.
-If self.takes_sender_edge_input is `False`, this arg will be `None`.
-(If it is `True`, it depends on how this layer gets called.)
+The input Tensor from the sender EdgeSet, or <code>None</code>.
+If self.takes_sender_edge_input is <code>False</code>, this arg will be <code>None</code>.
+(If it is <code>True</code>, it depends on how this layer gets called.)
 If present, this Tensor is already indexed by the items for which
 messages are computed.
 </td>
 </tr><tr>
 <td>
-`receiver_input`
+<code>receiver_input</code>
 </td>
 <td>
 The input Tensor from the receiver NodeSet or Context,
-or None. If self.takes_receiver_input is `False`, this arg will be
-`None`. (If it is `True`, it depends on how this layer gets called.)
+or None. If self.takes_receiver_input is <code>False</code>, this arg will be
+<code>None</code>. (If it is <code>True</code>, it depends on how this layer gets called.)
 See broadcast_from_receiver.
 </td>
 </tr><tr>
 <td>
-`broadcast_from_sender_node`
+<code>broadcast_from_sender_node</code>
 </td>
 <td>
 A function that broadcasts a Tensor indexed
@@ -311,25 +293,25 @@ messages are computed.
 </td>
 </tr><tr>
 <td>
-`broadcast_from_receiver`
+<code>broadcast_from_receiver</code>
 </td>
 <td>
-Call this as `broadcast_from_receiver(value)`
+Call this as <code>broadcast_from_receiver(value)</code>
 to broadcast a Tensor indexed like receiver_input to a Tensor indexed
 by the items for which messages are computed.
 </td>
 </tr><tr>
 <td>
-`pool_to_receiver`
+<code>pool_to_receiver</code>
 </td>
 <td>
-Call this as `pool_to_receiver(value, reduce_type=...)`
+Call this as <code>pool_to_receiver(value, reduce_type=...)</code>
 to pool an item-indexed Tensor to a receiver-indexed tensor, using
 a reduce_type understood by tfgnn.pool(), such as "sum".
 </td>
 </tr><tr>
 <td>
-`extra_receiver_ops`
+<code>extra_receiver_ops</code>
 </td>
 <td>
 The extra_receiver_ops passed to init, see there,
@@ -339,10 +321,10 @@ this argument, so subclass implementors not using it can omit it.
 </td>
 </tr><tr>
 <td>
-`training`
+<code>training</code>
 </td>
 <td>
-The `training` boolean that was passed to Layer.call(). If true,
+The <code>training</code> boolean that was passed to Layer.call(). If true,
 the result is computed for training rather than inference. For example,
 calls to tf.nn.dropout() are usually conditioned on this flag.
 By contrast, calling another Keras layer (like tf.keras.layers.Dropout)
@@ -351,9 +333,8 @@ does not require forwarding this arg, Keras does that automatically.
 </tr>
 </table>
 
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Returns</th></tr>
@@ -365,8 +346,3 @@ result of the convolution for each receiver.
 </tr>
 
 </table>
-
-
-
-
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/Broadcast.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/Broadcast.md
index 1884cc66..26cad901 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/Broadcast.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/Broadcast.md
@@ -1,17 +1,10 @@
 # tfgnn.keras.layers.Broadcast
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/graph_ops.py#L715-L803">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/graph_ops.py#L715-L803">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Broadcasts a GraphTensor feature.
 
@@ -60,13 +53,14 @@ left unset to select
 <a href="../../../tfgnn.md#HIDDEN_STATE"><code>tfgnn.HIDDEN_STATE</code></a>.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Init args</h2></th></tr>
 
 <tr>
 <td>
-`tag`<a id="tag"></a>
+<code>tag</code><a id="tag"></a>
 </td>
 <td>
 Can be set to one of <a href="../../../tfgnn.md#SOURCE"><code>tfgnn.SOURCE</code></a>, <a href="../../../tfgnn.md#TARGET"><code>tfgnn.TARGET</code></a> or <a href="../../../tfgnn.md#CONTEXT"><code>tfgnn.CONTEXT</code></a>
@@ -74,25 +68,25 @@ to select the sender from which feature values are broadcast.
 </td>
 </tr><tr>
 <td>
-`edge_set_name`<a id="edge_set_name"></a>
+<code>edge_set_name</code><a id="edge_set_name"></a>
 </td>
 <td>
 If set, the feature will be broadcast to this edge set
-(or this sequence of edge sets) from the sender given by `tag`.
-Mutually exclusive with `node_set_name`.
+(or this sequence of edge sets) from the sender given by <code>tag</code>.
+Mutually exclusive with <code>node_set_name</code>.
 </td>
 </tr><tr>
 <td>
-`node_set_name`<a id="node_set_name"></a>
+<code>node_set_name</code><a id="node_set_name"></a>
 </td>
 <td>
 If set, the feature will be broadcast to this node set
 (or sequence of node sets). The sender must be selected as
-`tag=tfgn.CONTEXT`. Mutually exclusive with `edge_set_name`.
+<code>tag=tfgn.CONTEXT</code>. Mutually exclusive with <code>edge_set_name</code>.
 </td>
 </tr><tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 The name of the feature to read. If unset (also in call),
@@ -102,20 +96,21 @@ the <a href="../../../tfgnn.md#HIDDEN_STATE"><code>tfgnn.HIDDEN_STATE</code></a>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Call args</h2></th></tr>
 
 <tr>
 <td>
-`graph`<a id="graph"></a>
+<code>graph</code><a id="graph"></a>
 </td>
 <td>
 The scalar <a href="../../../tfgnn/GraphTensor.md"><code>tfgnn.GraphTensor</code></a> to read from.
 </td>
 </tr><tr>
 <td>
-`tag`<a id="tag"></a>
+<code>tag</code><a id="tag"></a>
 </td>
 <td>
 Same meaning as for init. Must be passed to init, or to call,
@@ -125,7 +120,7 @@ edge_set_name, node_set_name: Same meaning as for init. One of them must
 </td>
 </tr><tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 Same meaning as for init. If passed to both, the value must
@@ -148,32 +143,29 @@ requested receivers.
 
 </table>
 
-
-
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
 <tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 Returns the feature_name argument to init, or None if unset.
 </td>
 </tr><tr>
 <td>
-`location`<a id="location"></a>
+<code>location</code><a id="location"></a>
 </td>
 <td>
 Returns dict of kwarg to init with the node or edge set name.
 </td>
 </tr><tr>
 <td>
-`tag`<a id="tag"></a>
+<code>tag</code><a id="tag"></a>
 </td>
 <td>
 Returns the tag argument to init, or None if unset.
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ContextUpdate.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ContextUpdate.md
index 42aa6d9d..18a9b1dc 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ContextUpdate.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ContextUpdate.md
@@ -1,17 +1,10 @@
 # tfgnn.keras.layers.ContextUpdate
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/graph_update.py#L446-L543">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/graph_update.py#L455-L555">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 A context update with input from node sets and/or edge sets.
 
@@ -26,49 +19,53 @@ A context update with input from node sets and/or edge sets.
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
+This layer can be restored from config by `tf.keras.models.load_model()` when
+saved as part of a Keras model using `save_format="tf"`.
+
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Init args</h2></th></tr>
 
-<tr> <td> `node_set_inputs`<a id="node_set_inputs"></a> </td> <td> A dict
-`{node_set_name: node_set_input, ...}` of Keras layers that return values shaped
-like context features with information aggregated from the given edge set. They
-are run on the input graph tensor as `node_set_input(graph,
-node_set_name=node_set_name)`. </td> </tr><tr> <td>
-`edge_set_inputs`<a id="edge_set_inputs"></a> </td> <td> A dict `{edge_set_name:
-edge_set_input, ...}` of Keras layers that return values shaped like context
-features with information aggregated from the given edge set. They are run on
-the input graph tensor as `edge_set_input(graph, edge_set_name=edge_set_name)`.
-</td> </tr><tr> <td> `next_state`<a id="next_state"></a> </td> <td> A Keras
-layer to compute the new node state from a tuple of inputs that contains, in
-this order:
+<tr> <td> <code>node_set_inputs</code><a id="node_set_inputs"></a> </td> <td> A
+dict <code>{node_set_name: node_set_input, ...}</code> of Keras layers that
+return values shaped like context features with information aggregated from the
+given edge set. They are run on the input graph tensor as
+<code>node_set_input(graph, node_set_name=node_set_name)</code>. </td> </tr><tr>
+<td> <code>edge_set_inputs</code><a id="edge_set_inputs"></a> </td> <td> A dict
+<code>{edge_set_name: edge_set_input, ...}</code> of Keras layers that return
+values shaped like context features with information aggregated from the given
+edge set. They are run on the input graph tensor as <code>edge_set_input(graph,
+edge_set_name=edge_set_name)</code>. </td> </tr><tr> <td>
+<code>next_state</code><a id="next_state"></a> </td> <td> A Keras layer to
+compute the new node state from a tuple of inputs that contains, in this order:
 
--   the `context_input_feature` (see there),
--   a dict `{node_set_name: input}` with the results of `node_set_inputs`, in
-    which each result is a tensor or dict of tensors,
--   a dict `{edge_set_name: input}` with the results of `edge_set_inputs`,
+-   the <code>context_input_feature</code> (see there),
+-   a dict <code>{node_set_name: input}</code> with the results of
+    <code>node_set_inputs</code>, in which each result is a tensor or dict of
+    tensors,
+-   a dict <code>{edge_set_name: input}</code> with the results of <code>edge_set_inputs</code>,
     in which each result is a tensor or dict of tensors, if there are any.
     </td>
     </tr><tr>
     <td>
-    `context_input_feature`<a id="context_input_feature"></a>
+    <code>context_input_feature</code><a id="context_input_feature"></a>
     </td>
     <td>
     The feature name(s) of inputs from the context to
-    `next_state`, defaults to <a href="../../../tfgnn.md#HIDDEN_STATE"><code>tfgnn.HIDDEN_STATE</code></a>.
+    <code>next_state</code>, defaults to <a href="../../../tfgnn.md#HIDDEN_STATE"><code>tfgnn.HIDDEN_STATE</code></a>.
     If set to a single feature name, a single tensor is passed.
-    If set to `None` or an empty sequence, an empty dict is passed.
+    If set to <code>None</code> or an empty sequence, an empty dict is passed.
     Otherwise, a dict of tensors keyed by feature names is passed.
     </td>
     </tr>
     </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Call result</h2></th></tr>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/EdgeSetUpdate.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/EdgeSetUpdate.md
index 3a9cc172..b35aa19b 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/EdgeSetUpdate.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/EdgeSetUpdate.md
@@ -1,17 +1,10 @@
 # tfgnn.keras.layers.EdgeSetUpdate
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/graph_update.py#L261-L350">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/graph_update.py#L264-L356">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Computes the new state of an EdgeSet from select input features.
 
@@ -27,36 +20,38 @@ Computes the new state of an EdgeSet from select input features.
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
+This layer can be restored from config by `tf.keras.models.load_model()` when
+saved as part of a Keras model using `save_format="tf"`.
+
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Init args</h2></th></tr>
 
 <tr>
 <td>
-`next_state`<a id="next_state"></a>
+<code>next_state</code><a id="next_state"></a>
 </td>
 <td>
 The NextState layer to apply.
 </td>
 </tr><tr>
 <td>
-`edge_input_feature`<a id="edge_input_feature"></a>
+<code>edge_input_feature</code><a id="edge_input_feature"></a>
 </td>
 <td>
 The feature name(s) of inputs from the edge set to
-`next_state`, defaults to <a href="../../../tfgnn.md#HIDDEN_STATE"><code>tfgnn.HIDDEN_STATE</code></a>.
+<code>next_state</code>, defaults to <a href="../../../tfgnn.md#HIDDEN_STATE"><code>tfgnn.HIDDEN_STATE</code></a>.
 If set to a single feature name, a single tensor is passed.
-If set to `None` or an empty sequence, an empty dict is passed.
+If set to <code>None</code> or an empty sequence, an empty dict is passed.
 Otherwise, a dict of tensors keyed by feature names is passed.
 </td>
 </tr><tr>
 <td>
-`node_input_tags`<a id="node_input_tags"></a>
+<code>node_input_tags</code><a id="node_input_tags"></a>
 </td>
 <td>
 The incident nodes of each edge whose states are used
@@ -65,23 +60,23 @@ by default).
 </td>
 </tr><tr>
 <td>
-`node_input_feature`<a id="node_input_feature"></a>
+<code>node_input_feature</code><a id="node_input_feature"></a>
 </td>
 <td>
 The feature name of the input from node sets to
-`next_state`, defaults to <a href="../../../tfgnn.md#HIDDEN_STATE"><code>tfgnn.HIDDEN_STATE</code></a>.
-Setting this to `None` passes an empty dict of node inputs.
+<code>next_state</code>, defaults to <a href="../../../tfgnn.md#HIDDEN_STATE"><code>tfgnn.HIDDEN_STATE</code></a>.
+Setting this to <code>None</code> passes an empty dict of node inputs.
 This class supports only a single input feature from nodes. For more
 complex settings, you need to write your own, or start a design discussion
 about a node_input_map from tags to the respective features for each.
 </td>
 </tr><tr>
 <td>
-`context_input_feature`<a id="context_input_feature"></a>
+<code>context_input_feature</code><a id="context_input_feature"></a>
 </td>
 <td>
 The feature name(s) of inputs from the context to
-`next_state`. Defaults to `None`, which passes an empty dict.
+<code>next_state</code>. Defaults to <code>None</code>, which passes an empty dict.
 If set to a single feature name, a single tensor is passed.
 Otherwise, a dict of tensors keyed by feature names is passed.
 To pass the default state tensor of the context, set this to
@@ -91,6 +86,7 @@ To pass the default state tensor of the context, set this to
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Call returns</h2></th></tr>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/GraphUpdate.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/GraphUpdate.md
index 1249d28d..c10b986c 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/GraphUpdate.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/GraphUpdate.md
@@ -1,17 +1,10 @@
 # tfgnn.keras.layers.GraphUpdate
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/graph_update.py#L125-L258">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/graph_update.py#L125-L261">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Applies one round of updates to EdgeSets, NodeSets and Context.
 
@@ -26,28 +19,30 @@ Applies one round of updates to EdgeSets, NodeSets and Context.
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
-The updates of EdgeSets, NodeSets and Context can either be passed as
-init arguments, or constructed later by passing a deferred_init_callback,
-which allows advanced users to adjust the updates to the GraphTensorSpec
-of the input (which EdgeSets and NodeSets even exist).
+The updates of EdgeSets, NodeSets and Context can either be passed as init
+arguments, or constructed later by passing a deferred_init_callback, which
+allows advanced users to adjust the updates to the GraphTensorSpec of the input
+(which EdgeSets and NodeSets even exist).
+
+This layer can be restored from config by `tf.keras.models.load_model()` when
+saved as part of a Keras model using `save_format="tf"`.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Init args</h2></th></tr>
 
 <tr>
 <td>
-`edge_sets`<a id="edge_sets"></a>
+<code>edge_sets</code><a id="edge_sets"></a>
 </td>
 <td>
-A dict `{edge_set_name: edge_set_update, ...}` of EdgeSetUpdate
+A dict <code>{edge_set_name: edge_set_update, ...}</code> of EdgeSetUpdate
 layers (or custom reimplementations). They are run on the input graph
-tensor as `edge_set_update(graph, edge_set_name=edge_set_name)`.
+tensor as <code>edge_set_update(graph, edge_set_name=edge_set_name)</code>.
 Their results are merged into the feature map of the respective edge set.
 This argument can be omitted, which is common in models with node set
 updates that use convolutions (i.e., read from adjacent nodes without
@@ -55,19 +50,19 @@ computing explicit edge states).
 </td>
 </tr><tr>
 <td>
-`node_sets`<a id="node_sets"></a>
+<code>node_sets</code><a id="node_sets"></a>
 </td>
 <td>
-A dict `{node_set_name: node_set_update, ...}` of NodeSetUpdate
+A dict <code>{node_set_name: node_set_update, ...}</code> of NodeSetUpdate
 layers (or custom reimplementations). They are run on the graph tensor
 with edge set updates (if any) as
-`node_set_update(graph, node_set_name=node_set_name)`,
+<code>node_set_update(graph, node_set_name=node_set_name)</code>,
 Their results are merged into the feature map of the respective node set.
 This argument can be omitted (but that is uncommon).
 </td>
 </tr><tr>
 <td>
-`context`<a id="context"></a>
+<code>context</code><a id="context"></a>
 </td>
 <td>
 A ContextUpdate that is run on the graph tensor with edge set and
@@ -77,7 +72,7 @@ without a context state.
 </td>
 </tr><tr>
 <td>
-`deferred_init_callback`<a id="deferred_init_callback"></a>
+<code>deferred_init_callback</code><a id="deferred_init_callback"></a>
 </td>
 <td>
 Can be set to a function that accepts a
@@ -93,6 +88,7 @@ be saved.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Call result</h2></th></tr>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ItemDropout.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ItemDropout.md
index afd222c9..0c3dbd0c 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ItemDropout.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ItemDropout.md
@@ -1,17 +1,10 @@
 # tfgnn.keras.layers.ItemDropout
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/item_dropout.py#L22-L74">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/item_dropout.py#L22-L77">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Dropout of feature values for entire edges, nodes or components.
 
@@ -27,39 +20,44 @@ This Layer class wraps `tf.keras.layers.Dropout` to perform edge dropout or node
 dropout (or "component dropout", which is rarely useful) on Tensors shaped like
 features of a **scalar** GraphTensor.
 
+This layer can be restored from config by `tf.keras.models.load_model()` when
+saved as part of a Keras model using `save_format="tf"`.
+
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Init args</h2></th></tr>
 
 <tr>
 <td>
-`rate`<a id="rate"></a>
+<code>rate</code><a id="rate"></a>
 </td>
 <td>
-The dropout rate, forwarded to `tf.keras.layers.Dropout`.
+The dropout rate, forwarded to <code>tf.keras.layers.Dropout</code>.
 </td>
 </tr><tr>
 <td>
-`seed`<a id="seed"></a>
+<code>seed</code><a id="seed"></a>
 </td>
 <td>
-The random seed, forwarded `tf.keras.layers.Dropout`.
+The random seed, forwarded <code>tf.keras.layers.Dropout</code>.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Call args</h2></th></tr>
 
 <tr>
 <td>
-`x`<a id="x"></a>
+<code>x</code><a id="x"></a>
 </td>
 <td>
-A float Tensor of shape `[num_items, *feature_dims]`. This is the shape
+A float Tensor of shape <code>[num_items, *feature_dims]</code>. This is the shape
 of node features or edge features (or context features) in a *scalar**
 GraphTensor. Across calls, all inputs must have the same known rank.
 </td>
@@ -67,15 +65,16 @@ GraphTensor. Across calls, all inputs must have the same known rank.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Call returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-A Tensor `y` with the same shape and dtype as the input `x`.
-In non-training mode, the output is the same as the input: `y == x`.
-In training mode, each row `y[i]` is either zeros (with probability `rate`)
-or a scaled-up copy of the input row: `y[i] = x[i] * 1./(1-rate)`.
+A Tensor <code>y</code> with the same shape and dtype as the input <code>x</code>.
+In non-training mode, the output is the same as the input: <code>y == x</code>.
+In training mode, each row <code>y[i]</code> is either zeros (with probability <code>rate</code>)
+or a scaled-up copy of the input row: <code>y[i] = x[i] * 1./(1-rate)</code>.
 This is similar to ordinary dropout, except all or none of the feature
 values for each item are dropped out.
 </td>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/MakeEmptyFeature.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/MakeEmptyFeature.md
index b5257840..a2c13f00 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/MakeEmptyFeature.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/MakeEmptyFeature.md
@@ -1,17 +1,10 @@
 # tfgnn.keras.layers.MakeEmptyFeature
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/map_features.py#L363-L419">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/map_features.py#L362-L421">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns an empty feature with a shape that fits the input graph piece.
 
@@ -21,25 +14,27 @@ Returns an empty feature with a shape that fits the input graph piece.
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
+This layer can be restored from config by `tf.keras.models.load_model()` when
+saved as part of a Keras model using `save_format="tf"`.
+
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Init args</h2></th></tr>
 
 <tr>
 <td>
-`dtype`<a id="dtype"></a>
+<code>dtype</code><a id="dtype"></a>
 </td>
 <td>
 the tf.DType to use for the result, defaults to tf.float32.
 </td>
 </tr><tr>
 <td>
-`**kwargs`<a id="**kwargs"></a>
+<code>**kwargs</code><a id="**kwargs"></a>
 </td>
 <td>
 Other arguments for the tf.keras.layers.Layer base class.
@@ -48,13 +43,14 @@ Other arguments for the tf.keras.layers.Layer base class.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Call args</h2></th></tr>
 
 <tr>
 <td>
-`graph_piece`<a id="graph_piece"></a>
+<code>graph_piece</code><a id="graph_piece"></a>
 </td>
 <td>
 a Context, NodeSet or EdgeSet from a GraphTensor.
@@ -63,6 +59,7 @@ a Context, NodeSet or EdgeSet from a GraphTensor.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Call returns</h2></th></tr>
@@ -80,6 +77,7 @@ GraphTensor, the result is a Tensor of shape [graph_piece.total_size, 0].
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">TPU compatibility</h2></th></tr>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/MapFeatures.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/MapFeatures.md
index 2b1f2b90..13700a35 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/MapFeatures.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/MapFeatures.md
@@ -1,17 +1,10 @@
 # tfgnn.keras.layers.MapFeatures
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/map_features.py#L27-L316">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/map_features.py#L27-L317">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Transforms features on a GraphTensor by user-defined callbacks.
 
@@ -30,10 +23,10 @@ Transforms features on a GraphTensor by user-defined callbacks.
 <!-- Placeholder for "Used in" -->
 
 This layer transforms the feature maps of graph pieces (that is, EdgeSets,
-NodeSets, or the Context) by applying Keras Models to them. Those Models
-are built by user-supplied callbacks that receive a KerasTensor for the
-graph piece as input and return a dict of output features computed with
-the Keras functional API, see https://tensorflow.org/guide/keras/functional.
+NodeSets, or the Context) by applying Keras Models to them. Those Models are
+built by user-supplied callbacks that receive a KerasTensor for the graph piece
+as input and return a dict of output features computed with the Keras functional
+API, see https://tensorflow.org/guide/keras/functional.
 
 Auxiliary graph pieces (e.g., for
 <a href="../../../tfgnn/keras/layers/StructuredReadout.md"><code>tfgnn.keras.layers.StructuredReadout</code></a>)
@@ -42,8 +35,6 @@ are skipped, unless explicitly requested via `allowed_aux_node_sets_pattern` or
 
 #### Examples:
 
-
-
 ```python
 # Hashes edge features called "id", leaves others unchanged:
 def edge_sets_fn(edge_set, *, edge_set_name):
@@ -81,11 +72,11 @@ graph = tfgnn.keras.layers.MapFeatures(
 )(graph)
 ```
 
-When this layer is called on a GraphTensor, it transforms the feature map
-of each graph piece with the model built by the respective callbacks.
-The very first call to this layer triggers building the models. Subsequent
-calls to this layer do not use the callbacks again, but check that their
-input does not have more graph pieces or features than seen by the callbacks:
+When this layer is called on a GraphTensor, it transforms the feature map of
+each graph piece with the model built by the respective callbacks. The very
+first call to this layer triggers building the models. Subsequent calls to this
+layer do not use the callbacks again, but check that their input does not have
+more graph pieces or features than seen by the callbacks:
 
 *   It is an error to call with a node set or edge set that was not present in
     the first call. (After the first call, it is too late to initialize another
@@ -102,34 +93,33 @@ input does not have more graph pieces or features than seen by the callbacks:
 More details on the callbacks:
 
 The model-building callbacks are passed as arguments when initializing this
-layer (see "Init args" below). Each callback is invoked as
-`fn(graph_piece, **kwargs)` where
+layer (see "Init args" below). Each callback is invoked as `fn(graph_piece,
+**kwargs)` where
 
-  * `graph_piece` is a KerasTensor for the EdgeSet, NodeSet or Context
-    that is being transformed. It provides access to the input features.
-  * the keyword argument (if any) is
-      * `edge_set_name=...` when transforming the features of that EdgeSet,
-      * `node_set_name=...` when transforming the features of that NodeSet,
-      * absent when transforming the features of the Context.
+*   `graph_piece` is a KerasTensor for the EdgeSet, NodeSet or Context that is
+    being transformed. It provides access to the input features.
+*   the keyword argument (if any) is
+    *   `edge_set_name=...` when transforming the features of that EdgeSet,
+    *   `node_set_name=...` when transforming the features of that NodeSet,
+    *   absent when transforming the features of the Context.
 
 The output of the callbacks can take one of the following forms:
 
-  * A returned dict of feature values is used as the new feature map of
-    the respective graph piece in this layer's output. Returning the
-    empty dict `{}` is allowed and results in an empty feature map.
-  * A returned feature `value` not wrapped in a dict is a shorthand for
-    `{tfgnn.HIDDEN_STATE: value}`, to simplify the set-up of initial
-    states.
-  * Returning `None` as the callback's result indicates to leave this graph
+*   A returned dict of feature values is used as the new feature map of the
+    respective graph piece in this layer's output. Returning the empty dict `{}`
+    is allowed and results in an empty feature map.
+*   A returned feature `value` not wrapped in a dict is a shorthand for
+    `{tfgnn.HIDDEN_STATE: value}`, to simplify the set-up of initial states.
+*   Returning `None` as the callback's result indicates to leave this graph
     piece alone and not even validate that subsequent inputs have the same
     features.
 
 The output values are required to
 
-  * have the correct shape for a feature on the respective piece of the
+*   have the correct shape for a feature on the respective piece of the
     GraphTensor;
-  * depend on the input, so that the Keras functional API can use them
-    as Model outputs.
+*   depend on the input, so that the Keras functional API can use them as Model
+    outputs.
 
 This happens naturally for outputs of transformed input features. Outputs
 created from scratch still need to depend on the input for its size. The helper
@@ -144,75 +134,80 @@ KerasTensors.
 Weight sharing between the transformation of different graph pieces is possible
 by sharing the Keras objects between the respective callback invocations.
 
+This layer can be restored from config by `tf.keras.models.load_model()` when
+saved as part of a Keras model using `save_format="tf"`.
+
 WARNING: Weight sharing fails in `tf.keras.models.load_model()` with an error
 message on weights missing from the checkpoint. (Most users don't need to
 re-load their models this way.)
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Init args</h2></th></tr>
 
 <tr>
 <td>
-`context_fn`<a id="context_fn"></a>
+<code>context_fn</code><a id="context_fn"></a>
 </td>
 <td>
 A callback to build a Keras model for transforming context
-features. It will be called as `output = context_fn(g.context)`.
-Leaving this at the default `None` is equivalent to returning `None`.
+features. It will be called as <code>output = context_fn(g.context)</code>.
+Leaving this at the default <code>None</code> is equivalent to returning <code>None</code>.
 </td>
 </tr><tr>
 <td>
-`node_sets_fn`<a id="node_sets_fn"></a>
+<code>node_sets_fn</code><a id="node_sets_fn"></a>
 </td>
 <td>
 A callback to build a Keras model for transforming node set
 features. It will be called for every node sets as
-`node_sets_fn(g.node_sets[node_set_name], node_set_name=node_set_name)`.
-Leaving this at the default `None` is equivalent to returning `None`
+<code>node_sets_fn(g.node_sets[node_set_name], node_set_name=node_set_name)</code>.
+Leaving this at the default <code>None</code> is equivalent to returning <code>None</code>
 for every node set.
 </td>
 </tr><tr>
 <td>
-`edge_sets_fn`<a id="edge_sets_fn"></a>
+<code>edge_sets_fn</code><a id="edge_sets_fn"></a>
 </td>
 <td>
 A callback to build a Keras model for transforming edge set
 features. It will be called for every edge sets as
-`edge_sets_fn(g.edge_sets[edge_set_name], edge_set_name=edge_set_name)`.
-Leaving this at the default `None` is equivalent to returning `None`
+<code>edge_sets_fn(g.edge_sets[edge_set_name], edge_set_name=edge_set_name)</code>.
+Leaving this at the default <code>None</code> is equivalent to returning <code>None</code>
 for every edge set.
 </td>
 </tr><tr>
 <td>
-`allowed_aux_node_sets_pattern`<a id="allowed_aux_node_sets_pattern"></a>
+<code>allowed_aux_node_sets_pattern</code><a id="allowed_aux_node_sets_pattern"></a>
 </td>
 <td>
-If set, `node_sets_fn` is also invoked for
+If set, <code>node_sets_fn</code> is also invoked for
 those auxiliary node sets that match this pattern, according to Python's
-`re.fullmatch(pattern, node_set_name)`.
+<code>re.fullmatch(pattern, node_set_name)</code>.
 </td>
 </tr><tr>
 <td>
-`allowed_aux_edge_sets_pattern`<a id="allowed_aux_edge_sets_pattern"></a>
+<code>allowed_aux_edge_sets_pattern</code><a id="allowed_aux_edge_sets_pattern"></a>
 </td>
 <td>
-If set, `edge_sets_fn` is also invoked for
+If set, <code>edge_sets_fn</code> is also invoked for
 those auxiliary edge sets that match this pattern, according to Python's
-`re.fullmatch(pattern, edge_set_name)`.
+<code>re.fullmatch(pattern, edge_set_name)</code>.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Call args</h2></th></tr>
 
 <tr>
 <td>
-`graph`<a id="graph"></a>
+<code>graph</code><a id="graph"></a>
 </td>
 <td>
 A GraphTensor. The very first call triggers the building of
@@ -223,6 +218,7 @@ taken from the GraphTensorSpec of the first input.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Call returns</h2></th></tr>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/NextStateFromConcat.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/NextStateFromConcat.md
index 2b5d3d2f..b409eef2 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/NextStateFromConcat.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/NextStateFromConcat.md
@@ -1,17 +1,10 @@
 # tfgnn.keras.layers.NextStateFromConcat
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/next_state.py#L110-L142">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/next_state.py#L110-L145">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Computes a new state by concatenating inputs and applying a Keras Layer.
 
@@ -21,21 +14,23 @@ Computes a new state by concatenating inputs and applying a Keras Layer.
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
 This layer flattens all inputs into a list (forgetting their origin),
 concatenates them and sends them through a user-supplied feed-forward network.
 
+This layer can be restored from config by `tf.keras.models.load_model()` when
+saved as part of a Keras model using `save_format="tf"`.
+
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Init args</h2></th></tr>
 
 <tr>
 <td>
-`transformation`<a id="transformation"></a>
+<code>transformation</code><a id="transformation"></a>
 </td>
 <td>
 Required. A Keras Layer to transform the combined inputs
@@ -45,6 +40,7 @@ into the new state.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Call returns</h2></th></tr>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/NodeSetUpdate.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/NodeSetUpdate.md
index 14972a35..ac73bdbc 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/NodeSetUpdate.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/NodeSetUpdate.md
@@ -1,17 +1,10 @@
 # tfgnn.keras.layers.NodeSetUpdate
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/graph_update.py#L353-L443">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/graph_update.py#L359-L452">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 A node state update with input from convolutions or other edge set inputs.
 
@@ -26,46 +19,50 @@ A node state update with input from convolutions or other edge set inputs.
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
+This layer can be restored from config by `tf.keras.models.load_model()` when
+saved as part of a Keras model using `save_format="tf"`.
+
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Init args</h2></th></tr>
 
-<tr> <td> `edge_set_inputs`<a id="edge_set_inputs"></a> </td> <td> A dict
-`{edge_set_name: edge_set_input, ...}` of Keras layers (such as convolutions)
-that return values shaped like node features with information aggregated from
-the given edge set. They are run in parallel on the input graph tensor as
-`edge_set_input(graph, edge_set_name=edge_set_name)`. </td> </tr><tr> <td>
-`next_state`<a id="next_state"></a> </td> <td> A Keras layer to compute the new
-node state from a tuple of inputs that contains, in this order:
+<tr> <td> <code>edge_set_inputs</code><a id="edge_set_inputs"></a> </td> <td> A
+dict <code>{edge_set_name: edge_set_input, ...}</code> of Keras layers (such as
+convolutions) that return values shaped like node features with information
+aggregated from the given edge set. They are run in parallel on the input graph
+tensor as <code>edge_set_input(graph, edge_set_name=edge_set_name)</code>. </td>
+</tr><tr> <td> <code>next_state</code><a id="next_state"></a> </td> <td> A Keras
+layer to compute the new node state from a tuple of inputs that contains, in
+this order:
 
--   the `node_input_feature` (see there),
--   a dict `{edge_set_name: input}` with the results of `edge_set_inputs`, in
-    which each result is a tensor or dict of tensors,
--   if context_input_feature is not `None`, those feature(s).
+-   the <code>node_input_feature</code> (see there),
+-   a dict <code>{edge_set_name: input}</code> with the results of
+    <code>edge_set_inputs</code>, in which each result is a tensor or dict of
+    tensors,
+-   if context_input_feature is not <code>None</code>, those feature(s).
     </td>
     </tr><tr>
     <td>
-    `node_input_feature`<a id="node_input_feature"></a>
+    <code>node_input_feature</code><a id="node_input_feature"></a>
     </td>
     <td>
     The feature name(s) of inputs from the node set to
-    `next_state`, defaults to <a href="../../../tfgnn.md#HIDDEN_STATE"><code>tfgnn.HIDDEN_STATE</code></a>.
+    <code>next_state</code>, defaults to <a href="../../../tfgnn.md#HIDDEN_STATE"><code>tfgnn.HIDDEN_STATE</code></a>.
     If set to a single feature name, a single tensor is passed.
-    If set to `None` or an empty sequence, an empty dict is passed.
+    If set to <code>None</code> or an empty sequence, an empty dict is passed.
     Otherwise, a dict of tensors keyed by feature names is passed.
     </td>
     </tr><tr>
     <td>
-    `context_input_feature`<a id="context_input_feature"></a>
+    <code>context_input_feature</code><a id="context_input_feature"></a>
     </td>
     <td>
     The feature name(s) of inputs from the context to
-    `next_state`. Defaults to `None`, which passes an empty dict.
+    <code>next_state</code>. Defaults to <code>None</code>, which passes an empty dict.
     If set to a single feature name, a single tensor is passed.
     Otherwise, a dict of tensors keyed by feature names is passed.
     To pass the default state tensor of the context, set this to
@@ -75,6 +72,7 @@ node state from a tuple of inputs that contains, in this order:
     </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Call result</h2></th></tr>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/PadToTotalSizes.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/PadToTotalSizes.md
index b678656a..9c9acb53 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/PadToTotalSizes.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/PadToTotalSizes.md
@@ -1,17 +1,10 @@
 # tfgnn.keras.layers.PadToTotalSizes
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/padding_ops.py#L23-L64">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/padding_ops.py#L23-L66">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Applies tfgnn.pad_to_total_sizes() to a GraphTensor.
 
@@ -24,15 +17,14 @@ Applies tfgnn.pad_to_total_sizes() to a GraphTensor.
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
 This Keras layer maps a GraphTensor to a GraphTensor by calling
-<a href="../../../tfgnn/pad_to_total_sizes.md"><code>tfgnn.pad_to_total_sizes()</code></a> with the additional arguments, notably
-`sizes_constraints`, passed at initialization time. See that function
-for detailed documentation.
-
-Serialization to a Keras model config requires the `sizes_constraints` to
-contain Python integers or eager Tensors, not symbolic Tensors.
-
+<a href="../../../tfgnn/pad_to_total_sizes.md"><code>tfgnn.pad_to_total_sizes()</code></a>
+with the additional arguments, notably `sizes_constraints`, passed at
+initialization time. See that function for detailed documentation.
+
+This layer can be restored from config by `tf.keras.models.load_model()` when
+saved as part of a Keras model using `save_format="tf"`. Serialization to a
+Keras model config requires the `sizes_constraints` to contain Python integers
+or eager Tensors, not symbolic Tensors.
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ParseExample.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ParseExample.md
index 08709382..d2856137 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ParseExample.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ParseExample.md
@@ -1,17 +1,10 @@
 # tfgnn.keras.layers.ParseExample
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/parse_example.py#L25-L38">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/parse_example.py#L23-L40">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Applies tfgnn.parse_example(graph_tensor_spec, _) to a batch of strings.
 
@@ -22,8 +15,7 @@ Applies tfgnn.parse_example(graph_tensor_spec, _) to a batch of strings.
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
-
+This layer can be restored from config by `tf.keras.models.load_model()` when
+saved as part of a Keras model using `save_format="tf"`.
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ParseSingleExample.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ParseSingleExample.md
index afb0e7ad..37bcaed5 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ParseSingleExample.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ParseSingleExample.md
@@ -1,17 +1,10 @@
 # tfgnn.keras.layers.ParseSingleExample
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/parse_example.py#L41-L54">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/parse_example.py#L43-L60">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Applies tfgnn.parse_single_example(graph_tensor_spec, _).
 
@@ -22,8 +15,7 @@ Applies tfgnn.parse_single_example(graph_tensor_spec, _).
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
-
+This layer can be restored from config by `tf.keras.models.load_model()` when
+saved as part of a Keras model using `save_format="tf"`.
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/Pool.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/Pool.md
index 9f1de463..1bbf5565 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/Pool.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/Pool.md
@@ -1,17 +1,10 @@
 # tfgnn.keras.layers.Pool
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/graph_ops.py#L806-L919">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/graph_ops.py#L806-L919">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Pools a GraphTensor feature.
 
@@ -29,8 +22,8 @@ Pools a GraphTensor feature.
 
 <!-- Placeholder for "Used in" -->
 
-This layer accepts a complete GraphTensor and returns a tensor with the
-result of pooling some feature.
+This layer accepts a complete GraphTensor and returns a tensor with the result
+of pooling some feature.
 
 There are two kinds of pooling that this layer can be used for:
 
@@ -64,13 +57,14 @@ The feature name can be left unset to select
 <a href="../../../tfgnn.md#HIDDEN_STATE"><code>tfgnn.HIDDEN_STATE</code></a>.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Init args</h2></th></tr>
 
 <tr>
 <td>
-`tag`<a id="tag"></a>
+<code>tag</code><a id="tag"></a>
 </td>
 <td>
 Can be set to one of <a href="../../../tfgnn.md#SOURCE"><code>tfgnn.SOURCE</code></a>, <a href="../../../tfgnn.md#TARGET"><code>tfgnn.TARGET</code></a> or <a href="../../../tfgnn.md#CONTEXT"><code>tfgnn.CONTEXT</code></a>
@@ -78,32 +72,32 @@ to select the receiver.
 </td>
 </tr><tr>
 <td>
-`reduce_type`<a id="reduce_type"></a>
+<code>reduce_type</code><a id="reduce_type"></a>
 </td>
 <td>
-Can be set to any `reduce_type` understood by <a href="../../../tfgnn/pool.md"><code>tfgnn.pool()</code></a>.
+Can be set to any <code>reduce_type</code> understood by <a href="../../../tfgnn/pool.md"><code>tfgnn.pool()</code></a>.
 </td>
 </tr><tr>
 <td>
-`edge_set_name`<a id="edge_set_name"></a>
+<code>edge_set_name</code><a id="edge_set_name"></a>
 </td>
 <td>
 If set, the feature will be pooled from this edge set
-(or this sequence of edge sets) to the receiver given by `tag`.
-Mutually exclusive with `node_set_name`.
+(or this sequence of edge sets) to the receiver given by <code>tag</code>.
+Mutually exclusive with <code>node_set_name</code>.
 </td>
 </tr><tr>
 <td>
-`node_set_name`<a id="node_set_name"></a>
+<code>node_set_name</code><a id="node_set_name"></a>
 </td>
 <td>
 If set, the feature will be pooled from this node set
 (or sequence of node sets). The receiver must be selected as
-`tag=tfgnn.CONTEXT`. Mutually exclusive with `edge_set_name`.
+<code>tag=tfgnn.CONTEXT</code>. Mutually exclusive with <code>edge_set_name</code>.
 </td>
 </tr><tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 The name of the feature to read. If unset (also in call),
@@ -113,20 +107,21 @@ the <a href="../../../tfgnn.md#HIDDEN_STATE"><code>tfgnn.HIDDEN_STATE</code></a>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Call args</h2></th></tr>
 
 <tr>
 <td>
-`graph`<a id="graph"></a>
+<code>graph</code><a id="graph"></a>
 </td>
 <td>
 The scalar <a href="../../../tfgnn/GraphTensor.md"><code>tfgnn.GraphTensor</code></a> to read from.
 </td>
 </tr><tr>
 <td>
-`reduce_type`<a id="reduce_type"></a>
+<code>reduce_type</code><a id="reduce_type"></a>
 </td>
 <td>
 Same meaning as for init. Must be passed to init, or to call,
@@ -134,7 +129,7 @@ or to both (with the same value).
 </td>
 </tr><tr>
 <td>
-`tag`<a id="tag"></a>
+<code>tag</code><a id="tag"></a>
 </td>
 <td>
 Same meaning as for init. Must be passed to init, or to call,
@@ -144,7 +139,7 @@ edge_set_name, node_set_name: Same meaning as for init. One of them must
 </td>
 </tr><tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 Same meaning as for init. If passed to both, the value must
@@ -154,6 +149,7 @@ be the same. If passed to neither, <a href="../../../tfgnn.md#HIDDEN_STATE"><cod
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
@@ -165,39 +161,36 @@ A tensor with the pooled feature value.
 
 </table>
 
-
-
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
 <tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 Returns the feature_name argument to init, or None if unset.
 </td>
 </tr><tr>
 <td>
-`location`<a id="location"></a>
+<code>location</code><a id="location"></a>
 </td>
 <td>
 Returns dict of kwarg to init with the node or edge set name.
 </td>
 </tr><tr>
 <td>
-`reduce_type`<a id="reduce_type"></a>
+<code>reduce_type</code><a id="reduce_type"></a>
 </td>
 <td>
 Returns the reduce_type argument to init, or None if unset.
 </td>
 </tr><tr>
 <td>
-`tag`<a id="tag"></a>
+<code>tag</code><a id="tag"></a>
 </td>
 <td>
 Returns the tag argument to init, or None if unset.
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/Readout.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/Readout.md
index d08783bd..df4d238d 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/Readout.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/Readout.md
@@ -1,17 +1,10 @@
 # tfgnn.keras.layers.Readout
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/graph_ops.py#L38-L175">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/graph_ops.py#L38-L175">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Reads a feature out of a GraphTensor.
 
@@ -36,14 +29,14 @@ require a Keras Layer and do not allow subscrpting syntax like
 `graph_tensor.node_sets["user"]["name"]`.
 
 A location in the graph is selected by setting exactly one of the keyword
-arguments `edge_set_name=...`, `node_set_name=...` or `from_context=True`.
-From there, the keyword argument `feature_name=...` selects the feature.
+arguments `edge_set_name=...`, `node_set_name=...` or `from_context=True`. From
+there, the keyword argument `feature_name=...` selects the feature.
 
-Both the initialization of and the call to this layer accept arguments to
-select the feature location and the feature name. The call arguments take
-effect for that call only and can supply missing values, but they are not
-allowed to contradict initialization arguments. The feature name can be left
-unset to select tfgnn.HIDDEN_STATE.
+Both the initialization of and the call to this layer accept arguments to select
+the feature location and the feature name. The call arguments take effect for
+that call only and can supply missing values, but they are not allowed to
+contradict initialization arguments. The feature name can be left unset to
+select tfgnn.HIDDEN_STATE.
 
 #### For example:
 
@@ -64,13 +57,14 @@ previously handled by
 <a href="../../../tfgnn/keras/layers/ReadoutFirstNode.md"><code>tfgnn.keras.layers.ReadoutFirstNode</code></a>.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Init args</h2></th></tr>
 
 <tr>
 <td>
-`edge_set_name`<a id="edge_set_name"></a>
+<code>edge_set_name</code><a id="edge_set_name"></a>
 </td>
 <td>
 If set, the feature will be read from this edge set.
@@ -78,7 +72,7 @@ Mutually exclusive with node_set_name and from_context.
 </td>
 </tr><tr>
 <td>
-`node_set_name`<a id="node_set_name"></a>
+<code>node_set_name</code><a id="node_set_name"></a>
 </td>
 <td>
 If set, the feature will be read from this node set.
@@ -86,7 +80,7 @@ Mutually exclusive with edge_set_name and from_context.
 </td>
 </tr><tr>
 <td>
-`from_context`<a id="from_context"></a>
+<code>from_context</code><a id="from_context"></a>
 </td>
 <td>
 If true, the feature will be read from the context.
@@ -94,7 +88,7 @@ Mutually exclusive with edge_set_name and node_set_name.
 </td>
 </tr><tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 The name of the feature to read. If unset (also in call),
@@ -104,13 +98,14 @@ tfgnn.HIDDEN_STATE will be read.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Call args</h2></th></tr>
 
 <tr>
 <td>
-`graph`<a id="graph"></a>
+<code>graph</code><a id="graph"></a>
 </td>
 <td>
 The GraphTensor to read from.
@@ -119,7 +114,7 @@ edge_set_name, node_set_name, from_context: Same meaning as for init. One of
 </td>
 </tr><tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 Same meaning as for init. If passed to both, the value must
@@ -129,6 +124,7 @@ be the same. If passed to neither, tfgnn.HIDDEN_STATE is used.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
@@ -140,28 +136,22 @@ The tensor with the selected feature.
 
 </table>
 
-
-
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
-<tr> <td> `feature_name`<a id="feature_name"></a> </td> <td> Returns the
-feature_name argument to init, or None if unset. </td> </tr><tr> <td>
-`location`<a id="location"></a> </td> <td> Returns a dict with the kwarg to init
-that selected the feature location.
+<tr> <td> <code>feature_name</code><a id="feature_name"></a> </td> <td> Returns
+the feature_name argument to init, or None if unset. </td> </tr><tr> <td>
+<code>location</code><a id="location"></a> </td> <td> Returns a dict with the
+kwarg to init that selected the feature location.
 
-The result contains the keyword argument and value passed to `__init__()`
+The result contains the keyword argument and value passed to <code>**init**()</code>
 that selects the location from which the layer's output feature is read,
- that is, one of `edge_set_name=...`, `node_set_name=...` or
-`from_context=True`. If none of these has been set, the result is
+ that is, one of <code>edge_set_name=...</code>, <code>node_set_name=...</code> or
+<code>from_context=True</code>. If none of these has been set, the result is
 empty, and one of them must be set at call time.
 </td>
 </tr>
 </table>
-
-
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ReadoutFirstNode.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ReadoutFirstNode.md
index 7f551c39..2e1e6382 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ReadoutFirstNode.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ReadoutFirstNode.md
@@ -1,17 +1,10 @@
 # tfgnn.keras.layers.ReadoutFirstNode
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/graph_ops.py#L226-L315">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/graph_ops.py#L226-L315">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Reads a feature from the first node of each graph component.
 
@@ -26,15 +19,15 @@ Reads a feature from the first node of each graph component.
 
 <!-- Placeholder for "Used in" -->
 
-Given a particular node set (identified by `node_set_name`), this layer
-will gather the given feature from the first node of each graph component.
+Given a particular node set (identified by `node_set_name`), this layer will
+gather the given feature from the first node of each graph component.
 
 This is often used for rooted graphs created by sampling around the
 neighborhoods of seed nodes in a large graph: by convention, each seed node is
-the first node of its component in the respective node set, and this layer
-reads out the information it has accumulated there. (In other node sets, the
-first node may be arbitrary -- or nonexistant, in which case this operation
-must not be used and may raise an error at runtime.)
+the first node of its component in the respective node set, and this layer reads
+out the information it has accumulated there. (In other node sets, the first
+node may be arbitrary -- or nonexistent, in which case this operation must not
+be used and may raise an error at runtime.)
 
 This implicit convention is inflexible and hard to validate. New models are
 encouraged to use
@@ -44,20 +37,21 @@ sampled dataset itself; if not, it can be added after the fact by
 <a href="../../../tfgnn/keras/layers/AddReadoutFromFirstNode.md"><code>tfgnn.keras.layers.AddReadoutFromFirstNode</code></a>.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Init args</h2></th></tr>
 
 <tr>
 <td>
-`node_set_name`<a id="node_set_name"></a>
+<code>node_set_name</code><a id="node_set_name"></a>
 </td>
 <td>
 If set, the feature will be read from this node set.
 </td>
 </tr><tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 The name of the feature to read. If unset (also in call),
@@ -67,20 +61,21 @@ tfgnn.HIDDEN_STATE will be read.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Call args</h2></th></tr>
 
 <tr>
 <td>
-`graph`<a id="graph"></a>
+<code>graph</code><a id="graph"></a>
 </td>
 <td>
 The scalar GraphTensor to read from.
 </td>
 </tr><tr>
 <td>
-`node_set_name`<a id="node_set_name"></a>
+<code>node_set_name</code><a id="node_set_name"></a>
 </td>
 <td>
 Same meaning as for init. Must be passed to init, or to call,
@@ -88,7 +83,7 @@ or to both (with the same value).
 </td>
 </tr><tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 Same meaning as for init. If passed to both, the value must
@@ -98,6 +93,7 @@ be the same. If passed to neither, tfgnn.HIDDEN_STATE is used.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
@@ -110,23 +106,17 @@ context feature.
 
 </table>
 
-
-
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
-<tr> <td> `feature_name`<a id="feature_name"></a> </td> <td> Returns the
-feature_name argument to init, or None if unset. </td> </tr><tr> <td>
-`location`<a id="location"></a> </td> <td> Returns a dict with the kwarg to init
-that selected the feature location.
+<tr> <td> <code>feature_name</code><a id="feature_name"></a> </td> <td> Returns
+the feature_name argument to init, or None if unset. </td> </tr><tr> <td>
+<code>location</code><a id="location"></a> </td> <td> Returns a dict with the
+kwarg to init that selected the feature location.
 
 </td>
 </tr>
 </table>
-
-
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ReadoutNamedIntoFeature.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ReadoutNamedIntoFeature.md
index 48797d43..b57c41a7 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ReadoutNamedIntoFeature.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ReadoutNamedIntoFeature.md
@@ -1,17 +1,10 @@
 # tfgnn.keras.layers.ReadoutNamedIntoFeature
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/graph_ops.py#L415-L515">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/graph_ops.py#L415-L515">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Reads out a feature value from select nodes (or edges) in a graph.
 
@@ -61,16 +54,16 @@ training labels out of the input graph seen my the model (see
 
 <tr>
 <td>
-`key`<a id="key"></a>
+<code>key</code><a id="key"></a>
 </td>
 <td>
 A string key to select between possibly multiple named readouts
-(such as `"source"` and `"target"` for link prediction). Can be fixed
+(such as <code>"source"</code> and <code>"target"</code> for link prediction). Can be fixed
 in init, or selected for each call.
 </td>
 </tr><tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 The name of a feature to read out from, as with
@@ -78,42 +71,42 @@ The name of a feature to read out from, as with
 </td>
 </tr><tr>
 <td>
-`new_feature_name`<a id="new_feature_name"></a>
+<code>new_feature_name</code><a id="new_feature_name"></a>
 </td>
 <td>
-The name of the feature to add to `readout_node_set`
-for storing the readout result. If unset, defaults to `feature_name`.
-It is an error if the added feature already exists on `readout_node_set`
-in the input `graph`, unless `overwrite=True` is set.
+The name of the feature to add to <code>readout_node_set</code>
+for storing the readout result. If unset, defaults to <code>feature_name</code>.
+It is an error if the added feature already exists on <code>readout_node_set</code>
+in the input <code>graph</code>, unless <code>overwrite=True</code> is set.
 </td>
 </tr><tr>
 <td>
-`remove_input_feature`<a id="remove_input_feature"></a>
+<code>remove_input_feature</code><a id="remove_input_feature"></a>
 </td>
 <td>
-If set, the given `feature_name` is removed from the
+If set, the given <code>feature_name</code> is removed from the
 node (or edge) set(s) that contain the value to be read out in the input
-`GraphTensor`.
+<code>GraphTensor</code>.
 </td>
 </tr><tr>
 <td>
-`overwrite`<a id="overwrite"></a>
+<code>overwrite</code><a id="overwrite"></a>
 </td>
 <td>
 If set, allows overwriting a potentially already existing
-feature `graph.node_sets[readout_node_set][new_feature_name]`.
+feature <code>graph.node_sets[readout_node_set][new_feature_name]</code>.
 </td>
 </tr><tr>
 <td>
-`readout_node_set`<a id="readout_node_set"></a>
+<code>readout_node_set</code><a id="readout_node_set"></a>
 </td>
 <td>
-A string, defaults to `"_readout"`. This is used as the
+A string, defaults to <code>"_readout"</code>. This is used as the
 name for the readout node set and as a name prefix for its edge sets.
 </td>
 </tr><tr>
 <td>
-`validate`<a id="validate"></a>
+<code>validate</code><a id="validate"></a>
 </td>
 <td>
 Setting this to false disables the validity checks of
@@ -132,14 +125,14 @@ on structurally unchanged GraphTensors.
 
 <tr>
 <td>
-`graph`<a id="graph"></a>
+<code>graph</code><a id="graph"></a>
 </td>
 <td>
-The scalar `GraphTensor` to read from.
+The scalar <code>GraphTensor</code> to read from.
 </td>
 </tr><tr>
 <td>
-`key`<a id="key"></a>
+<code>key</code><a id="key"></a>
 </td>
 <td>
 Same meaning as for init. Must be passed to init, or to call,
@@ -155,9 +148,9 @@ or to both (with the same value).
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-A `GraphTensor` like `graph`, with the readout result stored as
-`.node_sets[readout_node_set][new_feature_name]` and possibly the
-readout input feature removed (see `remove_input_feature`).
+A <code>GraphTensor</code> like <code>graph</code>, with the readout result stored as
+<code>.node_sets[readout_node_set][new_feature_name]</code> and possibly the
+readout input feature removed (see <code>remove_input_feature</code>).
 </td>
 </tr>
 
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ResidualNextState.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ResidualNextState.md
index 2c137c4a..4626c9c8 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ResidualNextState.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/ResidualNextState.md
@@ -1,17 +1,10 @@
 # tfgnn.keras.layers.ResidualNextState
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/next_state.py#L145-L251">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/next_state.py#L148-L257">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Updates a state with a residual block.
 
@@ -38,14 +31,18 @@ If the initial state of the graph piece that is being updated has size 0, the
 skip connection is omitted. This avoids the need to special-case, say, latent
 node sets in modeling code applied to different node sets.
 
+This layer can be restored from config by `tf.keras.models.load_model()` when
+saved as part of a Keras model using `save_format="tf"`.
+
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Init args</h2></th></tr>
 
 <tr>
 <td>
-`residual_block`<a id="residual_block"></a>
+<code>residual_block</code><a id="residual_block"></a>
 </td>
 <td>
 Required. A Keras Layer to transform the concatenation
@@ -53,18 +50,18 @@ of all inputs into a delta that gets added to the state.
 </td>
 </tr><tr>
 <td>
-`activation`<a id="activation"></a>
+<code>activation</code><a id="activation"></a>
 </td>
 <td>
 An activation function (none by default), as understood by
-`tf.keras.layers.Activation`. This activation function is applied after
+<code>tf.keras.layers.Activation</code>. This activation function is applied after
 the residual block and the addition. If using this, typically the
 residual block does not have an activation function on its last layer,
 or vice versa.
 </td>
 </tr><tr>
 <td>
-`skip_connection_feature_name`<a id="skip_connection_feature_name"></a>
+<code>skip_connection_feature_name</code><a id="skip_connection_feature_name"></a>
 </td>
 <td>
 Controls which input from the updated graph
@@ -77,6 +74,7 @@ defaults to <a href="../../../tfgnn.md#HIDDEN_STATE"><code>tfgnn.HIDDEN_STATE</c
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Call returns</h2></th></tr>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/SimpleConv.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/SimpleConv.md
index aa1d784f..ff4cc4a6 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/SimpleConv.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/SimpleConv.md
@@ -1,17 +1,10 @@
 # tfgnn.keras.layers.SimpleConv
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/convolutions.py#L26-L146">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/convolutions.py#L26-L150">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 A convolution layer that applies a passed-in message_fn.
 
@@ -59,14 +52,19 @@ graph = tfgnn.keras.layers.GraphUpdate(
 )(graph)
 ```
 
+This layer can be restored from config by `tf.keras.models.load_model()` when
+saved as part of a Keras model using `save_format="tf"`, provided that the
+layers used in `message_fn` also support this. (Keras core layers do.)
+
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Init args</h2></th></tr>
 
 <tr>
 <td>
-`message_fn`<a id="message_fn"></a>
+<code>message_fn</code><a id="message_fn"></a>
 </td>
 <td>
 A Keras layer that computes the individual messages from the
@@ -74,17 +72,17 @@ combined input features (see combine_type).
 </td>
 </tr><tr>
 <td>
-`reduce_type`<a id="reduce_type"></a>
+<code>reduce_type</code><a id="reduce_type"></a>
 </td>
 <td>
 Specifies how to pool the messages to receivers. Defaults to
-`"sum"`, can be any reduce_type understood by `tfgnn.pool()`, including
-concatenations like `"sum|max"` (but mind the increased dimension of the
+<code>"sum"</code>, can be any reduce_type understood by <code>tfgnn.pool()</code>, including
+concatenations like <code>"sum|max"</code> (but mind the increased dimension of the
 result and the growing number of model weights in the next-state layer).
 </td>
 </tr><tr>
 <td>
-`combine_type`<a id="combine_type"></a>
+<code>combine_type</code><a id="combine_type"></a>
 </td>
 <td>
 a string understood by tfgnn.combine_values(), to specify how
@@ -93,7 +91,7 @@ to "concat", which concatenates inputs along the last axis.
 </td>
 </tr><tr>
 <td>
-`receiver_tag`<a id="receiver_tag"></a>
+<code>receiver_tag</code><a id="receiver_tag"></a>
 </td>
 <td>
  one of <a href="../../../tfgnn.md#SOURCE"><code>tfgnn.SOURCE</code></a>, <a href="../../../tfgnn.md#TARGET"><code>tfgnn.TARGET</code></a> or <a href="../../../tfgnn.md#CONTEXT"><code>tfgnn.CONTEXT</code></a>.
@@ -106,39 +104,40 @@ If left unset for init, the tag must be passed at call time.
 </td>
 </tr><tr>
 <td>
-`receiver_feature`<a id="receiver_feature"></a>
+<code>receiver_feature</code><a id="receiver_feature"></a>
 </td>
 <td>
 Can be set to override <a href="../../../tfgnn.md#HIDDEN_STATE"><code>tfgnn.HIDDEN_STATE</code></a> for use as
-the input feature from the receiver. Passing `None` disables input from
+the input feature from the receiver. Passing <code>None</code> disables input from
 the receiver.
 </td>
 </tr><tr>
 <td>
-`sender_node_feature`<a id="sender_node_feature"></a>
+<code>sender_node_feature</code><a id="sender_node_feature"></a>
 </td>
 <td>
 Can be set to override <a href="../../../tfgnn.md#HIDDEN_STATE"><code>tfgnn.HIDDEN_STATE</code></a> for use as
-the input feature from sender nodes. Passing `None` disables input from
+the input feature from sender nodes. Passing <code>None</code> disables input from
 the sender node.
-IMPORANT: Must be set to `None` for use with `receiver_tag=tfgnn.CONTEXT`
+IMPORTANT: Must be set to <code>None</code> for use with <code>receiver_tag=tfgnn.CONTEXT</code>
 on an edge set, or for pooling from edges without sender node states.
 </td>
 </tr><tr>
 <td>
-`sender_edge_feature`<a id="sender_edge_feature"></a>
+<code>sender_edge_feature</code><a id="sender_edge_feature"></a>
 </td>
 <td>
 Can be set to a feature name of the edge set to select
-it as an input feature. By default, this set to `None`, which disables
+it as an input feature. By default, this set to <code>None</code>, which disables
 this input.
-IMPORTANT: Must be set for use with `receiver_tag=tfgnn.CONTEXT` on an
+IMPORTANT: Must be set for use with <code>receiver_tag=tfgnn.CONTEXT</code> on an
 edge set.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Call returns</h2></th></tr>
@@ -152,11 +151,12 @@ pooled messages for each receiver.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
-<tr> <td> `receiver_tag`<a id="receiver_tag"></a> </td> <td> one of
+<tr> <td> <code>receiver_tag</code><a id="receiver_tag"></a> </td> <td> one of
 <a href="../../../tfgnn.md#SOURCE"><code>tfgnn.SOURCE</code></a>,
 <a href="../../../tfgnn.md#TARGET"><code>tfgnn.TARGET</code></a> or
 <a href="../../../tfgnn.md#CONTEXT"><code>tfgnn.CONTEXT</code></a>. The results
@@ -168,45 +168,46 @@ of the edges. If set to
 <a href="../../../tfgnn.md#CONTEXT"><code>tfgnn.CONTEXT</code></a>, the layer
 can be called for an edge set or a node set and will aggregate results for
 context (per graph component). If left unset for init, the tag must be passed at
-call time. </td> </tr><tr> <td> `receiver_feature`<a id="receiver_feature"></a>
-</td> <td> The name of the feature that is read from the receiver graph piece
-and passed as convolve(receiver_input=...). </td> </tr><tr> <td>
-`sender_node_feature`<a id="sender_node_feature"></a> </td> <td> The name of the
-feature that is read from the sender nodes, if any, and passed as
-convolve(sender_node_input=...). NOTICE this must be `None` for use with
-`receiver_tag=tfgnn.CONTEXT` on an edge set, or for pooling from edges without
-sender node states. </td> </tr><tr> <td>
-`sender_edge_feature`<a id="sender_edge_feature"></a> </td> <td> The name of the
-feature that is read from the sender edges, if any, and passed as
-convolve(sender_edge_input=...). NOTICE this must not be `None` for use with
-`receiver_tag=tfgnn.CONTEXT` on an edge set. </td> </tr><tr> <td>
-`extra_receiver_ops`<a id="extra_receiver_ops"></a> </td> <td> A str-keyed
-dictionary of Python callables that are wrapped to bind some arguments and then
-passed on to `convolve()`. Sample usage: `extra_receiver_ops={"softmax":
-tfgnn.softmax}`. The values passed in this dict must be callable as follows,
-with two positional arguments:
+call time. </td> </tr><tr> <td>
+<code>receiver_feature</code><a id="receiver_feature"></a> </td> <td> The name
+of the feature that is read from the receiver graph piece and passed as
+convolve(receiver_input=...). </td> </tr><tr> <td>
+<code>sender_node_feature</code><a id="sender_node_feature"></a> </td> <td> The
+name of the feature that is read from the sender nodes, if any, and passed as
+convolve(sender_node_input=...). NOTICE this must be <code>None</code> for use
+with <code>receiver_tag=tfgnn.CONTEXT</code> on an edge set, or for pooling from
+edges without sender node states. </td> </tr><tr> <td>
+<code>sender_edge_feature</code><a id="sender_edge_feature"></a> </td> <td> The
+name of the feature that is read from the sender edges, if any, and passed as
+convolve(sender_edge_input=...). NOTICE this must not be <code>None</code> for
+use with <code>receiver_tag=tfgnn.CONTEXT</code> on an edge set. </td> </tr><tr>
+<td> <code>extra_receiver_ops</code><a id="extra_receiver_ops"></a> </td> <td> A
+str-keyed dictionary of Python callables that are wrapped to bind some arguments
+and then passed on to <code>convolve()</code>. Sample usage:
+<code>extra_receiver_ops={"softmax": tfgnn.softmax}</code>. The values passed in
+this dict must be callable as follows, with two positional arguments:
 
 ```python
 f(graph, receiver_tag, node_set_name=..., feature_value=..., ...)
 f(graph, receiver_tag, edge_set_name=..., feature_value=..., ...)
 ```
 
-The wrapped callables seen by `convolve()` can be called like
+The wrapped callables seen by <code>convolve()</code> can be called like
 
 ```python
 wrapped_f(feature_value, ...)
 ```
 
-The first three arguments of `f` are set to the input GraphTensor of
+The first three arguments of <code>f</code> are set to the input GraphTensor of
 the layer and the tag/name pair required by <a href="../../../tfgnn/broadcast.md"><code>tfgnn.broadcast()</code></a> and
 <a href="../../../tfgnn/pool.md"><code>tfgnn.pool()</code></a> to move values between the receiver and the messages that
 are computed inside the convolution. The sole positional argument of
-`wrapped_f()` is passed to `f()`  as `feature_value=`, and any keyword
+<code>wrapped_f()</code> is passed to <code>f()</code>  as <code>feature_value=</code>, and any keyword
 arguments are forwarded.
 </td>
 </tr><tr>
 <td>
-`**kwargs`<a id="**kwargs"></a>
+<code>**kwargs</code><a id="**kwargs"></a>
 </td>
 <td>
 Forwarded to the base class tf.keras.layers.Layer.
@@ -215,30 +216,31 @@ Forwarded to the base class tf.keras.layers.Layer.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
 <tr>
 <td>
-`takes_receiver_input`<a id="takes_receiver_input"></a>
+<code>takes_receiver_input</code><a id="takes_receiver_input"></a>
 </td>
 <td>
-If `False`, all calls to convolve() will get `receiver_input=None`.
+If <code>False</code>, all calls to convolve() will get <code>receiver_input=None</code>.
 </td>
 </tr><tr>
 <td>
-`takes_sender_edge_input`<a id="takes_sender_edge_input"></a>
+<code>takes_sender_edge_input</code><a id="takes_sender_edge_input"></a>
 </td>
 <td>
-If `False`, all calls to convolve() will get `sender_edge_input=None`.
+If <code>False</code>, all calls to convolve() will get <code>sender_edge_input=None</code>.
 </td>
 </tr><tr>
 <td>
-`takes_sender_node_input`<a id="takes_sender_node_input"></a>
+<code>takes_sender_node_input</code><a id="takes_sender_node_input"></a>
 </td>
 <td>
-If `False`, all calls to convolve() will get `sender_node_input=None`.
+If <code>False</code>, all calls to convolve() will get <code>sender_node_input=None</code>.
 </td>
 </tr>
 </table>
@@ -247,7 +249,7 @@ If `False`, all calls to convolve() will get `sender_node_input=None`.
 
 <h3 id="convolve"><code>convolve</code></h3>
 
-<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/convolutions.py#L122-L146">View
+<a target="_blank" class="external" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/convolutions.py#L126-L150">View
 source</a>
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
@@ -273,44 +275,45 @@ from nodes to context). In the end, values have to be pooled from there into a
 Tensor with a leading dimension indexed by receivers, see `pool_to_receiver`.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Args</th></tr>
 
 <tr>
 <td>
-`sender_node_input`
+<code>sender_node_input</code>
 </td>
 <td>
-The input Tensor from the sender NodeSet, or `None`.
-If self.takes_sender_node_input is `False`, this arg will be `None`.
-(If it is `True`, that depends on how this layer gets called.)
+The input Tensor from the sender NodeSet, or <code>None</code>.
+If self.takes_sender_node_input is <code>False</code>, this arg will be <code>None</code>.
+(If it is <code>True</code>, that depends on how this layer gets called.)
 See also broadcast_from_sender_node.
 </td>
 </tr><tr>
 <td>
-`sender_edge_input`
+<code>sender_edge_input</code>
 </td>
 <td>
-The input Tensor from the sender EdgeSet, or `None`.
-If self.takes_sender_edge_input is `False`, this arg will be `None`.
-(If it is `True`, it depends on how this layer gets called.)
+The input Tensor from the sender EdgeSet, or <code>None</code>.
+If self.takes_sender_edge_input is <code>False</code>, this arg will be <code>None</code>.
+(If it is <code>True</code>, it depends on how this layer gets called.)
 If present, this Tensor is already indexed by the items for which
 messages are computed.
 </td>
 </tr><tr>
 <td>
-`receiver_input`
+<code>receiver_input</code>
 </td>
 <td>
 The input Tensor from the receiver NodeSet or Context,
-or None. If self.takes_receiver_input is `False`, this arg will be
-`None`. (If it is `True`, it depends on how this layer gets called.)
+or None. If self.takes_receiver_input is <code>False</code>, this arg will be
+<code>None</code>. (If it is <code>True</code>, it depends on how this layer gets called.)
 See broadcast_from_receiver.
 </td>
 </tr><tr>
 <td>
-`broadcast_from_sender_node`
+<code>broadcast_from_sender_node</code>
 </td>
 <td>
 A function that broadcasts a Tensor indexed
@@ -319,25 +322,25 @@ messages are computed.
 </td>
 </tr><tr>
 <td>
-`broadcast_from_receiver`
+<code>broadcast_from_receiver</code>
 </td>
 <td>
-Call this as `broadcast_from_receiver(value)`
+Call this as <code>broadcast_from_receiver(value)</code>
 to broadcast a Tensor indexed like receiver_input to a Tensor indexed
 by the items for which messages are computed.
 </td>
 </tr><tr>
 <td>
-`pool_to_receiver`
+<code>pool_to_receiver</code>
 </td>
 <td>
-Call this as `pool_to_receiver(value, reduce_type=...)`
+Call this as <code>pool_to_receiver(value, reduce_type=...)</code>
 to pool an item-indexed Tensor to a receiver-indexed tensor, using
 a reduce_type understood by tfgnn.pool(), such as "sum".
 </td>
 </tr><tr>
 <td>
-`extra_receiver_ops`
+<code>extra_receiver_ops</code>
 </td>
 <td>
 The extra_receiver_ops passed to init, see there,
@@ -347,10 +350,10 @@ this argument, so subclass implementors not using it can omit it.
 </td>
 </tr><tr>
 <td>
-`training`
+<code>training</code>
 </td>
 <td>
-The `training` boolean that was passed to Layer.call(). If true,
+The <code>training</code> boolean that was passed to Layer.call(). If true,
 the result is computed for training rather than inference. For example,
 calls to tf.nn.dropout() are usually conditioned on this flag.
 By contrast, calling another Keras layer (like tf.keras.layers.Dropout)
@@ -360,6 +363,7 @@ does not require forwarding this arg, Keras does that automatically.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2">Returns</th></tr>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/SingleInputNextState.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/SingleInputNextState.md
index c121434f..5a78ac01 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/SingleInputNextState.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/SingleInputNextState.md
@@ -1,17 +1,10 @@
 # tfgnn.keras.layers.SingleInputNextState
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/next_state.py#L254-L281">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/next_state.py#L260-L290">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Replaces a state from a single input.
 
@@ -27,7 +20,11 @@ In a NodeSetUpdate, it replaces the node state with a single edge set input. For
 an EdgeSetUpdate, it replaces the edge_state with the incident node set's input.
 For a ContextUpdate, it replaces the context state with a single node set input.
 
+This layer can be restored from config by `tf.keras.models.load_model()` when
+saved as part of a Keras model using `save_format="tf"`.
+
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Call returns</h2></th></tr>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/StructuredReadout.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/StructuredReadout.md
index 748d8d83..5e3a0a50 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/StructuredReadout.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/keras/layers/StructuredReadout.md
@@ -1,17 +1,10 @@
 # tfgnn.keras.layers.StructuredReadout
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/graph_ops.py#L318-L412">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/keras/layers/graph_ops.py#L318-L412">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Reads out a feature value from select nodes (or edges) in a graph.
 
@@ -68,16 +61,16 @@ To retrieve a feature unchanged, see
 
 <tr>
 <td>
-`key`<a id="key"></a>
+<code>key</code><a id="key"></a>
 </td>
 <td>
 A string key to select between possibly multiple named readouts
-(such as `"source"` and `"target"` for link prediction). Can be fixed
+(such as <code>"source"</code> and <code>"target"</code> for link prediction). Can be fixed
 in init, or selected for each call.
 </td>
 </tr><tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 The name of the feature to read. If unset (also in call),
@@ -85,15 +78,15 @@ The name of the feature to read. If unset (also in call),
 </td>
 </tr><tr>
 <td>
-`readout_node_set`<a id="readout_node_set"></a>
+<code>readout_node_set</code><a id="readout_node_set"></a>
 </td>
 <td>
-A string, defaults to `"_readout"`. This is used as the
+A string, defaults to <code>"_readout"</code>. This is used as the
 name for the readout node set and as a name prefix for its edge sets.
 </td>
 </tr><tr>
 <td>
-`validate`<a id="validate"></a>
+<code>validate</code><a id="validate"></a>
 </td>
 <td>
 Setting this to false disables the validity checks for the
@@ -112,14 +105,14 @@ structurally unchanged GraphTensors.
 
 <tr>
 <td>
-`graph`<a id="graph"></a>
+<code>graph</code><a id="graph"></a>
 </td>
 <td>
 The scalar GraphTensor to read from.
 </td>
 </tr><tr>
 <td>
-`key`<a id="key"></a>
+<code>key</code><a id="key"></a>
 </td>
 <td>
 Same meaning as for init. Must be passed to init, or to call,
@@ -136,7 +129,7 @@ or to both (with the same value).
 <tr class="alt">
 <td colspan="2">
 A tensor of read-out feature values, shaped like a feature of the
-`readout_node_set`.
+<code>readout_node_set</code>.
 </td>
 </tr>
 
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/learn_fit_or_skip_size_constraints.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/learn_fit_or_skip_size_constraints.md
index 31963e69..f272b479 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/learn_fit_or_skip_size_constraints.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/learn_fit_or_skip_size_constraints.md
@@ -1,17 +1,10 @@
 # tfgnn.learn_fit_or_skip_size_constraints
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/batching_utils.py#L257-L621">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/batching_utils.py#L257-L621">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Learns the optimal size constraints for the fixed size batching with retry.
 
@@ -27,20 +20,16 @@ Learns the optimal size constraints for the fixed size batching with retry.
 ) -> Union[<a href="../tfgnn/SizeConstraints.md"><code>tfgnn.SizeConstraints</code></a>, List[Any]]
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
 The function estimates the smallest possible size constraints so that a random
-sample of `batch_size` graph tensors meets those constraints with probability
-no less than `success_ratio`. The success ratio is treated as a hard
-constraint, up to sampling error. The constraints can be used for graph tensor
-padding to the fully defined shapes required by XLA.
+sample of `batch_size` graph tensors meets those constraints with probability no
+less than `success_ratio`. The success ratio is treated as a hard constraint, up
+to sampling error. The constraints can be used for graph tensor padding to the
+fully defined shapes required by XLA.
 
 #### Example:
 
-
-
 ```python
 # Learn size constraints for a given dataset of graph tensors and the target
 # batch size(s). The constraints could be learned once and then reused.
@@ -73,35 +62,36 @@ if training:
 
 The learned constraints are intend to be used only with randomized repeated
 dataset. This dataset are first batched using `tf.data.Dataset.batch()`, the
-batches that are too large to fit the learned contraints are filtered using
-<a href="../tfgnn/satisfies_size_constraints.md"><code>tfgnn.satisfies_size_constraints()</code></a> and then padded
+batches that are too large to fit the learned constraints are filtered using
+<a href="../tfgnn/satisfies_size_constraints.md"><code>tfgnn.satisfies_size_constraints()</code></a>
+and then padded
 <a href="../tfgnn/pad_to_total_sizes.md"><code>tfgnn.pad_to_total_sizes()</code></a>.
 
 This approach, if applicable, is more efficient compared to padding to the
 maximum possible sizes. It is also simpler and faster compared to the dynamic
-batching, especially for the large batch sizes (>10).  To illustrate the main
+batching, especially for the large batch sizes (>10). To illustrate the main
 point, consider graphs containing only 0 or 1 nodes. A random batch of 1000 of
 those graphs could contain 1000 nodes in the worst case. If this maximum limit
 is used to reseve space for random 1000 graphs, the space of 425 nodes is used
 only in 1:1000_000 cases. It is >40% more efficient to reserve space only for
 575 nodes and resample batches in the rare cases when they do not fit.
 
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`dataset`<a id="dataset"></a>
+<code>dataset</code><a id="dataset"></a>
 </td>
 <td>
 dataset of graph tensors that is intended to be batched.
 </td>
 </tr><tr>
 <td>
-`batch_size`<a id="batch_size"></a>
+<code>batch_size</code><a id="batch_size"></a>
 </td>
 <td>
 the target batch size(s). Could be a single positive integer
@@ -110,7 +100,7 @@ requested value.
 </td>
 </tr><tr>
 <td>
-`min_nodes_per_component`<a id="min_nodes_per_component"></a>
+<code>min_nodes_per_component</code><a id="min_nodes_per_component"></a>
 </td>
 <td>
 mapping from a node set name to a minimum number of
@@ -118,7 +108,7 @@ nodes in each graph component. Defaults to 0.
 </td>
 </tr><tr>
 <td>
-`success_ratio`<a id="success_ratio"></a>
+<code>success_ratio</code><a id="success_ratio"></a>
 </td>
 <td>
 the target probability(s) that a random batch of graph tensor
@@ -126,19 +116,19 @@ satisfies the learned constraints. Could be a single float value between 0
 and 1 or any iterable. For the latter case the result is reported for
 each requested value. NOTE: setting success_ratio to 1 only guarantees
 that all sampled graphs are satisfy the learned constraints. This does not
-in general apply to an arbitrary sample. When `sample_size` tends to
+in general apply to an arbitrary sample. When <code>sample_size</code> tends to
 infinity, the 1 ratio corresponds to the "almost surely satisfies" event.
 </td>
 </tr><tr>
 <td>
-`sample_size`<a id="sample_size"></a>
+<code>sample_size</code><a id="sample_size"></a>
 </td>
 <td>
 the number of the first dataset examples to use for inference.
 </td>
 </tr><tr>
 <td>
-`num_thresholds`<a id="num_thresholds"></a>
+<code>num_thresholds</code><a id="num_thresholds"></a>
 </td>
 <td>
 the number of quantiles to use to approximate probability
@@ -148,18 +138,18 @@ distributions.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-Learned size constraints. If both `batch_size` and `success_ratio` are
-iterables, the result is returned as a nested lists, were `result[b][r]`
-is a size constraints for `batch_size[b]` and `success_ratio[r]`. If any of
-`batch_size` or/and `success_ratio` are scalars the corresponding dimension
+Learned size constraints. If both <code>batch_size</code> and <code>success_ratio</code> are
+iterables, the result is returned as a nested lists, were <code>result[b][r]</code>
+is a size constraints for <code>batch_size[b]</code> and <code>success_ratio[r]</code>. If any of
+<code>batch_size</code> or/and <code>success_ratio</code> are scalars the corresponding dimension
 is squeezed in the output.
 </td>
 </tr>
 
 </table>
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/mask_edges.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/mask_edges.md
index 5a15e628..5f2a5691 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/mask_edges.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/mask_edges.md
@@ -1,17 +1,10 @@
 # tfgnn.mask_edges
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_ops.py#L234-L347">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_ops.py#L229-L341">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Creates a GraphTensor after applying edge_mask over the specified edge-set.
 
@@ -33,35 +26,36 @@ Edge masking doesn't change the node sets or the context node information.
 Not compatible with XLA.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`graph`<a id="graph"></a>
+<code>graph</code><a id="graph"></a>
 </td>
 <td>
 A scalar GraphTensor.
 </td>
 </tr><tr>
 <td>
-`edge_set_name`<a id="edge_set_name"></a>
+<code>edge_set_name</code><a id="edge_set_name"></a>
 </td>
 <td>
 Name of edge-set to apply the boolean mask.
 </td>
 </tr><tr>
 <td>
-`boolean_edge_mask`<a id="boolean_edge_mask"></a>
+<code>boolean_edge_mask</code><a id="boolean_edge_mask"></a>
 </td>
 <td>
-A boolean mask with shape `[num_edges]` to mask edges in
-the specified edge-set of the `graph` with rank=1.
+A boolean mask with shape <code>[num_edges]</code> to mask edges in
+the specified edge-set of the <code>graph</code> with rank=1.
 </td>
 </tr><tr>
 <td>
-`masked_info_edge_set_name`<a id="masked_info_edge_set_name"></a>
+<code>masked_info_edge_set_name</code><a id="masked_info_edge_set_name"></a>
 </td>
 <td>
 Masked out edge-set information will be kept in a
@@ -71,6 +65,7 @@ new edge-set, with name masked_info_edge_set_name.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/node_degree.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/node_degree.md
index 904791d8..1d0979ea 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/node_degree.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/node_degree.md
@@ -1,17 +1,10 @@
 # tfgnn.node_degree
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_ops.py#L620-L647">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_ops.py#L615-L643">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns the degree of each node w.r.t. one side of an edge set.
 
@@ -24,29 +17,29 @@ Returns the degree of each node w.r.t. one side of an edge set.
 </code></pre>
 
 <!-- Placeholder for "Used in" -->
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`graph_tensor`<a id="graph_tensor"></a>
+<code>graph_tensor</code><a id="graph_tensor"></a>
 </td>
 <td>
 A scalar GraphTensor.
 </td>
 </tr><tr>
 <td>
-`edge_set_name`<a id="edge_set_name"></a>
+<code>edge_set_name</code><a id="edge_set_name"></a>
 </td>
 <td>
 The name of the edge set for which degrees are calculated.
 </td>
 </tr><tr>
 <td>
-`node_tag`<a id="node_tag"></a>
+<code>node_tag</code><a id="node_tag"></a>
 </td>
 <td>
 The side of each edge for which the degrees are calculated,
@@ -57,15 +50,16 @@ specified by its tag in the edge set (e.g., <a href="../tfgnn.md#SOURCE"><code>t
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-An integer Tensor of shape `[num_nodes]` and dtype equal to `indices_dtype`
-of the GraphTensor. Element `i` contains the number of edges in the given
-edge set that have node index `i` as their endpoint with the given
-`node_tag`. The dimension `num_nodes` is the number of nodes in the
+An integer Tensor of shape <code>[num_nodes]</code> and dtype equal to <code>indices_dtype</code>
+of the GraphTensor. Element <code>i</code> contains the number of edges in the given
+edge set that have node index <code>i</code> as their endpoint with the given
+<code>node_tag</code>. The dimension <code>num_nodes</code> is the number of nodes in the
 respective node set.
 </td>
 </tr>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/pad_to_total_sizes.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/pad_to_total_sizes.md
index 577db1d3..da498e3a 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/pad_to_total_sizes.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/pad_to_total_sizes.md
@@ -1,17 +1,10 @@
 # tfgnn.pad_to_total_sizes
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/padding_ops.py#L30-L183">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/padding_ops.py#L31-L185">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Pads graph tensor to the total sizes by inserting fake graph components.
 
@@ -25,8 +18,6 @@ Pads graph tensor to the total sizes by inserting fake graph components.
 ) -> Tuple[<a href="../tfgnn/GraphTensor.md"><code>tfgnn.GraphTensor</code></a>, tf.Tensor]
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
 Padding is done by inserting "fake" graph components at the end of the input
@@ -34,58 +25,59 @@ graph tensor until target total sizes are exactly matched. If that is not
 possible (e.g. input already has more nodes than allowed by the constraints)
 function raises `tf.errors.InvalidArgumentError`.
 
-If size_constraints.min_nodes_per_component is specified for a node set,
-the inserted graph components satisfy that constraint (e.g., such that there
-is a node for tf.gather_first_node()). Components in the input graph tensor
-must satisfy that constraint already, or tf.errors.InvalidArgumentError will
-be raised. (This function cannot add padding within existing components.)
+If size_constraints.min_nodes_per_component is specified for a node set, the
+inserted graph components satisfy that constraint (e.g., such that there is a
+node for tf.gather_first_node()). Components in the input graph tensor must
+satisfy that constraint already, or tf.errors.InvalidArgumentError will be
+raised. (This function cannot add padding within existing components.)
 
 Context, node or edge features of the appended fake components are filled with
-user-provided scalar values or with zeros if the latter are not specified.
-Fake edges are created such that each fake node has an approximately uniform
-number of incident edges (this behavior is not guaranteed and may change in
-the future).
+user-provided scalar values or with zeros if the latter are not specified. Fake
+edges are created such that each fake node has an approximately uniform number
+of incident edges (this behavior is not guaranteed and may change in the
+future).
 
 NOTE(b/275338236): This operation is not available in TFLite (last checked for
 TF 2.12).
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`graph_tensor`<a id="graph_tensor"></a>
+<code>graph_tensor</code><a id="graph_tensor"></a>
 </td>
 <td>
 scalar graph tensor (rank=0) to pad.
 </td>
 </tr><tr>
 <td>
-`size_constraints`<a id="size_constraints"></a>
+<code>size_constraints</code><a id="size_constraints"></a>
 </td>
 <td>
 target total sizes for each graph piece. Must define the
-target number of graph components (`.total_num_components`), target total
-number of items for each node set (`.total_num_nodes[node_set_name]`) and
-likewise for each edge set (`.total_num_edges[edge_set_name]`).
-If `min_nodes_per_component` is set, the inserted graph components satisfy
+target number of graph components (<code>.total_num_components</code>), target total
+number of items for each node set (<code>.total_num_nodes[node_set_name]</code>) and
+likewise for each edge set (<code>.total_num_edges[edge_set_name]</code>).
+If <code>min_nodes_per_component</code> is set, the inserted graph components satisfy
 that constraint and graph components of the input graph tensor are checked
 against this constraint.
 </td>
 </tr><tr>
 <td>
-`padding_values`<a id="padding_values"></a>
+<code>padding_values</code><a id="padding_values"></a>
 </td>
 <td>
 optional mapping from a context, node set or edge set
 feature name to a scalar tensor to use for padding. If no value is
-specified for some feature, its type 'zero' is used (as in `tf.zeros()`).
+specified for some feature, its type 'zero' is used (as in <code>tf.zeros()</code>).
 </td>
 </tr><tr>
 <td>
-`validate`<a id="validate"></a>
+<code>validate</code><a id="validate"></a>
 </td>
 <td>
 If true, then use assertions to check that the input graph tensor
@@ -97,6 +89,7 @@ each input value.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
@@ -104,35 +97,34 @@ each input value.
 <td colspan="2">
 Tuple of padded graph tensor and padding mask. The mask is a rank-1 dense
 boolean tensor wth size equal to the number of graph compoents is the result
-containing `True` for real graph components and `False` - for fake one used
+containing <code>True</code> for real graph components and <code>False</code> - for fake one used
 for padding.
 </td>
 </tr>
 
 </table>
 
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Raises</h2></th></tr>
 
 <tr>
 <td>
-`ValueError`<a id="ValueError"></a>
+<code>ValueError</code><a id="ValueError"></a>
 </td>
 <td>
 if input parameters are invalid.
 </td>
 </tr><tr>
 <td>
-`tf.errors.InvalidArgumentError`<a id="tf.errors.InvalidArgumentError"></a>
+<code>tf.errors.InvalidArgumentError</code><a id="tf.errors.InvalidArgumentError"></a>
 </td>
 <td>
 if input graph tensor could not be padded to
-the `size_constraints` or has less nodes in a component than allowed by
-the `min_nodes_per_component`
+the <code>size_constraints</code> or has less nodes in a component than allowed by
+the <code>min_nodes_per_component</code>
 </td>
 </tr>
 </table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/parse_example.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/parse_example.md
index 37e77985..b7bcc439 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/parse_example.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/parse_example.md
@@ -1,17 +1,10 @@
 # tfgnn.parse_example
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_io.py#L54-L100">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_io.py#L54-L100">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Parses a batch of serialized Example protos into a single `GraphTensor`.
 
@@ -24,8 +17,6 @@ Parses a batch of serialized Example protos into a single `GraphTensor`.
 ) -> <a href="../tfgnn/GraphTensor.md"><code>tfgnn.GraphTensor</code></a>
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
 We expect `serialized` to be a string tensor batched with `batch_size` many
@@ -38,18 +29,19 @@ function accepts a type spec for a graph tensor and implements an encoding for
 all container tensors, including ragged tensors, from a batched sequence of
 `tf.train.Example` protocol buffer messages.
 
-The encoded examples shapes and features are expected to conform to the
-encoding defined by `get_io_spec()`. The `validate` flag exists to implement
-verifications of this encoding.
+Parsed feature values are converted to their dtype declared in the `spec` using
+`tf.cast()`. Note that `tf.float64` features are converted from a `tf.float32`
+representation in `tf.train.Example`.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`spec`<a id="spec"></a>
+<code>spec</code><a id="spec"></a>
 </td>
 <td>
 A graph tensor type specification of a single serialized graph tensor
@@ -57,16 +49,16 @@ value.
 </td>
 </tr><tr>
 <td>
-`serialized`<a id="serialized"></a>
+<code>serialized</code><a id="serialized"></a>
 </td>
 <td>
 A rank-1 dense tensor of strings with serialized Example protos,
-where each example is a graph tensor object with type corresponding `spec`
+where each example is a graph tensor object with type corresponding <code>spec</code>
 type spec.
 </td>
 </tr><tr>
 <td>
-`prefix`<a id="prefix"></a>
+<code>prefix</code><a id="prefix"></a>
 </td>
 <td>
 An optional prefix string over all the features. You may use
@@ -74,24 +66,24 @@ this if you are encoding other data in the same protocol buffer.
 </td>
 </tr><tr>
 <td>
-`validate`<a id="validate"></a>
+<code>validate</code><a id="validate"></a>
 </td>
 <td>
 A boolean indicating whether or not to validate that the input
-values form a valid GraphTensor. Defaults to `True`.
+values form a valid GraphTensor. Defaults to <code>True</code>.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-A graph tensor object with `spec.batch(serialized.shape[0])` type spec.
+A graph tensor object with <code>spec.batch(serialized.shape[0])</code> type spec.
 </td>
 </tr>
 
 </table>
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/parse_schema.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/parse_schema.md
index f037f66d..c84a4809 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/parse_schema.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/parse_schema.md
@@ -1,54 +1,46 @@
 # tfgnn.parse_schema
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/schema_utils.py#L28-L38">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/schema_utils.py#L30-L40">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Parse a schema from text-formatted protos.
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>tfgnn.parse_schema(
     schema_text: Union[bytes, str]
-) -> <a href="../tfgnn/GraphSchema.md"><code>tfgnn.GraphSchema</code></a>
+) -> <a href="../tfgnn/proto/GraphSchema.md"><code>tfgnn.proto.GraphSchema</code></a>
 </code></pre>
 
 <!-- Placeholder for "Used in" -->
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`schema_text`<a id="schema_text"></a>
+<code>schema_text</code><a id="schema_text"></a>
 </td>
 <td>
 A string containing a text-formatted protocol buffer rendition
-of a `GraphSchema` message.
+of a <code>GraphSchema</code> message.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-A `GraphSchema` instance.
+A <code>GraphSchema</code> instance.
 </td>
 </tr>
 
 </table>
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/parse_single_example.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/parse_single_example.md
index 4d98395d..eced5dda 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/parse_single_example.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/parse_single_example.md
@@ -1,17 +1,10 @@
 # tfgnn.parse_single_example
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_io.py#L103-L129">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_io.py#L103-L129">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Parses a single serialized Example proto into a single `GraphTensor`.
 
@@ -24,36 +17,36 @@ Parses a single serialized Example proto into a single `GraphTensor`.
 ) -> <a href="../tfgnn/GraphTensor.md"><code>tfgnn.GraphTensor</code></a>
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
-Like `parse_example()`, but for a single graph tensor.
-See <a href="../tfgnn/parse_example.md"><code>tfgnn.parse_example()</code></a> for reference.
+Like `parse_example()`, but for a single graph tensor. See
+<a href="../tfgnn/parse_example.md"><code>tfgnn.parse_example()</code></a> for
+reference.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`spec`<a id="spec"></a>
+<code>spec</code><a id="spec"></a>
 </td>
 <td>
 A graph tensor type specification.
 </td>
 </tr><tr>
 <td>
-`serialized`<a id="serialized"></a>
+<code>serialized</code><a id="serialized"></a>
 </td>
 <td>
 A scalar string tensor with a serialized Example proto
-containing a graph tensor object with the `spec` type spec.
+containing a graph tensor object with the <code>spec</code> type spec.
 </td>
 </tr><tr>
 <td>
-`prefix`<a id="prefix"></a>
+<code>prefix</code><a id="prefix"></a>
 </td>
 <td>
 An optional prefix string over all the features. You may use
@@ -61,16 +54,17 @@ this if you are encoding other data in the same protocol buffer.
 </td>
 </tr><tr>
 <td>
-`validate`<a id="validate"></a>
+<code>validate</code><a id="validate"></a>
 </td>
 <td>
 A boolean indicating whether or not to validate that the input
-fields form a valid `GraphTensor`. Defaults to `True`.
+fields form a valid <code>GraphTensor</code>. Defaults to <code>True</code>.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
@@ -81,4 +75,3 @@ A graph tensor object with a matching type spec.
 </tr>
 
 </table>
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/pool.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/pool.md
index 9f2f38ea..c0d229cb 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/pool.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/pool.md
@@ -1,17 +1,10 @@
 # tfgnn.pool
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/pool_ops.py#L207-L327">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/pool_ops.py#L210-L331">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Pools values from edges to nodes, or from nodes or edges to context.
 
@@ -65,20 +58,21 @@ individual results along the innermost axis in the order of appearance.
 support RaggedTensors.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`graph_tensor`<a id="graph_tensor"></a>
+<code>graph_tensor</code><a id="graph_tensor"></a>
 </td>
 <td>
 A scalar GraphTensor.
 </td>
 </tr><tr>
 <td>
-`to_tag`<a id="to_tag"></a>
+<code>to_tag</code><a id="to_tag"></a>
 </td>
 <td>
 Values are pooled to context if this is <a href="../tfgnn.md#CONTEXT"><code>tfgnn.CONTEXT</code></a> or to the
@@ -86,48 +80,48 @@ incident node on each edge with this tag.
 </td>
 </tr><tr>
 <td>
-`edge_set_name`<a id="edge_set_name"></a>
+<code>edge_set_name</code><a id="edge_set_name"></a>
 </td>
 <td>
 The name of the edge set from which values are pooled, or
-a non-empty sequence of such names. Unless `to_tag=tfgnn.CONTEXT`,
+a non-empty sequence of such names. Unless <code>to_tag=tfgnn.CONTEXT</code>,
 all named edge sets must have the same incident node set at the given tag.
 </td>
 </tr><tr>
 <td>
-`node_set_name`<a id="node_set_name"></a>
+<code>node_set_name</code><a id="node_set_name"></a>
 </td>
 <td>
 The name of the node set from which values are pooled,
 or a non-empty sequence of such names. Can only be set with
-`to_tag=tfgnn.CONTEXT`. Exactly one of edge_set_name or node_set_name
+<code>to_tag=tfgnn.CONTEXT</code>. Exactly one of edge_set_name or node_set_name
 must be set.
 </td>
 </tr><tr>
 <td>
-`reduce_type`<a id="reduce_type"></a>
+<code>reduce_type</code><a id="reduce_type"></a>
 </td>
 <td>
 A string with the name of a pooling operation, or multiple ones
-separated by `|`. See the table above for the known names.
+separated by <code>|</code>. See the table above for the known names.
 </td>
 </tr><tr>
 <td>
-`feature_value`<a id="feature_value"></a>
+<code>feature_value</code><a id="feature_value"></a>
 </td>
 <td>
 A tensor or list of tensors, parallel to the node_set_names
 or edge_set_names, to supply the input values of pooling. Each tensor
-has shape `[num_items, *feature_shape]`, where `num_items` is the number
+has shape <code>[num_items, *feature_shape]</code>, where <code>num_items</code> is the number
 of edges in the given edge set or nodes in the given node set, and
-`*feature_shape` is the same across all inputs. The `*feature_shape` may
+<code>*feature_shape</code> is the same across all inputs. The <code>*feature_shape</code> may
 contain ragged dimensions. All the ragged values that are reduced onto
 any one item of the graph must have the same ragged index structure,
 so that a result can be computed from them.
 </td>
 </tr><tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 The name of a feature stored on each graph piece from which
@@ -145,10 +139,10 @@ Exactly one of feature_name or feature_value must be set.
 <tr class="alt">
 <td colspan="2">
 A tensor with the result of pooling from the conceptual concatenation of the
-named edge set(s) or node set(s) to the destination selected by `to_tag`.
-Its shape is `[num_items, *feature_shape]`, where `num_items` is the number
-of destination nodes (or graph components if `to_tag=tfgnn.CONTEXT`)
-and `*feature_shape` is as for all the inputs.
+named edge set(s) or node set(s) to the destination selected by <code>to_tag</code>.
+Its shape is <code>[num_items, *feature_shape]</code>, where <code>num_items</code> is the number
+of destination nodes (or graph components if <code>to_tag=tfgnn.CONTEXT</code>)
+and <code>*feature_shape</code> is as for all the inputs.
 </td>
 </tr>
 
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/pool_edges_to_context.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/pool_edges_to_context.md
index 497178ab..d0817bf9 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/pool_edges_to_context.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/pool_edges_to_context.md
@@ -1,17 +1,10 @@
 # tfgnn.pool_edges_to_context
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/pool_ops.py#L151-L204">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/pool_ops.py#L153-L207">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Aggregates (pools) edge values to graph context.
 
@@ -49,45 +42,46 @@ regular convolution, we will first broadcast over edges and combine the result
 of that with this function or a pooling over the nodes.)
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`graph_tensor`<a id="graph_tensor"></a>
+<code>graph_tensor</code><a id="graph_tensor"></a>
 </td>
 <td>
 A scalar GraphTensor.
 </td>
 </tr><tr>
 <td>
-`edge_set_name`<a id="edge_set_name"></a>
+<code>edge_set_name</code><a id="edge_set_name"></a>
 </td>
 <td>
 An edge set name.
 </td>
 </tr><tr>
 <td>
-`reduce_type`<a id="reduce_type"></a>
+<code>reduce_type</code><a id="reduce_type"></a>
 </td>
 <td>
-A pooling operation name, like `"sum"` or `"mean"`, or a
-`|`-separated combination of these; see <a href="../tfgnn/pool.md"><code>tfgnn.pool()</code></a>.
+A pooling operation name, like <code>"sum"</code> or <code>"mean"</code>, or a
+<code>|</code>-separated combination of these; see <a href="../tfgnn/pool.md"><code>tfgnn.pool()</code></a>.
 </td>
 </tr><tr>
 <td>
-`feature_value`<a id="feature_value"></a>
+<code>feature_value</code><a id="feature_value"></a>
 </td>
 <td>
 A ragged or dense edge feature value. Has a shape
-`[num_edges, *feature_shape]`, where `num_edges` is the number of edges in
-the `edge_set_name` edge set and `feature_shape` is the shape of the
+<code>[num_edges, *feature_shape]</code>, where <code>num_edges</code> is the number of edges in
+the <code>edge_set_name</code> edge set and <code>feature_shape</code> is the shape of the
 feature value for each edge.
 </td>
 </tr><tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 An edge feature name.
@@ -96,16 +90,16 @@ An edge feature name.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-A node value pooled to graph context. Has a shape `[num_components,
-*feature_shape]`, where `num_components` is the number of components in a
-graph and `feature_shape` is not affected.
+A node value pooled to graph context. Has a shape <code>[num_components,
+*feature_shape]</code>, where <code>num_components</code> is the number of components in a
+graph and <code>feature_shape</code> is not affected.
 </td>
 </tr>
 
 </table>
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/pool_edges_to_node.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/pool_edges_to_node.md
index f8188235..172e53a4 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/pool_edges_to_node.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/pool_edges_to_node.md
@@ -1,17 +1,10 @@
 # tfgnn.pool_edges_to_node
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/pool_ops.py#L41-L96">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/pool_ops.py#L41-L97">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Aggregates (pools) edge values to incident nodes.
 
@@ -44,35 +37,36 @@ For a generalization beyond a single edge set, see
 The feature to fetch edge values from is provided either by name (using
 `feature_name`) and found in the graph tensor itself, or provided explicitly
 (using `feature_value`) in which case its shape has to be compatible with the
-shape prefix of the edge set being gathered from. One of `feature_value`
-or `feature_name` must be specified.
+shape prefix of the edge set being gathered from. One of `feature_value` or
+`feature_name` must be specified.
 
 (Note that in most cases the `feature_value` form will be used, because in a
 regular convolution, we will first broadcast over edges and combine the result
 of that with this function.)
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`graph_tensor`<a id="graph_tensor"></a>
+<code>graph_tensor</code><a id="graph_tensor"></a>
 </td>
 <td>
 A scalar GraphTensor.
 </td>
 </tr><tr>
 <td>
-`edge_set_name`<a id="edge_set_name"></a>
+<code>edge_set_name</code><a id="edge_set_name"></a>
 </td>
 <td>
 The name of the edge set from which values are pooled.
 </td>
 </tr><tr>
 <td>
-`node_tag`<a id="node_tag"></a>
+<code>node_tag</code><a id="node_tag"></a>
 </td>
 <td>
 The incident node of each edge at which values are aggregated,
@@ -80,25 +74,25 @@ identified by its tag in the edge set.
 </td>
 </tr><tr>
 <td>
-`reduce_type`<a id="reduce_type"></a>
+<code>reduce_type</code><a id="reduce_type"></a>
 </td>
 <td>
-A pooling operation name like `"sum"` or `"mean"`, or a
-`|`-separated combination of these; see <a href="../tfgnn/pool.md"><code>tfgnn.pool()</code></a>.
+A pooling operation name like <code>"sum"</code> or <code>"mean"</code>, or a
+<code>|</code>-separated combination of these; see <a href="../tfgnn/pool.md"><code>tfgnn.pool()</code></a>.
 </td>
 </tr><tr>
 <td>
-`feature_value`<a id="feature_value"></a>
+<code>feature_value</code><a id="feature_value"></a>
 </td>
 <td>
 A ragged or dense edge feature value. Has a shape
-`[num_edges, *feature_shape]`, where `num_edges` is the number of edges in
-the `edge_set_name` edge set and `feature_shape` is the shape of the
+<code>[num_edges, *feature_shape]</code>, where <code>num_edges</code> is the number of edges in
+the <code>edge_set_name</code> edge set and <code>feature_shape</code> is the shape of the
 feature value for each edge.
 </td>
 </tr><tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 An edge feature name.
@@ -107,16 +101,16 @@ An edge feature name.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-The edge values pooled to each incident node. Has a shape `[num_nodes,
-*feature_shape]`, where `num_nodes` is the number of nodes in the incident
-node set and `feature_shape` is not affected.
+The edge values pooled to each incident node. Has a shape <code>[num_nodes,
+*feature_shape]</code>, where <code>num_nodes</code> is the number of nodes in the incident
+node set and <code>feature_shape</code> is not affected.
 </td>
 </tr>
 
 </table>
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/pool_neighbors_to_node.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/pool_neighbors_to_node.md
new file mode 100644
index 00000000..50f43ccc
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/pool_neighbors_to_node.md
@@ -0,0 +1,110 @@
+# tfgnn.pool_neighbors_to_node
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_ops.py#L1134-L1198">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Aggregates (pools) neighbor node values along one or more edge sets.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>tfgnn.pool_neighbors_to_node(
+    graph_tensor: GraphTensor,
+    edge_set_name: Union[Sequence[EdgeSetName], EdgeSetName],
+    to_tag: IncidentNodeTag,
+    *,
+    reduce_type: str,
+    from_tag: Optional[IncidentNodeTag] = None,
+    feature_value: Optional[Field] = None,
+    feature_name: Optional[FieldName] = None
+) -> Field
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+This is a helper function that first broadcasts feature values from the nodes at
+the `from_tag` endpoints (the "neighbors") of each given edge set and then pools
+those values at the `to_tag` endpoints (the "nodes").
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>graph_tensor</code><a id="graph_tensor"></a>
+</td>
+<td>
+A scalar GraphTensor.
+</td>
+</tr><tr>
+<td>
+<code>edge_set_name</code><a id="edge_set_name"></a>
+</td>
+<td>
+The name of the edge set through which values are pooled, or
+a non-empty sequence of such names. It is required that all edge sets
+connect the same <code>from_tag</code> and <code>to_tag</code> node sets.
+</td>
+</tr><tr>
+<td>
+<code>to_tag</code><a id="to_tag"></a>
+</td>
+<td>
+The incident node of each edge at which values are aggregated,
+identified by its tag in the edge set.
+</td>
+</tr><tr>
+<td>
+<code>reduce_type</code><a id="reduce_type"></a>
+</td>
+<td>
+A pooling operation name like <code>"sum"</code> or <code>"mean"</code>, or a
+<code>|</code>-separated combination of these; see <a href="../tfgnn/pool.md"><code>tfgnn.pool()</code></a>.
+</td>
+</tr><tr>
+<td>
+<code>from_tag</code><a id="from_tag"></a>
+</td>
+<td>
+The incident node of each edge from which values are aggregated.
+Required for hypergraphs. For ordinary graphs, defaults to the opposite of
+<code>to_tag</code>.
+</td>
+</tr><tr>
+<td>
+<code>feature_value</code><a id="feature_value"></a>
+</td>
+<td>
+A ragged or dense neighbor feature value. Has a shape
+<code>[num_sender_nodes, *feature_shape]</code>, where <code>num_sender_nodes</code> is the
+number of sender nodes and <code>feature_shape</code> is the shape of the feature
+value for each sender node.
+</td>
+</tr><tr>
+<td>
+<code>feature_name</code><a id="feature_name"></a>
+</td>
+<td>
+An neighbors feature name.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
+<tr class="alt">
+<td colspan="2">
+The sender nodes values pooled to each receiver node. Has a shape
+<code>[num_receiver_nodes, *feature_shape]</code>, where <code>num_receiver_nodes</code> is the
+number of receiver nodes and <code>feature_shape</code> is not affected.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/pool_neighbors_to_node_feature.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/pool_neighbors_to_node_feature.md
new file mode 100644
index 00000000..d5b086ab
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/pool_neighbors_to_node_feature.md
@@ -0,0 +1,105 @@
+# tfgnn.pool_neighbors_to_node_feature
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_ops.py#L1201-L1255">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Aggregates (pools) sender node feature to receiver nodes feature.
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>tfgnn.pool_neighbors_to_node_feature(
+    graph_tensor: GraphTensor,
+    edge_set_name: Union[Sequence[EdgeSetName], EdgeSetName],
+    to_tag: IncidentNodeTag,
+    *,
+    reduce_type: str,
+    feature_name: const.FieldName,
+    to_feature_name: Optional[const.FieldName] = None,
+    from_tag: Optional[IncidentNodeTag] = None
+) -> gt.GraphTensor
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+Similar to the `pool_neighbors_to_node` but results in the graph tensor with
+updated receiver feature.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+<code>graph_tensor</code><a id="graph_tensor"></a>
+</td>
+<td>
+A scalar GraphTensor.
+</td>
+</tr><tr>
+<td>
+<code>edge_set_name</code><a id="edge_set_name"></a>
+</td>
+<td>
+The name of the edge set through which values are pooled, or
+a non-empty sequence of such names. It is required that all edge sets
+connect the same <code>from_tag</code> and <code>to_tag</code> node sets.
+</td>
+</tr><tr>
+<td>
+<code>to_tag</code><a id="to_tag"></a>
+</td>
+<td>
+The incident node of each edge at which values are aggregated,
+identified by its tag in the edge set.
+</td>
+</tr><tr>
+<td>
+<code>reduce_type</code><a id="reduce_type"></a>
+</td>
+<td>
+A pooling operation name like <code>"sum"</code> or <code>"mean"</code>, or a
+<code>|</code>-separated combination of these; see <a href="../tfgnn/pool.md"><code>tfgnn.pool()</code></a>.
+</td>
+</tr><tr>
+<td>
+<code>feature_name</code><a id="feature_name"></a>
+</td>
+<td>
+An neighbors feature name to pool values from.
+</td>
+</tr><tr>
+<td>
+<code>to_feature_name</code><a id="to_feature_name"></a>
+</td>
+<td>
+A receiver feature name to write pooled values to. Defaults
+to the feature_name.
+</td>
+</tr><tr>
+<td>
+<code>from_tag</code><a id="from_tag"></a>
+</td>
+<td>
+The incident node of each edge from which values are aggregated.
+Optional for regular graphs, required for hypergraphs.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
+<tr class="alt">
+<td colspan="2">
+Copy of the input graph tensor with updated <code>receiver_feature_name</code> for the
+receiver node set.
+</td>
+</tr>
+
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/pool_nodes_to_context.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/pool_nodes_to_context.md
index c8731e07..a33ee821 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/pool_nodes_to_context.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/pool_nodes_to_context.md
@@ -1,17 +1,10 @@
 # tfgnn.pool_nodes_to_context
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/pool_ops.py#L99-L148">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/pool_ops.py#L100-L150">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Aggregates (pools) node values to graph context.
 
@@ -45,45 +38,46 @@ The feature to fetch node values from is provided either by name (using
 specified.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`graph_tensor`<a id="graph_tensor"></a>
+<code>graph_tensor</code><a id="graph_tensor"></a>
 </td>
 <td>
 A scalar GraphTensor.
 </td>
 </tr><tr>
 <td>
-`node_set_name`<a id="node_set_name"></a>
+<code>node_set_name</code><a id="node_set_name"></a>
 </td>
 <td>
 A node set name.
 </td>
 </tr><tr>
 <td>
-`reduce_type`<a id="reduce_type"></a>
+<code>reduce_type</code><a id="reduce_type"></a>
 </td>
 <td>
-A pooling operation name, like `"sum"` or `"mean"`, or a
-`|`-separated combination of these; see <a href="../tfgnn/pool.md"><code>tfgnn.pool()</code></a>.
+A pooling operation name, like <code>"sum"</code> or <code>"mean"</code>, or a
+<code>|</code>-separated combination of these; see <a href="../tfgnn/pool.md"><code>tfgnn.pool()</code></a>.
 </td>
 </tr><tr>
 <td>
-`feature_value`<a id="feature_value"></a>
+<code>feature_value</code><a id="feature_value"></a>
 </td>
 <td>
 A ragged or dense node feature value. Has a shape
-`[num_nodes, *feature_shape]`, where `num_nodes` is the number of nodes in
-the `node_set_name` node set and `feature_shape` is the shape of the
+<code>[num_nodes, *feature_shape]</code>, where <code>num_nodes</code> is the number of nodes in
+the <code>node_set_name</code> node set and <code>feature_shape</code> is the shape of the
 feature value for each node.
 </td>
 </tr><tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 A node feature name.
@@ -92,16 +86,16 @@ A node feature name.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-Node value pooled to graph context. Has a shape `[num_components,
-*feature_shape]`, where `num_components` is the number of components in a
-graph and `feature_shape` is not affected.
+Node value pooled to graph context. Has a shape <code>[num_components,
+*feature_shape]</code>, where <code>num_components</code> is the number of components in a
+graph and <code>feature_shape</code> is not affected.
 </td>
 </tr>
 
 </table>
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/proto.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/proto.md
new file mode 100644
index 00000000..d79ca235
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/proto.md
@@ -0,0 +1,68 @@
+# Module: tfgnn.proto
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/proto/__init__.py">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+The protocol message (protobuf) types defined by TensorFlow GNN.
+
+This package is automatically included in the top-level tfgnn package:
+
+```
+import tensorflow_gnn as tfgnn
+graph_schema = tfgnn.proto.GraphSchema()
+```
+
+Users are also free to import it separately as
+
+```
+import tensorflow_gnn.proto as tfgnn_proto
+graph_schema = tfgnn_proto.GraphSchema()
+```
+
+...which, together with using its more targeted BUILD dependency, can help to
+shrink the bazel-bin/**/*.runfiles/ directory.
+
+## Classes
+
+[`class BigQuery`](../tfgnn/proto/BigQuery.md): Describes a BigQuery table or
+SQL statement as datasource of a graph piece.
+
+[`class Context`](../tfgnn/proto/Context.md): The schema for the features that
+apply across the entire input graph.
+
+[`class EdgeSet`](../tfgnn/proto/EdgeSet.md): The schema shared by a set of
+edges that connect the same pair of node sets.
+
+[`class Feature`](../tfgnn/proto/Feature.md): The schema entry for a single
+feature.
+
+[`class GraphSchema`](../tfgnn/proto/GraphSchema.md): The top-level container
+for the schema of a graph dataset.
+
+[`class Metadata`](../tfgnn/proto/Metadata.md): Extra information optionally
+provided on a context, node set or edge set.
+
+[`class NodeSet`](../tfgnn/proto/NodeSet.md): The schema shared by a set of
+nodes in the graph.
+
+[`class OriginInfo`](../tfgnn/proto/OriginInfo.md): Metadata about the origin of
+the graph data.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Other Members</h2></th></tr>
+
+<tr> <td> GraphType<a id="GraphType"></a> </td> <td> ['UNDEFINED', 'FULL',
+'SUBGRAPH', 'RANDOM_WALKS']
+
+An enumeration of graph types according to the method of creation.
+
+For detailed documentation, see the comments in the <code>graph_schema.proto</code> file.
+</td>
+</tr>
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/BigQuery.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/BigQuery.md
new file mode 100644
index 00000000..76365376
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/BigQuery.md
@@ -0,0 +1,91 @@
+# tfgnn.proto.BigQuery
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/proto/graph_schema.proto">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Describes a BigQuery table or SQL statement as datasource of a graph piece.
+
+<!-- Placeholder for "Used in" -->
+
+For detailed documentation, see the comments in the `graph_schema.proto` file.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr>
+<td>
+<code>read_method</code><a id="read_method"></a>
+</td>
+<td>
+<code>ReadMethod read_method</code>
+</td>
+</tr><tr>
+<td>
+<code>reshuffle</code><a id="reshuffle"></a>
+</td>
+<td>
+<code>bool reshuffle</code>
+</td>
+</tr><tr>
+<td>
+<code>sql</code><a id="sql"></a>
+</td>
+<td>
+<code>string sql</code>
+</td>
+</tr><tr>
+<td>
+<code>table_spec</code><a id="table_spec"></a>
+</td>
+<td>
+<code>TableSpec table_spec</code>
+</td>
+</tr>
+</table>
+
+## Child Classes
+
+[`class TableSpec`](../../tfgnn/proto/BigQuery/TableSpec.md)
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Class Variables</h2></th></tr>
+
+<tr>
+<td>
+DIRECT_READ<a id="DIRECT_READ"></a>
+</td>
+<td>
+<code>2</code>
+</td>
+</tr><tr>
+<td>
+EXPORT<a id="EXPORT"></a>
+</td>
+<td>
+<code>1</code>
+</td>
+</tr><tr>
+<td>
+ReadMethod<a id="ReadMethod"></a>
+</td>
+<td>
+['UNSPECIFIED', 'EXPORT', 'DIRECT_READ']
+</td>
+</tr><tr>
+<td>
+UNSPECIFIED<a id="UNSPECIFIED"></a>
+</td>
+<td>
+<code>0</code>
+</td>
+</tr>
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/BigQuery/TableSpec.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/BigQuery/TableSpec.md
new file mode 100644
index 00000000..03f59584
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/BigQuery/TableSpec.md
@@ -0,0 +1,40 @@
+# tfgnn.proto.BigQuery.TableSpec
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/proto/graph_schema.proto">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+A ProtocolMessage
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr>
+<td>
+<code>dataset</code><a id="dataset"></a>
+</td>
+<td>
+<code>string dataset</code>
+</td>
+</tr><tr>
+<td>
+<code>project</code><a id="project"></a>
+</td>
+<td>
+<code>string project</code>
+</td>
+</tr><tr>
+<td>
+<code>table</code><a id="table"></a>
+</td>
+<td>
+<code>string table</code>
+</td>
+</tr>
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/Context.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/Context.md
new file mode 100644
index 00000000..56271b16
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/Context.md
@@ -0,0 +1,36 @@
+# tfgnn.proto.Context
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/proto/graph_schema.proto">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+The schema for the features that apply across the entire input graph.
+
+<!-- Placeholder for "Used in" -->
+
+For detailed documentation, see the comments in the `graph_schema.proto` file.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr>
+<td>
+<code>features</code><a id="features"></a>
+</td>
+<td>
+<code>repeated FeaturesEntry features</code>
+</td>
+</tr><tr>
+<td>
+<code>metadata</code><a id="metadata"></a>
+</td>
+<td>
+<code>Metadata metadata</code>
+</td>
+</tr>
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/EdgeSet.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/EdgeSet.md
new file mode 100644
index 00000000..57413b69
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/EdgeSet.md
@@ -0,0 +1,64 @@
+# tfgnn.proto.EdgeSet
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/proto/graph_schema.proto">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+The schema shared by a set of edges that connect the same pair of node sets.
+
+<!-- Placeholder for "Used in" -->
+
+For detailed documentation, see the comments in the `graph_schema.proto` file.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr>
+<td>
+<code>context</code><a id="context"></a>
+</td>
+<td>
+<code>repeated string context</code>
+</td>
+</tr><tr>
+<td>
+<code>description</code><a id="description"></a>
+</td>
+<td>
+<code>string description</code>
+</td>
+</tr><tr>
+<td>
+<code>features</code><a id="features"></a>
+</td>
+<td>
+<code>repeated FeaturesEntry features</code>
+</td>
+</tr><tr>
+<td>
+<code>metadata</code><a id="metadata"></a>
+</td>
+<td>
+<code>Metadata metadata</code>
+</td>
+</tr><tr>
+<td>
+<code>source</code><a id="source"></a>
+</td>
+<td>
+<code>string source</code>
+</td>
+</tr><tr>
+<td>
+<code>target</code><a id="target"></a>
+</td>
+<td>
+<code>string target</code>
+</td>
+</tr>
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/Feature.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/Feature.md
new file mode 100644
index 00000000..03a652c4
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/Feature.md
@@ -0,0 +1,58 @@
+# tfgnn.proto.Feature
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/proto/graph_schema.proto">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+The schema entry for a single feature.
+
+<section class="expandable">
+  <h4 class="showalways">View aliases</h4>
+  <p>
+<b>Main aliases</b>
+<p>`tfgnn.Feature`</p>
+</p>
+</section>
+
+<!-- Placeholder for "Used in" -->
+
+For detailed documentation, see the comments in the `graph_schema.proto` file.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr>
+<td>
+<code>description</code><a id="description"></a>
+</td>
+<td>
+<code>string description</code>
+</td>
+</tr><tr>
+<td>
+<code>dtype</code><a id="dtype"></a>
+</td>
+<td>
+<code>DataType dtype</code>
+</td>
+</tr><tr>
+<td>
+<code>shape</code><a id="shape"></a>
+</td>
+<td>
+<code>TensorShapeProto shape</code>
+</td>
+</tr><tr>
+<td>
+<code>source</code><a id="source"></a>
+</td>
+<td>
+<code>string source</code>
+</td>
+</tr>
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/GraphSchema.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/GraphSchema.md
new file mode 100644
index 00000000..741c53ec
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/GraphSchema.md
@@ -0,0 +1,58 @@
+# tfgnn.proto.GraphSchema
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/proto/graph_schema.proto">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+The top-level container for the schema of a graph dataset.
+
+<section class="expandable">
+  <h4 class="showalways">View aliases</h4>
+  <p>
+<b>Main aliases</b>
+<p>`tfgnn.GraphSchema`</p>
+</p>
+</section>
+
+<!-- Placeholder for "Used in" -->
+
+For detailed documentation, see the comments in the `graph_schema.proto` file.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr>
+<td>
+<code>context</code><a id="context"></a>
+</td>
+<td>
+<code>Context context</code>
+</td>
+</tr><tr>
+<td>
+<code>edge_sets</code><a id="edge_sets"></a>
+</td>
+<td>
+<code>repeated EdgeSetsEntry edge_sets</code>
+</td>
+</tr><tr>
+<td>
+<code>info</code><a id="info"></a>
+</td>
+<td>
+<code>OriginInfo info</code>
+</td>
+</tr><tr>
+<td>
+<code>node_sets</code><a id="node_sets"></a>
+</td>
+<td>
+<code>repeated NodeSetsEntry node_sets</code>
+</td>
+</tr>
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/Metadata.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/Metadata.md
new file mode 100644
index 00000000..18099619
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/Metadata.md
@@ -0,0 +1,54 @@
+# tfgnn.proto.Metadata
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/proto/graph_schema.proto">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Extra information optionally provided on a context, node set or edge set.
+
+<!-- Placeholder for "Used in" -->
+
+For detailed documentation, see the comments in the `graph_schema.proto` file.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr>
+<td>
+<code>bigquery</code><a id="bigquery"></a>
+</td>
+<td>
+<code>BigQuery bigquery</code>
+</td>
+</tr><tr>
+<td>
+<code>cardinality</code><a id="cardinality"></a>
+</td>
+<td>
+<code>int64 cardinality</code>
+</td>
+</tr><tr>
+<td>
+<code>extra</code><a id="extra"></a>
+</td>
+<td>
+<code>repeated KeyValue extra</code>
+</td>
+</tr><tr>
+<td>
+<code>filename</code><a id="filename"></a>
+</td>
+<td>
+<code>string filename</code>
+</td>
+</tr>
+</table>
+
+## Child Classes
+
+[`class KeyValue`](../../tfgnn/proto/Metadata/KeyValue.md)
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/Metadata/KeyValue.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/Metadata/KeyValue.md
new file mode 100644
index 00000000..0fbd66b3
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/Metadata/KeyValue.md
@@ -0,0 +1,33 @@
+# tfgnn.proto.Metadata.KeyValue
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/proto/graph_schema.proto">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+A ProtocolMessage
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr>
+<td>
+<code>key</code><a id="key"></a>
+</td>
+<td>
+<code>string key</code>
+</td>
+</tr><tr>
+<td>
+<code>value</code><a id="value"></a>
+</td>
+<td>
+<code>string value</code>
+</td>
+</tr>
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/NodeSet.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/NodeSet.md
new file mode 100644
index 00000000..dd059869
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/NodeSet.md
@@ -0,0 +1,50 @@
+# tfgnn.proto.NodeSet
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/proto/graph_schema.proto">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+The schema shared by a set of nodes in the graph.
+
+<!-- Placeholder for "Used in" -->
+
+For detailed documentation, see the comments in the `graph_schema.proto` file.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr>
+<td>
+<code>context</code><a id="context"></a>
+</td>
+<td>
+<code>repeated string context</code>
+</td>
+</tr><tr>
+<td>
+<code>description</code><a id="description"></a>
+</td>
+<td>
+<code>string description</code>
+</td>
+</tr><tr>
+<td>
+<code>features</code><a id="features"></a>
+</td>
+<td>
+<code>repeated FeaturesEntry features</code>
+</td>
+</tr><tr>
+<td>
+<code>metadata</code><a id="metadata"></a>
+</td>
+<td>
+<code>Metadata metadata</code>
+</td>
+</tr>
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/OriginInfo.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/OriginInfo.md
new file mode 100644
index 00000000..404472b2
--- /dev/null
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/proto/OriginInfo.md
@@ -0,0 +1,36 @@
+# tfgnn.proto.OriginInfo
+
+<!-- Insert buttons and diff -->
+
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/proto/graph_schema.proto">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
+
+Metadata about the origin of the graph data.
+
+<!-- Placeholder for "Used in" -->
+
+For detailed documentation, see the comments in the `graph_schema.proto` file.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr>
+<td>
+<code>graph_type</code><a id="graph_type"></a>
+</td>
+<td>
+<code>GraphType graph_type</code>
+</td>
+</tr><tr>
+<td>
+<code>root_set</code><a id="root_set"></a>
+</td>
+<td>
+<code>repeated string root_set</code>
+</td>
+</tr>
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/random_graph_tensor.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/random_graph_tensor.md
index de67b023..e9a1ef2b 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/random_graph_tensor.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/random_graph_tensor.md
@@ -1,32 +1,23 @@
 # tfgnn.random_graph_tensor
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_random.py#L144-L241">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_random.py#L149-L294">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
-Generate a graph tensor from a schema, with random features.
+Generate a graph tensor from a spec, with random features.
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>tfgnn.random_graph_tensor(
     spec: <a href="../tfgnn/GraphTensorSpec.md"><code>tfgnn.GraphTensorSpec</code></a>,
     sample_dict: Optional[SampleDict] = None,
     row_lengths_range: Tuple[int, int] = (2, 8),
-    row_splits_dtype: tf.dtypes.DType = tf.int32,
-    validate: bool = True
+    validate: bool = True,
+    num_components_range: Tuple[int, int] = (1, 2)
 ) -> <a href="../tfgnn/GraphTensor.md"><code>tfgnn.GraphTensor</code></a>
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
 NOTE: This function does not (yet?) support the generation of the auxiliary node
@@ -36,20 +27,23 @@ It should not be included in the `spec`, and if needed, should be added
 separately in a later step.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`spec`<a id="spec"></a>
+<code>spec</code><a id="spec"></a>
 </td>
 <td>
-A GraphTensorSpec instance that describes the graph tensor.
+A GraphTensorSpec instance that describes the graph tensor. The result
+random graph tensors are generated for the relaxed number of items, as
+<code>spec.relax(num_components=True, num_nodes=True, num_edges=True)</code>.
 </td>
 </tr><tr>
 <td>
-`sample_dict`<a id="sample_dict"></a>
+<code>sample_dict</code><a id="sample_dict"></a>
 </td>
 <td>
 A dict of (set-type, set-name, field-name) to list-of-values to
@@ -60,40 +54,45 @@ random features are inserted of the right type.
 </td>
 </tr><tr>
 <td>
-`row_lengths_range`<a id="row_lengths_range"></a>
+<code>row_lengths_range</code><a id="row_lengths_range"></a>
 </td>
 <td>
-Minimum and maximum values for each row lengths in a
-ragged range.
+Minimum (included) and maximum (excluded) values for each
+row lengths in a ragged range.
 </td>
 </tr><tr>
 <td>
-`row_splits_dtype`<a id="row_splits_dtype"></a>
+<code>validate</code><a id="validate"></a>
 </td>
 <td>
-Data type for row splits.
+If true, then use assertions to check that the arguments form a
+valid RaggedTensor. Note: these assertions incur a runtime cost, since
+they must be checked for each tensor value.
 </td>
 </tr><tr>
 <td>
-`validate`<a id="validate"></a>
+<code>num_components_range</code><a id="num_components_range"></a>
 </td>
 <td>
-If true, then use assertions to check that the arguments form a
-valid RaggedTensor. Note: these assertions incur a runtime cost, since
-they must be checked for each tensor value.
+Minimum (included) and maximum (excluded) values for
+the number of graph components. Overrides the number of components
+from spec.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-An instance of a GraphTensor.
+A GraphTensor compatible with the given spec with relaxed number of graph
+items. The size of each node set and edge set is random within
+<code>row_lengths_range</code>. The number of components is random within
+<code>num_components_range</code>.
 </td>
 </tr>
 
 </table>
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/read_schema.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/read_schema.md
index 4b159ffd..c60de988 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/read_schema.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/read_schema.md
@@ -1,56 +1,46 @@
 # tfgnn.read_schema
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/schema_utils.py#L41-L52">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/schema_utils.py#L43-L54">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Read a proto schema from a file with text-formatted contents.
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>tfgnn.read_schema(
     filename: str
-) -> <a href="../tfgnn/GraphSchema.md"><code>tfgnn.GraphSchema</code></a>
+) -> <a href="../tfgnn/proto/GraphSchema.md"><code>tfgnn.proto.GraphSchema</code></a>
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`filename`<a id="filename"></a>
+<code>filename</code><a id="filename"></a>
 </td>
 <td>
 A string, the path to a file containing a text-formatted protocol
-buffer rendition of a `GraphSchema` message.
+buffer rendition of a <code>GraphSchema</code> message.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-A `GraphSchema` instance.
+A <code>GraphSchema</code> instance.
 </td>
 </tr>
 
 </table>
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/reorder_nodes.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/reorder_nodes.md
index 9c3d574c..00c65716 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/reorder_nodes.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/reorder_nodes.md
@@ -1,17 +1,10 @@
 # tfgnn.reorder_nodes
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_ops.py#L449-L551">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_ops.py#L445-L542">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Reorders nodes within node sets according to indices.
 
@@ -25,84 +18,86 @@ Reorders nodes within node sets according to indices.
 </code></pre>
 
 <!-- Placeholder for "Used in" -->
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`graph_tensor`<a id="graph_tensor"></a>
+<code>graph_tensor</code><a id="graph_tensor"></a>
 </td>
 <td>
 A scalar GraphTensor.
 </td>
 </tr><tr>
 <td>
-`node_indices`<a id="node_indices"></a>
+<code>node_indices</code><a id="node_indices"></a>
 </td>
 <td>
 A mapping from node sets name to new nodes indices (positions
 within the node set). Each index is an arbitrary permutation of
-`tf.range(num_nodes)`, where `index[i]` is an index of an original node
-to be placed at position `i`.
+<code>tf.range(num_nodes)</code>, where <code>index[i]</code> is an index of an original node
+to be placed at position <code>i</code>.
 </td>
 </tr><tr>
 <td>
-`validate`<a id="validate"></a>
+<code>validate</code><a id="validate"></a>
 </td>
 <td>
-If True, checks that `node_indices` are valid permutations.
+If True, checks that <code>node_indices</code> are valid permutations.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-A scalar GraphTensor with randomly shuffled nodes within `node_sets`.
+A scalar GraphTensor with randomly shuffled nodes within <code>node_sets</code>.
 </td>
 </tr>
 
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Raises</h2></th></tr>
 
 <tr>
 <td>
-`ValueError`<a id="ValueError"></a>
+<code>ValueError</code><a id="ValueError"></a>
 </td>
 <td>
-If `node_sets` contains non existing node set names.
+If <code>node_sets</code> contains non existing node set names.
 </td>
 </tr><tr>
 <td>
-`ValueError`<a id="ValueError"></a>
+<code>ValueError</code><a id="ValueError"></a>
 </td>
 <td>
-If indices are not `rank=1` `tf.int32` or `tf.int64` tensors.
+If indices are not <code>rank=1</code> <code>tf.int32</code> or <code>tf.int64</code> tensors.
 </td>
 </tr><tr>
 <td>
-`InvalidArgumentError`<a id="InvalidArgumentError"></a>
+<code>InvalidArgumentError</code><a id="InvalidArgumentError"></a>
 </td>
 <td>
-if an index shape is not `[num_nodes]`.
+if an index shape is not <code>[num_nodes]</code>.
 </td>
 </tr><tr>
 <td>
-`InvalidArgumentError`<a id="InvalidArgumentError"></a>
+<code>InvalidArgumentError</code><a id="InvalidArgumentError"></a>
 </td>
 <td>
 if an index is not a permutation of
-`tf.range(num_nodes)`. Only if validate is set to True.
+<code>tf.range(num_nodes)</code>. Only if validate is set to True.
 </td>
 </tr>
 </table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/reverse_tag.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/reverse_tag.md
index 5bcff3f4..df2f0b8f 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/reverse_tag.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/reverse_tag.md
@@ -1,17 +1,10 @@
 # tfgnn.reverse_tag
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/tag_utils.py#L31-L40">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/tag_utils.py#L31-L40">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Flips tfgnn.SOURCE to tfgnn.TARGET and vice versa.
 
@@ -21,6 +14,4 @@ Flips tfgnn.SOURCE to tfgnn.TARGET and vice versa.
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/sampler.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/sampler.md
index a97e80ca..14090d56 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/sampler.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/sampler.md
@@ -1,17 +1,10 @@
 # Module: tfgnn.sampler
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/sampler/__init__.py">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/sampler/__init__.py">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Public interface for GNN Sampler.
 
@@ -28,3 +21,19 @@ builder pattern that eases creation of `tfgnn.SamplingSpec`.
 
 [`make_sampling_spec_tree(...)`](../tfgnn/sampler/make_sampling_spec_tree.md):
 Automatically creates `SamplingSpec` by starting from seed node set.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Other Members</h2></th></tr>
+
+<tr>
+<td>
+SamplingStrategy<a id="SamplingStrategy"></a>
+</td>
+<td>
+['TOP_K', 'RANDOM_UNIFORM', 'RANDOM_WEIGHTED']
+</td>
+</tr>
+</table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/sampler/SamplingOp.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/sampler/SamplingOp.md
index 233dd4c5..f25ff6c1 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/sampler/SamplingOp.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/sampler/SamplingOp.md
@@ -1,17 +1,10 @@
 # tfgnn.sampler.SamplingOp
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/sampler/sampling_spec.proto">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/sampler/sampling_spec.proto">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 A ProtocolMessage
 
@@ -24,38 +17,38 @@ A ProtocolMessage
 
 <tr>
 <td>
-`edge_set_name`<a id="edge_set_name"></a>
+<code>edge_set_name</code><a id="edge_set_name"></a>
 </td>
 <td>
-`string edge_set_name`
+<code>string edge_set_name</code>
 </td>
 </tr><tr>
 <td>
-`input_op_names`<a id="input_op_names"></a>
+<code>input_op_names</code><a id="input_op_names"></a>
 </td>
 <td>
-`repeated string input_op_names`
+<code>repeated string input_op_names</code>
 </td>
 </tr><tr>
 <td>
-`op_name`<a id="op_name"></a>
+<code>op_name</code><a id="op_name"></a>
 </td>
 <td>
-`string op_name`
+<code>string op_name</code>
 </td>
 </tr><tr>
 <td>
-`sample_size`<a id="sample_size"></a>
+<code>sample_size</code><a id="sample_size"></a>
 </td>
 <td>
-`int32 sample_size`
+<code>int32 sample_size</code>
 </td>
 </tr><tr>
 <td>
-`strategy`<a id="strategy"></a>
+<code>strategy</code><a id="strategy"></a>
 </td>
 <td>
-`SamplingStrategy strategy`
+<code>SamplingStrategy strategy</code>
 </td>
 </tr>
 </table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/sampler/SamplingSpec.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/sampler/SamplingSpec.md
index ad0ca821..8d83211c 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/sampler/SamplingSpec.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/sampler/SamplingSpec.md
@@ -1,17 +1,10 @@
 # tfgnn.sampler.SamplingSpec
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/sampler/sampling_spec.proto">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/sampler/sampling_spec.proto">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 A ProtocolMessage
 
@@ -24,17 +17,24 @@ A ProtocolMessage
 
 <tr>
 <td>
-`sampling_ops`<a id="sampling_ops"></a>
+<code>sampling_ops</code><a id="sampling_ops"></a>
+</td>
+<td>
+<code>repeated SamplingOp sampling_ops</code>
+</td>
+</tr><tr>
+<td>
+<code>seed_op</code><a id="seed_op"></a>
 </td>
 <td>
-`repeated SamplingOp sampling_ops`
+<code>SeedOp seed_op</code>
 </td>
 </tr><tr>
 <td>
-`seed_op`<a id="seed_op"></a>
+<code>symmetric_link_seed_op</code><a id="symmetric_link_seed_op"></a>
 </td>
 <td>
-`SeedOp seed_op`
+<code>SymmetricLinkSeedOp symmetric_link_seed_op</code>
 </td>
 </tr>
 </table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/sampler/SamplingSpecBuilder.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/sampler/SamplingSpecBuilder.md
index 11033cc2..f2926a0a 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/sampler/SamplingSpecBuilder.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/sampler/SamplingSpecBuilder.md
@@ -1,24 +1,17 @@
 # tfgnn.sampler.SamplingSpecBuilder
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/sampler/sampling_spec_builder.py#L197-L324">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/sampler/sampling_spec_builder.py#L197-L324">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Mimics builder pattern that eases creation of `tfgnn.SamplingSpec`.
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>tfgnn.sampler.SamplingSpecBuilder(
-    graph_schema: <a href="../../tfgnn/GraphSchema.md"><code>tfgnn.GraphSchema</code></a>,
-    default_strategy: SamplingStrategy = SamplingStrategy.TOP_K
+    graph_schema: <a href="../../tfgnn/proto/GraphSchema.md"><code>tfgnn.proto.GraphSchema</code></a>,
+    default_strategy: <a href="../../tfgnn/sampler.md#SamplingStrategy"><code>tfgnn.sampler.SamplingStrategy</code></a> = SamplingStrategy.TOP_K
 )
 </code></pre>
 
@@ -109,12 +102,12 @@ Initializes sampling by seeding on node with `node_set_name`.
 
 <tr>
 <td>
-`node_set_name`
+<code>node_set_name</code>
 </td>
 <td>
-Becomes the `node_set_name` of built `spec.sampling_op`. If
-not given, the graph schema must be homogeneous (with one `node_set`).
-If given, it must correspond to some node set name in `graph_schema`
+Becomes the <code>node_set_name</code> of built <code>spec.sampling_op</code>. If
+not given, the graph schema must be homogeneous (with one <code>node_set</code>).
+If given, it must correspond to some node set name in <code>graph_schema</code>
 given to constructor.
 </td>
 </tr>
@@ -128,7 +121,7 @@ given to constructor.
 <tr class="alt">
 <td colspan="2">
 Object which support builder pattern, upon which, you may repeatedly call
-`.sample()`, per header comments.
+<code>.sample()</code>, per header comments.
 </td>
 </tr>
 
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/sampler/make_sampling_spec_tree.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/sampler/make_sampling_spec_tree.md
index 1730a42a..3b23d48b 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/sampler/make_sampling_spec_tree.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/sampler/make_sampling_spec_tree.md
@@ -1,23 +1,16 @@
 # tfgnn.sampler.make_sampling_spec_tree
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/sampler/sampling_spec_builder.py#L143-L194">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/sampler/sampling_spec_builder.py#L143-L194">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Automatically creates `SamplingSpec` by starting from seed node set.
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>tfgnn.sampler.make_sampling_spec_tree(
-    graph_schema: <a href="../../tfgnn/GraphSchema.md"><code>tfgnn.GraphSchema</code></a>,
+    graph_schema: <a href="../../tfgnn/proto/GraphSchema.md"><code>tfgnn.proto.GraphSchema</code></a>,
     seed_nodeset: NodeSetName,
     *,
     sample_sizes: List[int],
@@ -40,14 +33,14 @@ created `SamplingSpec` instructs sampling up to `sample_sizes[1]` edges for
 
 <tr>
 <td>
-`graph_schema`<a id="graph_schema"></a>
+<code>graph_schema</code><a id="graph_schema"></a>
 </td>
 <td>
 contains node-sets & edge-sets.
 </td>
 </tr><tr>
 <td>
-`seed_nodeset`<a id="seed_nodeset"></a>
+<code>seed_nodeset</code><a id="seed_nodeset"></a>
 </td>
 <td>
 name of node-set that the sampler will be instructed to use as
@@ -55,17 +48,17 @@ seed nodes.
 </td>
 </tr><tr>
 <td>
-`sample_sizes`<a id="sample_sizes"></a>
+<code>sample_sizes</code><a id="sample_sizes"></a>
 </td>
 <td>
-list of number of nodes to sample. E.g. if `sample_sizes` are
-`[5, 2, 2]`, then for every sampled node, up-to `5` of its neighbors will
-be sampled, and for each, up to `2` of its neighbors will be sampled, etc,
-totalling sampled nodes up to `5 * 2 * 2 = 20` for each seed node.
+list of number of nodes to sample. E.g. if <code>sample_sizes</code> are
+<code>[5, 2, 2]</code>, then for every sampled node, up-to <code>5</code> of its neighbors will
+be sampled, and for each, up to <code>2</code> of its neighbors will be sampled, etc,
+totalling sampled nodes up to <code>5 * 2 * 2 = 20</code> for each seed node.
 </td>
 </tr><tr>
 <td>
-`sampling_strategy`<a id="sampling_strategy"></a>
+<code>sampling_strategy</code><a id="sampling_strategy"></a>
 </td>
 <td>
 one of the supported sampling strategies, the same for
@@ -81,8 +74,8 @@ each depth.
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-`SamplingSpec` that instructs the sampler to sample according to the
-`sampling_strategy` and `sample_sizes`.
+<code>SamplingSpec</code> that instructs the sampler to sample according to the
+<code>sampling_strategy</code> and <code>sample_sizes</code>.
 </td>
 </tr>
 
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/satisfies_size_constraints.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/satisfies_size_constraints.md
index 55d29832..9c4ad426 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/satisfies_size_constraints.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/satisfies_size_constraints.md
@@ -1,17 +1,10 @@
 # tfgnn.satisfies_size_constraints
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/padding_ops.py#L186-L210">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/padding_ops.py#L188-L213">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns whether the input `graph_tensor` satisfies `total_sizes`.
 
@@ -30,26 +23,23 @@ Returns whether the input `graph_tensor` satisfies `total_sizes`.
 ) -> tf.Tensor
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`graph_tensor`<a id="graph_tensor"></a>
+<code>graph_tensor</code><a id="graph_tensor"></a>
 </td>
 <td>
 a graph tensor to check against target total sizes.
 </td>
 </tr><tr>
 <td>
-`total_sizes`<a id="total_sizes"></a>
+<code>total_sizes</code><a id="total_sizes"></a>
 </td>
 <td>
 target total sizes for each graph piece.
@@ -58,15 +48,15 @@ target total sizes for each graph piece.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-A scalar boolean tensor equal to `True` if the `graph_tensor` statisifies
-`total_sizes`, and `False` if not.
+A scalar boolean tensor equal to <code>True</code> if the <code>graph_tensor</code> statisifies
+<code>total_sizes</code>, and <code>False</code> if not.
 </td>
 </tr>
 
 </table>
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/shuffle_features_globally.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/shuffle_features_globally.md
index eef56412..c474ad61 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/shuffle_features_globally.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/shuffle_features_globally.md
@@ -1,17 +1,10 @@
 # tfgnn.shuffle_features_globally
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_ops.py#L406-L446">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_ops.py#L401-L442">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Shuffles context, node set and edge set features of a scalar GraphTensor.
 
@@ -34,14 +27,14 @@ TF 2.12).
 
 <tr>
 <td>
-`graph_tensor`<a id="graph_tensor"></a>
+<code>graph_tensor</code><a id="graph_tensor"></a>
 </td>
 <td>
 A scalar GraphTensor.
 </td>
 </tr><tr>
 <td>
-`seed`<a id="seed"></a>
+<code>seed</code><a id="seed"></a>
 </td>
 <td>
 A seed for random uniform shuffle.
@@ -56,15 +49,15 @@ A seed for random uniform shuffle.
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-A scalar GraphTensor `result` with the same graph structure as the input,
+A scalar GraphTensor <code>result</code> with the same graph structure as the input,
 but randomly shuffled feature tensors. More precisely, the result satisfies
-`result.node_sets[ns][ft][i] = graph_tensor.node_sets[ns][ft][sigma(i)]`
-for all node set names `ns` (including auxiliary node sets), all feature
-names `ft` and all indices `i` in `range(n)`, where `n` is the total_size
-of the node set and `sigma` is a permutation of `range(n)`.
+<code>result.node_sets[ns][ft][i] = graph_tensor.node_sets[ns][ft][sigma(i)]</code>
+for all node set names <code>ns</code> (including auxiliary node sets), all feature
+names <code>ft</code> and all indices <code>i</code> in <code>range(n)</code>, where <code>n</code> is the total_size
+of the node set and <code>sigma</code> is a permutation of <code>range(n)</code>.
 Moreover, the result satisfies the the analogous equations for all features
 of all edge sets (including auxiliary edge sets) and the context.
-The permutation `sigma` is drawn uniformly at random, independently for
+The permutation <code>sigma</code> is drawn uniformly at random, independently for
 each graph piece and each feature(!). That is, separate features are
 permuted differently, and features on any one item (edge, node, component)
 can form combinations not seen on an input item.
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/shuffle_nodes.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/shuffle_nodes.md
index a58c9d9f..f62162d2 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/shuffle_nodes.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/shuffle_nodes.md
@@ -1,17 +1,10 @@
 # tfgnn.shuffle_nodes
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_ops.py#L554-L617">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_ops.py#L545-L612">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Randomly reorders nodes of given node sets, within each graph component.
 
@@ -40,28 +33,29 @@ NOTE(b/277938756): This operation is not available in TFLite (last checked for
 TF 2.12).
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`graph_tensor`<a id="graph_tensor"></a>
+<code>graph_tensor</code><a id="graph_tensor"></a>
 </td>
 <td>
 A scalar GraphTensor.
 </td>
 </tr><tr>
 <td>
-`node_sets`<a id="node_sets"></a>
+<code>node_sets</code><a id="node_sets"></a>
 </td>
 <td>
 An optional collection of node sets names to shuffle. If None,
-all node sets are shuffled.  Should not overlap with `shuffle_indices`.
+all node sets are shuffled.  Should not overlap with <code>shuffle_indices</code>.
 </td>
 </tr><tr>
 <td>
-`seed`<a id="seed"></a>
+<code>seed</code><a id="seed"></a>
 </td>
 <td>
 Optionally, a fixed seed for random uniform shuffle.
@@ -70,28 +64,30 @@ Optionally, a fixed seed for random uniform shuffle.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-A scalar GraphTensor with randomly shuffled nodes within `node_sets`.
+A scalar GraphTensor with randomly shuffled nodes within <code>node_sets</code>.
 </td>
 </tr>
 
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Raises</h2></th></tr>
 
 <tr>
 <td>
-`ValueError`<a id="ValueError"></a>
+<code>ValueError</code><a id="ValueError"></a>
 </td>
 <td>
-If `node_sets` containes non existing node set names.
+If <code>node_sets</code> containes non existing node set names.
 </td>
 </tr>
 </table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/softmax.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/softmax.md
index 5b74662f..caccb2ac 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/softmax.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/softmax.md
@@ -1,17 +1,10 @@
 # tfgnn.softmax
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/normalization_ops.py#L36-L108">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/normalization_ops.py#L37-L110">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Computes softmax over a many-to-one relationship in a GraphTensor.
 
@@ -31,28 +24,29 @@ Computes softmax over a many-to-one relationship in a GraphTensor.
 
 This function can be used to compute a softmax normalization...
 
-  * of edge values, across the edges with a common incident node at `per_tag`
+*   of edge values, across the edges with a common incident node at `per_tag`
     (e.g., SOURCE or TARGET);
-  * of node values, across all the nodes in the same graph component;
-  * of edge values, across all the edges in the same graph component.
+*   of node values, across all the nodes in the same graph component;
+*   of edge values, across all the edges in the same graph component.
 
 For non-scalar values, the softmax function is applied element-wise.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`graph_tensor`<a id="graph_tensor"></a>
+<code>graph_tensor</code><a id="graph_tensor"></a>
 </td>
 <td>
 A scalar GraphTensor.
 </td>
 </tr><tr>
 <td>
-`per_tag`<a id="per_tag"></a>
+<code>per_tag</code><a id="per_tag"></a>
 </td>
 <td>
 tfgnn.CONTEXT for normalization per graph component, or an incident
@@ -61,37 +55,37 @@ common incident node.
 </td>
 </tr><tr>
 <td>
-`edge_set_name`<a id="edge_set_name"></a>
+<code>edge_set_name</code><a id="edge_set_name"></a>
 </td>
 <td>
 The name of the edge set on which values are normalized,
-or a non-empty sequence of such names. Unless `from_tag=tfgnn.CONTEXT`,
+or a non-empty sequence of such names. Unless <code>from_tag=tfgnn.CONTEXT</code>,
 all named edge sets must have the same incident node set at the given tag.
 </td>
 </tr><tr>
 <td>
-`node_set_name`<a id="node_set_name"></a>
+<code>node_set_name</code><a id="node_set_name"></a>
 </td>
 <td>
 The name of the node set on which values are normalized,
 or a non-empty sequence of such names. Can only be passed together with
-`from_tag=tfgnn.CONTEXT`. Exactly one of edge_set_name or node_set_name
+<code>from_tag=tfgnn.CONTEXT</code>. Exactly one of edge_set_name or node_set_name
 must be set.
 </td>
 </tr><tr>
 <td>
-`feature_value`<a id="feature_value"></a>
+<code>feature_value</code><a id="feature_value"></a>
 </td>
 <td>
 A tensor or list of tensors, parallel to the node_set_names
 or edge_set_names, to supply the input values of softmax. Each tensor
-has shape `[num_items, *feature_shape]`, where `num_items` is the number
+has shape <code>[num_items, *feature_shape]</code>, where <code>num_items</code> is the number
 of edges in the given edge set or nodes in the given node set, and
-`*feature_shape` is the same across all inputs.
+<code>*feature_shape</code> is the same across all inputs.
 </td>
 </tr><tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 The name of a feature stored on each graph piece from which
@@ -114,4 +108,3 @@ the tensors and the length of the list do not change from the input.
 </tr>
 
 </table>
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/softmax_edges_per_node.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/softmax_edges_per_node.md
index daeaa2b0..75e79399 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/softmax_edges_per_node.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/softmax_edges_per_node.md
@@ -1,17 +1,10 @@
 # tfgnn.softmax_edges_per_node
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/normalization_ops.py#L119-L129">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/normalization_ops.py#L121-L132">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Returns softmax() of edge values per common `node_tag` node.
 
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/structured_readout.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/structured_readout.md
index 861859a6..337242b0 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/structured_readout.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/structured_readout.md
@@ -1,17 +1,10 @@
 # tfgnn.structured_readout
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/readout.py#L137-L269">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/readout.py#L141-L274">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Reads out a feature value from select nodes (or edges) in a graph.
 
@@ -89,7 +82,7 @@ as usual with `GraphTensor.node_sets["_readout"]["ft"]`.
 
 <tr>
 <td>
-`graph`<a id="graph"></a>
+<code>graph</code><a id="graph"></a>
 </td>
 <td>
 A scalar GraphTensor with a readout structure composed of auxiliary
@@ -97,36 +90,36 @@ graph pieces as described above.
 </td>
 </tr><tr>
 <td>
-`key`<a id="key"></a>
+<code>key</code><a id="key"></a>
 </td>
 <td>
 A string key to select between possibly multiple named readouts
-(such as `"source"` and `"target"` for link prediction).
+(such as <code>"source"</code> and <code>"target"</code> for link prediction).
 </td>
 </tr><tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 The name of a feature that is present on the node set(s)
 (or edge set(s)) referenced by the auxiliary edge sets. The feature
-must have shape `[num_items, *feature_dims]` with the same `feature_dims`
+must have shape <code>[num_items, *feature_dims]</code> with the same <code>feature_dims</code>
 on all graph pieces, and the same dtype.
 </td>
 </tr><tr>
 <td>
-`readout_node_set`<a id="readout_node_set"></a>
+<code>readout_node_set</code><a id="readout_node_set"></a>
 </td>
 <td>
 The name for the readout node set and the name prefix for
-its edge sets. Permissible values are `"_readout"` (the default) and
-`f"_readout:{tag}"` where `tag` matches `[a-zA-Z0-9_]+`.
+its edge sets. Permissible values are <code>"_readout"</code> (the default) and
+<code>f"_readout:{tag}"</code> where <code>tag</code> matches <code>[a-zA-Z0-9_]+</code>.
 Setting this to a different value allows to select between multiple
 independent readout structures in the same graph.
 </td>
 </tr><tr>
 <td>
-`validate`<a id="validate"></a>
+<code>validate</code><a id="validate"></a>
 </td>
 <td>
 Setting this to false disables the validity checks for the
@@ -144,7 +137,7 @@ structurally unchanged GraphTensors.
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-A tensor of shape `[readout_size, *feature_dims]` with the read-out feature
+A tensor of shape <code>[readout_size, *feature_dims]</code> with the read-out feature
 values.
 </td>
 </tr>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/structured_readout_into_feature.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/structured_readout_into_feature.md
index 956f2a0c..583ef1cb 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/structured_readout_into_feature.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/structured_readout_into_feature.md
@@ -1,17 +1,10 @@
 # tfgnn.structured_readout_into_feature
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/readout.py#L272-L354">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/readout.py#L277-L360">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Reads out a feature value from select nodes (or edges) in a graph.
 
@@ -45,7 +38,7 @@ modified `GraphTensor` in which the readout result is stored as a feature on the
 
 <tr>
 <td>
-`graph`<a id="graph"></a>
+<code>graph</code><a id="graph"></a>
 </td>
 <td>
 A scalar GraphTensor with the auxiliary graph pieces required by
@@ -53,15 +46,15 @@ A scalar GraphTensor with the auxiliary graph pieces required by
 </td>
 </tr><tr>
 <td>
-`key`<a id="key"></a>
+<code>key</code><a id="key"></a>
 </td>
 <td>
 A string key to select between possibly multiple named readouts
-(such as `"source"` and `"target"` for link prediction).
+(such as <code>"source"</code> and <code>"target"</code> for link prediction).
 </td>
 </tr><tr>
 <td>
-`feature_name`<a id="feature_name"></a>
+<code>feature_name</code><a id="feature_name"></a>
 </td>
 <td>
 The name of a feature to read out from, as with
@@ -69,43 +62,43 @@ The name of a feature to read out from, as with
 </td>
 </tr><tr>
 <td>
-`new_feature_name`<a id="new_feature_name"></a>
+<code>new_feature_name</code><a id="new_feature_name"></a>
 </td>
 <td>
-The name of the feature to add to `readout_node_set`
-for storing the readout result. If unset, defaults to `feature_name`.
-It is an error if the added feature already exists on `readout_node_set`
-in the input `graph`, unless `overwrite=True` is set.
+The name of the feature to add to <code>readout_node_set</code>
+for storing the readout result. If unset, defaults to <code>feature_name</code>.
+It is an error if the added feature already exists on <code>readout_node_set</code>
+in the input <code>graph</code>, unless <code>overwrite=True</code> is set.
 </td>
 </tr><tr>
 <td>
-`remove_input_feature`<a id="remove_input_feature"></a>
+<code>remove_input_feature</code><a id="remove_input_feature"></a>
 </td>
 <td>
-If set, the given `feature_name` is removed from the
+If set, the given <code>feature_name</code> is removed from the
 node (or edge) set(s) that supply the input to
 <a href="../tfgnn/structured_readout.md"><code>tfgnn.structured_readout()</code></a>.
 </td>
 </tr><tr>
 <td>
-`overwrite`<a id="overwrite"></a>
+<code>overwrite</code><a id="overwrite"></a>
 </td>
 <td>
 If set, allows overwriting a potentially already existing
-feature `graph.node_sets[readout_node_set][new_feature_name]`.
+feature <code>graph.node_sets[readout_node_set][new_feature_name]</code>.
 </td>
 </tr><tr>
 <td>
-`readout_node_set`<a id="readout_node_set"></a>
+<code>readout_node_set</code><a id="readout_node_set"></a>
 </td>
 <td>
-A string, defaults to `"_readout"`. This is used as the
+A string, defaults to <code>"_readout"</code>. This is used as the
 name for the readout node set and as a name prefix for its edge sets.
 See <a href="../tfgnn/structured_readout.md"><code>tfgnn.structured_readout()</code></a> for more.
 </td>
 </tr><tr>
 <td>
-`validate`<a id="validate"></a>
+<code>validate</code><a id="validate"></a>
 </td>
 <td>
 Setting this to false disables the validity checks of
@@ -123,9 +116,9 @@ earlier on structurally unchanged GraphTensors.
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-A `GraphTensor` like `graph`, with the readout result stored as
-`.node_sets[readout_node_set][new_feature_name]` and possibly the
-readout inputs removed (see `remove_input_feature`).
+A <code>GraphTensor</code> like <code>graph</code>, with the readout result stored as
+<code>.node_sets[readout_node_set][new_feature_name]</code> and possibly the
+readout inputs removed (see <code>remove_input_feature</code>).
 </td>
 </tr>
 
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/validate_graph_tensor_for_readout.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/validate_graph_tensor_for_readout.md
index 64accbf6..53bb11fe 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/validate_graph_tensor_for_readout.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/validate_graph_tensor_for_readout.md
@@ -1,17 +1,10 @@
 # tfgnn.validate_graph_tensor_for_readout
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/readout.py#L67-L134">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/readout.py#L68-L138">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Checks `graph` supports `structured_readout()` from `required_keys`.
 
@@ -45,14 +38,14 @@ all keys provided in the graph structure are checked.
 
 <tr>
 <td>
-`graph`<a id="graph"></a>
+<code>graph</code><a id="graph"></a>
 </td>
 <td>
 The graph tensor to check.
 </td>
 </tr><tr>
 <td>
-`required_keys`<a id="required_keys"></a>
+<code>required_keys</code><a id="required_keys"></a>
 </td>
 <td>
 Can be set to a list of readout keys to check. If unset,
@@ -60,7 +53,7 @@ checks all keys provided by the graph.
 </td>
 </tr><tr>
 <td>
-`readout_node_set`<a id="readout_node_set"></a>
+<code>readout_node_set</code><a id="readout_node_set"></a>
 </td>
 <td>
 Optionally, a non-default name for use as
@@ -76,7 +69,7 @@ Optionally, a non-default name for use as
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-The input GraphTensor, unchanged. This helps to put `tf.debugging.assert*`
+The input GraphTensor, unchanged. This helps to put <code>tf.debugging.assert*</code>
 ops from this function into a dependency chain.
 </td>
 </tr>
@@ -91,7 +84,7 @@ ops from this function into a dependency chain.
 
 <tr>
 <td>
-`ValueError`<a id="ValueError"></a>
+<code>ValueError</code><a id="ValueError"></a>
 </td>
 <td>
 if the auxiliary graph pieces for readout are malformed in the
@@ -99,14 +92,14 @@ if the auxiliary graph pieces for readout are malformed in the
 </td>
 </tr><tr>
 <td>
-`KeyError`<a id="KeyError"></a>
+<code>KeyError</code><a id="KeyError"></a>
 </td>
 <td>
-if any of the `required_keys` is missing.
+if any of the <code>required_keys</code> is missing.
 </td>
 </tr><tr>
 <td>
-`tf.errors.InvalidArgumentError`<a id="tf.errors.InvalidArgumentError"></a>
+<code>tf.errors.InvalidArgumentError</code><a id="tf.errors.InvalidArgumentError"></a>
 </td>
 <td>
 If values in the GraphTensor, notably
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/validate_graph_tensor_spec_for_readout.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/validate_graph_tensor_spec_for_readout.md
index cb656e22..c3429762 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/validate_graph_tensor_spec_for_readout.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/validate_graph_tensor_spec_for_readout.md
@@ -1,17 +1,10 @@
 # tfgnn.validate_graph_tensor_spec_for_readout
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/readout.py#L29-L64">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/readout.py#L30-L65">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Checks `graph_spec` supports `structured_readout()` from `required_keys`.
 
@@ -53,7 +46,7 @@ input.
 
 <tr>
 <td>
-`graph_spec`<a id="graph_spec"></a>
+<code>graph_spec</code><a id="graph_spec"></a>
 </td>
 <td>
 The graph tensor spec to check. Must be scalar, that is, have
@@ -61,7 +54,7 @@ shape [].
 </td>
 </tr><tr>
 <td>
-`required_keys`<a id="required_keys"></a>
+<code>required_keys</code><a id="required_keys"></a>
 </td>
 <td>
 Can be set to a list of readout keys that are required to be
@@ -69,7 +62,7 @@ provided by the spec.
 </td>
 </tr><tr>
 <td>
-`readout_node_set`<a id="readout_node_set"></a>
+<code>readout_node_set</code><a id="readout_node_set"></a>
 </td>
 <td>
 The name of the auxiliary node set for readout, which is
@@ -87,17 +80,17 @@ auxiliary edge sets connected to it.
 
 <tr>
 <td>
-`ValueError`<a id="ValueError"></a>
+<code>ValueError</code><a id="ValueError"></a>
 </td>
 <td>
 if the auxiliary graph pieces for readout are malformed.
 </td>
 </tr><tr>
 <td>
-`KeyError`<a id="KeyError"></a>
+<code>KeyError</code><a id="KeyError"></a>
 </td>
 <td>
-if any of the `required_keys` is missing.
+if any of the <code>required_keys</code> is missing.
 </td>
 </tr>
 </table>
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/validate_schema.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/validate_schema.md
index ef13c26a..5fa193bf 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/validate_schema.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/validate_schema.md
@@ -1,23 +1,16 @@
 # tfgnn.validate_schema
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/schema_validation.py#L45-L77">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/schema_validation.py#L38-L70">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Validates the correctness of a graph schema instance.
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>tfgnn.validate_schema(
-    schema: <a href="../tfgnn/GraphSchema.md"><code>tfgnn.GraphSchema</code></a>,
+    schema: <a href="../tfgnn/proto/GraphSchema.md"><code>tfgnn.proto.GraphSchema</code></a>,
     readout_node_sets: Optional[Sequence[const.NodeSetName]] = None
 ) -> List[Exception]
 </code></pre>
@@ -25,36 +18,38 @@ Validates the correctness of a graph schema instance.
 <!-- Placeholder for "Used in" -->
 
 `GraphSchema` configuration messages are created by users in order to describe
-the topology of a graph. This function checks various aspects of the schema
-for correctness, e.g. prevents usage of reserved feature names, ensures given
-shapes are fully-defined, ensures set name references are found, etc.
+the topology of a graph. This function checks various aspects of the schema for
+correctness, e.g. prevents usage of reserved feature names, ensures given shapes
+are fully-defined, ensures set name references are found, etc.
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`schema`<a id="schema"></a>
+<code>schema</code><a id="schema"></a>
 </td>
 <td>
 An instance of the graph schema.
 </td>
 </tr><tr>
 <td>
-`readout_node_sets`<a id="readout_node_sets"></a>
+<code>readout_node_sets</code><a id="readout_node_sets"></a>
 </td>
 <td>
 By default, this function checks the "_readout" node set,
 if present, if it meets the requirements of <a href="../tfgnn/structured_readout.md"><code>tfgnn.structured_readout()</code></a>.
 That's sufficient for most cases. Optionally, you can pass a list of
-`readout_node_set` names to (a) require their presence and (b) check them.
+<code>readout_node_set</code> names to (a) require their presence and (b) check them.
 </td>
 </tr>
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
@@ -67,16 +62,15 @@ Render those to your favorite stream (or ignore).
 
 </table>
 
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Raises</h2></th></tr>
 
 <tr>
 <td>
-`ValidationError`<a id="ValidationError"></a>
+<code>ValidationError</code><a id="ValidationError"></a>
 </td>
 <td>
 If a validation check fails.
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/write_example.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/write_example.md
index e2279ed5..321be541 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/write_example.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/write_example.md
@@ -1,17 +1,10 @@
 # tfgnn.write_example
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_encode.py#L32-L59">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/graph_tensor_encode.py#L34-L72">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Encode an eager `GraphTensor` to a tf.train.Example proto.
 
@@ -22,8 +15,6 @@ Encode an eager `GraphTensor` to a tf.train.Example proto.
 ) -> tf.train.Example
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
 
 This routine can be used to create a stream of training data for GNNs from a
@@ -39,21 +30,31 @@ graph schema. The graph tensors materialized in this way will be parseable by
 (using the spec deserialized from the schema) and have the same contents (up to
 the choice of indices_dtype).
 
+All features stored on the `graph` must have dtypes that are supported by the
+graph schema. For the following dtypes, special caveats apply to their
+representation in `tf.train.Example`:
+
+*   `tf.float64` features are serialized as `tf.float32`, which truncates them
+    (perhaps fatally to 0 or +/- inf, if exceeding the exponent range).
+*   `tf.uint64` features are serialized as `tf.int64` values with the same bit
+    pattern. Deserializing from `tf.train.Example` recovers the original value.
+
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`graph`<a id="graph"></a>
+<code>graph</code><a id="graph"></a>
 </td>
 <td>
-An eager instance of `GraphTensor` to write out.
+An eager instance of <code>GraphTensor</code> to write out.
 </td>
 </tr><tr>
 <td>
-`prefix`<a id="prefix"></a>
+<code>prefix</code><a id="prefix"></a>
 </td>
 <td>
 An optional prefix string over all the features. You may use
@@ -63,15 +64,14 @@ this if you are encoding other data in the same protocol buffer.
 </table>
 
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
 <tr class="alt">
 <td colspan="2">
-A reference to `result`, if provided, or a to a freshly created instance
-of `tf.train.Example`.
+A <code>tf.train.Example</code> with the serialized <code>graph</code>.
 </td>
 </tr>
 
 </table>
-
diff --git a/tensorflow_gnn/docs/api_docs/python/tfgnn/write_schema.md b/tensorflow_gnn/docs/api_docs/python/tfgnn/write_schema.md
index 6d034f9f..a4c7c325 100644
--- a/tensorflow_gnn/docs/api_docs/python/tfgnn/write_schema.md
+++ b/tensorflow_gnn/docs/api_docs/python/tfgnn/write_schema.md
@@ -1,51 +1,41 @@
 # tfgnn.write_schema
 
-[TOC]
-
 <!-- Insert buttons and diff -->
 
-<table class="tfo-notebook-buttons tfo-api nocontent" align="left">
-<td>
-  <a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/schema_utils.py#L55-L64">
-    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
-    View source on GitHub
-  </a>
-</td>
-</table>
+<a target="_blank" href="https://github.com/tensorflow/gnn/tree/master/tensorflow_gnn/graph/schema_utils.py#L57-L66">
+<img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source
+on GitHub </a>
 
 Write a `GraphSchema` to a text-formatted proto file.
 
 <pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
 <code>tfgnn.write_schema(
-    schema: <a href="../tfgnn/GraphSchema.md"><code>tfgnn.GraphSchema</code></a>,
+    schema: <a href="../tfgnn/proto/GraphSchema.md"><code>tfgnn.proto.GraphSchema</code></a>,
     filename: str
 )
 </code></pre>
 
-
-
 <!-- Placeholder for "Used in" -->
-
-
 <!-- Tabular view -->
+
  <table class="responsive fixed orange">
 <colgroup><col width="214px"><col></colgroup>
 <tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
 <tr>
 <td>
-`schema`<a id="schema"></a>
+<code>schema</code><a id="schema"></a>
 </td>
 <td>
-A `GraphSchema` instance to write out.
+A <code>GraphSchema</code> instance to write out.
 </td>
 </tr><tr>
 <td>
-`filename`<a id="filename"></a>
+<code>filename</code><a id="filename"></a>
 </td>
 <td>
 A string, the path to a file to render a text-formatted rendition
-of the `GraphSchema` message to.
+of the <code>GraphSchema</code> message to.
 </td>
 </tr>
 </table>
diff --git a/tensorflow_gnn/models/multi_head_attention/layers.py b/tensorflow_gnn/models/multi_head_attention/layers.py
index 9693c8a8..8f1d9089 100644
--- a/tensorflow_gnn/models/multi_head_attention/layers.py
+++ b/tensorflow_gnn/models/multi_head_attention/layers.py
@@ -84,6 +84,8 @@ class MultiHeadAttentionConv(tfgnn.keras.layers.AnyToAnyConvolutionBase):
   Example: Transformer-style attention on neighbors along incoming edges
   whose result is concatenated with the old node state and passed through
   a Dense layer to compute the new node state.
+
+
   ```
   dense = tf.keras.layers.Dense
   graph = tfgnn.keras.layers.GraphUpdate(
@@ -97,22 +99,22 @@ class MultiHeadAttentionConv(tfgnn.keras.layers.AnyToAnyConvolutionBase):
   For now, there is a variant that modifies the inputs transformation part and
   could potentially be beneficial:
 
-      1. (transform_keys is False) Instead of projecting both queries and
-        keys when computing attention weights, we only project the queries
-        because the two linear projections can be collapsed to a single
-        projection:
+    1. (transform_keys is False) Instead of projecting both queries and
+       keys when computing attention weights, we only project the queries
+       because the two linear projections can be collapsed to a single
+       projection:
 
-          $$ (Q_v W_Q^k)(K_u W_K^k)^T
-            = Q_v (W_Q^k {W_K^k}^T) K_u^T
-            = Q_v W_{QK}^k K_u^T $$
+         $$ (Q_v W_Q^k)(K_u W_K^k)^T
+           = Q_v (W_Q^k {W_K^k}^T) K_u^T
+           = Q_v W_{QK}^k K_u^T $$
 
-        where $d$ is the key width. (Following "Attention is all you need",
-        this scaling is meant to achieve unit variance of the results, assuming
-        that $Q_v W_{QK}^k$ has unit variance due to the initialization of
-        $Q_v W_{QK}^k$.)
+       where $d$ is the key width. (Following "Attention is all you need",
+       this scaling is meant to achieve unit variance of the results, assuming
+       that $Q_v W_{QK}^k$ has unit variance due to the initialization of
+       $Q_v W_{QK}^k$.)
 
-        NOTE: The single projection matrix behaves differently in
-        gradient-descent training than the product of two matrices.
+       NOTE: The single projection matrix behaves differently in
+       gradient-descent training than the product of two matrices.
 
   This layer can be restored from config by `tf.keras.models.load_model()`
   when saved as part of a Keras model using `save_format="tf"`.

Args
+`fns` +	+a mapping from a metric name to a `Callable` that accepts +representations as well as the result of their SVD decomposition. +Currently only singular values are passed. +
+`y_pred_transform_fn` +	+a function to extract clean representations +from model predictions. By default, no transformation is applied. +
+`name` +	+Name for the metric class, used for Keras bookkeeping. +
Args
`*args`	+ +
+`**kwargs` +	+A mini-batch of inputs to the Metric. +
Args
+`node_set_name` +	+Name of the node set for readout. +
+`feature_name` +	+Feature name for readout. +
+`representations_layer_name` +	+Layer name for uncorrupted representations. +
+`corruptor` +	+`Corruptor` instance for creating negative samples. If not +specified, we use `ShuffleFeaturesGlobally` by default. +
+`projector_units` +	+`Sequence` of layer sizes for projector network. +Projectors prevent dimensional collapse, but can hinder training for +easy corruptions. For more details, see +https://arxiv.org/abs/2304.12210. +
+`seed` +	+Random seed for the default corruptor (`ShuffleFeaturesGlobally`). +
Attributes
+`node_set_corruption` +	+Dataclass field +
+`edge_set_corruption` +	+Dataclass field +
+`context_corruption` +	+Dataclass field +
Args
+`corruption_spec` +	+A spec for corruption application. +
+`corruption_fn` +	+Corruption function. +
+`default` +	+Global application default of the corruptor. This is only used +when `corruption_spec` is None. +
+`**kwargs` +	+Additional keyword arguments. +
Args
+`representations` +	+Input representations, a rank-2 tensor. +
+`sigma` +	+Unused. +
+`u` +	+An optional tensor with left singular vectors of representations. If not +present, computes a SVD of representations. +
Args
+`representations` +	+Input representations. We expect rank 2 input. +
+`sigma` +	+An optional tensor with singular values of representations. If not +present, computes SVD (singular values only) of representations. +
+`u` +	+Unused. +
Args
+`representations` +	+Input representations as rank-2 tensor. +
+`sigma` +	+An optional tensor with singular values of representations. If not +present, computes SVD (singular values only) of representations. +
+`u` +	+Unused. +
+`epsilon` +	+Epsilon for numerican stability. +
Args
+`representations` +	+Input representations. +
+`subtract_mean` +	+Whether to subtract the mean from representations. +
Init args
-`num_heads` +`num_heads`	The number of attention heads.
-`per_head_channels` +`per_head_channels`	The number of channels for each attention head. This means: - if `heads_merge_type == "concat"`, then final output size will be: - `per_head_channels * num_heads`. - if `heads_merge_type == "mean"`, then final output size will be: - `per_head_channels`. + if `heads_merge_type == "concat"`, then final output size will be: + `per_head_channels * num_heads`. + if `heads_merge_type == "mean"`, then final output size will be: + `per_head_channels`.
-`receiver_tag` +`receiver_tag`	-one of `tfgnn.SOURCE`, `tfgnn.TARGET` or `tfgnn.CONTEXT`. +one of `tfgnn.SOURCE`, `tfgnn.TARGET` or `tfgnn.CONTEXT`. The results of attention are aggregated for this graph piece. -If set to `tfgnn.SOURCE` or `tfgnn.TARGET`, the layer can be called for +If set to `tfgnn.SOURCE` or `tfgnn.TARGET`, the layer can be called for an edge set and will aggregate results at the specified endpoint of the edges. -If set to `tfgnn.CONTEXT`, the layer can be called for an edge set or +If set to `tfgnn.CONTEXT`, the layer can be called for an edge set or node set. If left unset for init, the tag must be passed at call time.
-`receiver_feature` +`receiver_feature`	-Can be set to override `tfgnn.HIDDEN_STATE` for use as +Can be set to override `tfgnn.HIDDEN_STATE` for use as the receiver's input feature to attention. (The attention key is derived from this input.)
-`sender_node_feature` +`sender_node_feature`	-Can be set to override `tfgnn.HIDDEN_STATE` for use as +Can be set to override `tfgnn.HIDDEN_STATE` for use as the input feature from sender nodes to attention. -IMPORTANT: Must be set to `None` for use with `receiver_tag=tfgnn.CONTEXT` +IMPORTANT: Must be set to `None` for use with `receiver_tag=tfgnn.CONTEXT` on an edge set, or for pooling from edges without sender node states.
-`sender_edge_feature` +`sender_edge_feature`	Can be set to a feature name of the edge set to select -it as an input feature. By default, this set to `None`, which disables +it as an input feature. By default, this set to `None`, which disables this input. -IMPORTANT: Must be set for use with `receiver_tag=tfgnn.CONTEXT` +IMPORTANT: Must be set for use with `receiver_tag=tfgnn.CONTEXT` on an edge set.
-`use_bias` +`use_bias`	If true, a bias term is added to the transformations of query and @@ -164,7 +161,7 @@ value inputs.
-`edge_dropout` +`edge_dropout`	Can be set to a dropout rate for edge dropout. (When pooling @@ -173,29 +170,29 @@ is dropped out.)
-`attention_activation` +`attention_activation`	The nonlinearity used on the transformed inputs before multiplying with the trained weights of the attention layer. This can be specified as a Keras layer, a tf.keras.activations.* -function, or a string understood by `tf.keras.layers.Activation()`. +function, or a string understood by `tf.keras.layers.Activation()`. Defaults to "leaky_relu", which in turn defaults to a negative slope -of `alpha=0.2`. +of `alpha=0.2`.
-`heads_merge_type` +`heads_merge_type`	The merge operation for combining output from -all `num_heads` attention heads. By default, output of heads will be +all `num_heads` attention heads. By default, output of heads will be concatenated. However, GAT paper (Velickovic et al, Eq 6) recommends only for output layer to do mean across attention heads, which is acheivable -by setting to `"mean"`. +by setting to `"mean"`.
-`activation` +`activation`	The nonlinearity applied to the final result of attention, @@ -203,17 +200,17 @@ specified in the same ways as attention_activation.
-`kernel_initializer` +`kernel_initializer`	-Can be set to a `kernel_initializer` as understood -by `tf.keras.layers.Dense` etc. -An `Initializer` object gets cloned before use to ensure a fresh seed, -if not set explicitly. For more, see `tfgnn.keras.clone_initializer()`. +Can be set to a `kernel_initializer` as understood +by `tf.keras.layers.Dense` etc. +An `Initializer` object gets cloned before use to ensure a fresh seed, +if not set explicitly. For more, see `tfgnn.keras.clone_initializer()`.
-`kernel_regularizer` +`kernel_regularizer`	If given, will be used to regularize all layer kernels. @@ -222,56 +219,58 @@ If given, will be used to regularize all layer kernels.