Skip to content

Conversation

@virajbshah
Copy link
Contributor

  • Update seq2seq models to only use non-context nodes as input to the
    readout network.
  • Add tests for training seq2seq models with context.

 * Add fields to the `proto` specification to store context.
 * Add members to the Gematria `BasicBlock` data structure to store
   context and update methods on it and its Python binding accordingly.
 * Bonus: Remove dangling TODO.
 * Update the graph builder and its Python bindings to add context
   instructions to basic block graphs and store context node mask to later
   be used by models.
 * Add tests for the new graph builder functionality.
 * Update seq2seq models to only use non-context nodes as input to the
   readout network.
 * Add tests for training seq2seq models with context.
Copy link
Collaborator

@ondrasej ondrasej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. I'll approve once the parent ones are ready (merged).

basic_block_was_added = self._batch_graph_builder.add_basic_block(block)
# Add context to the basic block graph only for seq2seq models.
basic_block_was_added = self._batch_graph_builder.add_basic_block(
block, add_context=self.use_deltas
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No action needed in this PR: I think we should either completely disable the model where use_deltas is False, or add the context mask info somehow to the update functions. Otherwise, these models would have a hard time knowing which nodes belong to the context...

But given the model precision, I'd go with just disabling the model without deltas :)

@virajbshah virajbshah changed the base branch from main to experiment/preceding-following-context June 9, 2025 13:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants