Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve formatting of dumped schema #436

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Improve formatting of dumped schema #436

wants to merge 1 commit into from

Conversation

derekprior
Copy link
Contributor

@derekprior derekprior commented Jan 3, 2025

Over time, our dumped schema formatting has become less pretty than I would like. Way back in 2016, Caleb and I talked about adopting a SQL formatter for Scenic, but we never found a gem we liked and never took it on ourselves. Enter: niceql.

niceql has zero dependencies of its own, is implemented in a single (large) file, and has over 1.6 million downloads. It has had no releases since November 2022 but I think it's safe to consider it complete rather than abandoned. It's required Ruby version is permissive (>= 2.5), so unless there are breaking changes to Ruby basics, I don't think it will be a problem.

I've tried it on a few basic views of my own and it produces compact, readable SQL. It's maybe not how I would have formatted things by hand, but it's better than the format we're getting from Postgres. The most common formatter for Postgres is pgFormatter but as best I can tell there is no Ruby implementation for it today.

In addition to formatting the SQL, this change also adds a newline between create_view statements and any indexes associated with the view.

Before these changes:

  create_view "searches", materialized: true, sql_definition: <<-SQL
      SELECT posts.content,
      posts.user_id
     FROM posts
  UNION
   SELECT posts.title AS content,
      posts.user_id
     FROM posts
  UNION
   SELECT comments.content,
      comments.user_id
     FROM comments;
  SQL
  add_index "searches", ["content"], name: "index_searches_on_content"
  add_index "searches", ["user_id"], name: "index_searches_on_user_id"

After these changes:

  create_view "searches", materialized: true, sql_definition: <<-SQL
    SELECT posts.content, posts.user_id
    FROM posts UNION
      SELECT posts.title AS content, posts.user_id
      FROM posts UNION
        SELECT comments.content, comments.user_id
        FROM comments
  SQL

  add_index "searches", ["content"], name: "index_searches_on_content"
  add_index "searches", ["user_id"], name: "index_searches_on_user_id"

Over time, our dumped schema formatting has become pretty unreadable.
Way back in 2016, Caleb and I talked about adopting a SQL formatter for
Scenic, but we never found a gem we liked and never took it on
ourselves. Enter: [niceql](https://github.com/alekseyl/niceql).

niceql has zero dependencies of its own, is implemented in a single
(large) file, and has over 1.6 million downloads. It has had no releases
since November 2022 but I think it's safe to consider it complete rather
than abandoned. It's required Ruby version is permissive (`>=
2.5`), so unless there are breaking changes to Ruby basics, I don't
think it will be a problem.

I've tried it on a few basic views of my own and it produces compact,
readable SQL. It's maybe not how I would have formatted things by hand,
but it's better than the format we're getting from Postgres. The most
common formatter for Postgres is `pgFormatter` but as best I can tell
there is no Ruby implementation for it today.

In addition to formatting the SQL, this change also adds a newline
between `create_view` statements and any indexes associated with the
view.

Before these changes:

```ruby
  create_view "searches", materialized: true, sql_definition: <<-SQL
      SELECT posts.content,
      posts.user_id
     FROM posts
  UNION
   SELECT posts.title AS content,
      posts.user_id
     FROM posts
  UNION
   SELECT comments.content,
      comments.user_id
     FROM comments;
  SQL
  add_index "searches", ["content"], name: "index_searches_on_content"
  add_index "searches", ["user_id"], name: "index_searches_on_user_id"

```

After these changes:

```ruby
  create_view "searches", materialized: true, sql_definition: <<-SQL
    SELECT posts.content, posts.user_id
    FROM posts UNION
      SELECT posts.title AS content, posts.user_id
      FROM posts UNION
        SELECT comments.content, comments.user_id
        FROM comments
  SQL

  add_index "searches", ["content"], name: "index_searches_on_content"
  add_index "searches", ["user_id"], name: "index_searches_on_user_id"

```
@derekprior
Copy link
Contributor Author

I'm not sure we should take a dependency for this, but I do like the output better. What do you think @calebhearth?

I figure our release with tsort is already going to change folks' schema.rb quite a bit, so we might as well clean this up while we are at it, if we can.

@derekprior
Copy link
Contributor Author

This is the diff generated when I tried this out on Mastodon. I think it's better but I'm not sure it's worth it. It does some odd things. pgFormatter is still what we really want.

The newline before indexes is a keeper and I think I can probably make some minor improvements futzing with current indentation.

Thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant