Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement Proposal: Automated Update of data_last_update in dataset Table #164

Open
gisn8 opened this issue Mar 27, 2024 · 0 comments
Open

Comments

@gisn8
Copy link

gisn8 commented Mar 27, 2024

Another processing tool to create a PG function to update the data_last_update values in the dataset table if one is not already in place, and set the update triggers for it on all the tables in the dataset table. Some code for concept.

Query to check if function exists

function_search_query = """
SELECT EXISTS (
  SELECT 1
  FROM pg_proc p
  JOIN pg_namespace n ON p.pronamespace = n.oid
  WHERE n.nspname = 'pgmetadata' AND p.proname = 'update_data_last_update'
);
"""

If not,

create_function_query = """
CREATE OR REPLACE FUNCTION pgmetadata.update_data_last_update()
RETURNS TRIGGER AS $$
BEGIN
    -- Update the last modification timestamp in the metadata table
    UPDATE pgmetadata.dataset
    SET data_last_update = CURRENT_TIMESTAMP
    WHERE schema_name = TG_TABLE_SCHEMA AND table_name = TG_TABLE_NAME;
    -- Return the appropriate row based on the operation type
    IF TG_OP = 'INSERT' OR TG_OP = 'UPDATE' THEN
        RETURN NEW;
    ELSIF TG_OP = 'DELETE' THEN
        RETURN OLD;
    END IF;
    -- This point should not be reached
    RETURN NULL;
END;
$$ LANGUAGE plpgsql;
"""

Then it could comb through the dataset table to pull out schema and table names to query for existing triggers and create them where they don't yet exist. We would need to ignore orphaned metadata.

trigger_sql_template = """
DO $$
BEGIN
    IF NOT EXISTS (
        SELECT 1
        FROM pg_trigger
        WHERE tgname = 'update_data_last_update_for_{schema}_{table}'
    ) THEN
        CREATE TRIGGER update_data_last_update_for_{schema}_{table}
        AFTER INSERT OR UPDATE OR DELETE
        ON {schema}.{table}
        FOR EACH ROW
        EXECUTE FUNCTION pgmetadata.update_data_last_update();
    END IF;
END;
$$ LANGUAGE plpgsql;
"""

It'd be a significant step to keep metadata up to date and would help end-users track down stale data easier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant