Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Performance improvements and add airforce, army, navy, NLM agencies -- Production #937

Merged
merged 9 commits into from
Jan 3, 2025
3 changes: 0 additions & 3 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,6 @@ jobs:
CF_ORGANIZATION_NAME: ${{ vars.CF_ORGANIZATION_NAME }}
CF_SPACE_NAME: ${{ vars.CF_SPACE_NAME_DEV }}
DB_SERVICE_NAME: ${{ vars.DB_SERVICE_NAME_DEV }}
MESSAGE_QUEUE_DATABASE_NAME: ${{ vars.MESSAGE_QUEUE_DATABASE_NAME }}
MESSAGE_QUEUE_NAME: ${{ vars.MESSAGE_QUEUE_NAME }}
NEW_RELIC_APP_NAME: ${{ vars.NEW_RELIC_APP_NAME_DEV }}
PROXY_FQDN: ${{ vars.PROXY_FQDN_DEV }}
Expand Down Expand Up @@ -111,7 +110,6 @@ jobs:
CF_ORGANIZATION_NAME: ${{ vars.CF_ORGANIZATION_NAME }}
CF_SPACE_NAME: ${{ vars.CF_SPACE_NAME_STG }}
DB_SERVICE_NAME: ${{ vars.DB_SERVICE_NAME_STG }}
MESSAGE_QUEUE_DATABASE_NAME: ${{ vars.MESSAGE_QUEUE_DATABASE_NAME }}
MESSAGE_QUEUE_NAME: ${{ vars.MESSAGE_QUEUE_NAME }}
NEW_RELIC_APP_NAME: ${{ vars.NEW_RELIC_APP_NAME_STG }}
PROXY_FQDN: ${{ vars.PROXY_FQDN_STG }}
Expand Down Expand Up @@ -141,7 +139,6 @@ jobs:
CF_ORGANIZATION_NAME: ${{ vars.CF_ORGANIZATION_NAME }}
CF_SPACE_NAME: ${{ vars.CF_SPACE_NAME_PRD }}
DB_SERVICE_NAME: ${{ vars.DB_SERVICE_NAME_PRD }}
MESSAGE_QUEUE_DATABASE_NAME: ${{ vars.MESSAGE_QUEUE_DATABASE_NAME }}
MESSAGE_QUEUE_NAME: ${{ vars.MESSAGE_QUEUE_NAME }}
NEW_RELIC_APP_NAME: ${{ vars.NEW_RELIC_APP_NAME_PRD }}
PROXY_FQDN: ${{ vars.PROXY_FQDN_PRD }}
Expand Down
4 changes: 0 additions & 4 deletions .github/workflows/deploy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,9 +27,6 @@ on:
DB_SERVICE_NAME:
required: true
type: string
MESSAGE_QUEUE_DATABASE_NAME:
required: true
type: string
MESSAGE_QUEUE_NAME:
required: true
type: string
Expand Down Expand Up @@ -72,7 +69,6 @@ env:
CF_SPACE_NAME: ${{ inputs.CF_SPACE_NAME }}
DB_SERVICE_NAME: ${{ inputs.DB_SERVICE_NAME }}
GA4_CREDS: ${{ secrets.GA4_CREDS }}
MESSAGE_QUEUE_DATABASE_NAME: ${{ inputs.MESSAGE_QUEUE_DATABASE_NAME }}
MESSAGE_QUEUE_NAME: ${{ inputs.MESSAGE_QUEUE_NAME }}
NEW_RELIC_APP_NAME: ${{ inputs.NEW_RELIC_APP_NAME }}
NEW_RELIC_LICENSE_KEY: ${{ secrets.NEW_RELIC_LICENSE_KEY }}
Expand Down
1 change: 0 additions & 1 deletion .github/workflows/manual_deploy_to_dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@ jobs:
CF_ORGANIZATION_NAME: ${{ vars.CF_ORGANIZATION_NAME }}
CF_SPACE_NAME: ${{ vars.CF_SPACE_NAME_DEV }}
DB_SERVICE_NAME: ${{ vars.DB_SERVICE_NAME_DEV }}
MESSAGE_QUEUE_DATABASE_NAME: ${{ vars.MESSAGE_QUEUE_DATABASE_NAME }}
MESSAGE_QUEUE_NAME: ${{ vars.MESSAGE_QUEUE_NAME }}
NEW_RELIC_APP_NAME: ${{ vars.NEW_RELIC_APP_NAME_DEV }}
PROXY_FQDN: ${{ vars.PROXY_FQDN_DEV }}
Expand Down
20 changes: 20 additions & 0 deletions deploy/agencies.json
Original file line number Diff line number Diff line change
Expand Up @@ -24,11 +24,21 @@
"agencyName": "agriculture",
"awsBucketPath": "data/agriculture"
},
{
"analyticsReportIds": "470235781",
"agencyName": "air-force",
"awsBucketPath": "data/air-force"
},
{
"analyticsReportIds": "395213963",
"agencyName": "american-battle-monuments-commission",
"awsBucketPath": "data/american-battle-monuments-commission"
},
{
"analyticsReportIds": "470089273",
"agencyName": "army",
"awsBucketPath": "data/army"
},
{
"analyticsReportIds": "395253935",
"agencyName": "commerce",
Expand Down Expand Up @@ -264,6 +274,11 @@
"agencyName": "national-labor-relations-board",
"awsBucketPath": "data/national-labor-relations-board"
},
{
"analyticsReportIds": "470944610",
"agencyName": "national-library-medicine",
"awsBucketPath": "data/national-library-medicine"
},
{
"analyticsReportIds": "425930242",
"agencyName": "national-mediation-board",
Expand Down Expand Up @@ -294,6 +309,11 @@
"agencyName": "national-transportation-safety-board",
"awsBucketPath": "data/national-transportation-safety-board"
},
{
"analyticsReportIds": "470101393",
"agencyName": "navy",
"awsBucketPath": "data/navy"
},
{
"analyticsReportIds": "395460734",
"agencyName": "nuclear-regulatory-commission",
Expand Down
2 changes: 1 addition & 1 deletion deploy/api.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@
export ANALYTICS_REPORTS_PATH=reports/api.json
export ANALYTICS_SCRIPT_NAME=api.sh

$ANALYTICS_ROOT_PATH/bin/analytics-publisher --debug --write-to-database --output /tmp --agenciesFile=$ANALYTICS_ROOT_PATH/deploy/agencies.json
$ANALYTICS_ROOT_PATH/bin/analytics-publisher --debug --write-to-database --agenciesFile=$ANALYTICS_ROOT_PATH/deploy/agencies.json
48 changes: 29 additions & 19 deletions deploy/cron.js
Original file line number Diff line number Diff line change
Expand Up @@ -61,18 +61,18 @@ const daily_run = () => {
runScriptWithLogName(`${scriptRootPath}/daily.sh`, "daily.sh");
};

const hourly_run = () => {
/*const hourly_run = () => {
runScriptWithLogName(`${scriptRootPath}/hourly.sh`, "hourly.sh");
};
};*/

const realtime_run = () => {
runScriptWithLogName(`${scriptRootPath}/realtime.sh`, "realtime.sh");
};

/**
Daily reports run every morning at 10 AM UTC.
This calculates the offset between now and then for the next scheduled run.
*/
* Daily and API reports run every morning at 10 AM UTC.
* This calculates the offset between now and then for the next scheduled run.
*/
const calculateNextDailyRunTimeOffset = () => {
const currentTime = new Date();
const nextRunTime = new Date(
Expand All @@ -85,26 +85,36 @@ const calculateNextDailyRunTimeOffset = () => {
};

/**
* All scripts run immediately upon application start (with a 10 second delay
* All scripts run immediately upon application start (with a 60 second delay
* between each so that they don't all run at once), then run again at intervals
* going forward.
*/
setTimeout(realtime_run, 1000 * 10);
setTimeout(hourly_run, 1000 * 20);
setTimeout(daily_run, 1000 * 30);
setTimeout(api_run, 1000 * 40);
// setTimeout(hourly_run, 1000 * 70); No hourly reports exist at this time.
setTimeout(daily_run, 1000 * 70);
setTimeout(api_run, 1000 * 130);

// daily
// Daily and API recurring script run setup.
// Runs at 10 AM UTC, then every 24 hours afterwards
setTimeout(() => {
daily_run();
setInterval(daily_run, 1000 * 60 * 60 * 24);
// API
api_run();
setInterval(api_run, 1000 * 60 * 60 * 24);
// Offset the daily script run by 30 seconds so that it never runs in parallel
// with the realtime script in order to save memory/CPU.
setTimeout(() => {
daily_run();
setInterval(daily_run, 1000 * 60 * 60 * 24);
}, 1000 * 30);

// setTimeout(hourly_run, 1000 * 60);

// Offset the API script run by 90 seconds so that it never runs in parallel
// with the daily or realtime scripts in order to save memory/CPU.
setTimeout(() => {
api_run();
setInterval(api_run, 1000 * 60 * 60 * 24);
}, 1000 * 90);
}, calculateNextDailyRunTimeOffset());
// hourly
setInterval(hourly_run, 1000 * 60 * 60);
// realtime. Runs every 15 minutes.
// Google updates realtime reports every 30 minutes, so there is some overlap.
// hourly (no hourly reports exist at this time).
// setInterval(hourly_run, 1000 * 60 * 60);

// Realtime recurring script run setup. Runs every 15 minutes.
setInterval(realtime_run, 1000 * 60 * 15);
39 changes: 26 additions & 13 deletions index.js
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
const { AsyncLocalStorage } = require("node:async_hooks");
const knex = require("knex");
const PgBoss = require("pg-boss");
const util = require("util");
const AppConfig = require("./src/app_config");
const ReportProcessingContext = require("./src/report_processing_context");
const Logger = require("./src/logger");
const Processor = require("./src/processor");
const PgBossKnexAdapter = require("./src/pg_boss_knex_adapter");

/**
* Gets an array of JSON report objects from the application confing, then runs
Expand Down Expand Up @@ -80,7 +82,7 @@ async function _processReport(appConfig, context, reportConfig, processor) {
await processor.processChain(context);
logger.info("Processing complete");
} catch (e) {
logger.error("Encountered an error");
logger.error("Encountered an error during report processing");
logger.error(util.inspect(e));
}
});
Expand Down Expand Up @@ -121,8 +123,8 @@ async function runQueuePublish(options = {}) {
agencyName: appConfig.agencyLogName,
scriptName: appConfig.scriptName,
});
const queueClient = await _initQueueClient(appConfig, appLogger);
const queue = "analytics-reporter-job-queue";
const knexInstance = await knex(appConfig.knexConfig);
const queueClient = await _initQueueClient(knexInstance, appLogger);

for (const agency of agencies) {
for (const reportConfig of reportConfigs) {
Expand All @@ -134,7 +136,7 @@ async function runQueuePublish(options = {}) {
});
try {
let jobId = await queueClient.send(
queue,
appConfig.messageQueueName,
_createQueueMessage(
options,
agency,
Expand All @@ -151,13 +153,17 @@ async function runQueuePublish(options = {}) {
);
if (jobId) {
reportLogger.info(
`Created job in queue: ${queue} with job ID: ${jobId}`,
`Created job in queue: ${appConfig.messageQueueName} with job ID: ${jobId}`,
);
} else {
reportLogger.info(`Found a duplicate job in queue: ${queue}`);
reportLogger.info(
`Found a duplicate job in queue: ${appConfig.messageQueueName}`,
);
}
} catch (e) {
reportLogger.error(`Error sending to queue: ${queue}`);
reportLogger.error(
`Error sending to queue: ${appConfig.messageQueueName}`,
);
reportLogger.error(util.inspect(e));
}
}
Expand All @@ -169,6 +175,9 @@ async function runQueuePublish(options = {}) {
} catch (e) {
appLogger.error("Error stopping queue client");
appLogger.error(util.inspect(e));
} finally {
appLogger.debug(`Destroying database connection pool`);
knexInstance.destroy();
}
}

Expand All @@ -189,10 +198,10 @@ function _initAgencies(agencies_file) {
return Array.isArray(agencies) ? agencies : legacyAgencies;
}

async function _initQueueClient(appConfig, logger) {
async function _initQueueClient(knexInstance, logger) {
let queueClient;
try {
queueClient = new PgBoss(appConfig.messageQueueDatabaseConnection);
queueClient = new PgBoss({ db: new PgBossKnexAdapter(knexInstance) });
await queueClient.start();
logger.debug("Starting queue client");
} catch (e) {
Expand Down Expand Up @@ -230,15 +239,19 @@ function _messagePriority(reportConfig) {
async function runQueueConsume() {
const appConfig = new AppConfig();
const appLogger = Logger.initialize();
const queueClient = await _initQueueClient(appConfig, appLogger);
const queue = "analytics-reporter-job-queue";
const knexInstance = await knex(appConfig.knexConfig);
const queueClient = await _initQueueClient(knexInstance, appLogger);

try {
const context = new ReportProcessingContext(new AsyncLocalStorage());
const processor = Processor.buildAnalyticsProcessor(appConfig, appLogger);
const processor = Processor.buildAnalyticsProcessor(
appConfig,
appLogger,
knexInstance,
);

await queueClient.work(
queue,
appConfig.messageQueueName,
{ newJobCheckIntervalSeconds: 1 },
async (message) => {
appLogger.info("Queue message received");
Expand Down
12 changes: 12 additions & 0 deletions knexfile.js
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ module.exports = {
password: process.env.POSTGRES_PASSWORD || "123abc",
port: 5432,
},
pool: {
min: 2,
max: 10,
},
},
test: {
client: "postgresql",
Expand All @@ -18,6 +22,10 @@ module.exports = {
password: process.env.POSTGRES_PASSWORD || "123abc",
port: 5432,
},
pool: {
min: 2,
max: 10,
},
migrations: {
tableName: "knex_migrations",
},
Expand All @@ -31,5 +39,9 @@ module.exports = {
password: process.env.POSTGRES_PASSWORD,
ssl: true,
},
pool: {
min: 2,
max: 10,
},
},
};
1 change: 0 additions & 1 deletion manifest.consumer.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@ applications:
ANALYTICS_REPORT_EMAIL: ${ANALYTICS_REPORT_EMAIL}
AWS_CACHE_TIME: '0'
GOOGLE_APPLICATION_CREDENTIALS: /home/vcap/app/${ANALYTICS_KEY_FILE_NAME}
MESSAGE_QUEUE_DATABASE_NAME: ${MESSAGE_QUEUE_DATABASE_NAME}
MESSAGE_QUEUE_NAME: ${MESSAGE_QUEUE_NAME}
NEW_RELIC_APP_NAME: ${NEW_RELIC_APP_NAME}
NEW_RELIC_LICENSE_KEY: ${NEW_RELIC_LICENSE_KEY}
Expand Down
1 change: 0 additions & 1 deletion manifest.publisher.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@ applications:
# The default path for reports (used for gov-wide reports)
AWS_BUCKET_PATH: data/live
AWS_CACHE_TIME: '0'
MESSAGE_QUEUE_DATABASE_NAME: ${MESSAGE_QUEUE_DATABASE_NAME}
MESSAGE_QUEUE_NAME: ${MESSAGE_QUEUE_NAME}
NEW_RELIC_APP_NAME: ${NEW_RELIC_APP_NAME}
NEW_RELIC_LICENSE_KEY: ${NEW_RELIC_LICENSE_KEY}
Expand Down
25 changes: 14 additions & 11 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading