On log messages

On Batch writing

All the following log messages are generated in the class

org.intermine.sql.writebatch.Batch

e.g.

Enqueued 7784310 of 7784310 byte batch.
Batch flusher has spent 25339 ms waiting for the database (duty cycle 20%) (current duty cycle 18%)
Waited 340 ms for batch flusher

Digested reading

The less Waited logs you get the better. Possibly influenced by temp_buffer and checkpoint_segment (tbc)
Idem for the batch flusher; a lower duty cycle should indicate a healthier state (?).
Do not worry about the enqueued.

Enqueued

e.g.

Enqueued 10000038 of 10000038 byte batch.

Message is triggered if the current batch size is greater than a fixed (and nearly random?) number, in our case 5 millions. Note that the maximum batch size is also hard coded and is 100 millions in our case.

It does not seems to be very significant, apart from signalling the presence of large batches.

It could be more interesting if the 2 figures in the message actually differ (tables too big?).

Batch flusher

e.g.

Batch flusher has spent 7057 ms waiting for the database (duty cycle 27%) (current duty cycle 27%)

#####duty cycle Calculated as (see code)

 int totalDutyCycle = (int) (((100 * totalSpent + ((end - flusherStart) / 2)) / (end - flusherStart)));

which translates to

  D = (100 * S + T/2)/T
where
  D = duty cycle [1-100]
  S = time spent flushing (ms)
  T = total time (cumulative for one process, i.e. one source)

Not sure it is a meaningful number.

It seems to indicate that a low duty cycle is desirable, since it would consist in a time spent flushing which is a small portion of the processing time.

The log message is triggered by any flush job since the previous message of the same type.

It tends to happen only when there is intensive 'computation' (e.g. features) in the integration part (parsing of source and writing to the item database), or when data loading (from the item database to the production database).

current duty cycle

Same as above, but the calculation is done considering the time elapsed since the last log message (and the additional time spent flushing).

Waited

e.g.

Waited 244 ms for batch flusher

Message is triggered if the list of flush jobs takes more than 100 ms to be cleared.

Summary logs

org.intermine.dataloader.IntegrationWriterDataTrackingImpl - Time spent: Equivalent object queries: 10788, Create object: 42535, Compute priorities: 4524, Copy fields: 326121, Store object: 978880, Data tracker write: 122592, recursing: 272048 org.intermine.dataloader.IntegrationWriterDataTrackingImpl - There were no duplicate objects org.intermine.dataloader.ObjectStoreDataLoader - Finished dataloading 3546358 objects at 98559 objects per minute (2158911 ms total) for source modencode-lieb-metadata org.intermine.dataloader.ObjectStoreDataLoader - Time spent: Reading: 441871, Writing: 1226033, Committing: 491007

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

On log messages

On Batch writing

Digested reading

Enqueued

Batch flusher

current duty cycle

Waited

Summary logs

Clone this wiki locally