Skip to content
This repository has been archived by the owner on Feb 20, 2024. It is now read-only.

Commit

Permalink
handle fully deduplicated dataset
Browse files Browse the repository at this point in the history
  • Loading branch information
adonm committed Oct 24, 2022
1 parent c35f1b9 commit 685db75
Showing 1 changed file with 3 additions and 0 deletions.
3 changes: 3 additions & 0 deletions siem_query_utils/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -549,6 +549,9 @@ def upload_loganalytics(rows: list, log_type: str):
if digest not in existing_hashes:
item[digest_column] = digest # only add digest for new rows
rows = [item for item in rows if digest_column in item.keys()]
if len(rows) == 0:
logger.info("Nothing to upload")
return
rowsize = len(json.dumps(rows[0]).encode("utf8"))
chunkSize = int(20 * 1024 * 1024 / rowsize) # 20MB max size
chunks = [rows[x : x + chunkSize] for x in range(0, len(rows), chunkSize)]
Expand Down

0 comments on commit 685db75

Please sign in to comment.