Skip to content

redpencilio/virus-scanner-service

Repository files navigation

virus-scanner-service

Microservice to perform a virus scan on uploaded files.

This service listens for delta notifications about new files and scans those for viruses.

Project Status: WIP – Initial development is in progress

Getting started

Prerequisites:

Adding the service to your stack

Add the following snippet to your docker-compose.yml to include this service in your project.

version: '3.4'

services:
  virus-scanner:
    image: redpencilio/virus-scanner-service:0.1.0
    links:
      - database:database
    environment:
      VIRUS_SCANNER_CLAMD_USER: "clamav" # See README before changing to "root".
    volumes:
      - ./data/files:/share
      - type: volume
        source: virus-scanner-signatures
        target: /var/lib/clamav

volumes:
  virus-scanner-signatures:

Note on VIRUS_SCANNER_CLAMD_USER (default: "clamav"): The ClamAV authors do not recommend running clamd as root for safety reasons because ClamAV scans untrusted files that may be malware.

However, file-service currently saves its files with access permission for root only, which makes them inaccessible for ClamAV running as clamav.

Consider the security implications for your situation before letting clamd run as root by setting:

  VIRUS_SCANNER_CLAMD_USER: "root"

Add rules to dispatcher.ex to dispatch requests to this service. E.g.

  match "/virus-scanner/*path", %{ layer: :services } do
    Proxy.forward conn, path, "http://virus-scanner/"
  end

TODO: Change match /virus-scanner to post /malware-analyses like for mu-cl-resources described further below? Conflict on get: hello from virus-scanner vs mu-cl-resources

The host virus-scanner in the forward URL reflects the name of the file service in the docker-compose.yml file.

Update the authorization configuration config/authorization/config.ex to make sure the user has appropriate read/write access on the resource type stix:MalwareAnalysis. E.g.

    ...
    constraint: %ResourceConstraint{
      resource_types: [
        "http://docs.oasis-open.org/cti/ns/stix#MalwareAnalysis",
        ...
      ]
    }
    ...

Add delta-notifier to your stack as described in the delta-notifier documentation. Then configure delta-notifier to send relevant deltas to virus-scanner by adding the following snippet to config/delta/rules.js:

export default [
  {
    match: {
      predicate: {
        type: 'uri',
        value: 'http://www.semanticdesktop.org/ontologies/2007/01/19/nie#dataSource'
      },
    },
    callback: {
      url: 'http://virus-scanner/delta',
      method: 'POST',
    },
    options: {
      resourceFormat: 'v0.0.1',
      gracePeriod: 1000,
      ignoreFromSelf: true,
    }
  },
  // Other delta listeners
]

Run docker-compose up and the service should be reachable through the dispatcher, for example at http://localhost/virus-scanner/ .

How-to guides

How to configure malware-analysis resources in mu-cl-resources

Ensure that you have a file resource configured as described in the file-service documentation.

If you want to model the malware-analyses of the virus-scanner service in the domain of your mu-cl-resources service, add the following snippet to your resource configuration.

If you use the Lisp configuration format add the following to your domain.lisp:

(define-resource malware-analysis ()
  :class (s-prefix "stix:MalwareAnalysis")
  :properties `((:analysis-started :datetime ,(s-prefix "stix:analysis_started"))
                (:analysis-ended :datetime ,(s-prefix "stix:analysis_ended"))
                (:result :string ,(s-prefix "stix:result")))
  :has-one `((file :via ,(s-prefix "stix:sample_ref")
                   :as "sample-ref"))
  :resource-base (s-url "http://data.gift/virus-scanner/analysis/id/")
  :features `(include-uri)
  :on-path "malware-analyses")

(define-resource file ()
  ;; ...
  :has-many `((malware-analysis :via ,(s-prefix "stix:sample_ref")
                           :inverse t
                           :as "malware-analyses"))
  ;; ...

And configure this prefix in your repository.lisp:

(add-prefix "stix" "http://docs.oasis-open.org/cti/ns/stix#")

If you use the JSON configuration format add the following to your domain.json: TODO

Next, add the following rule to ./config/dispatcher/dispatcher.ex.

  define_accept_types [
    json: [ "application/vnd.api+json" ],
  ]

  ...

  get "/malware-analyses/*path", %{ accept: [ :json ], layer: :services } do
    Proxy.forward conn, path, "http://resource/malware-analyses/"
  end

Finally, restart the services to pick up the configuration changes:

docker-compose restart resource dispatcher

How to upload a file using a curl command

The following assumes mu-dispatcher is running on localhost:80.

Download an EICAR test file and upload it to file-service:

curl -O https://secure.eicar.org/eicar.com.txt
curl -i -X POST -H "Content-Type: multipart/form-data" -F "file=@eicar.com.txt" http://localhost/files

The virus-scanner-service will receive a delta notification of the upload, scan the file and write the results to the database.

To request a scan manually:

curl -i -X POST -H "Content-Type: application/json" -d '{"file":"http://mu.semte.ch/services/file-service/files/6543bc046ea4f3000e00000c"}' http://localhost/virus-scanner/scan

The virus-scanner-service will scan the file and add the new results to the database. Earlier results for the same file are left untouched.

How to check if a file is clean

To check if a file is clean, create a query that sorts the malware-analyses for that file by analysis-started, take the most recent one, check that analysis-ended is filled in with a recent-enough date and that the result is strict-equal to "benign". This should prevent erroneously assuming that a file is clean in corner cases such as when the malware-analysis is missing, the result is missing or unknown, or an earlier result was benign but the most recent analysis failed.

TODO: Example query.

Reference

Model

Ontologies and prefixes

Prefix URI
stix http://docs.oasis-open.org/cti/ns/stix#
nfo http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#
nie http://www.semanticdesktop.org/ontologies/2007/01/19/nie#

Malware analyses

Description

https://docs.oasis-open.org/cti/stix/v2.1/stix-v2.1.html https://github.com/oasis-open/tac-ontology/

Class

stix:MalwareAnalysis

Properties
Name Predicate Range Definition
analysis-started stix:analysis_started xsd:dateTime Datetime of scan start
analysis-ended stix:analysis_ended xsd:dateTime Datetime of scan end
result stix:result xsd:string The result: malicious, suspicious, benign or unknown
sample-ref stix:sample_ref nfo:FileDataObject The file that was scanned
TODOresult-name stix:result_name xsd:string Details of the result, e.g. names of detected malware

TODO: We could add result-name, but clamscan returns an array of strings, because more than 1 malware may be found in a file. We could write a JSON-string of that array to result-name, if that is not a problem.

Configuration

Environment variables

The following enviroment variables can be configured:

  • LOG_INCOMING_DELTA (default: "false"): Log the delta message as received from the delta-notifier to the console.
  • LOG_INCOMING_SCAN_REQUESTS (default: "false"): Log the requests received by endpoint /scan.
  • VIRUS_SCANNER_CLAMD_USER (default: "clamav"): User to run the ClamAV daemon clamd as. See note on running clamd as root in the Getting started-section above.
  • The environment variables recognized by mu-javascript-template.

REST API

Notes:

  • In various error cases (e.g. no physical file IRI found, file not found on disk, errors from clamscan), virus-scanner will create a malware-analysis resource for the requested file IRI with result unknown. Error details are logged. Those are not uncaught errors that would lead to a 500 Server Error response.
  • Storing a malware-analysis in the database may fail with only a logged error (e.g. if 202 Accepted was already returned), or even silently (e.g. file IRI not in any graph at time of storing result).
  • How to check if a file is clean describes how to ensure a file is clean considering such corner cases.

POST /delta

Accepts requests like those created by delta-notifier.

Response
202 Accepted

Delta contains logical file IRI insertions that will be scanned.

The results will be logged and stored in the database.

204 No Content

Delta contains no logical file IRI insertions. No results are stored.

500 Server error

Uncaught error.

POST /scan

Scan a file. Accepts a multipart/form-data with a file parameter containing a logical file IRI.

Response
201 Created

The newly created malware-analysis in the response body:

{
  "data": {
    "type": "malware-analyses",
    "id": "3a2cafd0-8f8a-11ee-a732-97ed1ab0131d",
    "attributes": {
      "uri": "http://data.gift/virus-scanner/analysis/id/3a2cafd0-8f8a-11ee-a732-97ed1ab0131d",
      "analysis-started": "2023-11-30T14:10:33.855Z",
      "analysis-ended": "2023-11-30T14:10:33.930Z",
      "result": "malicious",
      "sample-ref": "http://mu.semte.ch/services/file-service/files/65684a368d76fe0010000000"
    }
  }
}

In case result is unknown or malicious, further details can be found in the virus-scanner log.

400 Bad Request

file not a non-empty String

422 Unprocessable Entity

file is a physical file IRI, should be a logical file IRI.

500 Server error

Uncaught error.

Development

For a more detailed look in how to develop a microservice based on the mu-javascript-template, we would recommend reading "Developing with the template".

Developing in the mu.semte.ch stack

Paste the following snippet in your docker-compose.override.yml, replacing ../virus-scanner-service/ with an absolute or relative path pointing to your local sources:

version: '3.4'

services:
  virus-scanner:
    ports:
      - "8893:80"
      - "9229:9229"
    environment:
      NODE_ENV: "development"
      LOG_INCOMING_DELTA: "true"
      LOG_INCOMING_SCAN_REQUESTS: "true"
    volumes:
      - ../virus-scanner-service/:/app/