diff --git a/docs/source/advanced/index.rst b/docs/source/advanced/index.rst index e0162aa..317ecf5 100644 --- a/docs/source/advanced/index.rst +++ b/docs/source/advanced/index.rst @@ -11,3 +11,4 @@ Advanced /advanced/filesystems /advanced/plugins /advanced/record_descriptors + /advanced/records diff --git a/docs/source/advanced/records.rst b/docs/source/advanced/records.rst new file mode 100644 index 0000000..dd3abfd --- /dev/null +++ b/docs/source/advanced/records.rst @@ -0,0 +1,76 @@ +Working with Records +==================== + +This section describes how to work with records. Records are the fundamental data structure in ``dissect`` and are +used to describe forensic evidence and artefacts in a structured way. They can be read from and written to various +sources and formats. + +Writing Records with rdump +-------------------------- + +The easiest way to write records is by using the :doc:`/tools/rdump` tool. It allows you to read records from any +source, filter them, and write them to a new destination. + +The output format is determined by the ``-w/--writer`` argument. You can specify a filename, and ``rdump`` will +automatically detect the desired output format and compression based on the file extension. + +For example, to write records to a gzip-compressed file, you can use: + +.. code-block:: console + + $ rdump -w output.rec.gz + +This will write the records to ``output.rec.gz`` in the default record stream format, compressed with gzip. + +Writing Records with Python +--------------------------- + +For more advanced use-cases, you can use the :class:`flow.record.RecordWriter` class in your own Python scripts. This +gives you full control over how and where records are written. + +The ``RecordWriter`` is best used as a context manager. It takes a URI as its main argument, which specifies the +adapter and any options to use. + +Here's an example of writing records to a JSON file: + +.. code-block:: python + + from flow.record import RecordDescriptor, RecordWriter + + # define our descriptor + MyRecord = RecordDescriptor("my/record", [ + ("net.ipaddress", "ip"), + ("string", "description"), + ]) + + # define some records + records = [ + MyRecord("1.1.1.1", "cloudflare dns"), + MyRecord("8.8.8.8", "google dns"), + ] + + # write the records to disk + with RecordWriter("jsonfile://output.json?indent=2") as writer: + for record in records: + writer.write(record) + +Adapters +-------- + +The ``RecordWriter`` uses adapters to write to different formats. The adapter is selected based on the scheme of the +URI passed to the ``RecordWriter``. + +Some common adapters include: + +* ``file``: The default record stream format. +* ``csvfile``: For writing CSV files. +* ``jsonfile``: For writing JSON or JSONL files. +* ``line``: For writing to the console in a human-readable format. + +You can get a full list of available adapters by running ``rdump --list-adapters``. + +Some adapters require extra dependencies, which will be shown in the output as well. + +.. seealso:: + + For more information about the ``flow.record`` library, please refer to the :doc:`/projects/flow.record/index` page. diff --git a/docs/source/tools/rdump.rst b/docs/source/tools/rdump.rst index 1b818ed..c50763d 100644 --- a/docs/source/tools/rdump.rst +++ b/docs/source/tools/rdump.rst @@ -120,7 +120,42 @@ For example, we can output just the hostname, name and image path of a Windows s Writing records --------------- -Something about writing records, e.g. auto detection of filename for compression. +``rdump`` can write records to a file, which can be used as input for ``rdump`` at a later stage. To write +records to a file, the ``-w`` or ``--writer`` argument can be used. + +The output format and compression type are automatically detected based on the filename extension. For example, to +write to a gzip compressed file, simply use the ``.gz`` extension in your output file. + +.. code-block:: console + + $ rdump services.rec -w services.rec.gz + +This will read the records from ``services.rec`` and write them to a new gzip compressed file named ``services.rec.gz``. +Other supported compression formats are ``.bz2``, ``.zst`` (zstandard), and ``.lz4``. + +If you want to write the records to a file without any compression, just use a filename without a compression +extension. + +.. code-block:: console + + $ rdump services.rec -w services.rec.out + +When the ``-w`` argument is omitted, ``rdump`` prints the string representation of the records to standard output, which is useful for piping to tools like ``grep`` or ``less``. + +If you want to output the record output to another ``rdump`` process you need to use ``-w -``, which will write the records in binary stream format to standard output. + +For example, you can chain ``rdump`` with common Linux command-line tools to analyze records. In this example, we extract image paths from a record source, sort them, count occurrences, and display the top 5 most common paths: + +.. code-block:: console + + $ rdump services.rec -w - | rdump -f "{imagepath}" | sort | uniq -c | sort -rn | head -n 5 + [reading from stdin] + 104 %SystemRoot%\system32\svchost.exe + 71 %SystemRoot%\System32\svchost.exe + 28 %systemroot%\system32\svchost.exe + 12 None + 3 %SystemRoot%\system32\lsass.exe + Usage -----