Concatenator

This presentations goal it to introduce the features of the Concatenator and how to configure it.

The challenge

I want to merge different fields from an event in one target field.

from this:

[64]:
document = {
    'data_stream': {
        'dataset': 'windows',
        'namespace': 'devopslab',
        'type': 'logs'
        },
    '_op_type': 'create'
    }

to this:

[65]:
expected = {
    'data_stream': {
        'dataset': 'windows',
        'namespace': 'devopslab',
        'type': 'logs'
        },
    '_op_type': 'create',
    '_index': 'logs-windows-devopslab'
    }

Create rule and processor

create the rule:

[66]:
import sys
sys.path.append("../../../../../")
import tempfile
from pathlib import Path

rule_yaml = """---
filter: "data_stream"
concatenator:
  source_fields:
    - data_stream.type
    - data_stream.dataset
    - data_stream.namespace
  target_field: _index
  separator: "-"
  overwrite_target: false
  delete_source_fields: false
"""

rule_path = Path(tempfile.gettempdir()) / "concatenator"
rule_path.mkdir(exist_ok=True)
rule_file = rule_path / "data-stream.yml"
rule_file.write_text(rule_yaml)
[66]:
230

create the processor config:

[67]:
processor_config = {
    "myconcatenator":{
        "type": "concatenator",
        "rules": [str(rule_path), "/dev"],
        }
    }

create the processor with the factory:

[68]:
from unittest import mock
from logprep.factory import Factory

mock_logger = mock.MagicMock()
concatenator = Factory.create(processor_config)
concatenator
[68]:
concatenator

Process event

[69]:
from copy import deepcopy
mydocument = deepcopy(document)


print(f"before: {mydocument}")
concatenator.process(mydocument)
print(f"after: {mydocument}")
print(mydocument == expected)
before: {'data_stream': {'dataset': 'windows', 'namespace': 'devopslab', 'type': 'logs'}, '_op_type': 'create'}
after: {'data_stream': {'dataset': 'windows', 'namespace': 'devopslab', 'type': 'logs'}, '_op_type': 'create', '_index': 'logs-windows-devopslab'}
True