-
Notifications
You must be signed in to change notification settings - Fork 15
Issues #29 and #30 #31
base: master
Are you sure you want to change the base?
Conversation
I changed to new plugin api.
p.s. |
Thanks for the update @evilezh. The maintainers of Snap are unfamiliar with Kafka, so this PR will take much longer to review than we like. I might use this as an excuse to dig into it. We'll keep you posted - happy new year! |
I've launched the plugin with your changes and I see that with your changes some information is lost: Output from current kafka publisher:
Output from kafka publisher with your changes:
|
That is general idea of conversion. To drop un-used fields and get only actual data. Format is optional, depends on your requirements. If you need full output, you can use old method. If you are sufficient with only path and data, then welcome to use tree-like output. That in general saves you around 90% of traffic and space on kafka. |
I think all data like unit and version should be kept for future filtering or debugging (to know which version of plugin was used in case the value is incorrect), so output should be like: {
"_tags": {
"plugin_running_on": "kujawka-Z97X-UD7-TH"
},
"_timestamp": 1485772518024965428,
"intel": {
"psutil": {
"load": {
"load1": {
"value": 0.15,
"version": 0,
"unit": "Load/1M",
"last_advertised_time": "2017-01-30T11:02:29.93389848+01:00"
},
"load15": {
"value": 0.5,
"version": 0,
"unit": "Load/15M",
"last_advertised_time": "2017-01-30T11:02:29.933898807+01:00"
}
}
}
} Anyway, I don't think that changing format of data is good way to achieve lower bandwidth. Correct way to optimize data trasfer is compressing data. It seems that data compression is supported both by Kafka and sarama (Kafka's client library for Go). You can find simple compression example here: I see that in snap-plugin-publisher kafka there is "nil" provided as config parameter for NewSyncProducer, so config Compression parameter is set to 0, which corresponds to "CompressionNone" (https://godoc.org/github.com/Shopify/sarama#CompressionCodec). Maybe it's worth to play with compression parameter instead of refactoring whole output format? |
There is nothing about to argue ... who wants .. outputs in previous format, i added new output format which throws out all unnecessary data which in real life maybe 0.01% users need. This is not only about compressing, but also data representation for further processing. |
@evilezh maybe it can be resolved like in elasticearch publisher. I've added configuration option which enables user to choose fields from metric to publish. Default is publishing whole metric structure, but you can choose for example only Namespace and Data. And only those parts are published in database. |
As for elasticsearch ... { "timestamp":12123123,
"data1": 123,
"data2": 321
} not like: { "data1" : {
"value": 123,
"timestamp":123123123
},
"data2": {
"value": 321,
"timestamp":123123123
}
} I did some research on plugins i use, all other fields except value usually is filled with same repeatable value. And about compression on wire. Yes it makes smaller amount of data to transfer, but it is huge amount to store .. it is huge amount to parse. And I will need post-process, as I did before and cut it all out. It makes much more sense - to cut all unnecessary data out before even sending to kafka. Otherwise it is server -> kafka -> post-processing -> kafka -> many readers. And you can always add some other format for kafka output: |
This is approximate idea for #29 and #30 .
#29 I tested - it works. #30 will test bit later. Need to roll out and change some stuff, before I can say it works correct.
It is kinda "first/second" code i wrote in golang, so welcome to modify and make it nicer.
Sorry, no tests, no documentation changes.
here is what i added to configuration:
key: "plugin_running_on"
output_type: "tree"
Both are optional, If not specified, it will default to previous behavior.