-
Notifications
You must be signed in to change notification settings - Fork 7
SOS Quickstart
Download here from GitHub
The following will build sos and numsos into the directory /home/XXX/BuildSos
1. If you are in a modules environment, load the module for python-2.7
2. Building sos:
cd into the top level sos checkout directory autogen.sh mkdir build cd build ../configure --prefix=/home/XXX/BuildSos --enable-python make make install
3. Building numsos
cd into the top level numsos checkout directory autogen.sh mkdir build cd build ../configure --prefix=/home/XXX/BuildSos --with-sos=/home/XXX/BuildSos make make install
4. Directory Structure
The build will result in /home/XXX/BuildSos/lib/python2.7/site-packages with sosdb and numsos modules. The sosdb module includes the DataSet class and also the Array and Sos modules, which are written in C for efficiency. The numsos module includes the DataSource, DataSink, Stack, and Transform classes.
5. Setting your environment variables
export PATH=/home/XXX/BuildSos/bin:$PATH export PYTHONPATH=/home/XXX/BuildSos/lib/python2.7/site-packages:$PYTHONPATH
File | Use With | Description |
meminfo_E5-2698.schema.json | sos-schema --add | A schema definition file |
meminfo_E5-2698.map.json | sos-import-csv --map | A file that tells the import tool which CSV columns go to which schema attributes |
meminfo_E5-2698.1000 | sos-import-csv --csv | 1000 lines of CSV data |
These files can be obtained from a clone of the wiki under the directory: files/meminfoCSV2SOS
> more meminfo_E5-2698.schema.json { "name" : "meminfo_E5-2698", "attrs" : [ { "name" : "timestamp", "type" : "timestamp", "index" : {} }, { "name" : "component_id", "type" : "uint64", "index" : {} }, { "name" : "job_id", "type" : "uint64", "index" : {} }, { "name" : "app_id", "type" : "uint64" }, { "name" : "MemTotal", "type" : "uint64" }, { "name" : "MemFree", "type" : "uint64" }, ... { "name" : "DirectMap2M", "type" : "uint64" }, { "name" : "DirectMap1G", "type" : "uint64" }, { "name" : "comp_time", "type" : "join", "join_attrs" : [ "component_id", "timestamp" ], "index" : {} }, { "name" : "job_comp_time", "type" : "join", "join_attrs" : [ "job_id", "component_id", "timestamp" ], "index" : {} }, { "name" : "job_time_comp", "type" : "join", "join_attrs" : [ "job_id", "timestamp", "component_id" ], "index" : {} } ] } # # > more meminfo_E5-2698.map.json [ { "target" : "timestamp", "source" : { "column" : 0 } }, { "target" : "component_id", "source" : { "column" : 3 } }, { "target" : "job_id", "source" : { "column" : 4 } }, { "target" : "app_id", "source" : { "column" : 5 } }, { "target" : "MemTotal", "source" : { "column" : 6 } }, { "target" : "MemFree", "source" : { "column" : 7 } }, ... { "target" : "DirectMap2M", "source" : { "column" : 47 } }, { "target" : "DirectMap1G", "source" : { "column" : 48 } } ] ] # # > more meminfo_E5-2698.1000 1518803953.003055,3055,nid00012,12,5078835....1957888,134217728 1518803953.003319,3319,nid00013,13,5078835....1957888,134217728
1. Create a container if you don't already have one:
> sos-db --path /dir/my-container --create
2. Create the schema in the container:
> sos-schema --path /dir/my-container --add meminfo_E5-2698.schema.json
3. Query the schema to see what's in it:
a. Using sos-schema:
> sos-schema --path /dir/my-container --query meminfo_E5-2698 --verbose meminfo_E5-2698 Id Type Indexed Name ---- ---------------- ------------ -------------------------------- 0 TIMESTAMP True timestamp 1 UINT64 True component_id 2 UINT64 True job_id 3 UINT64 app_id 4 UINT64 MemTotal 5 UINT64 MemFree ... 45 UINT64 DirectMap2M 46 UINT64 DirectMap1G 47 JOIN True comp_time [component_id+timestamp] 48 JOIN True job_comp_time [job_id+component_id+timestamp] 49 JOIN True job_time_comp [job_id+timestamp+component_id]
b. OR using sos_cmd:
> sos_cmd -C /dir/my-container -l schema : name : meminfo_E5-2698 schema_sz : 4904 obj_sz : 384 id : 129 -attribute : timestamp type : TIMESTAMP idx : 0 indexed : 1 offset : 8 -attribute : component_id type : UINT64 idx : 1 indexed : 1 offset : 16 -attribute : job_id type : UINT64 idx : 2 indexed : 1 offset : 24 ... -attribute : DirectMap2M type : UINT64 idx : 45 indexed : 0 offset : 368 -attribute : DirectMap1G type : UINT64 idx : 46 indexed : 0 offset : 376 -attribute : comp_time type : JOIN idx : 47 indexed : 1 offset : 384 -attribute : job_comp_time type : JOIN idx : 48 indexed : 1 offset : 384 -attribute : job_time_comp type : JOIN idx : 49 indexed : 1 offset : 384
Note that there is no data yet in the container (using sos_cmd):
> sos_cmd -C /dir/my-container -q -S meminfo_E5-2698 -X comp_time timestamp component_id job_id ... comp_time job_comp_time job_time_comp -------------------------------- ------------------ ... -------------------------------- Records 0/0.
4. Import the CSV data into the container:
> sos-import-csv --path /dir/my-container --schema meminfo_E5-2698 --map meminfo_E5-2698.map.json --csv meminfo_E5-2698.1000 Importing from CSV file meminfo_E5-2698.1000 into /home/gentile/Source/numsos/csvimport/test using map meminfo_E5-2698.map.json Created 1000 records
5. You can monitor the progress from another window like this:
> sos-monitor --path /dir/my-container --schema meminfo_E5-2698
It will take less than a second for 1000 lines, but you can see progress during larger file loads.
6. Query for the data in a container:
a. Query all the data, using comp_time as an index, which will determine the output order > sos_cmd -C /dir/my-container -q -S meminfo_E5-2698 -X comp_time timestamp component_id job_id ... DirectMap1G comp_time job_comp_time job_time_comp -------------------------------- ------------------ ------------------ ... -------------------------------- 1518803953.003055 12 5078835 ... 1957888 134217728 05:00:0C:00:00:00:00:00:00:00:0 05:00:33:7F:4D:00:00:00:00:00:0 05:00:33:7F:4D:00:00:00:00:00:0 1518803954.002904 12 5078835 ... 1957888 134217728 05:00:0C:00:00:00:00:00:00:00:0 05:00:33:7F:4D:00:00:00:00:00:0 05:00:33:7F:4D:00:00:00:00:00:0 ... 1518803961.002805 179 0 0 ... 1957888 134217728 05:00:B3:00:00:00:00:00:00:00:0 05:00:00:00:00:00:00:00:00:00:0 05:00:00:00:00:00:00:00:00:00:0 1518803962.002661 179 0 0 ... 1957888 134217728 05:00:B3:00:00:00:00:00:00:00:0 05:00:00:00:00:00:00:00:00:00:0 05:00:00:00:00:00:00:00:00:00:0 -------------------------------- ------------------ ... -------------------------------- Records 1000/1000.
b. Query only for certain variables (also using an index):
> sos_cmd -C /dir/my-container -q -S meminfo_E5-2698 -X comp_time -f table -V timestamp -V component_id -V Active timestamp component_id Active -------------------------------- ------------------ ------------------ 1518803953.003055 12 82672 1518803954.002904 12 82672 1518803955.002760 12 82672 ... 1518803960.001899 179 209712 1518803961.002805 179 209712 1518803962.002661 179 209712 -------------------------------- ------------------ ------------------ Records 1000/1000.
c. Querying with a filter:
> sos_cmd -C /home/gentile/Source/numsos/csvimport/test -q -S meminfo_E5-2698 -X comp_time -f table -V timestamp -V component_id -V Active -F "timestamp:gt:1518803957" -X comp_time timestamp component_id Active -------------------------------- ------------------ ------------------ 1518803957.003462 12 82672 1518803958.003315 12 82672 1518803959.001410 12 82672 1518803960.002299 12 82672 1518803961.002159 12 82672 ... 1518803957.003083 179 209712 1518803958.002909 179 209712 1518803959.001032 179 209712 1518803960.001899 179 209712 1518803961.002805 179 209712 1518803962.002661 179 209712 -------------------------------- ------------------ ------------------ Records 600/600.
d. Querying with multiple filters:
> sos_cmd -C /dir/my-container -q -S meminfo_E5-2698 -X comp_time -f table -V timestamp -V component_id -V Active -F "timestamp:gt:1518803960" -X comp_time -F "component_id:gt:177" timestamp component_id Active -------------------------------- ------------------ ------------------ 1518803960.002343 178 682756 1518803961.002104 178 682756 1518803962.001890 178 682756 1518803960.001899 179 209712 1518803961.002805 179 209712 1518803962.002661 179 209712 -------------------------------- ------------------ ------------------ Records 6/6.
- SOS QuickStart - includes creating SOS from CSV
- Building
- Viewing Class Documentation
- numSOS overview - python queries to numSOS data objects.