Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing attribute 'fullpath' for evaluating boolean expression #125

Open
thiell opened this issue Aug 5, 2022 · 2 comments
Open

Missing attribute 'fullpath' for evaluating boolean expression #125

thiell opened this issue Aug 5, 2022 · 2 comments

Comments

@thiell
Copy link
Contributor

thiell commented Aug 5, 2022

We've been investigating what could cause the following errors for existing files when using post_sched_match = auto_update; (the default):

2022/08/05 10:31:05 [30914/3] Policy | Missing attribute 'fullpath' for evaluating boolean expression on [0x2000520e0:0x2a62:0x0]
2022/08/05 10:31:05 [30914/3] Policy | [0x2000520e0:0x2a62:0x0]: attribute is missing for checking ignore_fileclass rule
2022/08/05 10:31:05 [30914/3] checkdv | Warning: cannot determine if entry  is whitelisted: skipping it.

This has also been reported by others on the mailing list I believe.

When running the policy, we have a few occurrences per minute, so it's not insignificant.

We checked the errors with multiple FIDs, and every time, the path is OK in the DB, like for this one:

MariaDB [robinhood_fir]> SELECT this_path(parent_id,name) FROM ENTRIES LEFT JOIN NAMES ON ENTRIES.id=NAMES.id WHERE ENTRIES.id='0x2000520e0:0x2a62:0x0';
+---------------------------------------------------------------------------------------------------+
| this_path(parent_id,name)                                                                         |
+---------------------------------------------------------------------------------------------------+
| 0x200000007:0x1:0x0/users/oparedes/Orca_Calc/Geom_opt/Arun_Kummar/A2/CL20_GeomOpt.proc6.orho0.tmp |
+---------------------------------------------------------------------------------------------------+
1 row in set (0.001 sec)

However, full logs show that the path is not resolved when the policy is run (example with another FID 0x2000523b3:0x1a:0x0):

2022/08/05 09:37:08 [28176/21] ListMgr | SQL query: SELECT id FROM ENTRIES WHERE id='0x2000523b3:0x1a:0x0'
2022/08/05 09:37:08 [28176/11] ListMgr | SQL query: COMMIT
2022/08/05 09:37:08 [28176/21] checkdv | requests: OK + in flight = 14
2022/08/05 09:37:08 [28176/26] checkdv | Checking if entry  matches policy rules (mode=auto_update_attrs)
2022/08/05 09:37:08 [28176/26] checkdv | Updating info about [0x2000523b3:0x1a:0x0]
2022/08/05 09:37:08 [28176/26] checkdv | Updating POSIX info of [0x2000523b3:0x1a:0x0]
2022/08/05 09:37:08 [28176/21] ListMgr | SQL query: SELECT size,last_mod,type,checkdv_lstchk,checkdv_out,parent_id,name,path_update,this_path(parent_id,name) FROM ENTRIES LEFT JOIN NAMES ON ENTRIES.id=NAMES.id WHERE ENTRIES.id='0x2000523b3:0x1b:0x0'
2022/08/05 09:37:08 [28176/26] Policy | Missing attribute 'fullpath' for evaluating boolean expression on [0x2000523b3:0x1a:0x0]
2022/08/05 09:37:08 [28176/26] Policy | [0x2000523b3:0x1a:0x0]: attribute is missing for checking ignore_fileclass rule
2022/08/05 09:37:08 [28176/26] checkdv | Warning: cannot determine if entry  is whitelisted: skipping it.

It looks like if we set post_sched_match = force_update;, the error goes away. It's our current workaround, but I'm worried it will slow down the policy (testing now...).

I'm suspecting an issue in src/policies/policy_run.c:check_entry(), when the path is updated:

2270     /* get fullpath or name, if they are needed to apply the policy */
2271     if (need_update(check_method, updt_mask.std &
2272                         (ATTR_MASK_fullpath | ATTR_MASK_name))) {
2273         DisplayLog(LVL_FULL, tag(policy), "Updating path info of "DFID,
2274                    PFID(&p_item->entry_id));
2275         switch (path_check_update(&p_item->entry_id, stat_path, new_attr_set,
2276                                   updt_mask)) {
2277         case PCR_UPDATED:
2278             updated = true;
2279             break;
2280 
2281         case PCR_NO_CHANGE:
2282             break;
2283 
2284         case PCR_ORPHAN:
2285             /* no path to access it, handle it as if it had been moved */
2286             return AS_MOVED;
2287         }
2288     }

Any idea? :) Thx

@tl-cea
Copy link
Member

tl-cea commented Aug 10, 2022

To help reproducing the issue, could you show what your policy look like? In particular, in which clause the path is matched?
And also what command you execute for the policy run?
Thx

@thiell
Copy link
Contributor Author

thiell commented Sep 8, 2022

Apologies for the delay. A policy basically looks either like this:

define_policy checkdv {
    status_manager = checker;
    scope { type == file }
    default_lru_sort_attr = none;
    # 'output' stands for previous value in DB
    # 7862400 = 90 days + 1 day grace
    default_action = cmd("/usr/sbin/rbh_checkdv /fir '{creation_time}' '{output}' 7862400 '{fid}'");
}

checkdv_rules {
    # ignore system files
    ignore_fileclass = system;

    rule default {
        condition { (checkdv.last_check == 0 or checkdv.output == "") and creation > 2d and
                    (ost_index == 0 or
                     ost_index == 1 or
                     ost_index == 2 or
                     ost_index == 3 or
                     ost_index == 4 or
                     ost_index == 5 or
                     ost_index == 6 or
                     ost_index == 7 or
                     ost_index == 8 or
                     ost_index == 9 or
                     ost_index == 10 or
                     ost_index == 11) }
    }
}

checkdv_parameters {
    db_result_size_max = 262144;
    queue_size = 65536;
    nb_threads = 8;
    reschedule_delay_ms = 0;
    recheck_ignored_entries = no;
    report_interval = 1min;
    pre_sched_match = none;
    post_sched_match = force_update;
}

...or like that:

define_policy checkdv {
    status_manager = checker;
    scope { type == file }
    default_lru_sort_attr = creation; #oldest first
    # 'output' stands for previous value in DB
    # 7862400 = 90 days + 1 day grace
    default_action = cmd("/usr/sbin/rbh_checkdv /fir '{creation_time}' '{output}' 7862400 '{fid}'");
}

checkdv_rules {
    # ignore system files
    ignore_fileclass = system;

    rule default {
        condition { checkdv.output != "" and creation > 90d and checkdv.last_check > 2d and
                    (ost_index == 0 or
                     ost_index == 1 or
                     ost_index == 2 or
                     ost_index == 3 or
                     ost_index == 4 or
                     ost_index == 5 or
                     ost_index == 6 or
                     ost_index == 7 or
                     ost_index == 8 or
                     ost_index == 9 or
                     ost_index == 10 or
                     ost_index == 11) }
    }
}

checkdv_parameters {
    db_result_size_max = 1000000000;  #no pagination? if not big enough we might never reach some entries
    queue_size = 131072;
    nb_threads = 8;
    reschedule_delay_ms = 0;
    recheck_ignored_entries = no;
    report_interval = 1min;
    pre_sched_match = none;
    post_sched_match = force_update;
}

So I believe fullpath is only used as part of this fileclass:

FileClass system {
    definition { tree == "/fir/.*" }
}

But TBH we have made quite some changes since I opened this ticket so these differ a little bit from the original policy. I will try to test a little bit more when possible. But in any case, it seems to work fine with post_sched_match = force_update;.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants