Skip to content

Comments

feat: add role-based authorization for guest and read-only users#381

Closed
Matovidlo wants to merge 11 commits intomainfrom
role-based-authorization-guest-read
Closed

feat: add role-based authorization for guest and read-only users#381
Matovidlo wants to merge 11 commits intomainfrom
role-based-authorization-guest-read

Conversation

@Matovidlo
Copy link
Contributor

@Matovidlo Matovidlo commented Feb 2, 2026

Changes:

  • Add role detection helpers (is_guest_role, is_read_only_role, requires_read_only_access)
  • Filter tools list to show only read-only tools for guest/read roles
  • Block write tool execution with clear error messages
  • Add comprehensive test suite (35 tests) for role-based authorization
  • Update README with role-based access control documentation

Guest and read-only users now have access to 15 read-only tools only: get_jobs, get_buckets, get_tables, get_project_info, docs_query, query_data, search, find_component_id, get_data_apps, get_components, get_configs, get_config_examples, get_flows, get_flow_examples, get_flow_schema

Description

Implements read-only tool access restrictions for guest and read-only roles to enable safe in-platform chat access with limited permissions.

Linear: AI-2442

Change Type

  • Major (breaking changes, significant new features)
  • Minor (new features, enhancements, backward compatible)
  • Patch (bug fixes, small improvements, no new features)

Summary

Implements read-only tool access restrictions for guest and read-only roles to enable safe in-platform chat access with limited permissions. We need to restrict roles from using write tools, because API returns 403 HTTP status code for them.

Testing

  • Tested with Cursor AI desktop (Streamable-HTTP transports)

Optional testing

  • Tested with Cursor AI desktop (all transports)
  • Tested with claude.ai web and canary-orion MCP (SSE and Streamable-HTTP)
  • Tested with In Platform Agent on canary-orion
  • Tested with RO chat on canary-orion

Checklist

  • Self-review completed
  • Unit tests added/updated (if applicable)
  • Integration tests added/updated (if applicable)
  • Project version bumped according to the change type (if applicable)
  • Documentation updated (if applicable)

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements role-based authorization to restrict guest and read-only users to tools annotated as read-only (readOnlyHint=True), preventing write tool execution and reducing 403 errors for those roles.

Changes:

  • Added role detection helpers and read-only tool detection to ToolsFilteringMiddleware, plus filtering/blocking logic for guest/read roles.
  • Expanded unit test coverage with a dedicated role-based authorization test suite and added role cases to existing server tests.
  • Updated README with role-based access control documentation.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
src/keboola_mcp_server/mcp.py Adds role helpers, read-only tool detection, and enforces read-only access for guest/read roles in tool listing and execution.
tests/test_role_based_authorization.py New comprehensive test suite validating role detection, tool list filtering, and tool call blocking behavior.
tests/test_server.py Extends existing server tests to validate tool availability for guest and read roles.
README.md Documents the new role-based access control behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@vita-stejskal vita-stejskal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bump the project version.

Matovidlo added a commit that referenced this pull request Feb 3, 2026
- Create utils.py with shared is_read_only_tool() function
- Remove duplicate _is_read_only_tool() from authorization.py and mcp.py
- Remove unnecessary wrapper functions (is_guest_role, is_read_only_role, requires_read_only_access)
- Use direct inline checks: if token_role in ['guest', 'readonly']
- Fix case sensitivity bug: readOnly -> readonly (to match .lower() conversion)
- Change INFO to DEBUG logging for frequent operations

Addresses PR #381 review feedback on code simplification and deduplication.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Matovidlo added a commit that referenced this pull request Feb 3, 2026
- Reorder role descriptions in logical progression (most restrictive first)
- Use consistent terminology ("read-only tools" instead of "query-only operations")
- Add clear explanations of what each role can/cannot do
- Clarify that admin role allows Scheduler API access

Addresses PR #381 review feedback on documentation clarity.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Matovidlo added a commit that referenced this pull request Feb 3, 2026
- Add test_on_call_tool_role_blocking to test_mcp.py (mirrors source structure)
- Remove redundant test_role_based_authorization.py (wrong structure)
- Test file structure now properly mirrors source code structure

Integration tests in test_server.py already provide comprehensive
role-based authorization coverage with real server instances.

Addresses PR #381 review feedback on test structure and quality.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Matovidlo added a commit that referenced this pull request Feb 3, 2026
- Create utils.py with shared is_read_only_tool() function
- Remove duplicate _is_read_only_tool() from authorization.py and mcp.py
- Remove unnecessary wrapper functions (is_guest_role, is_read_only_role, requires_read_only_access)
- Use direct inline checks: if token_role in ['guest', 'readonly']
- Fix case sensitivity bug: readOnly -> readonly (to match .lower() conversion)
- Change INFO to DEBUG logging for frequent operations

Addresses PR #381 review feedback on code simplification and deduplication.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Matovidlo added a commit that referenced this pull request Feb 3, 2026
- Reorder role descriptions in logical progression (most restrictive first)
- Use consistent terminology ("read-only tools" instead of "query-only operations")
- Add clear explanations of what each role can/cannot do
- Clarify that admin role allows Scheduler API access

Addresses PR #381 review feedback on documentation clarity.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Matovidlo added a commit that referenced this pull request Feb 3, 2026
- Add test_on_call_tool_role_blocking to test_mcp.py (mirrors source structure)
- Remove redundant test_role_based_authorization.py (wrong structure)
- Test file structure now properly mirrors source code structure

Integration tests in test_server.py already provide comprehensive
role-based authorization coverage with real server instances.

Addresses PR #381 review feedback on test structure and quality.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@Matovidlo Matovidlo force-pushed the role-based-authorization-guest-read branch 3 times, most recently from fbe66ba to c24cf99 Compare February 4, 2026 07:24
Matovidlo and others added 10 commits February 4, 2026 10:55
Implements read-only tool access restrictions for guest and read-only roles
to enable safe in-platform chat access with limited permissions.

Changes:
- Add role detection helpers (is_guest_role, is_read_only_role, requires_read_only_access)
- Filter tools list to show only read-only tools for guest/read roles
- Block write tool execution with clear error messages
- Add comprehensive test suite (35 tests) for role-based authorization
- Update README with role-based access control documentation

Guest and read-only users now have access to 15 read-only tools only:
get_jobs, get_buckets, get_tables, get_project_info, docs_query, query_data,
search, find_component_id, get_data_apps, get_components, get_configs,
get_config_examples, get_flows, get_flow_examples, get_flow_schema

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Create utils.py with shared is_read_only_tool() function
- Remove duplicate _is_read_only_tool() from authorization.py and mcp.py
- Remove unnecessary wrapper functions (is_guest_role, is_read_only_role, requires_read_only_access)
- Use direct inline checks: if token_role in ['guest', 'readonly']
- Fix case sensitivity bug: readOnly -> readonly (to match .lower() conversion)
- Change INFO to DEBUG logging for frequent operations

Addresses PR #381 review feedback on code simplification and deduplication.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Reorder role descriptions in logical progression (most restrictive first)
- Use consistent terminology ("read-only tools" instead of "query-only operations")
- Add clear explanations of what each role can/cannot do
- Clarify that admin role allows Scheduler API access

Addresses PR #381 review feedback on documentation clarity.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add test_on_call_tool_role_blocking to test_mcp.py (mirrors source structure)
- Remove redundant test_role_based_authorization.py (wrong structure)
- Test file structure now properly mirrors source code structure

Integration tests in test_server.py already provide comprehensive
role-based authorization coverage with real server instances.

Addresses PR #381 review feedback on test structure and quality.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Minor version bump for role-based authorization feature addition.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Establishes comprehensive project guidelines in .claude/CLAUDE.md covering code simplicity, test structure, logging standards, and quality checks. Fixes flake8 violations (unused exception variables, fixture naming) and black formatting in test files to meet quality standards.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Integration tests were hardcoded to US region URLs (connection.keboola.com,
queue.keboola.com, ai.keboola.com), causing 27 test failures when run
against other Keboola regions like EU GCP or EU AWS. This prevented
developers from running integration tests locally on non-US environments.

Changes:
- Add region-aware fixtures (hostname_suffix, connection_url, queue_url,
  ai_url, links_manager) to integtests/conftest.py
- Replace hardcoded URLs in test assertions with dynamic fixtures
- Use ProjectLinksManager for URL generation (same abstraction used in
  source code)
- Fix flow conftest to properly delete configurations (double-delete for
  permanent removal)

Tests now work on any Keboola region without code modification.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@Matovidlo Matovidlo force-pushed the role-based-authorization-guest-read branch from c24cf99 to 32ced0b Compare February 4, 2026 09:58
@Matovidlo Matovidlo marked this pull request as ready for review February 5, 2026 21:10
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 17 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 173 to +181
LOG.exception(f'Failed to trigger tool event for "{func.__name__}" tool: {e}')
raise
# Only swallow 403 Forbidden errors (expected for guest/read-only roles)
# Re-raise other errors as they indicate genuine problems
if isinstance(e, httpx.HTTPStatusError) and e.response.status_code == 403:
# Expected for restricted roles (guest, read-only) - don't fail the tool call
pass
else:
# Unexpected error - re-raise to alert about the problem
raise
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logging logic logs all event trigger failures at ERROR level (line 173) before checking if it's an expected 403 error. This will create noisy error logs for every tool call by guest/read-only users, even though 403 responses are expected and intentionally handled.

Consider moving the exception logging inside the else block (after line 179) so that only unexpected errors are logged at ERROR level. For 403 errors, you could use LOG.debug() to indicate they were silently handled as expected.

Copilot uses AI. Check for mistakes.
if 'tool_result' in locals() and hasattr(tool_result, 'structured_content'):
try:
configuration_id = tool_result.structured_content.get('configuration_id')
except Exception:
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Copilot uses AI. Check for mistakes.
Comment on lines +201 to +202
except Exception:
pass
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Suggested change
except Exception:
pass
except Exception as parse_error:
LOG.debug(
"Failed to extract configuration_id from tool_result during error handling: %s",
parse_error,
)

Copilot uses AI. Check for mistakes.
Comment on lines +268 to +269
except Exception:
pass
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Suggested change
except Exception:
pass
except Exception as extraction_error:
LOG.warning(
"Failed to extract configuration_id from tool_result during error handling: %s",
extraction_error,
)

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@vita-stejskal vita-stejskal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert your changes in the integration tests.

configuration_id=created_config.configuration_id,
configuration_name=test_name,
),
links_manager.get_config_dashboard_link(component_id=component_id, component_name=None),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert the changes that introduced the links_manager fixture. It's use degrades the tests and essentially results in the links not being tested at all. The tools use the LinksManager to produce the links and now the tests use the same LinksManager to check that the links in the tool output are correct. This tests nothing really. If there were a bug in the LinksManager causing it to produce no links at all for example, the test would happily pass and indicate no problem.



@pytest.fixture(scope='session')
def _ensure_clean_flows(storage_api_token: str, storage_api_url: str) -> Generator[None, Any, None]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that we should do this. The fact that there are existing flows in the testing project means that the project is being used by another tests that just happen to run at the same time.

def _ensure_clean_flows(storage_api_token: str, storage_api_url: str) -> Generator[None, Any, None]:
"""
Ensure the project has no flows before and after flow tests.
This prevents leftover flows from failed tests affecting new test runs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there really are leftover objects (not only flows, but also buckets, tables, etc) in the testing project we should improve the way how we detect that another tests are using the testing project and how the integtests claim that the testing project belongs to them.

Currently this is done by checking that the testing project is empty. But this does not work reliably if the tests run fails in some unexpected way and does not clean up the testing project. The "locking" of the testing project should be redesigned in the way that allows detecting stale locks and recovering from the situation automatically.

There are ways how to achieve this, but it is not something that should be squeezed to a totally unrelated PR. And this fixture does not solve the problem anyway. Although it may help to remedy some of those situations.


except Exception:
# If tool creation fails but returned a configuration_id, try to extract it
if 'tool_result' in locals() and hasattr(tool_result, 'structured_content'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty awkward and I don't think that it makes the components creation any more reliable. This and similar changes in other functions here seem to try to work around the testing project "locking" problem that I described below. I'd strongly recommend to revert these changes and fix the locking of the testing project properly in a separate task/PR.

pytest.fail(f'Expecting empty Keboola project {project_id}, but found {len(current_buckets)} buckets')
LOG.warning(
f'Found {len(current_buckets)} buckets in project {project_id} (expected empty). '
f'Cleaning up from previous interrupted test run...'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is totally wrong. The buckets are not leftover from a previous run. They are the testing artifacts created by another set of the integration tests running at the same time. Please revert this change.


# Role-based filtering: read-only access for guest and read roles
if token_role in ['guest', 'readonly']:
tools = [t for t in tools if is_read_only_tool(t)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is too restricted for the guest role, I think. IIRC the conclusion from the Slack discussions was to only filter out the flow tools for the quest role.


This helper is used by both ToolsFilteringMiddleware and ToolAuthorizationMiddleware
to determine which tools are read-only. Tools with readOnlyHint=True are safe for
guest and read-only users to access.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph does not belong here. The docstring of a function should say what the function is or what it does and not how or where it is used.

Plus it's amusing when a one-liner function comes with ten lines of docstring.

# Create mock tool with read-only annotation
tool = _tool(tool_name)
tool.annotations = MagicMock()
tool.annotations.readOnlyHint = tool_read_only
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _tool() function could do the annotations mocking too.

.idea/
.vscode/
.vscode
.claude/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct? You've added .claude/CLAUDE.md in the PR and at the same time you git-ignore .claude/ folder.

- **Guest**: Read-only access limited to tools marked as read-only (no modifying operations)
- **Read**: Similar to guest, users with read role can only access tools marked as read-only (no modifying operations)
- **Other non-admin roles**: Standard write access with some administrative tools restricted (e.g., `modify_flow` for scheduling)
- **Admin**: Broad tool access for administrative operations, with specific tools restricted (e.g., `update_flow` is blocked, use `modify_flow` for scheduling)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not mention explicitly the modify_flow versus update_flow tools. It is an implementation detail that there are two tools that only differ in the schedulers support. From the users' perspective both are just the same tool for changing flows.

@vita-stejskal
Copy link
Contributor

Superseded by #397.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants