Skip to content

Feature store lakeformation#5599

Draft
BassemHalim wants to merge 10 commits intoaws:masterfrom
BassemHalim:feature-store-lakeformation
Draft

Feature store lakeformation#5599
BassemHalim wants to merge 10 commits intoaws:masterfrom
BassemHalim:feature-store-lakeformation

Conversation

@BassemHalim
Copy link
Contributor

@BassemHalim BassemHalim commented Mar 4, 2026

Description

This PR adds Lake Formation integration to SageMaker Feature Store, enabling customers to govern access to their offline store data through AWS Lake Formation instead of relying solely on IAM policies.

This simplifies the manual process described in this blog https://aws.amazon.com/blogs/machine-learning/control-access-to-amazon-sagemaker-feature-store-offline-using-aws-lake-formation/

New Features

LakeFormationConfig class

  • Configuration object for Lake Formation settings
  • Attributes: enabled, use_service_linked_role, registration_role_arn, show_s3_policy

FeatureGroup.create() - new lake_formation_config parameter

  • Enables Lake Formation governance at Feature Group creation time
  • Automatically waits for Feature Group to reach "Created" status before configuring Lake Formation

FeatureGroup.enable_lake_formation() method

  • Enables Lake Formation on existing Feature Groups
  • Three-phase setup:
    1. Registers S3 location with Lake Formation
    2. Grants permissions to the offline store role on the Glue table
    3. Revokes IAMAllowedPrincipal permissions from the Glue table
  • Fail-fast behavior with clear error reporting at each phase
  • Optional show_s3_policy parameter prints recommended S3 deny policy

Usage

Enable at creation:

from sagemaker.mlops.feature_store import FeatureGroup, LakeFormationConfig

lf_config = LakeFormationConfig()
lf_config.enabled = True

fg = FeatureGroup.create(
    feature_group_name="my-feature-group",
    # ... other params ...
    lake_formation_config=lf_config,
)

Enable on existing Feature Group:

fg = FeatureGroup.get("my-feature-group")
fg.enable_lake_formation(show_s3_policy=True)

Testing

  • Unit tests: comprehensive coverage of all new methods and validation logic
  • Integration tests: end-to-end tests for both creation workflows and negative scenarios

Notes

  • S3 deny policy is provided as a recommendation (not applied automatically) to avoid breaking existing workflows

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

adishaa and others added 10 commits January 16, 2026 07:00
- Add LakeFormationConfig class to configure Lake Formation governance on offline stores
- Implement FeatureGroup subclass with Lake Formation integration capabilities
- Add helper methods for S3 URI/ARN conversion and Lake Formation role management
- Add S3 deny policy generation for Lake Formation access control
- Implement Lake Formation resource registration and S3 bucket policy setup
- Add integration tests for Lake Formation feature store workflows
- Add unit tests for Lake Formation configuration and policy generation
- Update feature_store module exports to include FeatureGroup and LakeFormationConfig
- Update API documentation to include Feature Store section in sagemaker_mlops.rst
- Enable fine-grained access control for feature store offline stores using AWS Lake Formation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants