Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: refactor and add components integration tests #3607

Merged
merged 3 commits into from
Sep 2, 2024

Conversation

nicoloboschi
Copy link
Contributor

This PR moves and add some tests.
The final goal is to have the following features in CI:

  1. Run both unit and integration tests that don't require external credentials on every PR
  2. Run both unit and integration tests that required external credentials with the scheduled job

In order to achieve this, we use the api_key_required pytest markers to mark tests that require an external api key (e.g. openai api key)

In this PR I've added 3 different baselines for new kind of tests, all under integration:

  1. components tests: test a single component, asserting the output with some given inputs. This will run as single-component flow, enabling the whole engine
  2. flows tests: test a flow built from current version of the components, in a programmatic way (thanks to the recent work of @ogabrielluiz ). This also verifies that with some inputs the outputs is always the same for a given flow.
  3. backward tests: download starter projects (and others in the future) from older versions of langflow and run them with the current engine. This will help us in catching incompatible changes

The expectation is that now on every PR modifying a component input or output must cover the change with one or more of these tests.

It's important to note that some integration tests do not require any api keys at all and they are run as part of PR validation.

Copy link
Contributor

Pull Request Validation Report

This comment is automatically generated by Conventional PR

Whitelist Report

Whitelist Active Result
Pull request is a draft and should be ignored
Pull request is made by a whitelisted user and should be ignored
Pull request is submitted by a bot and should be ignored
Pull request is submitted by administrators and should be ignored

Result

Pull request matches with one (or more) enabled whitelist criteria. Pull request validation is skipped.

Last Modified at 28 Aug 24 22:16 UTC

Copy link

This pull request is automatically being deployed by Amplify Hosting (learn more).

Access this pull request here: https://pr-3607.dmtpw4p5recq1.amplifyapp.com

@nicoloboschi nicoloboschi force-pushed the int-tests branch 2 times, most recently from 9661365 to b16bd7f Compare August 28, 2024 22:52
.github/workflows/python_test.yml Show resolved Hide resolved
Makefile Outdated
@@ -148,9 +148,20 @@ else
$(args)
endif

unit_tests_api_keys: ## run unit tests only with api key tests
Copy link
Collaborator

@jordanrfrazier jordanrfrazier Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should classify any tests that require API keys as unit tests.

Then for integration tests, we also try the following naming pattern. Opinions?

integration_tests: ## Run all integration tests
integration_tests_api_key_required_only: ## Run only integration tests that require api keys
integration_tests_no_api_key_required_only: ## .. 

for _component in component._components:
self.add_component(_component._id, _component)
self.vertex_map[component_id] = vertex
return component_id

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the graph code well enough, but this seems like a big change. Can you explain what the bug was here and what this change fixes?

for predecessor in self.predecessor_map[neighbor]:
if predecessor not in queue and predecessor not in visited:
queue.append(predecessor)

current_layer += 1 # Next layer
logger.debug(f"before refine {str(layers)}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prob remove some of these debugs or make them more descriptive if you want to keep

@@ -0,0 +1,34 @@
import os.path

# we need to import tmpdir
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this leftover?


@pytest.mark.asyncio
@pytest.mark.api_key_required
async def test_1_0_15_basic_prompting():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very cool, I like this compromise of testing actual json and not storing a full folder of flows in our repo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool!

)
component.build_vector_store()
print(results)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove? assert results not error?

not check_env_vars("ASTRA_DB_APPLICATION_TOKEN", "ASTRA_DB_API_ENDPOINT"),
reason="missing astra env vars",
)
@pytest.mark.parametrize("astra_fixture", [SEARCH_COLLECTION], indirect=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious if/how the tests work without these parameters

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assume you just haven't done these changes / will have someone do them as a follow up

src/backend/tests/integration/utils.py Outdated Show resolved Hide resolved
if _id in self.vertex_map:
return
def add_component(self, component: "Component", component_id: Optional[str] = None) -> str:
component_id = component_id or str(component.name + "-" + str(uuid.uuid4()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the user does not set an Id, the Component creates one automatically.

config |= {"_id": f"{self.__class__.__name__}-{nanoid.generate(size=5)}"}

If you still think this should happen here as well, try using the same logic with nanoid

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aligned to the existing logic, thanks


@pytest.mark.asyncio
@pytest.mark.api_key_required
async def test_1_0_15_basic_prompting():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool!

@nicoloboschi nicoloboschi marked this pull request as ready for review August 29, 2024 22:48
@dosubot dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Aug 29, 2024
Copy link
Contributor

@ogabrielluiz ogabrielluiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Aug 30, 2024
- uses: actions/checkout@v4
with:
ref: ${{ inputs.branch || github.ref }}
- name: Setup Node.js
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe integration tests need node

@@ -61,6 +61,7 @@ def __init__(self, **kwargs):
self._output_logs = {}
config = config or {}
if "_id" not in config:
print("generating id" + self.__class__.__name__)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

@@ -1430,6 +1434,8 @@ async def _execute_tasks(self, tasks: list[asyncio.Task], lock: asyncio.Lock) ->
# This could usually happen with input vertices like ChatInput
self.run_manager.remove_vertex_from_runnables(v.id)

logger.debug(f"Vertex {v.id}, result: {v._built_result}, object: {v._built_object}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

not check_env_vars("ASTRA_DB_APPLICATION_TOKEN", "ASTRA_DB_API_ENDPOINT"),
reason="missing env vars",
)
@pytest.mark.api_key_required
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the split of api_key_required vs. no_api_key_required enough? (I think for now, yes). In the future, it would be nice if CI automatically ran specific integration tests if it detected changes to specific components (for people with access to API Keys, i.e. us).

clazz=TextToData, inputs={"text_data": ["test1", "test2"]}, output_name="data"
),
"embedding": ComponentInputHandle(
clazz=OpenAIEmbeddingsComponent,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would err on using MockEmbeddings here, just as a best practice pattern. No need to introduce new dependencies or integrations if not necessary or explicitly testing them.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentional - do you have plans to restore this in a follow up?

@@ -427,7 +427,6 @@ def test_build_vertex_invalid_vertex_id(client, added_flow_with_prompt_and_histo
assert response.status_code == 500


@pytest.mark.api_key_required
def test_successful_run_no_payload(client, simple_api_test, created_api_key):
headers = {"x-api-key": created_api_key.api_key}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this test require API keys? Maybe this was the one you were referring to as unit test that required an API Key, which I understand is a bit ambiguous now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, it requires a langflow api key, not a credential api key

Copy link
Collaborator

@jordanrfrazier jordanrfrazier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please take a look at previous comments and see what you'd like to update before merging, thanks

@nicoloboschi nicoloboschi merged commit 96872f3 into langflow-ai:main Sep 2, 2024
28 of 29 checks passed
carlosrcoelho pushed a commit that referenced this pull request Sep 2, 2024
* improve inegration tests

* add fixes

* [autofix.ci] apply automated fixes

---------

Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
diogocabral pushed a commit to headlinevc/langflow that referenced this pull request Nov 26, 2024
* improve inegration tests

* add fixes

* [autofix.ci] apply automated fixes

---------

Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ignore-for-release lgtm This PR has been approved by a maintainer size:XL This PR changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants