Skip to content

Commit 6166f23

Browse files
authored
Merge pull request #8 from MichaelC1999/Return-Func
Return filter func and csv type
2 parents d445092 + fd76d77 commit 6166f23

File tree

2 files changed

+24
-5
lines changed

2 files changed

+24
-5
lines changed

README.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,10 +56,16 @@ With the default inputs, this function outputs Pandas Dataframes after streaming
5656
- Defaults to False
5757
- initial_snapshot
5858
- Boolean value, defaults to False
59+
- return_first_result_function
60+
- Custom filtering function that accepts parsed data passed as an argument and returns as either True or False
61+
- Gets called when return_first_result is True and a block has applicable events/txs
62+
- If function resolves to True, the polling function returns the data from the block
63+
- If function resolves to False, the polling function continues iteration
5964
- return_type
6065
- Specifies the type of value to return
6166
- Passing "df" returns the data in a pandas DataFrame
6267
- Passing "dict" returns in the format {"data": [], "module_name": String, "data_block": int, error: str | None}
68+
- Passing "csv" returns in the format {"data": String(CSV), "module_name": String, "data_block": int, error: str | None}
6369

6470
The result here is the default `SubstreamOutput` object, you can access both the `data` and `snapshots` dataframes by doing:
6571

substreams/substream.py

Lines changed: 18 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -166,12 +166,13 @@ def poll(
166166
start_block: int,
167167
end_block: int,
168168
return_first_result: bool = False,
169+
return_first_result_function: Optional[callable] = None,
169170
initial_snapshot: bool = False,
170171
return_type: str = "df"
171172
):
172173

173174
return_dict_interface = {"data": [], "module_name": output_module, "data_block": str(start_block), "error": None}
174-
valid_return_types = ["dict", "df"]
175+
valid_return_types = ["dict", "df", "csv"]
175176
results = []
176177
raw_results = defaultdict(lambda: {"data": list(), "snapshots": list()})
177178

@@ -221,16 +222,25 @@ def poll(
221222
if len(parsed) > 0:
222223
parsed = [dict(item, **{'block':data["clock"]["number"]}) for item in parsed]
223224
if return_first_result is True:
224-
break
225+
if callable(return_first_result_function):
226+
func_result = return_first_result_function(parsed)
227+
if func_result is True:
228+
break
229+
else:
230+
continue
231+
else:
232+
break
225233
elif int(return_dict_interface["data_block"]) + 1 == end_block:
226234
results = return_dict_interface
227235

228236
if return_first_result is True and parsed:
229-
return_dict_interface["data"] = parsed
230237
if return_type == "dict":
231-
results = return_dict_interface
238+
return_dict_interface["data"] = parsed
232239
if return_type == "df":
233-
results = pd.DataFrame(parsed)
240+
return_dict_interface["data"] = pd.DataFrame(parsed)
241+
if return_type == "csv":
242+
return_dict_interface["data"] = pd.DataFrame(parsed).to_csv(index=False)
243+
results = return_dict_interface
234244
if return_first_result is False and raw_results:
235245
result = SubstreamOutput(module_name=output_module)
236246
data_dict: dict = raw_results.get(output_module)
@@ -242,6 +252,9 @@ def poll(
242252
if return_type == "dict":
243253
return_dict_interface["data"] = results.to_dict()
244254
results = return_dict_interface
255+
if return_type == "csv":
256+
return_dict_interface["data"] = results.to_csv(index=False)
257+
results = return_dict_interface
245258
except Exception as err:
246259
error_to_pass = err
247260
if isinstance(err, Exception):

0 commit comments

Comments
 (0)