-
Notifications
You must be signed in to change notification settings - Fork 1.1k
feat: Kickoff Transformation implementationtransformation code base #5181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: HaoXuAI <[email protected]>
Reference comment: #5130 (comment) |
One unstable unit test seems to be related to HF @franciscojavierarceo : And rerun got new error: |
Got it, I can make a patch for that. 👍 |
No rush, take your time. This pr is ready for review now :) |
@@ -678,9 +687,6 @@ def _construct_random_input( | |||
) -> dict[str, Union[list[Any], Any]]: | |||
rand_dict_value: dict[ValueType, Union[list[Any], Any]] = { | |||
ValueType.BYTES: [str.encode("hello world")], | |||
ValueType.PDF_BYTES: [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what's making the unit tests fail.
This is used in infer_features
to validate the function schema / types actually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I think I accidentally delete it when resolve merge conflict. let me add it back
|
||
on_demand_feature_view_obj = OnDemandFeatureView( | ||
name=name if name is not None else user_function.__name__, | ||
sources=sources, | ||
schema=schema, | ||
feature_transformation=transformation, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we keep this to be backwards compatible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is backwards compatible. I put the feature_transformation extraction logic inside the OnDemandFeatureView initialization, since the decorator doesn't pass in the feature_transformation param.
So user can do:
@demand_feature_view(...)
def udf()...
or
odfv = OnDemandFeatureView(feature_transformation=Transformation(...))`
from feast.transformation.mode import TransformationMode | ||
|
||
|
||
class Transformation(ABC): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'd add a docstring here. ChatGPT should be a nice friend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's good point
owner: Optional[str] = "", | ||
): | ||
def mainify(obj): | ||
# Needed to allow dill to properly serialize the udf. Otherwise, clients will need to have a file with the same |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, we encountered some serialization issues with dill in the past. @Rostifar do you remember what they were?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have dill issue as well. Created a new issue for it:
#5182
inferred_value = feature_value[0] | ||
if singleton and isinstance(inferred_value, list): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just added this FYI as it's an edge case I didn't consider before, so please feel free to add it back in!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strange I didn't modify this. not sure why it sneak in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this look great! some small nits and some notes about fixing stuff I recently added which will fix the unit tests. otherwise looks awesome. 👏🚀🤠
Signed-off-by: HaoXuAI <[email protected]>
good catch. thanks! |
Signed-off-by: HaoXuAI <[email protected]>
Signed-off-by: HaoXuAI <[email protected]>
Signed-off-by: HaoXuAI <[email protected]>
Signed-off-by: HaoXuAI <[email protected]>
@franciscojavierarceo finally passed all tests! |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work @HaoXuAI 🚀🚀🚀
Probably we can update the docs in a follow up PR before the next release?
# [0.48.0](v0.47.0...v0.48.0) (2025-04-07) ### Bug Fixes * Enhance integration logos display and styling in the UI ([#5221](#5221)) ([5799257](5799257)) * Fix space typo in push.md docs ([#5184](#5184)) ([81677b2](81677b2)) * Fixed integration tests for qdrant and milvus ([#5224](#5224)) ([d6b080d](d6b080d)) * Formatting trino ([760ec0e](760ec0e)) * Multiple fixes in retrieval of online documents ([#5168](#5168)) ([66ddd3e](66ddd3e)) * Operator route creation for Feast UI in OpenShift ([e3946b4](e3946b4)) * Remove entity_rows parameter from retrieve_online_documents_v2 call ([#5225](#5225)) ([2a2e304](2a2e304)) * Styling ([#5222](#5222)) ([34c393c](34c393c)) * typo in the chart ([bd3448b](bd3448b)) * Update milvus-quickstart and feature_store.yaml with correct Milvus Config ([#5200](#5200)) ([306acca](306acca)) * Update Qdrant online store paths in repo_config.py ([#5207](#5207)) ([ab35b0b](ab35b0b)), closes [#5206](#5206) * Update the doc ([#5194](#5194)) ([726464e](726464e)) * Updated the operator-rabc example to test RBAC from a Kubernete pod ([#5147](#5147)) ([d23a1a5](d23a1a5)) ### Features * add `real`(float32) type for trino offline store ([#4749](#4749)) ([0947f96](0947f96)) * Add async DynamoDB timeout and retry configuration ([#5178](#5178)) ([2f3bcf5](2f3bcf5)) * Add CronJob capability to the Operator (feast apply & materialize-incremental) ([#5217](#5217)) ([285c0dc](285c0dc)) * Add RAG tutorial and Use Cases documentation ([#5226](#5226)) ([99f4004](99f4004)) * Added CLI for features, get historical and online features ([#5197](#5197)) ([4ab9f74](4ab9f74)) * Added export support in feast UI ([#5198](#5198)) ([b079553](b079553)) * Added global registry search support in Feast UI ([#5195](#5195)) ([f09ea49](f09ea49)) * Added UI for Features list ([#5192](#5192)) ([cc7fd47](cc7fd47)) * Adding blog on RAG with Milvus ([#5161](#5161)) ([b9e2e6c](b9e2e6c)) * Adding Docling RAG demo ([#5109](#5109)) ([569404b](569404b)) * Allow transformations on writes to output list of entities ([#5209](#5209)) ([955521a](955521a)) * Cache get_any_feature_view results ([#5175](#5175)) ([924b8a3](924b8a3)) * Clickhouse offline store ([#4725](#4725)) ([86794c2](86794c2)) * Enable keyword search for Milvus ([#5199](#5199)) ([ac44967](ac44967)) * Enable transformations on PDFs ([#5172](#5172)) ([3674971](3674971)) * Enable users to use Entity Query as CTE during historical retrieval ([#5202](#5202)) ([fe69eaf](fe69eaf)) * helm support more deployment config ([d575372](d575372)) * Improved CLI file structuring ([#5201](#5201)) ([972ed34](972ed34)) * Kickoff Transformation implementationtransformation code base ([#5181](#5181)) ([0083303](0083303)) * Make keep-alive timeout configurable for async DynamoDB connections ([#5167](#5167)) ([7f3e528](7f3e528)) * Operator mounts the odh-trusted-ca-bundle configmap when deployed on RHOAI or ODH ([d4d7b0d](d4d7b0d)) * Spark Transformation ([#5185](#5185)) ([be3d85c](be3d85c))
…east-dev#5181) * transformation code base Signed-off-by: HaoXuAI <[email protected]> * add back master change to resovle unit test error Signed-off-by: HaoXuAI <[email protected]> * add back master change to resovle unit test error Signed-off-by: HaoXuAI <[email protected]> * fix linthing Signed-off-by: HaoXuAI <[email protected]> * fix linthing Signed-off-by: HaoXuAI <[email protected]> * add back master change to resovle unit test error Signed-off-by: HaoXuAI <[email protected]> --------- Signed-off-by: HaoXuAI <[email protected]> Signed-off-by: Jacob Weinhold <[email protected]>
# [0.48.0](feast-dev/feast@v0.47.0...v0.48.0) (2025-04-07) ### Bug Fixes * Enhance integration logos display and styling in the UI ([feast-dev#5221](feast-dev#5221)) ([5799257](feast-dev@5799257)) * Fix space typo in push.md docs ([feast-dev#5184](feast-dev#5184)) ([81677b2](feast-dev@81677b2)) * Fixed integration tests for qdrant and milvus ([feast-dev#5224](feast-dev#5224)) ([d6b080d](feast-dev@d6b080d)) * Formatting trino ([760ec0e](feast-dev@760ec0e)) * Multiple fixes in retrieval of online documents ([feast-dev#5168](feast-dev#5168)) ([66ddd3e](feast-dev@66ddd3e)) * Operator route creation for Feast UI in OpenShift ([e3946b4](feast-dev@e3946b4)) * Remove entity_rows parameter from retrieve_online_documents_v2 call ([feast-dev#5225](feast-dev#5225)) ([2a2e304](feast-dev@2a2e304)) * Styling ([feast-dev#5222](feast-dev#5222)) ([34c393c](feast-dev@34c393c)) * typo in the chart ([bd3448b](feast-dev@bd3448b)) * Update milvus-quickstart and feature_store.yaml with correct Milvus Config ([feast-dev#5200](feast-dev#5200)) ([306acca](feast-dev@306acca)) * Update Qdrant online store paths in repo_config.py ([feast-dev#5207](feast-dev#5207)) ([ab35b0b](feast-dev@ab35b0b)), closes [feast-dev#5206](feast-dev#5206) * Update the doc ([feast-dev#5194](feast-dev#5194)) ([726464e](feast-dev@726464e)) * Updated the operator-rabc example to test RBAC from a Kubernete pod ([feast-dev#5147](feast-dev#5147)) ([d23a1a5](feast-dev@d23a1a5)) ### Features * add `real`(float32) type for trino offline store ([feast-dev#4749](feast-dev#4749)) ([0947f96](feast-dev@0947f96)) * Add async DynamoDB timeout and retry configuration ([feast-dev#5178](feast-dev#5178)) ([2f3bcf5](feast-dev@2f3bcf5)) * Add CronJob capability to the Operator (feast apply & materialize-incremental) ([feast-dev#5217](feast-dev#5217)) ([285c0dc](feast-dev@285c0dc)) * Add RAG tutorial and Use Cases documentation ([feast-dev#5226](feast-dev#5226)) ([99f4004](feast-dev@99f4004)) * Added CLI for features, get historical and online features ([feast-dev#5197](feast-dev#5197)) ([4ab9f74](feast-dev@4ab9f74)) * Added export support in feast UI ([feast-dev#5198](feast-dev#5198)) ([b079553](feast-dev@b079553)) * Added global registry search support in Feast UI ([feast-dev#5195](feast-dev#5195)) ([f09ea49](feast-dev@f09ea49)) * Added UI for Features list ([feast-dev#5192](feast-dev#5192)) ([cc7fd47](feast-dev@cc7fd47)) * Adding blog on RAG with Milvus ([feast-dev#5161](feast-dev#5161)) ([b9e2e6c](feast-dev@b9e2e6c)) * Adding Docling RAG demo ([feast-dev#5109](feast-dev#5109)) ([569404b](feast-dev@569404b)) * Allow transformations on writes to output list of entities ([feast-dev#5209](feast-dev#5209)) ([955521a](feast-dev@955521a)) * Cache get_any_feature_view results ([feast-dev#5175](feast-dev#5175)) ([924b8a3](feast-dev@924b8a3)) * Clickhouse offline store ([feast-dev#4725](feast-dev#4725)) ([86794c2](feast-dev@86794c2)) * Enable keyword search for Milvus ([feast-dev#5199](feast-dev#5199)) ([ac44967](feast-dev@ac44967)) * Enable transformations on PDFs ([feast-dev#5172](feast-dev#5172)) ([3674971](feast-dev@3674971)) * Enable users to use Entity Query as CTE during historical retrieval ([feast-dev#5202](feast-dev#5202)) ([fe69eaf](feast-dev@fe69eaf)) * helm support more deployment config ([d575372](feast-dev@d575372)) * Improved CLI file structuring ([feast-dev#5201](feast-dev#5201)) ([972ed34](feast-dev@972ed34)) * Kickoff Transformation implementationtransformation code base ([feast-dev#5181](feast-dev#5181)) ([0083303](feast-dev@0083303)) * Make keep-alive timeout configurable for async DynamoDB connections ([feast-dev#5167](feast-dev#5167)) ([7f3e528](feast-dev@7f3e528)) * Operator mounts the odh-trusted-ca-bundle configmap when deployed on RHOAI or ODH ([d4d7b0d](feast-dev@d4d7b0d)) * Spark Transformation ([feast-dev#5185](feast-dev#5185)) ([be3d85c](feast-dev@be3d85c)) Signed-off-by: Jacob Weinhold <[email protected]>
What this PR does / why we need it:
Created a Transformation interface. it still works with the current pandas_transformation, python_transformation etc.
The next step is refactor the BatchMaterializationEngine to make it works for both Materialization and Transformation.
Which issue(s) this PR fixes:
#4584
#4277 (comment)
#4696
Misc