Skip to content

Conversation

@pyalex
Copy link
Collaborator

@pyalex pyalex commented Apr 26, 2022

Signed-off-by: pyalex moskalenko.alexey@gmail.com

What this PR does / why we need it:

This is mostly refactoring PR. This refactoring is needed before integrating Python API for feature logging in offline store.
Improvements:

  1. Splitting LoggingService (single instance) with Logger (one per feature service) - allows to cache logger (per feature service) instead of creating new one on each request.
  2. Decoupling schema generation from logger - improves testing.
  3. Decoupling buffer operations - improves testing.
  4. Simplifying sink logic - there will be one prod and one test sink in foreseeable future, it's better to keep things simple - improves testing.

Which issue(s) this PR fixes:

Fixes #

@codecov-commenter
Copy link

codecov-commenter commented Apr 26, 2022

Codecov Report

Merging #2614 (dd563fb) into master (f3ff812) will decrease coverage by 0.04%.
The diff coverage is 60.00%.

@@            Coverage Diff             @@
##           master    #2614      +/-   ##
==========================================
- Coverage   81.64%   81.60%   -0.05%     
==========================================
  Files         161      161              
  Lines       13304    13307       +3     
==========================================
- Hits        10862    10859       -3     
- Misses       2442     2448       +6     
Flag Coverage Δ
integrationtests 72.56% <60.00%> (-0.34%) ⬇️
unittests 59.39% <40.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
sdk/python/feast/feature_logging.py 77.77% <60.00%> (-0.39%) ⬇️
sdk/python/tests/utils/online_read_write_test.py 93.54% <0.00%> (-6.46%) ⬇️
.../integration/online_store/test_online_retrieval.py 96.84% <0.00%> (-3.16%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f3ff812...dd563fb. Read the comment docs.

@pyalex pyalex requested a review from kevjumba April 26, 2022 23:44
Comment on lines 253 to 254
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we no longer need to clean up the generated log file?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it's written to temp directory, which will be cleaned up by test framework

@pyalex pyalex force-pushed the decouple-go-logging branch from 4ec8ccd to f15b622 Compare April 27, 2022 00:04
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a comment that is no longer needed as we have tests now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove comment.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove comment.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove comment.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

Comment on lines 184 to 186
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these not supported types?

Copy link
Collaborator Author

@pyalex pyalex Apr 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there're Timestamp_ms and Date32 types that are not convertible to proto. We write Timestamp_s to parquet but somehow it's being read by Go parquet reader as Timestamp_ms. Reasons are unclear yet. Date32 is used system column log_date. We can skip it for now in tests.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file log sink?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should also have error handling for writing batch.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot really handle this error. But added printing

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add some comments here that specify that differentiates flushing from writing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comments to LogSink interface

Comment on lines 17 to 23
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there should be comments explaining all of these fields

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whats a PartitionInterval for?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used right now. Deleted

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the default sample rate be 0 for perf?

Copy link
Collaborator Author

@pyalex pyalex Apr 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default behaviour is omission of LoggingConfig in feature service. In this case no logging happening. But if you added logging config then default sample_rate is 100%.

@feast-ci-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kevjumba, pyalex

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Signed-off-by: pyalex <moskalenko.alexey@gmail.com>
@pyalex pyalex force-pushed the decouple-go-logging branch from 6f52c8f to c7446f5 Compare April 27, 2022 18:34
Signed-off-by: pyalex <moskalenko.alexey@gmail.com>
@kevjumba
Copy link
Collaborator

/lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants