Skip to content

FeatureSets are delivered to Ingestion Job through Kafka#792

Merged
feast-ci-bot merged 2 commits intofeast-dev:masterfrom
pyalex:specs-in-kafka
Jun 17, 2020
Merged

FeatureSets are delivered to Ingestion Job through Kafka#792
feast-ci-bot merged 2 commits intofeast-dev:masterfrom
pyalex:specs-in-kafka

Conversation

@pyalex
Copy link
Copy Markdown
Collaborator

@pyalex pyalex commented Jun 11, 2020

What this PR does / why we need it:

SpecService & IngestionJob are now communicate through kafka topics, which makes job restarts on FeatureSet change obsolete. Now job restarted only when subscription configuration of store was changed.

Communication Flow:

  1. SpecService.applyFeatureSet publish FeatureSetSpec to specs-topic and set FeatureSet status to Pending
  2. IngestionJob reads from this topic (all history of changes + recent updates) & build materialized view of Specs in memory
  3. IngestionJob sends acknowledgment back to SpecService via specs-ack-topic
  4. SpecService collects acknowledgments from all related jobs (see FeatureSetJobStatus) and when all running jobs acknowledged FeatureSet status is changed to Ready

Which issue(s) this PR fixes:

Fixes #761

Does this PR introduce a user-facing change?:
No


@pyalex pyalex changed the title [WIP] FeatureSets are delivered to Ingestion Job through Kafka FeatureSets are delivered to Ingestion Job through Kafka Jun 12, 2020
@pyalex pyalex added the kind/feature New feature or request label Jun 12, 2020
@pyalex
Copy link
Copy Markdown
Collaborator Author

pyalex commented Jun 12, 2020

/test test-end-to-end-batch-dataflow

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want to flag that even though I agree with the approach we are taking with Spring and Kafka, I do consider it technical debt that we are going into. Ideally the life cycle of jobs and the updates of feature sets to those jobs would fully encapsulated in the job management layer, especially if we ever want to separate job management from core.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be more than happy to move this communication responsibility to JobService, but right now JobService is dependant on SpecService and I need publishing to kafka be synchronous part of applyFeatureSet. So yeah, currently it's a tech debt.

@pyalex
Copy link
Copy Markdown
Collaborator Author

pyalex commented Jun 16, 2020

/test test-end-to-end-batch-dataflow

…eatureSet version

switch to spring-kafka (configs)

specService send message to kafka & expect ack & update status accordingly

jobs runner to send source & specs config (source + ack)

ingestion job to read specs from kafka and send ack

return featureSets in ingestionJob

generate uniq topic name for each test run

prevent listJobs from failing when job failed on start
@pyalex
Copy link
Copy Markdown
Collaborator Author

pyalex commented Jun 16, 2020

/test test-end-to-end-batch-dataflow

@woop
Copy link
Copy Markdown
Member

woop commented Jun 17, 2020

/lgtm

@feast-ci-bot
Copy link
Copy Markdown
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: pyalex, woop

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Job Coordination Improvement Proposal

3 participants