Expected Behavior
Ingestion jobs successfully run and bigquery table (feast.customer_project_customer_transactions_v1) gets populated with data.
Current Behavior
>>> client.ingest("customer_transactions", customer_features)
Waiting for feature set to be ready for ingestion...
0%| | 0/155 [04:48<?, ?rows/s]
Ingestion complete!
Ingestion statistics:
Success: 0/155
Removing temporary file(s)...
BQ table is empty:
SELECT COUNT(*) FROM `feast.customer_project_customer_transactions_v1`;
is 0.
Steps to reproduce
Follow latest GKE setup docs with the following differences:
my-project as project ID
- Use image 0.4.2
- Use
NodePort to expose service to local client
- In basic example, use
$FEAST_CORE_URL and $FEAST_BATCH_SERVING_URL
- Open up nodeports in GCP firewall 32090, 32091, 32092:
gcloud compute firewall-rules create feast-core-port --allow tcp:32090
gcloud compute firewall-rules create feast-online-port --allow tcp:32091
gcloud compute firewall-rules create feast-batch-port --allow tcp:32092
my-feast-values.yaml ends up looking like:
feast-core:
enabled: true
image:
tag: "0.4.2"
jvmOptions:
- -Xms1024m
- -Xmx1024m
resources:
requests:
cpu: 1000m
memory: 1024Mi
service:
type: NodePort
grpc:
nodePort: 32090
gcpServiceAccount:
useExistingSecret: true
feast-serving-online:
enabled: true
redis:
enabled: true
image:
tag: "0.4.2"
jvmOptions:
- -Xms1024m
- -Xmx1024m
resources:
requests:
cpu: 500m
memory: 1024Mi
service:
type: NodePort
grpc:
nodePort: 32091
store.yaml:
name: redis
type: REDIS
redis_config:
port: 6379
subscriptions:
- name: "*"
project: "*"
version: "*"
feast-serving-batch:
enabled: true
redis:
enabled: false
image:
tag: "0.4.2"
jvmOptions:
- -Xms1024m
- -Xmx1024m
resources:
requests:
cpu: 500m
memory: 1024Mi
service:
type: NodePort
grpc:
nodePort: 32092
gcpServiceAccount:
useExistingSecret: true
application.yaml:
feast:
jobs:
staging-location: gs://my-project_feast_bucket/serving/batch
store-type: REDIS
store-options:
host: localhost
port: 6379
store.yaml:
name: bigquery
type: BIGQUERY
bigquery_config:
project_id: my-project
dataset_id: feast
subscriptions:
- name: "*"
project: "*"
version: "*"
Specifications
- Version: 0.4.2, latest master python SDK (2a33f7b)
- Platform: Installing GKE on local Mac OSX
- Subsystem: Python 3.7.6, helm v2.16.1, kubectl v1.17.0
Possible Solution
Default Kafka configs might need adjusting? Maybe related to NodePort config?
kubectl logs for feast-feast-core appear to show kafka jobs were successfully created and sent to DirectRunner:
22:57:52 [pool-5-thread-1] INFO feast.ingestion.ImportJob - Starting import job with settings:
Current Settings:
appName: DirectRunnerJobManager
blockOnRun: false
enforceEncodability: true
enforceImmutability: true
featureSetJson: [{
"name": "customer_transactions",
"version": 1,
"entities": [{
"name": "customer_id",
"valueType": "INT64"
}],
"features": [{
"name": "total_transactions",
"valueType": "INT64"
}, {
"name": "daily_transactions",
"valueType": "DOUBLE"
}],
"maxAge": "86400s",
"source": {
"type": "KAFKA",
"kafkaSourceConfig": {
"bootstrapServers": "feast-kafka:9092",
"topic": "feast"
}
},
"project": "customer_project"
}]
gcsPerformanceMetrics: false
optionsId: 0
project:
runner: class org.apache.beam.runners.direct.DirectRunner
stableUniqueNames: WARNING
storeJson: [{
"name": "bigquery",
"type": "BIGQUERY",
"subscriptions": [{
"name": "*",
"version": "*",
"project": "*"
}],
"bigqueryConfig": {
"projectId": "my-project",
"datasetId": "feast"
}
}]
22:57:54 [pool-5-thread-1] INFO feast.ingestion.utils.StoreUtil - Writing to existing BigQuery table 'my-project:feast.customer_project_customer_transactions_v1'
22:57:54 [pool-6-thread-1] INFO org.apache.beam.sdk.io.kafka.KafkaUnboundedSource - Partitions assigned to split 0 (total 1): feast-0
2020-01-09 22:57:54.984 AUDIT feast-feast-core-dc485b44d-qg75w --- [pool-6-thread-1] f.c.l.AuditLogger : {action=STATUS_CHANGE, detail=Job submitted to runner DirectRunner with ext id kafka-to-redis1578610671583., id=kafka-to-redis1578610671583, resource=JOB, timestamp=Thu Jan 09 22:57:54 UTC 2020}
22:57:55 [pool-5-thread-1] INFO org.apache.beam.sdk.io.kafka.KafkaUnboundedSource - Partitions assigned to split 0 (total 1): feast-0
2020-01-09 22:57:55.373 AUDIT feast-feast-core-dc485b44d-qg75w --- [pool-5-thread-1] f.c.l.AuditLogger : {action=STATUS_CHANGE, detail=Job submitted to runner DirectRunner with ext id kafka-to-bigquery1578610671583., id=kafka-to-bigquery1578610671583, resource=JOB, timestamp=Thu Jan 09 22:57:55 UTC 2020}
22:57:55 [pool-2-thread-1] INFO feast.core.service.JobCoordinatorService - Updating feature set status
22:57:58 [direct-runner-worker] INFO org.apache.beam.sdk.io.kafka.KafkaUnboundedSource - Reader-0: reading from feast-0 starting at offset 0
22:57:58 [direct-runner-worker] INFO org.apache.beam.sdk.io.kafka.KafkaUnboundedSource - Reader-0: reading from feast-0 starting at offset 0
And a feast topic was successfully created according to the kafka-config pod:
Waiting for Zookeeper...
Waiting for Kafka...
Applying runtime configuration using confluentinc/cp-kafka:5.0.1
Created topic "feast".
Configs for topic 'feast' are
But, I'm getting errors thrown related to topics in feast-kafka-0 logs:
[2020-01-09 22:57:57,799] INFO [Log partition=__consumer_offsets-2, dir=/opt/kafka/data/logs] Truncating to 0 has no effect as the largest offset in the log is -1 (kafka.log.Log)
[2020-01-09 22:57:57,804] ERROR [ReplicaFetcher replicaId=0, leaderId=1, fetcherId=0] Error for partition __consumer_offsets-8 at offset 0 (kafka.server.ReplicaFetcherThread)
org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition.
[2020-01-09 22:57:57,804] ERROR [ReplicaFetcher replicaId=0, leaderId=1, fetcherId=0] Error for partition __consumer_offsets-35 at offset 0 (kafka.server.ReplicaFetcherThread)
org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition.
[2020-01-09 22:57:57,805] ERROR [ReplicaFetcher replicaId=0, leaderId=1, fetcherId=0] Error for partition __consumer_offsets-41 at offset 0 (kafka.server.ReplicaFetcherThread)
Happy to share any other relevant logs. I'm admittedly not familiar with Kafka, so I could be off here. Just trying to get the GKE feast guide + basic example working end to end. Once it works, happy to put up a PR to update the guide for 0.4.X.
Expected Behavior
Ingestion jobs successfully run and bigquery table (
feast.customer_project_customer_transactions_v1) gets populated with data.Current Behavior
BQ table is empty:
is 0.
Steps to reproduce
Follow latest GKE setup docs with the following differences:
my-projectas project IDNodePortto expose service to local client$FEAST_CORE_URLand$FEAST_BATCH_SERVING_URLmy-feast-values.yamlends up looking like:Specifications
Possible Solution
Default
Kafkaconfigs might need adjusting? Maybe related toNodePortconfig?kubectllogs forfeast-feast-coreappear to show kafka jobs were successfully created and sent toDirectRunner:And a
feasttopic was successfully created according to thekafka-configpod:But, I'm getting errors thrown related to topics in
feast-kafka-0logs:Happy to share any other relevant logs. I'm admittedly not familiar with Kafka, so I could be off here. Just trying to get the GKE feast guide + basic example working end to end. Once it works, happy to put up a PR to update the guide for 0.4.X.