Skip to content

feast apply fails for Spark Offline Store once the registry has been created #2407

@aurobindoc

Description

@aurobindoc

Expected Behavior

It should not throw any exception

Current Behavior

feast apply throws an exception saying "ValueError: Could not identify the source type being added."

Error Log:

/grid/1/cremo/venvs/feast-spark/lib/python3.7/site-packages/scipy/sparse/sputils.py:16: DeprecationWarning: `np.typeDict` is a deprecated alias for `np.sctypeDict`.
  supported_dtypes = [np.typeDict[x] for x in supported_dtypes]
/grid/1/cremo/venvs/feast-spark/lib/python3.7/site-packages/scipy/fftpack/__init__.py:103: DeprecationWarning: The module numpy.dual is deprecated.  Instead of using dual, use the functions directly from numpy or scipy.
  from numpy.dual import register_func
/grid/1/cremo/venvs/feast-spark/lib/python3.7/site-packages/feast/infra/offline_stores/contrib/spark_offline_store/spark_source.py:61: RuntimeWarning: The spark data source API is an experimental feature in alpha development. This API is unstable and it could and most probably will be changed in the future.
  RuntimeWarning,
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
2022-03-15 08:09:23,416 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2022-03-15 08:09:48,173 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
2022-03-15 08:10:47,117 WARN util.package: Truncated the string representation of a plan since it was too large. This behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'.
Traceback (most recent call last):
  File "/grid/1/cremo/venvs/feast-spark/bin/feast", line 10, in <module>
    sys.exit(cli())
  File "/grid/1/cremo/venvs/feast-spark/lib/python3.7/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/grid/1/cremo/venvs/feast-spark/lib/python3.7/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/grid/1/cremo/venvs/feast-spark/lib/python3.7/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/grid/1/cremo/venvs/feast-spark/lib/python3.7/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/grid/1/cremo/venvs/feast-spark/lib/python3.7/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/grid/1/cremo/venvs/feast-spark/lib/python3.7/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/grid/1/cremo/venvs/feast-spark/lib/python3.7/site-packages/feast/cli.py", line 439, in apply_total_command
    apply_total(repo_config, repo, skip_source_validation)
  File "/grid/1/cremo/venvs/feast-spark/lib/python3.7/site-packages/feast/usage.py", line 269, in wrapper
    return func(*args, **kwargs)
  File "/grid/1/cremo/venvs/feast-spark/lib/python3.7/site-packages/feast/repo_operations.py", line 251, in apply_total
    store, project, registry, repo, skip_source_validation
  File "/grid/1/cremo/venvs/feast-spark/lib/python3.7/site-packages/feast/repo_operations.py", line 210, in apply_total_with_repo_instance
    registry_diff, infra_diff, new_infra = store._plan(repo)
  File "/grid/1/cremo/venvs/feast-spark/lib/python3.7/site-packages/feast/usage.py", line 280, in wrapper
    raise exc.with_traceback(traceback)
  File "/grid/1/cremo/venvs/feast-spark/lib/python3.7/site-packages/feast/usage.py", line 269, in wrapper
    return func(*args, **kwargs)
  File "/grid/1/cremo/venvs/feast-spark/lib/python3.7/site-packages/feast/feature_store.py", line 543, in _plan
    self._registry, self.project, desired_repo_contents
  File "/grid/1/cremo/venvs/feast-spark/lib/python3.7/site-packages/feast/diff/registry_diff.py", line 215, in diff_between
    registry, current_project, desired_repo_contents
  File "/grid/1/cremo/venvs/feast-spark/lib/python3.7/site-packages/feast/diff/registry_diff.py", line 172, in extract_objects_for_keep_delete_update_add
    ] = FeastObjectType.get_objects_from_registry(registry, current_project)
  File "/grid/1/cremo/venvs/feast-spark/lib/python3.7/site-packages/feast/registry.py", line 84, in get_objects_from_registry
    FeastObjectType.DATA_SOURCE: registry.list_data_sources(project=project),
  File "/grid/1/cremo/venvs/feast-spark/lib/python3.7/site-packages/feast/registry.py", line 302, in list_data_sources
    data_sources.append(DataSource.from_proto(data_source_proto))
  File "/grid/1/cremo/venvs/feast-spark/lib/python3.7/site-packages/feast/data_source.py", line 252, in from_proto
    raise ValueError("Could not identify the source type being added.")
ValueError: Could not identify the source type being added.```

## Steps to reproduce
For Spark Offline Store,
- Create the registry using feast apply
- Update registry using feast ap[ply again, with or without any change in example.py

### Specifications

- Version: 0.19.3
- Platform: linux
- Subsystem: 

## Possible Solution
class DataSource(ABC):
    def from_proto(data_source: DataSourceProto) -> Any:
        # we should add check datasource and identify of it is of SparkSource type and use SparkSource.from_proto()

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions