vespa-engine/vespa

Support feeding from Spark

Open

#9,158 opened on Apr 23, 2019

View on GitHub
 (6 comments) (0 reactions) (0 assignees)Java (4,948 stars) (561 forks)batch import
enhancementgood first issue

Description

Today, the Hadoop integration tools for Vespa support Hadoop and Pig for feeding and querying Vespa. The Pig feeder is a thin wrapper around the Vespa HTTP client.

We should support feeding directly from Spark as well, to avoid Spark pipelines having to write to HDFS and run another Pig job for the actual feeding. Similarly to the Pig feeder, this could be implemented as a thin wrapper around the HTTP client.

Contributor guide

Support feeding from Spark · vespa-engine/vespa#9158 | Good First Issue