apache/beam

DataflowRunner should canonicalize gcpTempLocation/stagingLocation/etc formats.

Open

#18199 opened on Jun 3, 2022

View on GitHub
 (0 comments) (0 reactions) (0 assignees)Java (7,313 stars) (4,097 forks)batch import
P3dataflowgood first issuerunnerstests

Description

Cloud Dataflow has a minor history of small bugs related to various code paths expecting there to be or not be a trailing forward-slash in these location fields. The way that Beam's integration tests are set up, we are likely to only have one of these two cases tested (there is a single set of integration test pipeline options).

We should add a dedicated DataflowRunner integration test to handle this case.

Actually, we should probably canonicalize the URLs so that we only ever produce one version.

Imported from Jira BEAM-1194. Original Jira may contain additional context. Reported by: dhalperi.

Contributor guide