apache/iceberg

`remove_orphan_files` scopes `file_list_view` with raw string prefix matching

Open

#16493 opened on May 20, 2026

View on GitHub
 (4 comments) (0 reactions) (0 assignees)Java (5,146 stars) (1,915 forks)batch import
buggood first issue

Description

This issue was reported to the private Apache Iceberg security mailing list. The submitter is being kept anonymous because the report was sent to a private list. After review, the issue is not considered a serious vulnerability that needs to be kept private, so it is being filed publicly here for tracking and resolution.

Note: this submission was generated by AI. Please review its claims and source references carefully before acting on them.

Summary

Spark scopes file_list_view with raw string prefix matching, letting sibling paths fall inside orphan cleanup.

Affected Maven coordinates

  • versioned integration artifacts: org.apache.iceberg:iceberg-spark-3.4_*, org.apache.iceberg:iceberg-spark-3.5_*, org.apache.iceberg:iceberg-spark-4.0_2.13, org.apache.iceberg:iceberg-spark-4.1_2.13

Attacker prerequisites

  • control over a path, prefix, batch, or file list that sits adjacent to a legitimately scoped prefix
  • ability to trigger the credential-selection or cleanup path with that crafted input

Impact

  • A table location like s3://bucket/table also matches sibling prefixes such as s3://bucket/table-backup/....
  • If the caller supplies a crafted file_list_view, files outside the intended directory can be treated as in-scope and deleted.
  • This weakens scope checks even when the explicit location override is not used.

Proof status

Source review only. The issue is visible directly from source.

Key source references

  • org.apache.iceberg.spark.actions.DeleteOrphanFilesSparkAction

Contributor guide