Skip to content

[adapters] Upgrade delta and iceberg dependencies.#5682

Merged
ryzhyk merged 1 commit intomainfrom
bump-delta-rs
Feb 24, 2026
Merged

[adapters] Upgrade delta and iceberg dependencies.#5682
ryzhyk merged 1 commit intomainfrom
bump-delta-rs

Conversation

@ryzhyk
Copy link
Contributor

@ryzhyk ryzhyk commented Feb 24, 2026

Upgrade deltalake and related dependencies:

  • delta-rs 0.26.2 -> 0.30.2
  • iceberg 0.5.1 -> 0.8.0
  • arrow 55 -> 57
  • datafusion 47 -> 51

The new version of delta-rs is based on delta-kernel and has some features
missing in 0.26, such as support for v2 checkpoints and deletion vectors. The
latter will require additional work on the connector to support deletion
vectors in follow and CDC modes, but they should work out of the box in the
snapshot mode.

Overall we are hoping this version will be more reliable and performant.

Describe Manual Test Plan

Checklist

  • Unit tests added/updated
  • Integration tests added/updated
  • Documentation updated
  • Changelog updated

Breaking Changes?

Mark if you think the answer is yes for any of these components:

Describe Incompatible Changes

@ryzhyk ryzhyk added the connectors Issues related to the adapters/connectors crate label Feb 24, 2026
@ryzhyk ryzhyk force-pushed the bump-delta-rs branch 2 times, most recently from ac2c03a to 2ba39e5 Compare February 24, 2026 01:37
// https://docs.rs/parquet/50.0.0/src/parquet/record/api.rs.html#858
// the right way is probably to use serde_arrow for deserialization and serialization
timestamp_format: TimestampFormat::String("%Y-%m-%d %H:%M:%S %:z"), // 2023-11-04 15:33:47 +00:00
timestamp_format: TimestampFormat::String("%Y-%m-%d %H:%M:%S.%f %:z"), // 2023-11-04 15:33:47.123 +00:00
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a breaking change?

Copy link
Contributor Author

@ryzhyk ryzhyk Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems safe. It looks like the parquet crate now produces this format when converting records to JSON. In any case, this is only used in testing.

tokio = { workspace = true, features = ["sync", "macros", "fs", "rt"] }
utoipa = { workspace = true }
chrono = { workspace = true, features = ["rkyv-64", "serde"] }
chrono = { workspace = true, features = ["serde"] }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@ryzhyk ryzhyk force-pushed the bump-delta-rs branch 2 times, most recently from 80f9428 to 71d9fdb Compare February 24, 2026 07:25
@ryzhyk ryzhyk marked this pull request as ready for review February 24, 2026 07:25
@ryzhyk ryzhyk enabled auto-merge February 24, 2026 07:32
@ryzhyk ryzhyk added this pull request to the merge queue Feb 24, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to a conflict with the base branch Feb 24, 2026
Copy link
Collaborator

@mythical-fred mythical-fred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two blockers: a correctness bug in the Glue catalog path and a commit message that doesn't describe most of what's in the commit.

Commit message quality: The single commit in this PR is titled Disable chrono::rkyv feature. with a body that only explains the chrono/datafusion bug fix. But the commit contains the full delta-rs 0.26.2→0.30.2 / iceberg 0.5.1→0.8.0 / arrow 55→57 / datafusion 47→51 upgrade plus the adapter code rewrites — none of which appear in the message. Anyone running git log later will have no idea this commit contains a major dependency upgrade. Feldera has linear history on main; the commit message is the only window into why something changed. Please either split this into two commits (upgrade + chrono fix) or update the message to describe both changes.


let mut props = self.config.fileio_config.clone();
if let Some(endpoint) = self.config.glue_catalog_config.endpoint.as_ref() {
props.insert(GLUE_CATALOG_PROP_CATALOG_ID.to_string(), endpoint.clone());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy-paste bug: endpoint is being stored under GLUE_CATALOG_PROP_CATALOG_ID instead of GLUE_CATALOG_PROP_URI. Consequences:

  • If both id and endpoint are configured, the endpoint value silently overwrites the catalog ID in props (the ID is lost).
  • When only endpoint is set, it ends up under the wrong key and the Glue SDK ignores it — the custom endpoint URL is silently discarded.

GLUE_CATALOG_PROP_URI is exported by iceberg-catalog-glue (it's pub const GLUE_CATALOG_PROP_URI: &str = "uri" in catalog.rs). Import it alongside GLUE_CATALOG_PROP_CATALOG_ID and use it here:

if let Some(endpoint) = self.config.glue_catalog_config.endpoint.as_ref() {
    props.insert(GLUE_CATALOG_PROP_URI.to_string(), endpoint.clone());
}

Upgrade deltalake and related dependencies:

    delta-rs 0.26.2 -> 0.30.2
    iceberg 0.5.1 -> 0.8.0
    arrow 55 -> 57
    datafusion 47 -> 51

The new version of delta-rs is based on delta-kernel and has some features
missing in 0.26, such as support for v2 checkpoints and deletion vectors. The
latter will require additional work on the connector to support deletion
vectors in follow and CDC modes, but they should work out of the box in the
snapshot mode.

Overall we are hoping this version will be more reliable and performant.

As part of the upgrade, I disabled the `chrono::rkyv` feature.

The latest version of datafusion resurfaced this bug
apache/datafusion#14862.

The bug is triggered by the chrono crate when it is compiled with the rkyv
feature enabled. It turns out that we no longer need this feature, except in
the DBSP tutorial. This commit removes the feature and modifies the tutorial to
use an integer that represents the number of days since epoch instead of
chrono::NaiveDate.

Signed-off-by: Leonid Ryzhyk <ryzhyk@gmail.com>
@ryzhyk ryzhyk changed the title [adapters] Upgrade delta and iceberg depndencies. [adapters] Upgrade delta and iceberg dependencies. Feb 24, 2026
@ryzhyk ryzhyk requested a review from mythical-fred February 24, 2026 16:07
@ryzhyk
Copy link
Contributor Author

ryzhyk commented Feb 24, 2026

@mythical-fred , re-review

@ryzhyk ryzhyk added this pull request to the merge queue Feb 24, 2026
Merged via the queue into main with commit ad1f962 Feb 24, 2026
1 check passed
@ryzhyk ryzhyk deleted the bump-delta-rs branch February 24, 2026 19:10
Copy link
Collaborator

@mythical-fred mythical-fred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both blockers addressed:

  • Glue catalog endpoint now correctly inserted under GLUE_CATALOG_PROP_URI; catalog ID under GLUE_CATALOG_PROP_CATALOG_ID. Both constants imported.
  • Commit message now properly describes the full upgrade scope.

LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

connectors Issues related to the adapters/connectors crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants