Skip to content

Add snippets for JDBC using Managed I/O#10239

Open
shunping wants to merge 7 commits intoGoogleCloudPlatform:mainfrom
shunping:jdbc_examples
Open

Add snippets for JDBC using Managed I/O#10239
shunping wants to merge 7 commits intoGoogleCloudPlatform:mainfrom
shunping:jdbc_examples

Conversation

@shunping
Copy link

@shunping shunping commented Mar 3, 2026

Description

Add code snippets for reading and writing to PostgreSQL database using the Dataflow managed I/O transform.

Also update Beam version in these code snippets from 2.67.0 to 2.70.0.

Part of the work in internal bug: 427973384

Checklist

  • I have followed Sample Format Guide
  • pom.xml parent set to latest shared-configuration
  • Appropriate changes to README are included in PR
  • These samples need a new API enabled in testing projects to pass (let us know which ones)
  • These samples need a new/updated env vars in testing projects set to pass (let us know which ones)
  • Tests pass: mvn clean verify required
  • Lint passes: mvn -P lint checkstyle:check required
  • Static Analysis: mvn -P lint clean compile pmd:cpd-check spotbugs:check advisory only
  • This sample adds a new sample directory, and I updated the CODEOWNERS file with the codeowners for this sample
  • This sample adds a new Product API, and I updated the Blunderbuss issue/PR auto-assigner with the codeowners for this sample
  • Please merge this PR for me once it is approved

@shunping shunping requested review from a team and yoshi-approver as code owners March 3, 2026 04:37
@snippet-bot
Copy link

snippet-bot bot commented Mar 3, 2026

Here is the summary of changes.

You are about to add 2 region tags.

This comment is generated by snippet-bot.
If you find problems with this result, please file an issue at:
https://github.com/googleapis/repo-automation-bots/issues.
To update this comment, add snippet-bot:force-run label or use the checkbox below:

  • Refresh this comment

@product-auto-label product-auto-label bot added samples Issues that are directly related to samples. api: dataflow Issues related to the Dataflow API. labels Mar 3, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request adds code snippets demonstrating how to use Dataflow's Managed I/O transform to interact with a PostgreSQL database. It includes examples for both reading from and writing to a PostgreSQL database, along with corresponding integration tests to ensure the functionality works as expected. The changes also involve updating the project's pom.xml file to include the necessary dependencies.

Highlights

  • Managed I/O Integration: This PR introduces code snippets for reading from and writing to a PostgreSQL database using Dataflow's Managed I/O transform.
  • New Code Samples: It adds two new code samples: PostgresRead.java for reading data and PostgresWrite.java for writing data to a PostgreSQL database.
  • Integration Tests: The PR includes integration tests (PostgresReadIT.java and PostgresWriteIT.java) that use Testcontainers to run a containerized PostgreSQL instance for testing.
  • Dependency Updates: The pom.xml file is updated to include dependencies for JDBC, PostgreSQL, and Testcontainers, as well as updating the Apache Beam version.
Activity
  • New code snippets for reading and writing to PostgreSQL using Dataflow managed I/O transform were added.
  • Integration tests were added to verify the functionality of the new code snippets.
  • The pom.xml file was updated to include necessary dependencies and update the Apache Beam version.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@shunping
Copy link
Author

shunping commented Mar 3, 2026

r: @VeronicaWasson @chamikaramj

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds new code snippets for reading from and writing to a PostgreSQL database using Dataflow's Managed I/O. The changes include updating the pom.xml with necessary dependencies and adding new Java classes for the pipelines and their integration tests. My review focuses on improving dependency management in pom.xml and enhancing code readability in the new Java classes. Overall, the changes are well-structured and the new snippets are a valuable addition.

Copy link

@chamikaramj chamikaramj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. LGTM.

For my knowledge, how/when are the ITs introduced here are executed ?

@shunping
Copy link
Author

shunping commented Mar 3, 2026

Thanks. LGTM.

For my knowledge, how/when are the ITs introduced here are executed ?

Regarding how, I run mvn clean verify under dataflow/snippets. That will run all IT tests under the folder.

For when, I am not sure. I only see there were two kokoro presumit jobs running after I pushed my PR. Other than that, I think @VeronicaWasson can explain more in detail.

@tarun-google
Copy link
Contributor

LGTM, reviewed Iceberg relevant Modifications.

@tarun-google
Copy link
Contributor

I do not think there is an github action to run IT periodically in this repo. So, people committing the new changes notices issues.

Copy link

@shuesc1 shuesc1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several nits, otherwise LGTM


void setPassword(String value);

@Description("Path to write the output file")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add period at end to match other strings. Also is this a local path, GCS, other? Might be worth being explicit about that unless it's clear / widely known.

public class PostgresRead {

public interface Options extends PipelineOptions {
@Description("The jdbc url of PostgreSQL database to read from.")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The jdbc url" -> "The JDBC URL"

// Build the pipeline.
var pipeline = Pipeline.create(options);
pipeline
// Read data from Postgres database via managed io.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Options, depending on which makes most sense:
Read data from a Postgres database using Managed I/O.
Read data from a Postgres database using managed IO.

.build());

public interface Options extends PipelineOptions {
@Description("The jdbc url of PostgreSQL database to write to.")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JDBC URL

// Create data to write to Postgres.
.apply(Create.of(ROWS))
.setRowSchema(INPUT_SCHEMA)
// Write rows to Postgres database via managed io.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as other file

public void setUp() throws Exception {
postgres.start();

// Initialize the database with table and data
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add period at end


@Test
public void testPostgresRead() throws IOException {
// Execute the Beam pipeline
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

period at end


@Test
public void testPostgresWrite() throws Exception {
// Execute the Beam pipeline
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add period at end

public void setUp() throws Exception {
postgres.start();

// Pre-create the table so the Managed I/O can find it.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Pre-create" sounds a little off to me. I think just "Create" is fine to convey the same meaning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: dataflow Issues related to the Dataflow API. samples Issues that are directly related to samples.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants