Add snippets for JDBC using Managed I/O#10239
Add snippets for JDBC using Managed I/O#10239shunping wants to merge 7 commits intoGoogleCloudPlatform:mainfrom
Conversation
|
Here is the summary of changes. You are about to add 2 region tags.
This comment is generated by snippet-bot.
|
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request adds code snippets demonstrating how to use Dataflow's Managed I/O transform to interact with a PostgreSQL database. It includes examples for both reading from and writing to a PostgreSQL database, along with corresponding integration tests to ensure the functionality works as expected. The changes also involve updating the project's Highlights
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request adds new code snippets for reading from and writing to a PostgreSQL database using Dataflow's Managed I/O. The changes include updating the pom.xml with necessary dependencies and adding new Java classes for the pipelines and their integration tests. My review focuses on improving dependency management in pom.xml and enhancing code readability in the new Java classes. Overall, the changes are well-structured and the new snippets are a valuable addition.
dataflow/snippets/src/main/java/com/example/dataflow/PostgresRead.java
Outdated
Show resolved
Hide resolved
dataflow/snippets/src/test/java/com/example/dataflow/PostgresReadIT.java
Show resolved
Hide resolved
dataflow/snippets/src/test/java/com/example/dataflow/PostgresWriteIT.java
Show resolved
Hide resolved
chamikaramj
left a comment
There was a problem hiding this comment.
Thanks. LGTM.
For my knowledge, how/when are the ITs introduced here are executed ?
Regarding how, I run For when, I am not sure. I only see there were two kokoro presumit jobs running after I pushed my PR. Other than that, I think @VeronicaWasson can explain more in detail. |
|
LGTM, reviewed Iceberg relevant Modifications. |
|
I do not think there is an github action to run IT periodically in this repo. So, people committing the new changes notices issues. |
|
|
||
| void setPassword(String value); | ||
|
|
||
| @Description("Path to write the output file") |
There was a problem hiding this comment.
Add period at end to match other strings. Also is this a local path, GCS, other? Might be worth being explicit about that unless it's clear / widely known.
| public class PostgresRead { | ||
|
|
||
| public interface Options extends PipelineOptions { | ||
| @Description("The jdbc url of PostgreSQL database to read from.") |
| // Build the pipeline. | ||
| var pipeline = Pipeline.create(options); | ||
| pipeline | ||
| // Read data from Postgres database via managed io. |
There was a problem hiding this comment.
Options, depending on which makes most sense:
Read data from a Postgres database using Managed I/O.
Read data from a Postgres database using managed IO.
| .build()); | ||
|
|
||
| public interface Options extends PipelineOptions { | ||
| @Description("The jdbc url of PostgreSQL database to write to.") |
| // Create data to write to Postgres. | ||
| .apply(Create.of(ROWS)) | ||
| .setRowSchema(INPUT_SCHEMA) | ||
| // Write rows to Postgres database via managed io. |
| public void setUp() throws Exception { | ||
| postgres.start(); | ||
|
|
||
| // Initialize the database with table and data |
|
|
||
| @Test | ||
| public void testPostgresRead() throws IOException { | ||
| // Execute the Beam pipeline |
|
|
||
| @Test | ||
| public void testPostgresWrite() throws Exception { | ||
| // Execute the Beam pipeline |
| public void setUp() throws Exception { | ||
| postgres.start(); | ||
|
|
||
| // Pre-create the table so the Managed I/O can find it. |
There was a problem hiding this comment.
"Pre-create" sounds a little off to me. I think just "Create" is fine to convey the same meaning.
Description
Add code snippets for reading and writing to PostgreSQL database using the Dataflow managed I/O transform.
Also update Beam version in these code snippets from 2.67.0 to 2.70.0.
Part of the work in internal bug: 427973384
Checklist
pom.xmlparent set to latestshared-configurationmvn clean verifyrequiredmvn -P lint checkstyle:checkrequiredmvn -P lint clean compile pmd:cpd-check spotbugs:checkadvisory only