Bring back CLAUDE.md: revert commit 0fe3d88c by Karakatiza666 · Pull Request #5625 · feldera/feldera

Karakatiza666 · 2026-02-13T09:08:55Z

PR #5600 deleted the root composite CLAUDE.md file that, via a script that "unpacks", it covers the most of the repository with knowledge base through per-directory CLAUDE.md files.

The commit (not even a PR description) describes the CLAUDE.md as excessive, but it is in fact necessary to use Claude Code effectively.
If the composite CLAUDE.md somehow breaks some workflow that cannot use the file-splitting script (e.g. Claude GitHub Reviews ?) let's discuss and address the issue properly.

mihaibudiu · 2026-02-13T16:58:57Z

I will only approve this PR if you commit to keep this file up to date when the rest of the code changes.

gz · 2026-02-13T17:18:03Z

A single big CLAUDE.md file like this does not seem useful, in fact it looks like it's actively harmful the tool even displays a warning about the file being too large. It is also unlikely that most people working with claude will use your provided scripts to split the files

I tried to approach of splitting it up too and I did not notice any difference in quality of content it produces. FWIW it seems to look at README.mds too so maybe it's more helpful to write good and useful README files and tell claude to look at @docs.feldera.com, @openapi.json and README files.

I suggest we just put all higher level CLAUDE.md files in gitignore and everyone can have whatever CLAUDE.md files in higher level directories they want. You can also put it as a commit that can be applied to any branch someone works in if you like but I really don't see value in having this comitted to main. I think this is snake oil at this point without any objective measure whether it helps with anything or not.

kfollesdal · 2026-02-13T21:21:38Z

I have been using coding agents when working on NATS connector and exploring Feldera code base. When pulling a fresh Feldera main I delete the CLAUDE.md file from Feldera repo. Is is too big and and make prompts expensive and agent get unfocused with to much info. But that just my experience.

Anthropic recommend it to be short and focus. https://code.claude.com/docs/en/best-practices#write-an-effective-claude-md

But you can also make CLAUDE.md files in subfolders that the agent will read if it visits the folder.

And for some tasks and domain knowledge. Skills can be interesting. I playing with a NATS output connector made with a output connector skill that the agent made before implementing the connector.

mythical-fred

The concern that PR #5600 should have gone through a discussion rather than a bare commit is fair. But the solution here goes too far in the other direction.

A 10,387-line CLAUDE.md is operationally counterproductive:

Claude Code reads CLAUDE.md on every session start. A file this size consumes a large fraction of the context window with content that is:

General architecture documentation Claude can derive from reading the source
Highly likely to drift from reality as the codebase evolves, with no mechanism to catch it
Not targeted — every session gets all 10K lines regardless of what the developer is actually working on

The right model is per-directory CLAUDE.md files, which is what the unpack script was trying to achieve. Those files load only when Claude Code is working in that directory. The composite-single-file approach defeats that benefit — you end up with one giant file instead of small, focused, per-directory ones.

What would actually help:

A concise root (< 50 lines) covering: repo overview, how to build, how to run tests, key architectural decisions worth knowing upfront
Per-directory files for complex subsystems (e.g., , ) with subsystem-specific guidance
Both should be maintained manually, not auto-generated from a composite

If the concern is that someone reverted via commit instead of PR — yes, that should be raised with them. But the fix isn't to restore 10K lines of auto-generated documentation.

gz · 2026-02-22T04:35:55Z

FWIW just stumbled upon something interesting on linkedin (for once) here is some pretty strong evidence that this PR is "doing the wrong thing": https://arxiv.org/pdf/2602.11988

from the paper:

"Across multiple coding agents and LLMs, we find that context files tend to reduce task success rates compared to providing no repository context, while also increasing inference cost by over 20%. Behaviorally, both LLM-generated and developer-provided context files encourage broader exploration (e.g., more thorough testing and file traversal), and coding agents tend to respect their instructions. Ultimately, we conclude that unnecessary requirements from context files make tasks harder, and human-written context files should describe only minimal requirements"
Look at figure Figure 3

Karakatiza666 · 2026-02-25T11:34:46Z

After discussion with Gerd I implemented the following solution:
The detailed per-directory Claude context files are maintained in a separate branch, claude-context.

Whoever needs them can run scripts/claude.sh to pull these files as unstaged changes without affecting existing local staged and unstaged changes (CLAUDE.md would be overwritten).
The notice of this is added to the "bare" CLAUDE.md, so Claude Code will offer to do this when it reads the root CLAUDE.md
scripts/claude.js remains in this branch to avoid pulling it from claude-context

mythical-fred

This is the right fix. CLAUDE.md on main is now 11 lines of actionable guidance — exactly what it should be. The composite context lives opt-in on claude-context, accessible via scripts/claude.sh for those who want it.

The separation is clean: main stays lean, heavy context is not forced on CI or anyone who doesn't ask for it. Gerd's paper reference is apt — large context files are empirically harmful, not just wasteful.

Karakatiza666 · 2026-02-25T18:21:34Z

I will only approve this PR if you commit to keep this file up to date when the rest of the code changes.

Based on my experience I do not believe it is necessary nor extremely important to keep this file up-to-date with all the latest changes. Nor do I have an expertise to update context files across the entire codebase. I could set up a way to have an LLM update context changes, but this is not what these context files are designed for.

I insist that a practical and meaningful approach to maintain context files is lazy, iterative updates: whenever you interact with Claude and make use of context files and notice a discrepancy within a certain domain (e.g. SQL compiler sources, or Python integration tests) - you update the context file. Occasionally (once a month or two) you would gloss over the context file for a sub-directory you're knowledgeable in and author the file.

If you are the one who introduces a change but don't use Claude actively and context files are "hidden out of sight" - in a composite file, separate branch, etc. - I don't expect these files will be updated in sync with code changes; "lazy" approach addresses that.

Signed-off-by: Karakatiza666 <bulakh.96@gmail.com>

snkas · 2026-02-26T11:45:34Z

Doesn't this added line:

At the start of every conversation, offer the user to run `scripts/claude.sh` to pull in shared LLM context files as unstaged changes. These should not be committed outside the `claude-context` branch.

... mess with the review bot?

Karakatiza666 requested a review from gz February 13, 2026 09:08

mythical-fred suggested changes Feb 21, 2026

View reviewed changes

Karakatiza666 force-pushed the resuscitate-claude branch from 2ee4944 to 9f99aab Compare February 25, 2026 11:29

Karakatiza666 requested a review from mythical-fred February 25, 2026 11:35

mythical-fred approved these changes Feb 25, 2026

View reviewed changes

gz approved these changes Feb 25, 2026

View reviewed changes

Karakatiza666 force-pushed the resuscitate-claude branch from 9f99aab to 42b0a37 Compare February 25, 2026 21:38

Move Claude context-related files to a separate branch

2f376c0

Signed-off-by: Karakatiza666 <bulakh.96@gmail.com>

Karakatiza666 force-pushed the resuscitate-claude branch from 42b0a37 to 2f376c0 Compare February 25, 2026 21:39

Karakatiza666 enabled auto-merge February 25, 2026 21:39

Karakatiza666 added this pull request to the merge queue Feb 25, 2026

Merged via the queue into main with commit 7f5986d Feb 26, 2026
1 check passed

Karakatiza666 deleted the resuscitate-claude branch February 26, 2026 11:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bring back CLAUDE.md: revert commit 0fe3d88c#5625

Bring back CLAUDE.md: revert commit 0fe3d88c#5625
Karakatiza666 merged 1 commit intomainfrom
resuscitate-claude

Karakatiza666 commented Feb 13, 2026

Uh oh!

mihaibudiu commented Feb 13, 2026

Uh oh!

gz commented Feb 13, 2026

Uh oh!

kfollesdal commented Feb 13, 2026 •

edited

Loading

Uh oh!

mythical-fred left a comment

Uh oh!

gz commented Feb 22, 2026 •

edited

Loading

Uh oh!

Karakatiza666 commented Feb 25, 2026

Uh oh!

mythical-fred left a comment

Uh oh!

Karakatiza666 commented Feb 25, 2026

Uh oh!

snkas commented Feb 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

Karakatiza666 commented Feb 13, 2026

Uh oh!

mihaibudiu commented Feb 13, 2026

Uh oh!

gz commented Feb 13, 2026

Uh oh!

kfollesdal commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mythical-fred left a comment

Choose a reason for hiding this comment

Uh oh!

gz commented Feb 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Karakatiza666 commented Feb 25, 2026

Uh oh!

mythical-fred left a comment

Choose a reason for hiding this comment

Uh oh!

Karakatiza666 commented Feb 25, 2026

Uh oh!

snkas commented Feb 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

kfollesdal commented Feb 13, 2026 •

edited

Loading

gz commented Feb 22, 2026 •

edited

Loading