Skip to content

Bring back CLAUDE.md: revert commit 0fe3d88c#5625

Merged
Karakatiza666 merged 1 commit intomainfrom
resuscitate-claude
Feb 26, 2026
Merged

Bring back CLAUDE.md: revert commit 0fe3d88c#5625
Karakatiza666 merged 1 commit intomainfrom
resuscitate-claude

Conversation

@Karakatiza666
Copy link
Contributor

PR #5600 deleted the root composite CLAUDE.md file that, via a script that "unpacks", it covers the most of the repository with knowledge base through per-directory CLAUDE.md files.

The commit (not even a PR description) describes the CLAUDE.md as excessive, but it is in fact necessary to use Claude Code effectively.
If the composite CLAUDE.md somehow breaks some workflow that cannot use the file-splitting script (e.g. Claude GitHub Reviews ?) let's discuss and address the issue properly.

@Karakatiza666 Karakatiza666 requested a review from gz February 13, 2026 09:08
@mihaibudiu
Copy link
Contributor

I will only approve this PR if you commit to keep this file up to date when the rest of the code changes.

@gz
Copy link
Contributor

gz commented Feb 13, 2026

A single big CLAUDE.md file like this does not seem useful, in fact it looks like it's actively harmful the tool even displays a warning about the file being too large. It is also unlikely that most people working with claude will use your provided scripts to split the files

I tried to approach of splitting it up too and I did not notice any difference in quality of content it produces. FWIW it seems to look at README.mds too so maybe it's more helpful to write good and useful README files and tell claude to look at @docs.feldera.com, @openapi.json and README files.

I suggest we just put all higher level CLAUDE.md files in gitignore and everyone can have whatever CLAUDE.md files in higher level directories they want. You can also put it as a commit that can be applied to any branch someone works in if you like but I really don't see value in having this comitted to main. I think this is snake oil at this point without any objective measure whether it helps with anything or not.

@kfollesdal
Copy link
Contributor

kfollesdal commented Feb 13, 2026

I have been using coding agents when working on NATS connector and exploring Feldera code base. When pulling a fresh Feldera main I delete the CLAUDE.md file from Feldera repo. Is is too big and and make prompts expensive and agent get unfocused with to much info. But that just my experience.

Anthropic recommend it to be short and focus. https://code.claude.com/docs/en/best-practices#write-an-effective-claude-md

But you can also make CLAUDE.md files in subfolders that the agent will read if it visits the folder.

And for some tasks and domain knowledge. Skills can be interesting. I playing with a NATS output connector made with a output connector skill that the agent made before implementing the connector.

Copy link
Collaborator

@mythical-fred mythical-fred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The concern that PR #5600 should have gone through a discussion rather than a bare commit is fair. But the solution here goes too far in the other direction.

A 10,387-line CLAUDE.md is operationally counterproductive:

Claude Code reads CLAUDE.md on every session start. A file this size consumes a large fraction of the context window with content that is:

  • General architecture documentation Claude can derive from reading the source
  • Highly likely to drift from reality as the codebase evolves, with no mechanism to catch it
  • Not targeted — every session gets all 10K lines regardless of what the developer is actually working on

The right model is per-directory CLAUDE.md files, which is what the unpack script was trying to achieve. Those files load only when Claude Code is working in that directory. The composite-single-file approach defeats that benefit — you end up with one giant file instead of small, focused, per-directory ones.

What would actually help:

  • A concise root (< 50 lines) covering: repo overview, how to build, how to run tests, key architectural decisions worth knowing upfront
  • Per-directory files for complex subsystems (e.g., , ) with subsystem-specific guidance
  • Both should be maintained manually, not auto-generated from a composite

If the concern is that someone reverted via commit instead of PR — yes, that should be raised with them. But the fix isn't to restore 10K lines of auto-generated documentation.

@gz
Copy link
Contributor

gz commented Feb 22, 2026

FWIW just stumbled upon something interesting on linkedin (for once) here is some pretty strong evidence that this PR is "doing the wrong thing": https://arxiv.org/pdf/2602.11988

from the paper:

  • "Across multiple coding agents and LLMs, we find that context files tend to reduce task success rates compared to providing no repository context, while also increasing inference cost by over 20%. Behaviorally, both LLM-generated and developer-provided context files encourage broader exploration (e.g., more thorough testing and file traversal), and coding agents tend to respect their instructions. Ultimately, we conclude that unnecessary requirements from context files make tasks harder, and human-written context files should describe only minimal requirements"
  • Look at figure Figure 3

@Karakatiza666
Copy link
Contributor Author

After discussion with Gerd I implemented the following solution:
The detailed per-directory Claude context files are maintained in a separate branch, claude-context.

Whoever needs them can run scripts/claude.sh to pull these files as unstaged changes without affecting existing local staged and unstaged changes (CLAUDE.md would be overwritten).
The notice of this is added to the "bare" CLAUDE.md, so Claude Code will offer to do this when it reads the root CLAUDE.md
scripts/claude.js remains in this branch to avoid pulling it from claude-context

Copy link
Collaborator

@mythical-fred mythical-fred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the right fix. CLAUDE.md on main is now 11 lines of actionable guidance — exactly what it should be. The composite context lives opt-in on claude-context, accessible via scripts/claude.sh for those who want it.

The separation is clean: main stays lean, heavy context is not forced on CI or anyone who doesn't ask for it. Gerd's paper reference is apt — large context files are empirically harmful, not just wasteful.

@Karakatiza666
Copy link
Contributor Author

I will only approve this PR if you commit to keep this file up to date when the rest of the code changes.

Based on my experience I do not believe it is necessary nor extremely important to keep this file up-to-date with all the latest changes. Nor do I have an expertise to update context files across the entire codebase. I could set up a way to have an LLM update context changes, but this is not what these context files are designed for.

I insist that a practical and meaningful approach to maintain context files is lazy, iterative updates: whenever you interact with Claude and make use of context files and notice a discrepancy within a certain domain (e.g. SQL compiler sources, or Python integration tests) - you update the context file. Occasionally (once a month or two) you would gloss over the context file for a sub-directory you're knowledgeable in and author the file.

If you are the one who introduces a change but don't use Claude actively and context files are "hidden out of sight" - in a composite file, separate branch, etc. - I don't expect these files will be updated in sync with code changes; "lazy" approach addresses that.

Signed-off-by: Karakatiza666 <bulakh.96@gmail.com>
@Karakatiza666 Karakatiza666 added this pull request to the merge queue Feb 25, 2026
@snkas
Copy link
Contributor

snkas commented Feb 26, 2026

Doesn't this added line:

At the start of every conversation, offer the user to run `scripts/claude.sh` to pull in shared LLM context files as unstaged changes. These should not be committed outside the `claude-context` branch.

... mess with the review bot?

Merged via the queue into main with commit 7f5986d Feb 26, 2026
1 check passed
@Karakatiza666 Karakatiza666 deleted the resuscitate-claude branch February 26, 2026 11:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants