Skip to content

Add streaming to profiling.sampling #145464

@maurycy

Description

@maurycy

Feature or enhancement

Proposal:

Right now, profiling.sampling has roughly two modes: live with the TUI, and snapshot-at-the-end (except binary? but see the note below.) There's nothing that streams the data continously as it comes. This would be ideal for long-running headless profiling.

This one is less defined than #145411, so there are more open questions:

  • Should it stream raw or agreggate data? What should be the window?
  • What should be the format? Unfortunately, from what I checked the current binary format is not really well-suited for streaming, as it saves the dictionaries only on finalize.
  • What should be the transport layers for streaming?
  • What should be the types of messages?
  • What should be the configuration flags?
  • How the backpressure should be handled? Should it drop the oldest? All the oldest?

My hunch is:

  • support both raw and aggregate and assume that aggregate is just a different message type.
  • start with something simple as NDJSON
  • I have mixed feelings about the transport layer. The stdout sounds great on paper but it's a mixed-use channel right now and there's a question around blocking by slower consumers. Maybe for debugging? Unix socket is good but not perfectly portable
  • as for backpressure, I would just drop the oldest by default, but only the raw, agg and heartbeat messages.

For starters:

  • --stream unix:/tmp/tachyon-stream.sock or --stream file:/tmp/tachyon-stream.sock (later: --stream tcp:127.0.0.1:1234)
  • --stream-types meta|str_def|frame_def|raw|agg:5s|loss|heartbeat|end|error (comma separated; note the agg:5s here; we it could be extended to heartbeat:1s etc. in the future)
  • --stream-format ndjson

Then, in the next phases we could think of --stream-drop-policy etc.

Has this already been discussed elsewhere?

No response given

Links to previous discussion of this feature:

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibStandard Library Python modules in the Lib/ directorytopic-profilingtype-featureA feature request or enhancement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions