When a bot (e.g. AI) streams a reply, the entire message text is replaced on every new chunk. That causes:
Unnecessary work – the full text so far is rebuilt and re-rendered on each update.
Heavier payloads – the same prefix is sent repeatedly instead of only the new fragment.
Worse scaling – for long streams, CPU and bandwidth grow with total length instead of chunk size.
So streaming works, but it doesn’t scale well and does more work than needed.
Support partial/delta updates for streamed messages:
Server/API: When sending streaming chunks, send only the new fragment (delta) for each update, not the full text so far. Optionally support an “append” or “patch” semantic (e.g. “append this segment to the message”) so the client doesn’t have to reconstruct full text from many deltas if not desired.Client (Telegram Desktop): When applying a streaming update, append or patch the new content to the existing message text instead of replacing the whole text. Preserve existing formatting/entities where possible and merge or append entity ranges for the new segment.
Result: each update is small (only the new chunk), and the client does incremental updates instead of full replace + full re-render, which improves performance and scalability for long streamed replies.
Alternatives I've considered
Keep full-text replacement – Easiest but doesn’t fix the performance and payload issues; only acceptable as a short-term default.
If deltas mechanics already works on server, that lowers the load on Telegram, but keeps the load on startups.
Log in here to report bugs or suggest features. Please enter your phone number in the international format and we will send a confirmation message to your account via Telegram.