Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We do both:

We compress tool outputs at each step, so the cache isn't broken during the run. Once we hit the 85% context-window limit, we preemptively trigger a summarization step and load that when the context-window fills up.

 help



> we preemptively trigger a summarization step and load that when the context-window fills up.

How does this differ from auto compact? Also, how do you prove that yours is better than using auto compact?


For auto-compact, we do essentially the same Anthropic does, but at 85% filled context window. Then, when the window is 100% filled, we pull this precompaction + append accumulated 15%. This allows to run compaction instantly



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: