> Can't you simply ask codex in another tab to just do a code review? You are li...

sdevonoes · 2026-06-05T07:42:51 1780645371

> You are likely to get better results if you do not use the same model for review that wrote the code

There’s no evidence of this. I guess you are anthropomorphising models (i.e., it’s good that - different human reviews your code)

embedding-shape · 2026-06-05T09:22:56 1780651376

Yeah, one model over another seems to matter less, they respond differently to the same prompts, so if anything, I'd use multiple prompts over choosing one model over another.

However, using two models to generate two reviews easily beats doing one model and one review, as some models seem to "care" more about certain things, but you'll just miss different things if you change the model rather than add more.

tylermarques · 2026-06-05T16:42:38 1780677758

There is some evidence.[1] The best reviewer is a different model with fresh context, worst is same model with same context.

1. https://arxiv.org/pdf/2603.04582

dominotw · 2026-06-05T14:20:50 1780669250

well they are different. human or not. so it makes sense to get it reviewing by "something" different that one that wrote code.

krzyk · 2026-06-05T05:56:58 1780639018

Results also depend on the prompt. You get different results if you ask to review the PR and focus on particular file than if you don't make it focus.

Or if you make it "be a security engineer" with particular focus points.

Or make it a grammar nazi, it will find way more typos than without such focus.

Of course all of those "focuses" needs to be in a separate context (agent/subagent) to make it work.

Art9681 · 2026-06-05T02:55:33 1780628133

I would suggest that you reverse those roles. gpt-5.5 as the implementer and Opus as the reviewer.

hombre_fatal · 2026-06-05T04:05:07 1780632307

They find different things, and there's no reason to use one model for review. You want to review it until there's nothing left to be unearth.

And if you put the review effort into polishing an impl plan, then it doesn't matter which model implements it either.

pluralmonad · 2026-06-05T03:26:26 1780629986

How come? I find Opus to have better taste and GPT to have more rigor.