I assume that this is dependent on app, and it's quite possible that your approach is best in some cases.
In my case I started with something somewhat like Playwright, and claude had a habit of interacting with the app more directly than a user would be able to and so not spotting problems because of it. Forcing it to interact by pressing keys rather than delving into the dom or executing random javascript helped. In particular I wanted to be able to chat with it as it tried things interactively. This is more to help with manual tests or exploratory testing rather than classic automated testing.
My current app is a desktop app, so playwright isn't as applicable.
In general, it is very easy to detect that radio signals are present.
A better comparison is with radio signals for which a method of spread-spectrum modulation has been used, chosen such as to have a bandwidth so wide that the averaged signal falls below the thermal noise level.
Such radio signals will also not be detectable without special detectors.
WiFi and Bluetooth use spread-spectrum modulation methods but they have relatively low bandwidths, so they can be easily distinguished from thermal noise. Much wider bandwidths are required to prevent detection.