The music fingerprinting on my Android phone works in airplane mode, so it would be possible with modifications. Also, it's likely that Shazam is sending a "hash" of the audio rather than an audio stream in most cases.
Ctrl-F in that document for 'hashing'. That step reduces the audio information to a sparse collection of key points, one for each of four frequency ranges per time segment. I would assume that everything up to that step is done on the phone and only the key points are sent to the server.