I with my friends use Linphone and PhonerLite several years for speech, not music, of course identical program on both sides.
Current PhonerLite with Opus-18 or G722 WB or Speex WB (quality: this like this) transmits sound "S" is like "Sh"near"F"("Ph"), sound "Z"("Th") is like about "Zh", but half year ago PhonerLite or current Linphone hasn't this problem with Opus. We tested these "S" and "Z"("Th") sounds with PhonerLite now and earlier, with different hardwares/internet-providers any sides, so it's not hardware and not internet and not SIP-provider problem.
You wrote: "Opus codec automatically choses right data rate and quality needed for the provided signal" - but why PhonerLite's Opus half year ago versions-modifications automatically chose factually 48(50) for speech PhonerLite-2-PhonerLite (but now it's factually 18)? Of course I remember that half year ago PhonerLite offered only Opus-18 option, but factually it was 48(50) sound quality like Linphone.
So if use current PhonerLite's codecs assortment - it distorts speech because they trim high frequencies, but if Opus-48(50) quality - it's like You talk with companion near You.
http://cdni.wired.co.uk/1920x1280/o_r/opuscodec.jpg For real quality music is fullband stereo Opus 128 and higher, for real quality speech - mono fullband Opus 40-128 kbps: Opus-48(50) is enough. Also Skype use about it.
Opus-18 is WB quality near Speex WB or near G.722 WB - not-super-wideband.