As discussed at w3c/a11y-tracking#252 (comment) add an informative note for implementers to be aware of the need to ensure that the presented superscript/subscript text is visible as such to assistive technology.
For example something like:
For implementations that use the CSS vertical-align property when mapping tts:fontVariant into HTML, use of vertical-align alone is insufficient for assistive technology to identify that the glyphs concerned are superscript or subscript.
or:
Presentation implementations should mark up their output such that assistive technology can identify that the glyphs concerned are superscript or subscript. For example, use of the CSS vertical-align property alone could be insufficient.