vqa: use 1/sample_rate as the audio stream time base