Differentiate more than one speaker with their own color and/or position of captions.