Rep Task "Interpreted Speech" corpus

The RepTask “Interpreted speech” corpus consists in recordings realized through several different elicitation protocols:

  • semi-spontaneous conversations between two speakers with a free topic;
  • the same conversations “replicated” by other speakers, in two versions:
    • speakers read aloud the conversation transcript;
    • speakers memorize and then interpret the transcript.

The Rep Task “Interpreted speech” corpus was constituted in order to study prosodic phenomena fulfilling a semantic-pragmatic or expressive function (ex: focus marking, emphasis) and evaluate the influence of type of speech (spontaneous, read, interpreted) on their realization.

Click below to listen to samples of each type of speech:

spontaneous    extract 1    extract 2

reading    extract 1    extract 2

interpretation    extract 1    extract 2

Content of the corpus:

  • two texts
  • for each text:
    • one spontaneous version
    • two read versions
    • two interpreted versions
  Spontaneous Reading Interpretation
Text A Spontaneous AThis recording is taken from the CID corpus (Corpus of Interactional Data). It can be accessed on the ORTOLANG database (inscription pre-required). Go to: Table des matières > Détails and choose the file <AG_YM-15-30.wav>. The sample selected for the Rep Task “Interpreted speech” corpus is between 4:55 and 7:35. Reading A1 Interpretation A1
Reading A2 Interpretation A2
Text B Spontaneous B Reading B1 Interpretation B1
Reading B2 Interpretation B2

The corpus also contains “line runs” (the speakers memorize the text and pronounce it in a neutral way, without expressivity):

Also downloadable:

Download the entire corpus (27 Mo)

Information about the recordings (number of words and syllables, speakers, recording conditions):

 

number of words

number of syllables

speaker

material

location

format

Spontaneous A

533

691

male
30-40, Marseille
researcher

head-mounted microphone

quiet room with minimum background noise

wav

44.1 kHz

16 bits

Spontaneous B

606

842

female
 30, Paris
PhD student

H2 Zoom

Reading A1

526

693

male
26, Paris
amateur actor

Rode NT1-A
and Roland Quad-Capture audio interface

soundproof room

Interpretation A1

521

685

Reading A2

523

690

male
36, Paris
semi-professionnal actor

Interpretation A2

524

687

Reading B1

577

840

female
24, Paris
acting student

Interpretation B1

582

856

Reading B2

620

904

female
26, Paris
professional actress

Interpretation B2

631

912

Total

5643

7800

 

 

 

 

References:

Godement-Berline, R. (submitted), « Using a replication task to study prosodic highlighting », in Proceedings of Speech Prosody 2016, 30 mai-3 juin 2016, Boston, Etats-Unis.

Godement-Berline, R. (to appear), La focalisation prosodique dans la parole interprétée en français. Thèse de doctorat. Université Paris Diderot.

Financingt: Labex EFL, axis 7: Experimental Methods and Epistemology

Thanks to: Loïc Liégeois, Johan Ferguth, Xingjia Rachel Shen