In this study, we investigated the effects of individual differences in working memory (WM) capacity and the timing of the presentation of audiovisual information on the listening comprehension of L2 texts.In the experiment, an advanced class of Chinese learners of the Japanese language were separated into four groups according to their verbal and visuospatial WM capacities. Three texts were used as listening materials. Each text was auditorily presented with visual information (a graph) seven seconds ahead of, later, or simultaneously. The learners were required to complete false/true and free recall tests after they had listened to each text. We found that for learners with a high verbal WM capacity, the visuospatial WM capacity and the timing of the presentation of audiovisual information affected their comprehension and memory of the text. For example, in the false/true judgment test, which requires overall comprehension of the text, higher visuospatial WM capacity led to a higher correct rate; in the free recall test, which requires memory for details, learners with high visuospatial WM capacity had a higher correct recall rate when the visual material was presented ahead of the audio text than simultaneously. Those with low visuospatial WM capacity had a higher correct recall rate when visual information was presented simultaneously with the audio text than later. For learners with a low verbal WM capacity, the results of the two tests did not differ significantly. These results showed that the efficiency of the timing of the presentation of audiovisual information differed depending on the listener’s WM capacity.