

Second, these datasets were created independently of one another, and hence, they were normalised using different protocols. First, these datasets typically only include around 300 images (except for English 15 and Canadian French 15), which greatly restrict experimental designs. Considering these findings, researchers have developed coloured image datasets, also in different languages (e.g., English 15 French 13 Italian 16 Russian 17 Modern Greek 18 Turkish 19 Spanish 20).ĭespite all these efforts to develop standardised and open datasets of pictures and their properties in different languages, there are still some limitations. However, these datasets involve black and white line-drawings, which have been shown to generate weaker recognition than coloured pictures 13, 14. Snodgrass and Vanderwart 3 created the first normalised picture dataset for the American English language, which has been adapted to other languages in order to conduct cross-linguistic research (e.g., British English 4 Chinese 5 Croatian 6 Dutch 7 French 8 Argentinian Spanish 9 Italian 10 Japanese 11 Spanish 12). Crucially, in a world in which multilingualism is the norm- it has been estimated that more than half of the world’s population speaks two or more languages 1, 2-it is essential for researchers to be able to access such normative information of experimental items for different languages. Importantly, experimenters need to have access to normative data on diverse properties of the pictures (e.g., naming agreement, familiarity, or complexity) to be able to compare and generalise their results across studies. Research on the wide and multidisciplinary area of language (e.g., perception, production, processing, acquisition, learning, disorders, and multilingualism, among others) frequently uses pictures of objects as stimuli for different paradigms such as naming or classification tasks. The dataset has been made freely available. This is the first dataset to provide naming norms, and translation equivalents, for such a variety of languages as such, it will be of particular value to psycholinguists and other interested researchers.

The data was validated with standard methods that have been used for existing picture datasets. In this paper we present the Multilingual Picture (Multipic) database, containing naming norms and familiarity scores for 500 coloured pictures, in thirty-two languages or language varieties from around the world. However, existing databases tend to be small in terms of the number of items they include, and have also been normed in a limited number of languages, despite the recent boom in multilingualism research. One type of such tools are picture datasets which provide naming norms for everyday objects. The growing interdisciplinary research field of psycholinguistics is in constant need of new and up-to-date tools which will allow researchers to answer complex questions, but also expand on languages other than English, which dominates the field. Scientific Data volume 9, Article number: 431 ( 2022) I’m also passionate about learning the local history and culture. I focus a great deal on food and historic sites, as you probably have seen! I love to experience the different flavors that each destination has to offer, whether it’s casual Street food or gourmet restaurant dining.

My name is David Hoffmann and for the last decade I have been traveling around the world in search of unique culture, food and history! Since starting Davidsbeenhere in 2008, I have traveled to 71 countries and over 1,000 destinations, which I welcome you to check out on my YouTube Channel, blog and social medias.
