Discourse op Dinsdag 17 november

November 14, 2009

Date & time: November 17; 15:30-17:00

Location: Utrecht University, Janskerkhof 13, Room 0.06

Fridolin Wild

Knowledge Media Institute, Milton Keynes, UK

The Geometry of Learning

Abstract

Latent Semantic Analysis (LSA) is a mathematical technique for computationally modeling the meaning of words and larger units of texts. LSA works by applying a mathematical technique called Singular Value Decomposition (SVD) to a term*document matrix containing frequency counts for all words found in the corpus in all of the documents or passages in the corpus. After this SVD application, the meaning of a word is represented as a vector in a multidimensional semantic space, which makes it possible to compare word meanings, for instance by computing the cosine between two word vectors.

LSA has been successfully used in a large variety of language related applications from automatic grading of student essays to predicting click trails in website navigation. In Coh-Metrix (Graesser et al. 2004), a computational tool that produces indices of the linguistic and discourse representations of a text, LSA was used as a measure of text cohesion by assuming that cohesion increases as a functionof higher cosine scores between adjacent sentences.

Besides being interesting as a technique for building programs that need to deal with semantics, LSA is also interesting as a model of human cognition. LSA can match human performance on word association tasks and vocabulary test. In this talk, Fridolin will focus on LSA as a tool in modeling language acquisition. After framing the area of the talk with sketching the key concepts learning, information, and competence acquisition, and after outlining presuppositions, an introduction into meaningful interaction analysis (MIA) is given. MIA is a means to inspect learning with the support of language analysis that is geometrical in nature. MIA is a fusion of latent semantic analysis (LSA) combined with network analysis (NA/SNA). LSA, NA/SNA, and MIA are illustrated by several examples.

On Wednesday morning, November 18, Fridolin Wild will give a tutorial during which he will demonstrate the R-package he developed for LSA. For more information, please contact Rogier Kraf (r.kraf@uu.nl).

The Discourse op Dinsdag discussion group is intended for researchers working on discourse from a language use perspective, and offers a platform to discuss their work (in progress). For more information check our website http://www.let.uu.nl/vici.


Promotie/Defense Rasmus Steinkrauss, Groningen

November 4, 2009

Name: Rasmus Steinkrauss
Dissertation title: Frequency and Function in WH Question Acquisition –
A Usage-Based Case Study of German L1 Acquisition
Promotion date: 27.11.2009

Summary of the dissertation
English
My dissertation studies the early stages of first language acquisition. In particular, it studies the impact of the input, which is the language a child hears, on what a child says. Simply put, in usage-based linguistics is assumed that the more often a child hears something, the earlier and more often the child is going to say it. This applies
not only to single words but also to word combinations (‘constructions’) such as “where are…”. In my dissertation, I show that this is not always the case.

Using a very dense database of a boy learning German, I follow the boy’s early language development, especially the development of WH-questions, closely for one year (age 2-3) and show that not only input frequency, but also the function of linguistic constructions and the previous linguistic knowledge of the child play a role.
For example, even if a construction is very frequent in the input, the child might not use it because the construction does not serve any useful function for the child, because the child already knows a construction fulfilling the same function, or because the child does not know a more basic construction that the new construction
builds on. This shows that input frequency interacts with other factors in language acquisition.

Moreover, the dissertation shows that the acquisition of German differs from that of English because of the different composition of the input, and that the size of the corpus has a clear impact on the analysis of input frequency.

Dutch
Dit proefschrift beschouwt de invloed van de ‘input’, dus wat een kind hoort, op zijn/haar ontwikkeling van de moedertaal in de allereerste fasen van taalverwerving. Binnen de gebruiksgebaseerde (usage-based) linguïstiek wordt er over het algemeen van uitgegaan dat de input een sterke invloed uitoefent op wat een kind zelf leert te
zeggen. Simpel gezegd wordt aangenomen dat als een kind iets vaak hoort het kind dit zelf ook vaak en vroeg gebruikt. Dit geldt zowel voor enkele woorden als voor hele combinaties van woorden (‘constructies’) zoals ‘wat is dit voor…’.

Mijn proefschrift laat zien dat dit niet altijd het geval is. Met hulp van een bijzonder uitgebreid corpus wordt de vroege taalontwikkeling, specifiek de ontwikkeling van wvragen, van een Duits kind over de periode van een jaar (leeftijd 2-3) nauwlettend gevolgd. Er wordt aangetoond dat niet alleen de inputfrequentie, maar ook de functie
van constructies en de eerdere kennis van het kind invloed heeft op wat hij zegt. Zo gebruikt het kind bijvoorbeeld sommige constructies niet, hoewel zij frequent in de input voorkomen. Dit komt doordat zij voor hem geen nuttige functie vervullen, hij al een andere constructie met dezelfde functie gebruikt of doordat hij een andere, meer
basale constructie nog niet kent. Dit laat zien dat inputfrequentie met andere factoren samenwerkt.

Daarnaast toont het proefschrift aan dat de taalontwikkeling in het Duits door de andere compositie van de input anders verloopt dan in het Engels en dat de grootte van het gebruikte corpus van invloed is op de resultaten.


Symposium Frequency and Function Groningen

November 4, 2009

The Role of Frequency and Function in Language Development
Symposium, November 25, 2009
Rijksuniversiteit Groningen
13.30 – 17.30
Senaatszaal, Academie Building, Broerstraat 5, Groningen

Frequency of occurrence is central to the process of learning: the frequency with which a person experiences an event or executes an action plays an significant role in building a mental representation of that event or action. This is an important reason why usage-based approaches to language in particular have considered the frequency
with which linguistic structures are used to be a central factor in the development of language. By now, many frequency-related effects such as the interplay between frequency and grammaticality judgements have been revealed. Therefore, frequency
increasingly finds its way as an explanatory factor also in other linguistic approaches.

At the same time, it seems clear that the communicative or semantic function of a linguistic structure plays a major part in language development as well, and that it is related to frequency. For example, some individual speakers might not use a linguistic structure in spite of a high overall frequency of occurrence in the speech community because the structure does not meet these speakers’ communicative
purposes. This in turn can feed back to these speakers’ representation of the structure and lower the overall frequency of that structure. Another example is the number of linguistic forms used in a language to carve up a given functional domain (e.g., colors or spatial relations). This ratio between linguistic form and semantic function affects
the amount of variation in a language in that domain. Again, this influences the frequency of occurrence, and possibly also the opacity of form-function-mappings and the ease with which linguistic forms may therefore be learnt. The symposium seeks to look at the two factors of frequency and function and illuminate their role.

Six invited speakers will present their work: Heike Behrens, Jack
Hoeksema, Angeliek van Hout, Elena Lieven, Rasmus Steinkrauss and Rosie van Veen. Dan Slobin will moderate the final discussion.
The symposium is supported by the School of Behavioral and Neurosciences Groningen (BCN).