Language Technologies for Lifelong Learning LTfLL - 200 8- 2125 78 Pr oj e ct D elive r a ble Re por t D e liv e r a ble n r D 5 .2 W ork Pa ck a ge 5 Ta sk Learning Support and Feedback D a t e of de liv e ry Con tr a ct u a l: 01- 11- 2009 Act u a l: 15- 12- 2009 Code n am e Ve rsion : 2.0 Draft Final Type of de liv e r able Report Se cu r ity Public ( distr ibu t ion lev e l) Con tr ibu t ors Au t hors ( Pa r tn e r) St efan Trausan-Mat u, Philippe Dessus, Traian Rebedea, Sonia Mandin, Em m anuelle Villiot - Leclercq, Mihai Dascalu, Alexandru Gart ner, Cost in Chiru, Dan Banica, Dan Mihaila, Benoît Lem aire, Virginie Zam pa, Eugène Graziani Con t a ct Pe rson St efan Trausan- Mat u W P/ Task r esponsible St efan Trausan- Mat u EC Proj e ct Officer Ms. M. Csap Abstr a ct This report presents Version 1 of t he support and feedback ( for dissem in a t ion ) services (delivering recom m endat ions based on int eract ion analysis and on st udents’ t ext ual product ion) that can be int egrated wit hin an e- learning environm ent . Furt her st eps toward t he im plem entat ion of Version 2 of t hese services and t heir fut ure int egrat ion wit h all t he LTfLL services are also suggest ed. Ke y w or ds List I ndividual and Collaborat ive Knowledge Building Social Net work Analysis, Feedback, Writ ten Synthesis, Lat ent Sem ant ic Analysis, Bakht in Ack n ow ledgem e n ts We wish t o t hank Dale Gerdem ann, Bernhard Hoisl and Kiril Sim ov for valuable com m ents on an earlier version of t his report . LTfLL Proj ect Coordinat ion at : Open Univ ersit y of t he Net herlands Valk enburgerweg 177, 6419 AT Heerlen, The Net herlands Tel: + 31 45 576 2624 – Fax : + 31 45 57 62800 LTfLL - 200 8- 2125 78 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck Table of Contents Executive Summary ........................................................................................................1 1. Introduction ............................................................................................................2 1.1. Previous Work on WP 5...................................................................................2 1.2. Educational Theory..........................................................................................2 1.3. Design/Scenarios .............................................................................................5 2. Implementation of Version 1 of the Services...........................................................6 2.1. Overall Presentation.........................................................................................6 2.2. Technical Description of T 5.1 Service.............................................................6 2.3. Technical Description of T 5.2 Service........................................................... 20 3. Integration and Validation of Services ..................................................................31 3.1 WP2 Integration............................................................................................. 31 3.2 WP3 threads...................................................................................................33 3.3 Collaboration with WP 4 and WP 6................................................................ 35 3.4 WP7 Validation Plans .................................................................................... 35 4. Conclusions: Tools and Resources for Second Cycle of LTfLL............................ 37 5. Appendices ...........................................................................................................39 Appendix 1 — Description of our Services as Fostering Self-regulated Learning.......39 Appendix 2 – The extended pattern language............................................................. 41 Appendix 3 – Identification of Lexical Chains ........................................................... 43 Appendix 4 – Details on the Evaluation of Interactions..............................................46 6. References ............................................................................................................55 LTfLL - 200 8- 2125 78 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck Executive Summary According to the DoW, this report presents Version 1 of the support and feedback services (delivering recommendations based on interaction analysis and on students’ textual production) that can be integrated within an e-learning environment. Therefore, it contains details about the design and implementation of the services. For each of the two services of WP5, four issues were considered: Challenges, Methods, Results and Conclusion. Also details of how the services will be integrated with the other services of LTfLL, as well as a brief introduction to how the services will be validated are provided. This report also attempts to answer the following important questions: — How it is possible to implement the ideas of polyphony and dialogism introduced by Bakhtin (1981, 1984) and recognized as paradigm for Computer Supported Collaborative Learning (Koschmann, 1999)? — Is it possible to provide a tool that measures the contribution and degree of collaboration of each participant in a chat? — How it is possible to foster individual and collective knowledge building processes with computer-based artifacts? As van Aalst (2009, p. 262) puts it: “Knowledge construction involves a range of cognitive processes, including the use of explanation-seeking questions and problems, interpreting and evaluating new information, sharing, critiquing, and testing ideas at different levels […] and efforts to rise above current levels of explanation, including summarization, synthesis, and the creation of new concepts”. — How self-regulated learning processes in Personalized Learning Environments can be promoted? — What kind of Instructional Design prescriptions can be used to predict and help the whole students’ learning workflow using the LTfLL services? The remainder of this report is as follows. The first part is introductive and sets up the theoretical background of our research. The second part describes Version 1 of WP 5 services in a fourfold argumentation (challenges, methods, results and conclusions). The third and last part sheds some light on the integration and validation of WP 5 services. LTfLL - 200 8- 2125 78 1 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck 1. Introduction In a lifelong learning context (van Merriënboer, Kirschner, Paas, Sloep, & Caniëls, 2009), learners have to perform a lot of writing-based activities that are seldom assessed by teachers or tutors because this assessment is time-consuming. One of the goals of the LTfLL (Language Technologies for Lifelong Learning) project is to develop a set of services that help learners manage communication through texts (that they read or write either as essays, posts in forums or utterances in chats) in order to learn, either in a formal or informal way. This management is mainly based on semantics and pragmatics levels and includes text retrieval, learner positioning, essay or chat assessment, etc. The outline of this Deliverable is as follows. First, we describe the theoretical background upon which our work is based. Second, we describe from a technical perspective the services that we have devised. Third, we elaborate some paths toward integration and we briefly discuss validation. 1.1. Previous Work on WP 5 Since writing is one of the most important ways to get information and to communicate to each other, at first, we have investigated the relations between the two forms of written communication: writing and chatting and their possible effects on learning. As unifying theories we referred to Stahl’s (2006) knowledge building cycles, as well as to Bakthin’s dialogism. These theories enable us to show that in both cases the learner is engaged in a two-cycle process which individually and collectively generates knowledge from beliefs and utterances, and leads to a more elaborated discourse that is in turn re-injected in the process as cultural and cognitive artifacts. Since lifelong learners have limited access to teachers and tutors, the services we are designing can help them get feedback from their written productions. In turn, this feedback—together with that of peers—can be compared to the learners’ own self-assessment, and this comparison is at the core of the learning process (Ross, 2006). To sum up, our goal is to provide cognitive-based feedback to learners on their written productions (either individual or collective) for them to build knowledge from texts they have read and from discussions with peers. 1.2. Educational Theory This deliverable refines some points that aim at integrating the two main tasks of WP5 in the future, as well as some of the other WPs. One of our main claims, which aims also at unifying our research efforts, is that dialogical meaning building should drive the student experience in a distance learning environment. As Wegerif (2006) mentions, such an environment is a dialogical space in which all the stakeholders’ activities are located. This space is defined by spatial, temporal and social characteristics in which learning and teaching take place (Code & Zaparyniuk, 2009). This idea is supported by a variety of evidence showing that argumentation (both as input, read texts, and as output, written texts) leads to a more profound understanding than monological or even more narrative ways of expression. The learners immersed in a learning space can direct their attention LTfLL - 200 8- 2125 78 2 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck in two ways: a cognitive attention, directed toward learning material (e.g., textbooks), and a social attention, displayed to others and interpreted from others (Jones, 2005). Our research purpose is to devise tools that can help these two forms of attention. The services developed in WP5 play the role of mediators between the two ways of understanding: individual and collective (Stahl, 2006, p. 210 et sq.). The activities of teaching and learning are seen as a joint activity (Lorino, 2009) in which each author (producer) has multiple co-authors. When students have to write out an essay, they necessarily take into account the view of their teacher, who becomes a co-author implicitly. Conversely, every teacher produces texts (e.g., task sheets) that are tightly related to the students: in writing the text the teacher foresees the kind of problems, the reactions or questions that the students may have, so the document is “co-authored” too. This joint activity is not circumscribed to formal instructional tasks, but also relates to the modes of coordination, the roles the stakeholders have and manage within the educational situation. For this joint activity to work and be successful, a step further, a semiotic mediation by signs and tools is necessary. Signs or words are not the mere representation of something, but a mediation between people (Lorino, 2009). As Wegerif (2006, p. 144) put it: “Any sign taken to be a mediation between self and other, a word or a facial expression, must pre-suppose the prior opening of a space of dialogue (an opening of a difference between voices) within which such a sign can be taken to mean something.” Keeping all the previous points in mind, we can now claim that writing (essays, notes, utterances) in a distance learning platform implies participation in curricular conversations (Newell, 2006; Sfard, 1998), and in a community of practices in a curricular domain (Wenger, 1998). All these different reasons let us envisage the possibility of devising computer-based environments that foster student’s knowledge building, either individually or collaboratively. There are very few computer-based environments that aim at helping students working from these two viewpoints in the same environment (Moreno, 2009). However, we are aware that other forms of writing (e.g., answers to more formal questions for learning, see Deliverable 4.2; search queries, see Deliverable 6.2) are also to be taken into account in our LTfLL project. As already expressed in the DoW (section B.1.1) and in the previous Deliverable (see Deliverable 5.1, section 1.1 p. 5), the goal of our consortium is to design and implement services in Personal Learning Environments (PLEs) that allow lifelong learning. PLEs are mostly designed and implemented to support two lines of core activities for learning: – self-regulated learning, or SRL (Puustinen & Pulkkinen, 2001); – summary writing, and, by extension, multiple-source writing (Thiede & Anderson, 2003). We have argued that Stahl’s (2006) model is of great interest for integrating both individual and collective knowledge building in the same framework. Nonetheless, this model does not account well for the individual learner activity, mainly through writing; and does not mention tasks and the way to process them on the individual side. Moreover, LTfLL - 200 8- 2125 78 3 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck the exact role artifact play in the model is rather vague. We thus additionally have to choose an SRL model compatible with Stahl’s, which is an activity not often supported in current PLEs (Vovides, Sanchez-Alonso, Mitropoulou, & Nickmans, 2007). These authors proposed an SRL model especially dedicated to PLEs, which allows students to perform the following activity: “The key to the success in their design was to have students experience these strategies in their own learning, explicitly compare their own performance with that of the model, and take action to revise ineffective learning approaches.” (Lin, 2001, p. 26) Figure 1 — Metacognitive approach to design e-learning (Vovides et al., 2007, p. 68). Let us provide a short description of this model. First, students work on the object level, in preparing their activity according to the ongoing task. They can also apply for some cognitive strategies for these activities to be performed. Then, students perform a first rough assessment of their production (its adequacy, its relation to their knowledge, etc.). Third, a reflection on a meta level allows them to perform a comparison between the latter comparison and the object level, often offered by artifacts (computer-based services, prompts, etc.), so to compare their perceived level of learning with that proposed by the artifacts. Eventually, the student can perform some adaptations to their work, which in turn fuels the possible update of the object level and can be re-acted in a further loop. The previous Deliverable (D 5.1) was mainly focused on chat and summary writing, i.e., the shortening of isolated pieces of texts (either utterances or even summaries), related to a course domain. Although reading and writing can be examined as separate processes, many academic tasks can be considered hybrid tasks, as students are asked to demonstrate their understanding of source texts they read by composing a new abridged text (Spivey, 1997). Such hybrid tasks can range from summary writing of single texts to discourse synthesis from multiple source texts. Discourse synthesis is a cognitively demanding activity requiring students to transform knowledge (Bereiter & Scardamalia, 1987) rather than simply reproduce information from a single source. It entails writing a new text and constructing meaning by using three key operations: selecting information according to the writer’s goals, connecting this information to achieve a cohesive text, LTfLL - 200 8- 2125 78 4 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck and organizing it according to the intended goals (Segev-Miller, 2004; Spivey, 1997). While it is possible to keep the same textual pattern when summarising a single text, synthesizing from multiple texts requires students to decide on a new text structure (macrostructure) to use in their written text in order to integrate their own ideas about the text contents (Brassart, 1993; Flower et al., 1990; Segev-Miller, 2004). 1.3. Design/Scenarios The functionality of the service developed by T 5.1 may be understood from the following scenario, which describes the experience of a hypothetical learner, Ulysses. In the Natural Language Processing (NLP) course, a forum and a chat system are used to collaborate with classmates. Moreover, the evaluation of these activities constitutes an important part of the final grade for each student. The teacher, Dr. Smith starts discussion topics on the forum after each course session. The tutors have to moderate and solve possible conflicts by offering explanations. In addition, the teacher gives topics to be discussed in small groups using the chat system. As preparation for a chat, the tutors group students in small teams of 4-7 participants, each being assigned a topic to study and then to support it in debates. Ulysses reads the most interesting materials about that topic in order to understand the subject in detail. During the discussions, his peers present other points of view, debate and inter-animate, all of these improving his own and the others’ understanding of the domain. After concluding a chat session, Ulysses can launch several web widgets from the Chat & Forum Analysis and Feedback System (C&F-AFS) that provides graphical and textual feedback and preliminary scores both for him and for the group as a whole. As he knows, also the tutors use C&F-AFS for providing them a better insight for writing a detailed feedback and giving grades. When Ulysses is using C&F-AFS for a forum, it provides him threads and/or posts that are related to a concept, it recommends peer-learners that have a good coverage of particular topics and it offers preliminary feedback about self- reflection activities. In turn, Ulysses can use Pensum, the T 5.2 service. He launches it as a Web service. He selects the NLP course domain and starts to express the main questions, problems and notions he has/wants to tackle in this course in a dedicated notepad. Then he begins to write a synthesis about the most important ideas of the understood course. Whenever Ulysses is uncertain about whether he grasps the most important notions of a text, he asks support from Pensum. The system gives Ulysses a feedback on his written synthesis, e.g., relevance of the sentences or inter-sentence coherence of the synthesis. Ulysses is in control of his own learning process, he requests whenever he wants feedback and can update his notepad according to the main points he understood and go further in the writing of the same synthesis or one related to another topic. LTfLL - 200 8- 2125 78 5 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck 2. Implementation of Version 1 of the Services 2.1. Overall Presentation Every lifelong learner performs a wide range of learning activities based on the use of language (retrieving pieces of text, reading, taking notes, discussing with peers or other stakeholders). In order to study and support these activities, different research domains are currently investigated (psychology of writing, distance learning, instructional psychology, natural language processing). In the preceding D 5.1, a state of the art presented these activities from two viewpoints: writing (mostly summaries) and chatting together with their relations with learning. In this report we want to go further in some new domains. This deliverable is the occasion to pinpoint the relations between individual and collaborative knowledge building, and the following points: – learners are more able to build knowledge from argumentative writing task than from other kinds of discourse (e.g., narrative) (Wiley & Voss, 1999), especially when that learner’s opinion is asked; – learners are able to manage their self-understanding of the course; – learners are able to write out syntheses of the course that capture their ongoing understanding of the course. The remainder of this section is devoted to document the implementation of each of the WP 5 services in a fourfold argumentation: what are the challenges at stake, the methods carried out in the implementation of the services, the main results we obtained and the conclusions we drawn. 2.2. Technical Description of T 5.1 Service Challenges Educational institutions have largely embraced the use of Internet, web technologies and its collaborative environments to supplement standard learning practices. Learners’ interactions show their (individual and group) knowledge regarding the course materials as well as their capacity to apply this knowledge when solving (practical) problems. However, what happens in these interactions is now generally beyond the control of the teachers, who only focus on the results of the collaboration processes. More involvement to assess individual contributions, to moderate or to provide relevant feedback concerning the quality of the web interactions regarding both content and collaboration appears to be time-consuming and demands a high cognitive load. The development of the service for T 5.1 is challenging because it is based on a new vision of supporting learning: using Natural Language Processing (NLP) tools for analyzing dialogic knowledge creation in chat conversations and forum discussions. It is theoretically challenging from several points of view. As mentioned in D 5.1, the service is based on Bakhtin’s ideas that dialog is in any text, that the echoes of many voices exist LTfLL - 200 8- 2125 78 6 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck in any word and that they weave together in polyphony and inter-animation (Bakhtin 1981, 1984). However, even if many (e.g. Koschmann, 1999 or Stahl, 2006) consider Bakhtin’s theories as a paradigm for Computer-Supported Collaborative Learning (CSCL), there are some important details remaining to be elaborated (e.g., what could be considered as a voice and how to develop a computational system from these theories). On another perspective NLP is known to be hard and is often considered unreliable in real settings. Furthermore, interaction analysis and pragmatics are among the most difficult issues for NLP. Moreover, it is not clear if complex metrics of collaboration can be computed. Another challenging theoretical point is the degree to which using CSCL may enhance knowledge building in lifelong learning. The validation activities for T 5.1 were carefully designed to analyze this challenge. In addition, there are technological challenges because NLP has hardly been used until now for analyzing multi-user chat conversations. Moreover, discourse analysis in chats and in any text, in general, was not based on a multithreaded, inter-animation perspective, as is the case for us. Methods One main idea behind our approach is to encourage in lifelong learning the usage of conversations, dialogue, debates, and inter-animation as premises for understanding, studying and creative thinking. The achievement of these goals may be supported by tools that analyze chats and forums and which provide insight and feedback. The implemented analysis method integrates results from NLP (content and discourse analysis), Social Networks Analysis (SNA) and, a novel idea (Trausan-Matu & Rebedea, 2009), the identification of polyphonic threading in chats. The results are textual and graphical feedback and evaluations of the contributions of the participants. Architecture of T 5.1 Service The goals of T 5.1 service are to provide feedback, recommendations and to propose grading for learners that participate for an assignment in a chat conversation or a discussion forum. Although the parameters taken into consideration are slightly different for the chat and forum cases, the main steps are identical, as described below. T 5.1 service is implemented by the Chat & Forum Analysis and Feedback System (C&F- AFS). First, the data is processed by a NLP pipe (spelling correction, stemmer, tokenizer, POS tagger and parser). In the semantic sub-layer, concepts are searched in a collection of key concepts and their inter-relations for the subject, provided by the teacher. These concepts may be obtained also from ontologies, either provided by experts or automatically extracted from various sources (e.g. using Wikipedia and Wiktionary, alternative which will be investigated for Version 2). Synonyms are obtained from the lexical database WordNet (http://wordnet.princeton.edu) for English; for other languages, e.g. Romanian, in Version 2 of the T5.1 service other lexical resources will be used, like dexonline.ro or particular wordnets e.g. Balkanet or EuroWordNet, if available. LTfLL - 200 8- 2125 78 7 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck Advanced NLP and discourse analysis techniques are used for the identification of speech acts, rhetorical schemas, lexical chains, co-references, in order to find interactions between the participants. Discourse analysis techniques are further used for identifying adjacency pairs, other implicit links, discussion threads, argumentation and transactivity. The polyphony sub-layer uses the interactions and advanced discourse structures to look for inter-animation, convergence and divergence. In addition, for the computation of several metrics regarding the participation in the community of learners, Social Network Analysis is used, which takes into account the social graph induced by the participants and interactions that have been discovered. The results of the previous sub-layers are combined to offer textual and graphical feedback and grade suggestions for each participant to an assignment. The above steps are performed in the four successive layers of the architecture of the system depicted in Figure 2 (to be noted that the upper layers use the information computed by the lower layers as it is presented in the next paragraph). Figure 2 — Layers of T 5.1 service Modules in T 5.1 Service The modules of T 5.1 service may be grouped into several functional components around its main goal, the contribution analyzer, which is a module that provides textual feedback, visualization of the interaction and proposes grades for the LTfLL - 200 8- 2125 78 8 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck participants. The input of the service is the interaction (chat or forum) log. For the content analyzer the NLP pipe is needed for pre-processing the log text. The inter- animation analyzer is processing the threads in the chat, which are built upon the explicit and implicit links in the conversation – the latter ones are provided by a specialized module that uses the results of the NLP pipe plus discourse analysis (see Figure 3). In Figure 5, the main blocks from Figure 3 are broken down into specific functional modules that are designed to be included in each component from Figure 3 and the most important interactions between them are present in the figure for not complicating it furthermore. The majority of the modules are already implemented in the first version and the few remaining (collocation determination, rhetorical schemas identification) will be included in the second version of the service. In the next sections more details about these components will be provided. Figure 3 — Main blocks of C&F-AFS The Format of the Input Data The chat environment used in the experimentation is the one used in the NSF Virtual Math Teams (VMT) project (Stahl, 2009), which offers a whiteboard and referencing facilities. This environment is available also as the open source system ConcertChat (Holmer, Kienle, & Wessner, 2006; http://sourceforge.net/projects/concertchat/). This environment allows the user to explicitly reference a previous utterance or an object on the whiteboard. This facility is extremely important in chat conversations with more than two participants because it allows the existence of several discussion threads in parallel, a feature that cannot be achieved in face-to-face, usual chats or phone conversations. An XML schema was designed for encoding chat conversations and discussion forums. In Figure 4, an example fragment of such a chat is presented. Each utterance has a unique identifier (‘genid’) and the existing explicit references (‘ref’) to previous utterances. In LTfLL - 200 8- 2125 78 9 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck addition to annotating the elements of a chat, the schema also includes data generated by the system, as will be presented later. In forums, an additional ‘thread’ XML element is added. The input data may be in different formats besides the above XML schema. A preprocessing module transforms these formats into an XML document that respects this schema. The supported formats are saved chats from Yahoo Messenger in text format, VMT (ConcertChat) html format and VMT older formats. < ?x m l v ersion= " 1.0" encoding= " UTF- 8 " ?> < Dialog t im e= " 2 0 0 5 - 0 1 - 1 1 0 9 :2 6 :1 1 " descript ion = " t h is is a n a ssign m e n t for t h e N LP cou rse " file= " ch a t _ in pu t _ 1 .x m l" id= " So cia l n e t w o rk s1 3 _ 6 _ 2 0 0 6 1 0 _ 5 7 _ 1 0 " language= " e n | fr| ro" nam e= " ch a t - 1 2 - A" sub j ect = " a bou t pr a gm a t ics" t eam = " 1 2 " > < Part icipant s> < Person nick nam e= " Ale x " realnam e= " Bibi I on e scu " / > < Person nick nam e= " v va lce a " realnam e= " " / > < Person nick nam e= " Ad ria n " / > < / Part icipant s> < Topics> < I t em set descr ipt ion= " N LP - pra gm a t ics" > < I t em > speech act < / I t em > < I t em descr ipt ion= " cnf. Grice’s t heory " > im plicat ure< / I t em > < / I t em set > < I t em set > ……………….< / I t em set > < / Topics> < Body > < Turn nick nam e= " Ale x " > < Ut t erance genid= " 1 " ref= " 0 " t im e= " 2 0 0 5 - 0 1 - 1 1 0 9 :2 6 :0 3 " > h e llo a ll < / Ut t erance> < / Turn > < Turn nick nam e= " Ad ria n " > < Ut t erance genid= " 2 " ref= " 0 " t im e= " 2 0 0 5 - 0 1 - 1 1 0 9 :2 7 :1 8 " > h i< / Ut t eran ce> < / Turn > < Turn nick nam e= " vv a lce a " > < Ut t erance genid= " 3 " ref= " 1 " t im e= " 2 0 0 5 - 0 1 - 1 1 - 0 9 :2 9 :2 9 " > H e llo Ale x < / Ut t erance> < / Turn > …………………………………….. < / Body > < / Dialog> Figure 4 — XML encoding of chats The NLP Pipe The NLP pipe has as input the chat log or the text of the discussion forum, in the above XML format and has the following component modules: – Spelling correction, which tries to correct the spelling errors from the text; – Tokenizer, which splits the text into textual units (these are not always simple words); – Named Entity Recognizer, which identifies and classifies names of persons, places, brands, companies, etc. It uses a gazetteer which has to be loaded with specific names for the considered teaching domain. – Stemmer (Lemmatizer), which stems each word to identify words from the same word family; – POS tagger and parser, which tags each word with its POS and constructs dependencies between the words; – NP-chunker, which is used to structure the noun phrases in the text. LTfLL - 200 8- 2125 78 10 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck The modules of the NLP pipe are those provided by the Stanford NLP software (http://nlp.stanford.edu/software), with the exception of the spellchecker (implemented with Jazzy, see http://jazzy.sourceforge.net/ and http://www.ibm.com/developerworks/java/library/j-jazzy/). Two alternative NLP pipes are under experimentation, integrating modules from GATE (http:// gate.ac.uk) and LingPipe (http://alias-i.com/lingpipe/). Version 1 of T 5.1 service has included only modules for the English language. However, the design and implementation were performed with the facility of considering other languages as well. For these new languages, of course, the modules should be replaced with the ones for the other languages. Figure. 5 — The main components of the system. Components that are part of the same module have the same background colors (for example, the NLP pipe is colored in light blue). Cue-Phrase Identification Because important parts of the processing in C&F-AFS are based on patterns identified by cue phrases, a module called “PatternSearch” was implemented for searching occurrences that match expressions specified by the user, in a log of a chat or a forum. In LTfLL - 200 8- 2125 78 11 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck addition to a simple regular expression search, the module allows considering not only words, but also synonyms, words’ stems and their part of speech (POS). Another novel facility is the consideration of utterances as a search unit, for example, specifying that a word should be searched in the previous n utterances and that two expressions should be in two distinct utterances. For example, the expression <S "convergence"> #[*] cube searches pairs of utterances that have a synonym of “convergence” in the first utterance and “cube” in the second. One result from a particular chat is the pair of utterances 1103 and 1107: 1103 # 1107. overlap # cube [that would stil have to acount for the overlap that way] # [an idea: Each cube is assigned to 3 edges. Then add the edges on the diagonalish face.] The search is made at utterance level. The program checks the utterances one by one and if there is a match between a part of the utterance and the searched expression, both the utterance and the specific text that matched are indicated. PatternSearch is used in several other modules: cue-phrases identification, implicit links identification and adjacency pairs identification. A complete description of the module is presented in Appendix 2. Content Analysis The content analysis identifies the main concepts of the chat or forum using the NLP Pipe, cue-phrases and graph algorithms. It also identifies speech acts and argumentation types of utterances (as in Toulmin’s theory: Warrant, Concession. Rebuttal and Qualifiers (Toulmin, 1958)). The first step in finding the chat subjects is to strip the text of irrelevant words (stop- words), text emoticons (like “:)” or “:P”) special abbreviations used while chatting (e.g., “brb,” “np” and “thx”) and other words considered irrelevant at this stage. The next step is the tokenization of the chat text. Recurrent tokens and their synonyms are considered as candidate concepts in the analysis. Synonyms are retrieved from the WordNet lexical ontology (http://wordnet.princeton.edu). If a concept is not found on WordNet, mistypes are searched. If successful, the synonyms of the suggested word will be retrieved. The last stage for identifying the chat topics consists of an unification of the candidate concepts discovered in the chat. This is done by using the synonym list for every concept: if a concept in the chat appears in the list of synonyms of another concept, then the two concepts’ synonym lists are joined (in the version 2 of the service, some of the processing done in WP4 will be also considered for concepts identification). At this point, the frequency of the resulting concept is the added frequencies of the two unified concepts. This process continues until there are no more concepts to be unified. At this point, the list of resulting concepts is taken as the list of topics for the chat conversation, ordered by their frequency. LTfLL - 200 8- 2125 78 12 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck In addition to the above method, used for determining the chat topics, there is an alternative technique we used to infer them by using a surface analysis technique of the conversation. Observing that new topics are generally introduced into a conversation using some standard expressions such as “let’s talk about email” or “what about wikis,” we have constructed a simple and efficient method for deducing the topics in a conversation by searching patterns containing specific cue phrases. The topics of the chat may also be detected starting from the connected components in the interaction graph constructed from the explicit and implicit links described in the next section. Speech acts were introduced by Austin and then elaborated by Searle and others (Jurafsky & Martin, 2009). They are classifications of utterances according to the action they fulfill. The list of speech acts considered in T 5.1 system is derived from DAMSL (http://www.cs.rochester.edu/research/cisd/resources/damsl/RevisedManual/). Statement Conventional Maybe Greeting Info Request Agreement Understanding Noise Declarative Question Accept Answer Continuation Wh-Question Reject Thanks Exclamation Action Partial Accept Sorry Directive Partial Reject Opinion Implicit Links Identification In addition to explicit links, stated in chats by the referencing facility of the VMT environment and in forums by the reply link, implicit links are also identified. The advanced NLP and basic discourse analysis sub-layer uses the results of the previous two sub-layers to identify various types of implicit links: - Repetitions (of ordinary words or Named Entities) - Lexical chains, which identify relations among the words in the same post or utterance or in different ones, by using semantic similarities (the semantic sub- layer); - Adjacency pairs (Jurafsky & Martin, 2009) – pairs of specific speech acts – answers to a single question in a limited window of time (in which the echo of the “voice” of the question remains), greeting-greeting, etc.; - Co-references. Implicit links, with the exception of lexical chains (Appendix 3 contains more details about the detection of lexical chains) and co-references (detected with the BART system is used, see http://bart-coref.org/) are detected using the cue phrases identification system LTfLL - 200 8- 2125 78 13 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck (PatternSearch) and LSA. For version 1 of the services, LSA has been considered as an alternative for computing semantic similarities using a domain ontology. Words, Key Concepts, Voices, Threads In addition to existing approaches in analyzing chats, which are mainly based on analyzing pairs of utterances (Dong, 2006; Joshi & Rosé, 2007), we use a thread-based analysis, starting from Mikhail Bakhtin’s multivocality (heteroglossia), polyphony and inter-animation ideas (Bakhtin 1981, 1984). The polyphony-based theoretical framework founded on Bakhtin’s theories (Bakhtin 1981, 1984) is centered around the idea of a co-presence of multiple voices which may be considered as particular positions that may be taken by one or more persons when they emit an utterance, which has explicit and implicit links or influences on the other voices. In the implementation of our analysis tool, we start from the key concepts and associated features1 that have to be discussed. Each participant is assigned to support a position which corresponds to a key concept. Implicitly, that corresponds to a voice emitting that concept and the associated features. We identify other, additional voices in the conversation by detecting recurrent themes, new concepts. Therefore, a first, simple perspective is to have a word-based approach on voices: we consider that a repeated (non-stop) word becomes a voice. The number of repetitions and some additional factors (e.g. presence in some specific patterns) are used to compute the strength of that voice (word). This perspective is also consonant with Vygotsky’s (1978) ideas that words are artifacts socially constructed, or tools for group knowledge construction. An example of an artifact in a CSCL chat for solving a geometry problem is the phrase “60/90/60”, the degrees of the angles of a triangle, which is used many times by the participants of a VMT cha. We use voices to keep track of the position that each participant has to support, in order to identify divergences and conjunctions. This position is, as mentioned above, an implicit voice. For a given small period of time, the last utterances are echo-like voices. For example, answers may be associated to questions in a given time window. Voices influence each other through explicit or implicit links. In this perspective, voices correspond to threads. A thread may be a reasoning or argumentation chain (Toulmin, 1958), a chain of rhetorical schemas, chains of co-references, lexical chains and even only chains of repeated words, in the idea of Tannen (1989). The identification of argumentation chains, rhetorical schemas or co-references in texts and conversations are very difficult tasks for Natural Language Processing. Chains of repeated words, however, are very easy to detect, the sole problem being the elimination of irrelevant repeated words like stop-words. Lexical chains can also be detected very easy, but their 1 In the second version a domain ontology will also be used. LTfLL - 200 8- 2125 78 14 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck construction is more difficult and the results are greatly influenced by the choice of semantic similarity measures. Polyphony, Inter-animation and Collaboration In polyphony, the most advanced kind of music compositions, a number of melodic lines (or “voices,” in an extended, non-acoustical perspective) jointly construct a harmonious musical piece, generating variations on one or several themes. Dissonances should be resolved, even if several themes (melodies) or theme variations are played simultaneously, and even if sometimes the voices situate themselves in opposing positions. Voices in polyphonic music have two dimensions, the sequential threading of utterances or words and the transversal one implicitly generated by the coincidence of multiple voices (Trausan-Matu and Rebedea, 2009). In addition, another dichotomy, the unity- difference (or centrifugal-centripetal, (Bakhtin, 1981) opposition may also be observed. Bakhtin (1981) considers that multiple voices are present also in texts, and sometimes they inter-animate, constituting a polyphonic framework (Bakhtin, 1984). Extrapolating this idea, we observe that inter-animation of voices following polyphonic patterns can be identified in dialogs in general and in chats in particular. A polyphonic collaboration involves several participants who play several themes and their variations in a game of sequential succession and differing positions. The existence of different voices introduces “dissonances,” unsound, rickety stories or solutions. Wegerif advocates the use of a dialogic framework for teaching thinking skills by stressing inter-animation: “meaning-making requires the inter-animation of more than one perspective” (R. Wegerif, 2006) He proposes that “questions like ‘What do you think?’ and ‘Why do you think that?’ in the right place can have a profound effect on learning” (Rupert Wegerif, 2007). However, he does not develop the polyphonic feature of inter-animation. Multivocality means that there are permanently several voices entering in competition. Each utterance is filled with “overtones” of other utterances. A first problem is to detect these overtones. In this aim we can start from implicit or explicit links. From these links a graph is constructed connecting utterances and, in some cases, words. In this graph, threads may be identified as sequences of implicit or explicit links that constitute voices. Simple examples of threads are repetitions of words or lexical chains. The same utterance may, of course, be included in several threads. Always, there are several voices that interact, for example, the writer, the potential reader, the echoes of the voices present in each word. Moreover, from this multivocality perspective, texts become meaning generation mechanisms, facilitating understanding and creative thought, as Lotman stated (Wertsch, 1991; Dysthe, 1996). A consequence is that in education, "the interaction of oral and written discourse increased dialogicality and multivoicesness and therefore provided more chances for students to learn than did LTfLL - 200 8- 2125 78 15 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck talking or writing alone" (Dysthe, 1996). The dialogic and multivoicesness features of any utterance, even written, may be unifying factors for the integration of the modules in the language-centered LTfLL project. Therefore, starting from the ideas of T5.1 an integrated framework may be provided for analysing all the textual learning activities such as searching documents, reading, writing summaries or forum posts and chatting. Contribution Analyser The evaluation of the contributions of each learner considers several features like the coverage of the expected concepts, readability measures (see appendix 4) , the degree to which they have influenced the conversation or contributed to the inter-animation. In terms of our polyphonic model, we evaluate to what degree they have emitted sound and strong utterances that influenced the following discussion, or, in other words, to what degree their utterances became strong voices, by generating new and long threads. The automatic analysis considers also the inter-animation patterns among threads in the chat. It uses several criteria such as the presence in the chat of questions, agreement, disagreement or explicit and implicit referencing. In addition, the strength of a voice (of an utterance) depends on the strength of the utterances that refer to it. If an utterance is referenced by other utterances that are considered important, obviously that utterance also becomes important. By using this method of computing their importance, the utterances that have started an important conversation within the chat, as well as those that began new topics or marked the passage between topics, are more easily emphasized. If the explicit relationships were always used and the implicit ones could be correctly determined in as high a number as possible, then this method of calculating the contribution of a participant would be considered successful (Trausan-Matu, Rebedea, Dragan, & Alexandru, 2007). The above ideas of polyphonic assessment are combined with the results of two additional sets of evaluations (more details are provided in Appendix 4): 1. From the utterance assessment perspective, the following 3 steps are performed after the NLP Pipe: 1.1. Evaluate each utterance individually taking into consideration the following features: - effective length of initial utterance; - the words that remain after eliminating stop words, spell-checking and stemming and their number of occurrences; - the level at which the current utterance is situated in the overall thread - the correlation/similarity with the overall chat; - the correlation/similarity with a set of predefined set of topics of discussion. 1.2. Augment the importance of Utterance Marks in the middle of threads using a Gaussian distribution; 1.3. Determine the final grade for each utterance in the current thread using only explicit links, thus obtaining a relative mark of each utterance in the corresponding thread. LTfLL - 200 8- 2125 78 16 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck 2. From the participants’ assessment perspective, evaluation is made at 2 different levels: 2.1. At a surface level, consisting of the following: - Readability of all utterances regarded as a document; - Proxes derived from Page’s Essay Grading techniques (see Appendix 4), consisting of the following aspects: fluency, spelling, diction and utterance structure; - Social Network specific metrics computed on the matrix of interchanged utterances, based entirely on explicit links: Degree (In-degree, Out-degree), Centrality (eigen-centrality, closeness, graph centrality) and User Ranking based on the well-known Google Page Rank Algorithm. 2.2. At a semantic level, using semantic similarity based on LSA and social network analysis on a matrix obtained by using previously assessed utterances and their corresponding marks. For instance, now the edge in the network is not represented by the number of utterances exchanged (as considered in the analysis at the surface level), but it consists of the sum of all the marks determined for each utterance. The main difference between the two is that at the first level the actual implication of each participant is assessed (gregariousness, openness to the community), but at the second level actual knowledge interpretation is made; therefore participant competency in the specified domain is being evaluated. Results C&F-AFS supports the analysis of collaboration among learners: It produces different kinds of information about discussions in chat and forum discussions, both quantitative and qualitative, such as various metrics or statistics, and content analysis data e..g. the coverage of the key concepts related to executing a task and the understanding of the course topics or the inter-threaded structure of the discussion. In addition, C&F-AFS provides feedback, both directive and facilitative (see Deliverable 5.1), telling the learners what was good or wrong in their interactions and facilitating their understanding. It provides data about the involvement of each learner, generates a preliminary assessment and visualizes the interactions and the social participation. Finally, the system identifies and visually highlights the most important chat utterances or forum posts (that express different opinions, missing topics/concepts, misleading posts, misconceptions or wrong relations between concepts). The results of the contribution analyzer are annotated in the XML file of the chat or forum. The annotations are about utterances: < Ut t eranceFeedback genid= " 5 3 " > < Grade t y pe= " ove ra ll" > 8 . 1 5 < / Grade> < SpeechAct > Con t in u a t ion < / SpeechAct > < SpeechAct > I n fo Re qu e st < / SpeechAct > < SpeechAct > St a t e m e n t < / SpeechAct > LTfLL - 200 8- 2125 78 17 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck < Argum ent at ion > Cla im < / Argu m ent at ion > < / Ut t eran ceFeedback > and about participants: < GeneralGrade nick nam e= " Ale x I " > < Grade t y pe= " dict ion " > 2 0 . 0 7 < / Grade> < Grade t y pe= " spe llin g" > 1 3 . 1 6 < / Grade> < Grade t y pe= " flu e n cy" > 2 5 . 6 3 < / Grade> < Grade t y pe= " pa ge Ra n k in g " > 2 0 . 4 4 < / Grade> < Grade t y pe= " u t t e ra n ce St ru ct u re " > 2 2 . 8 9 < / Grade> < Grade t y pe= " n brW ordsPro c" > 2 7 .3 7 < / Grade> < Grade t y pe= " n brDiff W or dsPro c" > 2 4 . 9 8 < / Grade> < Grade t y pe= " n brUt t e ra n ce sProc" > 2 5 . 0 6 < / Grade> < Grade t y pe= " n brUt t e ra n ce sProc" > 2 3 . 1 9 < / Grade> < Grade t y pe= " m e a n Ut t e ra n ce W o rds" > 1 3 . 2 1 < / Grade> < Grade t y pe= " corre ct W ordsPro c" > 1 3 .1 6 < / Grade> < Grade t y pe= " fle sch " > 5 5 . 1 8 < / Grade> < Grade t y pe= " k in ca id" > 8 . 2 3 < / Grade> < Grade t y pe= " fog" > 1 0 .3 5 < / Grade> < Grade t y pe= " in De gre e " > 2 5 .5 1 < / Grade> < Grade t y pe= " ou t De gre e " > 2 3 . 1 9 < / Grade> < Grade t y pe= " ra n k " > 2 2 . 0 9 < / Grade> < Grade t y pe= " e ige n " > 1 0 0 . 0 1 < / Grade> < Grade t y pe= " close n e ss" > 1 6 .7 9 < / Grade> < Grade t y pe= " ce n t ra lit y" > 1 7 .4 6 < / Grade> < / GeneralGrade> These values are used for generating textual feedback, which includes, besides the above numerical values: - the list of most important (used, discussed) concepts in a chat/forum; - coverage of the important concepts specified by the tutor; - the most important utterances of each participant (the ones with the largest scores – the score for an utterance uses a complex formula that takes into account the concepts used, dialog acts, the links between utterances and SNA factors); - a score for each participant in the conversation; - areas of the conversations with important collaboration (inter-animation, argumentation, convergence and divergence); - a grade for the collaboration in the whole conversation; - other indicators and statistics that are going to be added with the development of the service / system. As graphical feedback the service provides an interactive visualization and analysis of the conversations graph with filtering enabled. The graphical representation of chats was designed to facilitate an analysis based on the polyphony theory of Bakhtin and to permit the best visualization of the conversation. For each participant in the chat, there is a separate horizontal line in the representation and each utterance is placed in the line corresponding to the issuer of that utterance, taking into account its positioning in the original chat file—using the timeline as an horizontal axis (see Figure 6). Each utterance is represented as a rectangular node having a horizontal length proportional with the textual length of the utterance. The distance between two different utterances is LTfLL - 200 8- 2125 78 18 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck proportional to the time between the utterances (Trausan-Matu et al., 2007). The interface is implemented via widgets (see section 3.1). Figure 6 — Chat visualization widget The explicit and implicit references between utterances are depicted using connecting lines with different colors. The user may explore the visualization in several ways. First of all, he has several options for changing the colors of the threads or other features of the diagram like zooming or scaling. He may click on the rectangle of an utterance in order to obtain the threads in which it is included (Figure 7). Figure 7 — Visualization of a conversation thread LTfLL - 200 8- 2125 78 19 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck Threads of repeating words and patterns may be visualized by the option “special threads”, each thread being represented with a different color (Figure 8). This visualization allows seeing the inter-animation among the topics supposed to be discussed (the “voices” and their polyphonic weaving). For example, in a chat with a good collaboration the threads are often passing from one participant to another. Figure 8 — Visualization of specific threads, indicated by the user Conclusion for T5.1 The T5.1 service is aimed at supporting learners in chats and discussion forums. It provides textual and graphical feedback, both directive and facilitative, at several levels: utterance (post), participant and conversation. The service uses Natural Language Processing, Social Network Analysis and polyphonic-based analysis for generating quantitative and qualitative data. It also offers a graphical visualization and exploration of the interactions, which allow to quickly have a glimpse on the inter-animation and, therefore, on the collaboration in chats and forums. 2.3. Technical Description of T 5.2 Service Challenges for T 5.2 Service As expressed in section 1.2, the aim of T 5.2 service is to give learners assessments on their written productions, and thus to make the teachers’ assessment process smoother. The validation of T 5.2 Showcase was the first step to test and to validate some features of the service. T 5.2 Showcase used a reading and writing cycle to assist students undertaking text revisions. The reading loop uses LSA to identify texts on selected topics. During the writing loop, students put their understanding of topics in writing, and LTfLL - 200 8- 2125 78 20 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck LSA is used to provide feedback on how well the student has understood the text. This service got quite good evaluations from their users, though it would benefit from refinement. Its validation (see D 7.2 Appendix 7, pp. 78 et sq. for more details) highlighted the following points: – Students not only have to summarize separate pieces of texts, but also to write syntheses from multiple sources after their reading (e.g., different parts of a course or articles). – A key benefit of the service is that it supports learners in a familiar situation: reading texts and then summarizing them. They frequently read texts from the web to understand and refine concepts and then make notes. – Teaching managers stated that the system is ready to be used for teaching, an advantage being that the system only provides texts relevant to the course, whereas the internet can provide large amounts of only slightly relevant material. – The system would benefit from a more adaptive approach to identify texts, based on how well the learner performed in the writing cycle. – Transparency in how the system came to its conclusions is needed, so that learners can identify the specific areas that need more attention. These previous achievements led us implement Version 1 of T 5.2 service with the following main improvements. First, the prototype allows text summarization, not only from single texts (e.g., article summary) but also from a bulk of documents (e.g., synthesis). We propose to ask students synthesize the courses, activity which promotes and assesses their own understanding (Palincsar & Brown, 1984). Feedback is provided on the learning and the synthesis of courses, which is a common task proposed in several teaching domains (Kirkpatrick & Klein, 2009). We propose several kinds of feedback that help students understand both the texts they read and the pieces of texts they write as a synthesis. These feedbacks allow students to be focused on the information they need to understand in the source texts. Moreover, these feedbacks directly indicate the parts of source texts lacking in the synthesis and those of the synthesis to revise. The Solution Scenario we designed (see D 3.2, section 3.3.2 pp. 25 et sq.) describes the final version (v. 2) with two kinds of feedback and support. The first kind of feedback is a “reflexive feedback” (Butler & Winne, 1995; Hattie & Timperley, 2007) that fosters self-directed learning and knowledge building processes. Its main goal is to help students formulate questions on the course documents before reading them and starting the activity of synthesis. It relies on an inquiry activity in which students formulate a focus question and lead a reflection on their prior knowledge and ideas on the topic. Formulating learning questions about what they will read helps the student to collect relevant information in the texts and to organize it in the synthesis to be written. The second kind of feedback is an “object feedback” on the pieces of texts produced by the student. It aims at supporting students on their synthesis task and focuses on the semantic content of the synthesis. Two ways of delivering this feedback are: first, an immediate and computer-based feedback, as many times as necessary. Second, a delayed LTfLL - 200 8- 2125 78 21 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck feedback, given by teachers and tutors via the system. To help tutors or teachers to deliver adequate feedback on student’s syntheses, a list of recurrent understanding and writing problems based on the categorization of Thibaudeau (2000) is provided. This categorization of problems is built according to a design patterns approach (Hübscher & Frizell, 2002). The visualization of these pieces of feedback (texts, graphics, etc.) and their conditions and modalities of display relies on the model of Dufresne et al. (2003), which defines precisely the specifications of feedback to be displayed in an online educational context. The challenge we face is twofold: devise and implement cognitive models of written assessment; and use these latter models to build a comprehensive set of feedback to foster students’ knowledge building. We now describe the methods used in the implementation of Version 1 of T 5.2 service, which tests the effect of some of the feedback presented above. Methods carried out for version 1 In this first version of T 5.2 service, we are only focusing on the immediate feedback based on LSA-based computational cognitive models. LSA has been argued to be useful for modeling human semantic memory (Landauer & Dumais, 1997). This model can be used to simulate the understanding of texts and to analyze summaries of short explanatory texts. For example, Foltz, Kintsch and Landauer (1998) showed that LSA can be used to measure the coherence of text and that the understanding of a text depends on this coherence. This capability of LSA has also been used in other systems like Apex (Lemaire & Dessus, 2001), which uses the same measure of coherence to give feedback; or Summary Street, which sends feedback about the relevance and the redundancy of sentences (E. Kintsch et al., 2000; Wade-Stein & Kintsch, 2004). Several types of immediate feedback have been implemented in the Version 1 on the coherence of the synthesis, the relevance of its sentences, its completeness, and the generation of an outline of the synthesis. – The coherence assessment (Foltz et al., 1998) model was used (two consecutive sentences are coherent if their semantic proximity is above a threshold). This model can assess if two given sentences (in the same paragraph) are coherent. Moreover, a coherence gap between the last sentence of a paragraph and the first sentence of the next paragraph usually appears. In that case we nevertheless chose to indicate a coherence lack for two reasons: first, coherence lack may indicate a bad transition between paragraphs; second, students may insert wrong paragraph returns. – The relevance assessment model reused the Summary Street’s model (a relevant sentence is a sentence similar to at least one sentence of the source texts). – The completeness assessment model (does the synthesis cover all the course text topics?), we propose two alternative models: first, a measure with respect to course text topics (a topic is a keyword representing the gist of a text), i.e., the semantic proximity between the synthesis and the block of sentences linked to the topic in the course texts. A sentence of a course text is in this block if the semantic proximity between the sentence and the topic is high enough. If the semantic proximity between LTfLL - 200 8- 2125 78 22 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck the block of sentences and the topic is high then the topic is covered; if not, the student is prompted that the topic is not covered. Second, a measure with regard to the semantic relation between the sentences of the course text and the sentences of synthesis (a sentence of the course text is indicated as not summarized in the synthesis if there is not enough semantic proximity between this sentence and each sentence of the synthesis). – The outline generation model, which prompts the student with a picture (e.g., a map or a diagram), which represents the course topics as they appear in the synthesis. We implemented the non-topic based feedback in version 1 (coherence, relevance, and the second alternative of completeness model), and we worked in parallel on the detection of topics. Since we have two types of feedback based on topics (synthesis completeness, outline), we have to propose and test models that generate such topics (keywords). Several methods for extracting keywords were tested. The first one considers that good keywords are words semantically related to the general meaning of the text. The method therefore consists in computing the LSA vector for the text by means of a classical sum of the vectors of its words. Then all of the words of the corpus are successively considered in order to find those whose vectors are the closest to the text vector. The closest ones are considered as keywords because they are semantically close to the meaning of the text. It is worth noting that this method can produce keywords from the corpora that are not present in the analyzed text, although this is quite unlikely. In order to assess the relevance of that method, we asked 30 participants (2nd year Master students) to provide 5 keywords for each of 6 texts. These French texts were between 570 and 2,532 words long and were about internet and networks. We compared human and model keywords and unfortunately found that correlations between cosines of automatically extracted keywords and human keywords frequencies were very low. Although the extracted keywords seem relevant, they were quite different from those provided by participants. We believe two facts may explain this finding: first of all, participants tend to select domain-specific keywords, whereas the method often outputs general words (such as content, technique, users, etc.); secondly, participants mostly provide nouns whereas the methods may produce verbs or adjectives since any words of the corpus may be chosen as a keyword, provided that it is close enough to the text vectors. We will further investigate this problem, in particular by filtering keywords based on their specificity and their POS tags. We also tried another method which is promising but much more complicated. This method is based on the integration of a cognitive model of text comprehension and a model of word meaning. This method thus combines the Construction-Integration model (W. Kintsch, 1998) and LSA. An implementation is presented in Lemaire, Denhière, Bellissens and Jhean-Larose (2006). Each sentence of the text is successively considered by the model, which first retrieves semantic neighbors for each word and then only keeps those relevant with respect to the general meaning of the portion of the text analyzed so far. Said differently, neighbors that are close to a word but not to the text vector are ruled out. For example, if the sentence is “how planes fly”, the model may retrieve the words “bird, wing, airplane” as neighbors of ‘fly’, but ‘bird’ will be removed when compared to LTfLL - 200 8- 2125 78 23 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck the vector of the sentence. At the end, each word and sentence is given an activation value, which is proportional to its contribution to the general construction of the text meaning. This model has been proven to well account for text comprehension of very well controlled material. However, when applied to raw texts, this model did not provide satisfactory results: only very general words are highly activated. We need to further investigate that issue. Let us now describe the way feedback, which is based on cognitive models using LSA and described in the previous section, can be displayed to students. We chose to design the feedback (see Figure 11, part 2) in only one format: the sentence with a detected problem is underlined and a tooltip warns the student of the problem. For example, in Figure 9, there are two detected problems in the underlined sentence: a coherence problem (the two contiguous sentences are semantically far from each other) and a pertinence problem (this sentence doesn’t appear to be very important). We believe this format is not cognitively demanding. Figure 9 — Feedback on a given sentence of the synthesis, prompted in a tooltip. “Sentence 2: — lacks coherence: this sentence and the previous one [and the following one] are semantically far from each other. — lacks pertinence: this sentence appears to be not so important.” The service also delivers feedback about the completeness of synthesis. This feedback is not currently related to synthesis topics (because further validation tests are necessary). However, for each sentence of the course text, the feedback indicates if it is semantically linked to a sentence of synthesis (see Figure 10). LTfLL - 200 8- 2125 78 24 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck Figure 10 — The tooltip delivers feedback on the extent to which a sentence of the synthesis covers parts of the source text: “No sentence of your synthesis is related to this one” No reflexive feedback is implemented in version 1 yet. We propose the use of a notepad in which learners can write questions and notes about their learning process (Hadwin & Winne, 2001). The notepad is divided in four parts according to the type of utterance Scardamalia and Bereiter (1996) identified in their analysis of CSCL chat discussions. Thus, the four panes of the notepad are the following. Issues addressed in the read texts; ideas from the read texts to report and summarize in a synthesis; ideas from the participants’ individual knowledge; encountered problems, questions and points to detail. This notepad allows students to monitor their learning and the writing of the synthesis. It also allows collection of data to formalize and test a future computer-based reflexive feedback to be implemented in version 2. This service delivers feedback to students about the quality of their course synthesis, on a semantic basis. The students log on and select a course domain to synthesize. Then they can freely choose their learning activity. They can freely perform the following activities. Read course texts, search additional texts, write a synthesis, ask for feedback on the synthesis or lastly fill their notepad with possible research questions, ideas to put in the synthesis and their difficulties or questions. The search of additional texts according to the texts already read and understood and the effect of readings has already been tested with the Showcase (see Deliverable 3.2). During the next Version 1 validation process we won’t test this part of feedback even though it will be later integrated in Version 2. Results: Architecture of T 5.2 Service Layers The service is organized in four layers: client, service, application logic and storage. Client layer. Clients (students) use a web interface. It lets users read course texts and synthesize them. The interface code is in HTML/Javascript. LTfLL - 200 8- 2125 78 25 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck Service layer. This layer links the interface with LSA and/or the database. It needs an AJAX solution. In accordance with the users’ actions, some parameters are sent to the server with the XMLHttpRequest object. The parameters are used by a PHP script. And in PHP, we invoke the C scripts (for using LSA) or MySQL queries. Application logic layer. C scripts are invoked to manage LSA through passed parameters or to recover data directly from files. The LSA application returns a result file. In this file, semantic proximities are required. The service layer transforms this file to communicate data to the client layer. We use a 24 million-word corpus gathered from newspaper articles to allow a transfer of experimental results in other domains than the tested one. We are using a version with the original Bellcore application (which is maintained by WP 5.2 staff) and a version using the R programming language (http://www.r-project.org/) is under development in collaboration with WP 2. Our goal is to successively implement each of the initial feedback as web services by using R-LSA. Storage layer. LSA needs a semantic space to function. We computed it from a text corpus, which depends on the user’s knowledge level and the studied domain. Since computing a semantic space is time-consuming, we compute in advance the semantic spaces. If we want LSA to make comparisons in processing a new document (in addition to the corpus), we don’t compute a new semantic space, but use specific LSA functions (tplus and syn, i.e., “fold in” technique). Users’ data and the texts are stored in a MySQL database. We plan to replace, for the integrated version of services, the user/password authentication process in favor of logging in using an openID account. Interface Description Currently, the pilot is not turned into a widget but is a web interface. After being authenticated, the student selects a course domain. Then, the main page is displayed (see Figure 11). The main page is split in 2 parts (part 1 and 3). At the top, the student can select/read a text (course text or understood additional text) (part 3). At the bottom, the student can write a synthesis (part 1). On the right, a button allows reaching a search engine of additional texts (part 4, deactivated button in the test of version 1), a button allows asking feedback (part 2) and another button allows to display the notepad (part 5). LTfLL - 200 8- 2125 78 26 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck Figure 11 — Flow diagram of version 1 of T 5.2 service. During the next weeks (as part of D 2.3), we will transform the current web interface into a web widget. The implementation of a 5-part widget is planned: – synthesis, – feedback, – management of course reading and additional texts understanding, – additional texts search engine, – notepad. Data Conceptual Description The first version of system needs a database to store the course and additional texts and users’ data (name, password, synthesis, read texts). We represent the relations between data in the Merise formalism below. The Data Conceptual Model represents the data in a formal representation while relational tables describe tables used in the MySQL database. LTfLL - 200 8- 2125 78 27 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck Data Conceptual Model Figure 12 – Data Conceptual Model of T 5.2 Version 1 Service. LTfLL - 200 8- 2125 78 28 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck Relational Tables USERS = { username ; password} In the current version of service this table allows to log on a user. We will further adapt it to store opened identifiers (openID authentication) in widgetizing the service. READTEXT = { read_id ; addtext_id ; username ; read_understood} This table allows to know the read additional texts by the user and if the user understood it. ADDTEXT = { addtext_id ; addtext_title ; addtext_author} This table stores the title and author’s additional text LINESADDTEXT = { linesaddtext_id ; addtext_id ; linesaddtext _nb ; linesaddtext _sentence ; linesaddtext _endparag ; linesaddtext _bold ; linesaddtext _italic ; linesaddtext _underline ; linesaddtext _color ; linesaddtext _highlight} This table stores the content of additional texts HOLD2 = { hold2_id ; addtext_id ; domain_id} This table allows to know the additional text in relation with the course domain. It’s indicated by teachers DOMAIN = { domain_id ; domain_title } This table stores the title of course domains SYNTHESIS = { synthesis_id ; notepad_section1 ; notepad_section2; notepad_section3; notepad_section4} This table allows to know the users’ synthesis and the content of notepad LINESSYNTHESIS = { linessynthesis_id ; synthesis_id ; linessynthesis _nb ; linessynthesis _sentence ; linessynthesis _endparag ; linessynthesis _bold ; linessynthesis _italic ; linessynthesis _underline ; linessynthesis _color ; linessynthesis _highlight} This table stores the content of synthesis WORK = { work_id ; username ; domain_id ; synthesis_id ; work_begin ; work_last} This table allows to know which synthese is write by an user about a course domain COURSE = { course_id ; domain_id ; course_title} This table stores the title of a course and the linked domain LINESCOURSE = { linescourse_id ; course_id ; linescourse_nb ; linescourse_sentence ; linescourse_endparag ; linescourse_bold ; linescourse_italic ; linescourse_underline ; linescourse_color ; linescourse_highlight } This table stores the content of course texts LTfLL - 200 8- 2125 78 29 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck Conclusion for T5.2 Version 1 of T 5.2 service presented above allows learners to be assessed during their written production through immediate computer-based feedback on coherence, relevance and completeness of the written synthesis. These feedbacks allow learners to be engaged in a reflexive process concerning the way they build knowledge, and rely on cognitive models under validation. After validation of this service we plan to propose a more integrated version, which will handle cognitively and pedagogically sound computational models for assessing students’ written production. LTfLL - 200 8- 2125 78 30 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck 3. Integration and Validation of Services 3.1 WP2 Integration The entire system is based on a client-server architecture. The server side is implemented as web services that generate support and feedback information. The client side of the system consumes these web services, processes the data that it receives and presents the useful information in a user-friendly interface. In order to make the integration between the e-learning environment and the user interface of the provided services smoother, the latter is designed to be rendered as a set of widgets inside the environment. Server Side. The server side of the system is implemented using Java technology while its interface to the rest of the world is implemented using web services. These are built on top of the Apache Axis2 web services framework (version 1.5 - http://ws.apache.org/axis2/), which in turn needs to run in a Java servlet container, in our case the Apache Tomcat (version 6.0). The more lightweight RESTful web-services offered by the framework are employed in the system. Although the entire communication always leverages on the HTTP protocol, the Axis2 framework allows for a lot of flexibility, especially concerning the output formats of the implemented web services. Since Axis2 comes with several implementations of message formatters we can, for example, use both the default Simple Object Access Protocol (SOAP) message format and the Javascript Object Notation (JSON) format at the same time, without any code or configuration changes being needed. The format of the response is resolved based on the content type of the HTTP request. Client Side. The user interface of the system uses the default web building blocks HTML and Javascript, in accordance with AJAX to be able to call web services which outputs are rendered using widgets. The widget system allows for an easy integration in any learning environment, while the AJAX technology is used for getting and presenting the feedback information inside the widget without any need of a page refresh. The widget system is based on the Wookie framework (http://cwiki.apache.org/WOOKIE/). Wookie is an open source implementation of the W3C Widgets Candidate Recommendation (http://www.w3.org/TR/widgets/), which allows for small web applications to be embedded into web pages and therefore standardizes the packaging formats. Wookie is an Incubator project at the Apache Software Foundation. Using the Wookie framework the client side of the system consists actually of 2 components: 1. The Widget Container: is the web application that can include widgets in its pages, in our case the learning environment. It takes care of the authentication process and it can set properties for the widgets that run inside it. In order to communicate with the Widget Engine, the web application needs some specific code, a plugin. 2. The Widget Engine: is the core of the Wookie framework and it is the widget repository. It provides functionality for adding, editing, removing and, in general, for LTfLL - 200 8- 2125 78 31 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck managing the widgets that it hosts. The widget engine is implemented as Java Servlets and runs in the Apache Tomcat Container. The widget itself uses AJAX to consume the provided web services. Here, the JSON message format that the web services can generate, comes in very handy, since it is very easy to transform this type of response into Javascript objects that can be afterwards further processed. The Yahoo User Interface library (http://developer.yahoo.com/yui/) that we are using assists us both in managing the AJAX requests and in processing the web services responses and it integrates flawlessly in Wookie. Theoretically, AJAX calls towards external services (external servers) are not permitted by the user’s browser since this would constitute a security violation of the Same Origin Policy (http://en.wikipedia.org/w/index.php?title=Same_origin_policy&oldid=314530069). Therefore, all these requests need to be forwarded through a proxy server, this role being played here by the Widget Engine. Wookie provides this functionality by creating for every web service URL another URL that actually points to the Widget Engine server while the original domain name is passed as a parameter. Concerning the WP 5.2, we plan to transform the system in several widgets to be used in a learning environment with services from other WPs. The first version of the system is developed using AJAX to exchange data between the client and the server. This form makes the transformation into widgets smoother but we have also divided the system into different learning parts (see Figure 13) and maintain a common look and feel. Since WP 5 shares in some parts the same needs as other WPs (such as user authentication, or a text management system), we plan to adapt our source using the source of other widgets (WP 4.1 for course management or WP4.2 for versioning). This will make the data of services more exchangeable between WPs. Figure 13 — Architecture of widgets. LTfLL - 200 8- 2125 78 32 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck 3.2 WP3 threads The aim of this section is to explore a bit further the way to integrate WP 5 services in two main instructional settings: informal and formal learning. Informal Learning Scenarios In informal learning settings, students can freely explore the working possibilities of WP 5 services, without being directed by explicit pedagogical scenarios. They design their learning environment from scratch and use some widgets taken from our services for facilitating their workflow. Table 1 represents all the possible students’ paths across the different LTfLL services. Some of their features are noteworthy: first, the third to fifth services require texts (e.g., course notes, essays, summaries, syntheses) as input, in order to diagnose student’s position or understanding. Second, the first are focused on the assessment of student’s pre-knowledge, so they are likely to be used at the beginning of learning. A sample workflow of a student across our services could be as depicted in Figure 14. Let us present Maria Smith (see the “additional integration report” p. 20–21). She is enrolled in several degree courses at her college to get a qualification in the newest trends on IT (e.g., Web 2.0). At the beginning of a week, Maria decides to start learning on one of the topics of the course: “LSA-based systems for learning”. She connects on the LTfLL web platform, reflects on her work to come and manages the set of widgets she plans to use accordingly: – a common PDF reader/annotator widget, for reading and annotating course documents; – several widgets for analyzing chat and forum sessions (WP 5.1) performed with a chat system or in a discussion forum, for getting some feedback about her performance and of the team she participated to chats or forums; – a common word processing window, for writing the summaries and syntheses related to her understanding of the course content; – learning material searcher widget (WP 6.1), because she is not very sure of being able to understand all the notions of this difficult course (her peers said); – a level “determinator” widget (WP 4.1), because she has quickly to grasp some core notions on the course topic, and this widget is very useful to indicate the adequate level (or section, chapter) to which start a course; – an overall understanding assessment (WP 5.2), because she has quickly to figure out if she adequately understood the first several pieces of courses. This first widget selection is rather time-consuming. Since her teacher and the tutors can be informed on her choice, Maria thinks this time is a good investment because each of the widget smartly interacts with the other ones in order to help her learn. Table 1 below depicts the following use path. She first uses the level determinator (4.1 Service) to assess his or her initial knowledge, and types some key-words to find some parts of courses of interest. She then uses the PDF reader & annotator widget and takes some notes that could later compose her course synthesis. These notes, in turn, are entered as input in the learning material searcher widget (6.1 Service), which gives some LTfLL - 200 8- 2125 78 33 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck information on the possible course level to actually start the course. She then sets up, in her web browser, the four following widgets: – the PDF reader/annotator for reading further course content; – a word processor, for writing the synthesis of the course, gathering the main pieces of the course she understood; – a chat analysis widget window; the overall understanding assessment widget (5.2 Service) for asking intermediate feedback on the written syntheses. Table 1 – Possible Paths for a Student’s Workflow using our Services. Legend: S: Student. Read: When students are using 6.1 service (first row), then they are most likely to use 6.2, nd rd th 4.1 and 5.2 ones (resp. 2 , 3 and 5 columns). 1 2 3 4 5 6 1 Learning Material Searcher (6.1) X XX XX XX 2 Assess S Pre-Knowledge & Connect to other people (6.2) XX X X XX 3 S Level Determination. Text as input (4.1) XX X X XX XX 4 S Conceptual Understanding. Text as input (4.2) XX X XX X XX X 5 S Overall Individual Understanding. Text as input (5.2) XX XX XX X XX 6 Collective Knowledge Building. Discussion as input (5.1) XX XX XX X Legend: X: likely succession; XX: most likely succession. Figure 14 – A likely student’s workflow across LTfLL services. Formal Learning Scenarios In more formal learning settings, students are immersed in specific learning scenarios designed by teachers. WP 5.2 task is focused on summary and synthesis writing, which are core activities in distance learning. This activity is not only involved in very common situations, like note taking during courses, but is also at hand in more sophisticated LTfLL - 200 8- 2125 78 34 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck learning activities. Bonk and Dennen (2003, p. 340) listed four main kinds of pedagogical activities in distance learning. It is noteworthy that most of these activities they describe involve summarizing and chatting at one or more steps of their process. – Motivational and ice-breaking activities, allowing participants to introduce them to each other. All these activities are mainly performed through chat or forum. – Creative thinking activities, like brainstorming, role-play, topical discussions, web- based explorations and readings, performed through chat, forum, and also web-based searches. – Critical-thinking activities, like electronic polling, Delphi technique, summary writing, case analyses, web resources evaluation, virtual debates. Like for the previous activities, these involve the use of chat, forum, and word processor (summary writing). – Collaborative learning activities like structured controversy, expert panels, problem solving activities, publishing work. These latter activities need a strong group organization and the use of combined pieces of software like chats, collaborative writing processors, etc. 3.3 Collaboration with WP 4 and WP 6 As expressed in the integration report, we plan to strengthen our collaboration efforts with WP 4 and WP 6. The way a document can be annotated to support learning will be jointly determined (together with WP 4, 5 and 6, since all of them use or plan to use formulation of questions for learning). We plan to devise a set of annotation templates that fit with the most common pedagogical or learning intents (e.g., “Introduction, related work, our ideas, evaluation, conclusions”, or “New information, New idea, I need to understand, Further Explanation, My theory”). These templates can also be dependent to the subject domain taught—see http://kbtn.cite.hku.hk and Scardamalia, Bereiter and Lamon (1994), and will be integrated in our notepad (T 5.2). Task T5.1 could beneficiate in its analysis from some of the modules developed for positioning the learner in WP4, for example, topic detection and the suffix array algorithm, for the identification of repeating phrases. Regarding WP6, in version 2 we intend to develop a module that allows to use ontologies in the conversation processing. From another perspective, WP5 could use also annotated chats and forums for peer search. 3.4 WP7 Validation Plans The validation goals for T 5.1 service are to investigate the extent to which: – learners get a useful feedback immediately after they finish a chat discussion and just- in-time for forums; – C&F-AFS offers a graphical visualization that improves the understanding of the conversations; – time needed to provide final feedback and grading is reduced; – it increases the quality of the feedback resulting from analyzing collaborative chat sessions and discussion forums; – it is easier to maintain consistency of feedback between different tutors; LTfLL - 200 8- 2125 78 35 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck – the system offers formative feedback to adapt and improve the course by harvesting the large volume of data produced by the learners; – using C&F-AFS mediated collaboration improves the learning outcomes of the learners. The feedback service for chat conversations will be validated during the Human- Computer Interaction (HCI) course, running for 14 weeks in the first semester (ends in February 2010) at the Computer Science Department, “Politehnica” University of Bucharest, and the validation will involve the following participants: – at least 8 undergraduate students, year 4 (senior year); – 4 tutors / teaching assistants for the HCI course; – 1 professor for the HCI course. The forum feedback service will be validated at the University of Manchester, using a forum from the medical domain on the topic “professional behaviour for medical students”. The participants are: 8 students, 1 student-facilitator and 2 program managers. The validation procedure concerning T 5.2 service is running. It consists in proposing Pensum to 60 first year students engaged in a distance learning course provided by CNED (Centre National d’Enseignement à Distance, a French Open University), through a WebCT platform. These students have to perform a case study and their task is to read a set of papers carefully and to collectively write a synthesis. In order to understand the content of the papers, each student is invited to freely use Pensum as an external and individual help. Qualitative and quantitative analyses, in the same line as these performed for the Showcase validation, will be carried out for. In parallel, we plan to design and test more cognitive models on topic extraction. LTfLL - 200 8- 2125 78 36 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck 4. Conclusions: Tools and Resources for Second Cycle of LTfLL We presented in this report two lines of services that focus on written production (chat or forum conversations and course syntheses) and ways to assess them. Since these two lines appear to be separated (the first being social-centered and the second essay- centered), they share nonetheless a lot of assumptions: – written productions can be viewed as voices that populate the classroom, among others, those of the teacher, tutors, handbook’s authors, peers, etc.. The inter- animation between these voices can be uncovered with NLP techniques and reveal their relations; – self-regulated learning—an important characteristic of lifelong learning, since students have irregular contacts with their tutors and teacher—can be considered as a loop in which artifacts and peers help students make explicit what knowledge is built and how this knowledge is built. – ways to highlight the importance of some parts of the text produced by the students (i.e., utterances, paragraphs) are crucial to foster students’ understanding and knowledge building, since they direct their attention to the most important pieces of text; – providing students with computed-based artifacts that analyse and display the main features of their written production help them build knowledge. The next round of research will be dedicated to the following points. First, to design and manage experiments for validating the services in educational settings. Second, to refine some of the latter experiments to explore two main research paths: social psychology- based hypotheses (e.g., to what extent our services can help students not to feel themselves isolated) and cognitive-based ones (e.g., to what extent the reference to the self and dialogicity help students understand the content of courses). Third, exploring the ways to operationalize Bakhtin’s theory to lifelong learning, in providing a comprehensive guide to account for the core notions of this theory (e.g., “dialogicity” of the interactions in an environment to assess the quality of learning/teaching, utterance and its boundaries, inter-animation of voices with cohesion-based measures, etc.). Eventually, to provide graphical-oriented interfaces to give students a comprehensive perspective on knowledge built (like this of O'Rourke & Calvo, 2009). These points taken into account would allow us to design and implement Social and Knowledge Building software (Code & Zaparyniuk, 2009), whose purpose fully suits our own research goals and can be summarized as follows, in providing students: – shared spaces representing collective contributions; – ways to link and reference ideas and their development; – ways to represent higher-order organizations of ideas; – ways for the same idea to be worked in various contexts; – different kinds of systems of feedback to enhance self- and group-monitoring; LTfLL - 200 8- 2125 78 37 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck – opportunistic linking of persons and groups; – ways for different user groups to customize the environment. LTfLL - 200 8- 2125 78 38 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck 5. Appendices Appendix 1 — Description of our Services as Fostering Self-regulated Learning Fostering self-regulated learning for lifelong learning is one of our main goals in this project (see also Deliverable 4.2). Our previous state of art on feedback (see Deliverable 5.1, section 2.3) underlined the need to promote alternate ways to deliver feedback on writing (Lindblom-Ylänne, Pihlajamäki, & Kotkas, 2006). In doing so, we plan to blend two forms of feedback—verification and elaboration—for improving its effectiveness (Kulhavy & Stock, 1989) and to support students in their knowledge building. We can now elaborate more on how our services can support personal and collaborative learning. Very briefly (more information in the “Additional integration report”, pp. 8 et sq.). First of all, each of our services can foster learning in a self-regulated way. Vovides et al.’s (2007) loop represents the tiers in which students are involved during learning. For T5.1 this loop is presented in Figure 15. The student discusses in a chat or forum. She evaluates whether her utterances fit her goals. The T5.1 service provides feedback which allows the student to compare it with her image of what she uttered. Figure 15 — Self-regulated learning loop for students using T 5.1 service. Let us show such a loop also for T 5.2 service. Individually, each student can read and write texts concerning a given topic (see Figure 16). At any moment (the “stop and think” strategy, as described by Vovides et al. (2007), the student can write a specific piece of text on a subject he or she wants to understand (or reuse an already-made one, like course notes) and considers being important content. This is the object level and this piece of text can be close to a synthesis. The second step is a first kind of assessment, either by the learner or a peer, in which the student assesses the purpose and the quality of the text LTfLL - 200 8- 2125 78 39 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck (related to the course). Third, the student can be prompted with some information on the relevance of the topics mentioned, the coherence of the synthesis, and its completeness. Then, the ‘meta’ level starts, in which the student is asked to monitor his or her written production (e.g., synthesis), that is, to compare the feedback of the service with his or her own. In turn, the initial text can be modified in light of the last two steps for a new round. Figure 16 — Self-regulated learning loop for students using 5.2 service LTfLL - 200 8- 2125 78 40 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck Appendix 2 – The extended pattern language Elementary Expressions The elementary expressions that may be used in the search are: word matches any occurrence of a ‘word’ “text” matches any occurrence of a ‘text’ * matches any string of 0 or more characters <D “stem”> matches any word which has the stem ‘stem’ <S “word”> matches any synonym of “word” <ref> matches any word that also appeared in the last 10 utterances <ref[x]> matches any word that also appeared in the last x utterances <NN>, <NNS>, <VB>, etc any word that was annotated by the POS- tagger with that label The tags can be combined in expressions like <S ref> (synonim with a word that appeared in the last 10 utterances), <D ref> (has the same stem as a word that appeared in the last 10 utterances), <S D “word”> (synonim with something that has the stem of “word” or, equivalently, is declined from it). However, not every combination of tags is computable. For example it can’t be derived about a word “X” whether it is <D NN> (declined from a noun) or not. That is because this version of the program only uses the information from the POS-tagger about the words in chat, and when saying “X” is declined from “Y”, we can’t tell anymore whether “Y” is a noun or not, because “Y” is not a word from the chat. This is not the only example of incomputable expression that can be obtained by combining the tags. The program announces when such a combination occurs. Operators Three operators are provided in “PatternSearch” for constructing more complex expressions. They are concatenation, AND and OR. - Concatenation: Expression1 Expression2 By joining two expressions, a new expression is obtained that matches a text if there exists text1 and text2 such that text1 matches Expression1, text2 matches Expression2 and text = text1 + whitespaces + text2 - AND operator: Expression1 & Expression2 Expression1 & Expression2 matches text if both Expression1 matches text and also Expression2 matches text. - OR operator: Expression1 | Expression2 LTfLL - 200 8- 2125 78 41 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck Expression1 | Expression2 matches text if Expression1 matches text or Expression2 matches text. Normal parentheses may be used for grouping expressions. The evaluation of expressions is by default from left to right, excepting the case of parentheses, which are evaluated first. For example Expression1 Expression2 & Expression3 is equivalent to (Expression1 Expression2) & Expression3 but different from Expression1 (Expression2 & Expression3) Composite expressions Composite expressions can be used for finding links between utterances. Their syntax is: Expression1 # Expression2 Just one “#” can appear in a composite expression. Two consecutive utterances, utterance1 and utterance2, match Expression1 # Expression2 if utterance1 contains a text that matches Expression1 and utterance2 contains a text that matches Expression2. The following forms are also accepted: Expression1 #[k] Expression2 – utterance1 and utterance2 must be at distance k. Expression1 #[*] Expression2 – utterance1 and utterance2 can be at any distance (at most 10). An example of composite expression: (What do you think) | (What is your oppinion)#I*<S “agree”> Variables The program allows the definition of variables. If one line from the input contains the assign string “:=”, then the line is interpreted as a variable definition. The name of a variable must begin with the character ‘$’. So the syntax is: $variable := Expression Filters Sometimes there are more occurrences which match an expression. We can specify to select only the longest (or shortest) one at the level of the whole chat: Expression @ max (or Expression @ min) Also, a single utterance might contain multiple matches of an expression. Therefore it might be useful to be able to select only the longest (or shortest) match from each utterance. The syntax is the following: Expression @ max_r (or Expression @ min_r) LTfLL - 200 8- 2125 78 42 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck Appendix 3 – Identification of Lexical Chains Algorithm As shown in Cartgy & Stokes (2001), a lexical chain is a collection of semantically related words that are spread in a text. In order to determine the lexical chains from a text, we have to prior know the semantic distances between the words found in that text. The process evaluates each word found in the text and places it in the lexical chain where it fits best. If the word doesn’t fit in any of the existing lexical chains, then a new chain is created and the word is placed in this new chain. In order to evaluate how well a word fits in a lexical chain, a relationship between that lexical chain and that word e tried to establish that is evaluated in order to be added in an existing lexical chain. We considered that a word can be added to a chain if most of the words from that chain are semantically close to the given word. Next, we present the algorithm that we have used for this task. The algorithm receives two variables threshold1 and threshold2 along with the text in order to build the lexical chains. Threshold1 represents the maximum value of the semantic distance between two words so that they can be considered semantically connected, while threshold2 represents the minimum percentage of the words in the chain that should be semantically connected to the given word, in order to place the word in that chain: Lexical_chains(text, threshold1, threshold2) FOR EVERY word w in the text FOR EVERY existing chain c IF c contains w THAN continue with the next word from the text ELSE continue with the next chain FOR EVERY existing chain c FOR EVERY word w1 in chain c IF sem_dist(w1,w) <= threshold1 THAN increment no_sem_related(c,w) IF no_sem_related(c,w)/no_words(c) >= threshold2 THAN word w is introduced in chain c stop trying the rest of the chains and continue with the next word from the text ELSE continue with the next chain IF w has not been placed in any chain THAN create a new chain and introduce w in it End. In the above algorithm, sem_dist(a,b) denotes the semantic distance between a and b, no_sem_related(c,w) represents the number of words from the chain c that are semantically related to the word w and no_words(c) represents the total number of words found in chain c. We have used two metrics for detecting the semantic distances: shortest path length and Jiang-Conrath. Further information about these distances will be presented in the Semantic Distances subsection. In the case of the shortest path length, the thresholds used in practice were 90% for the percentage of the words in the chain that should be semantically connected to the given LTfLL - 200 8- 2125 78 43 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck word (threshold2), and 3 for the maximum semantic distance between two related words. The value of 90% for threshold2 has also been kept for the Jiang-Conrath metric, but this time, threshold1 had to be broken in two limits (inferior and superior), since this metric could also have negative values. Therefore, instead of testing if sem_dist(w1,w) <= threshold1, we have tested threshold1a <= sem_dist(w1,w) <= threshold1b, where threshold1a was -15 and threshold1b was 0.5. The maximum complexity of the above algorithm is: card (text) * card (text dictionary) * no_lexical_chains * complexity(sem_dist). In this formula, card(text) represents the number of words from the initial text, card (text dictionary) represents the number of distinct words from the text, no_lexical_chains represents the total number of obtained lexical chains and complexity(sem_dist) is the maximum complexity needed to determine the semantic distance between two words. The time needed for creating the lexical chains for a chat containing 104 replies and 156 Wordnet concepts took around 350 seconds. Semantic Distances In a previous subsection, we have discussed about the lexical chains, and the need to know the semantic distances between the words in order to build the chains. There are many metrics that compute these distances, most of them using Wordnet, some of the possibilities being presented in (Hirst & Budanitsky, 2001). Out of these possible metrics for computing the semantic distances, we have decided to use two different distances: the shortest path length described by Rada (1989) and the Jiang-Conrath distance (Jiang & Conrath, 1997). The first approach was based only on the semantic relations found in Wordnet, and was used in order to evaluate how appropriate for the task these relations are: dist(c1, c2) = min (no_edges found in a path from c1 to c2) For the second approach, we have considered the Jiang-Conrath, because Hirst and Budanitsky’s (2001) analyzes of more semantic distances have shown that this measure gave the best results: sim( c1 , c2 )  2  log p (lso(c1 , c2 ))  (log p( c1 )  log p(c2 )) In the above formula, lso(c1,c2) is the lowest superordinate of c1 and c2 (the deepest node from Wordnet that subsumes both the concepts c1 and c2). Besides the Wordnet relations, this formula also considers the frequencies of the concepts (denoted by p(c)) in a corpus. We have considered the Web as a huge corpus, and extracted the frequencies of concepts as being the number of documents that contain the given concept divided by the total number of documents written in that language found on the Web. For English, the total number of documents was identified as being the number of documents containing the word “the”, since there is no document written in English that does not contain “the”. LTfLL - 200 8- 2125 78 44 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck These values have been determined by interrogating a search engine (Google). In order to avoid multiple interrogations for the same concept, we have built a hash table containing the values already determined so further interrogations are not needed any more for future apparitions of the same concept. Every time a new concept is found and its frequency is computed, this value is also added to the hash table. The complexity of computing the semantic distance between two concepts using shortest path length is O(n*m), where n and m are the number of senses of the two concepts. This computation takes around 600 milliseconds for a pair of concepts. In the case of the Jiang-Conrath distance, the complexity is O(N), where N is the length of the shortest path between the two concepts. In this case, determining the semantic distance between two concepts takes around 1900 milliseconds. The time needed by this approach is larger, because it implies two actions that are more time consuming: determining the lowest superordinate of two concepts and the interaction with the search engine. From the tests that we made, we have seen that the Wordnet ontology is not enough for evaluating the distance between two concepts, especially if they are domain specific. In order to correct this limitation, in the future we will use the domain ontology to augment the relations existent in Wordnet. We also plan to test the possibility of using the Wikipedia similarity in order to improve the semantic distances between concepts. Regarding the lexical chains, if the results will not be satisfying, we plan to use a new method of introducing the words in the chains: before deciding where to introduce a new word, the mean distance between that word and all the words found in every lexical chain could be computed. Then, the new word will be introduced in the chain that has the smallest mean of the semantic distances to the given word, but only if this distance is bellow a threshold. If the smallest mean of the semantic distances is above the threshold, then a new chain is created and the new word is placed in this new chain. LTfLL - 200 8- 2125 78 45 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck Appendix 4 – Details on the Evaluation of Interactions Measures First, surface metrics are computed for all the utterances of a participant in order to determine factors like fluency, spelling, diction or utterance structure (Page & Paulus, 1968). All these factors are combined and a mark is obtained for each participant without taking into consideration a lexical or a semantic analysis of what they are actually discussing. At the same level readability ease measures are computed. The next step is grammatical and morphological analysis based on spellchecking, stemming, tokenization and part of speech tagging. Eventually, a semantic evaluation is performed using Latent Semantic Analysis (LSA, Landauer & al, 1998). For assessing the on-topic grade of each utterance a set of predefined keywords for all corpus chats is taken into consideration. Moreover, at the surface and at the semantic levels, metrics specific to social networks are applied for proper assessment of participants’ involvement and similarities with the overall chat and predefined topics of the discussion. In order to perform a detailed surface analysis two categories of factors are taken into consideration at a lexical level: Page’s essay grading proxes and readability. Page’s idea was that computers could be used to automatically evaluate and grade student essays as effective as any human teacher using only simple measures – statistically and easily detectable attributes (Wresch, 1993). In order to perform a statistical analysis, Page correlated two concepts: proxes (computer approximations of interest) with human trins (intrinsic variables – human measures used for evaluation). The overall results were remarkable – a correlation of 0.71 using only simple measures which proved that computer programs could predict grades quite reliably - at least the grades given by the computer correlated with the human judges as well as the humans had correlated with each other. Starting for Page’s metrics for automatically grading essays, and taking into consideration Slotnick’s method (http://www.readabilityformulas.com/flesch-reading- ease-readability-formula.php) to group them correspondingly to their intrinsic values, the following factors and values were identified in order to evaluate each participant at the surface level: LTfLL - 200 8- 2125 78 46 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck Table 2 — Categories taken into consideration and corresponding proxes Number Quality Characteristic Proxes 1. Fluency Number of total characters, number of total words, number of different words, mean number of characters per utterance, number of utterances, number of sentences (different, because in an utterance multiple sentences can be identified) 2. Spelling Misspelled words, but in order to obtain a positive approach (the greater the percentage, the better) the percentage of correctly written words is used 3. Diction Mean and standard deviation of word length 4. Utterance Number of utterances, mean utterance length in words, mean utterance Structure length in characters All the above proxes determine the average consistency of utterances. Although simple, all these factors play an important role in discovering the most important person in a chat, in other words to measure his activity. In addition, quantity is also important in its part of analyzing each participant’s utterances. Each factor has the same weight in the corresponding quality and the overall grade is obtained by using the arithmetic mean on all predefined values. All these factors, except misspelled words, are converted into percentages in order to scale them and to obtain a relative mark for all participants. The second factor taken into account is readability. It can be defined as reading ease of a particular text, especially as it results from one’s writing style. This factor is very important because extensive research in this field show that easy-reading text (and in our case chats and utterances) has a great impact on comprehension, retention, reading speed, and reading persistence. Because readability implies the interaction between a participant and the collaborative environment, several features from the reader’s point of view are essential: prior knowledge, personal skills and traits (for example intelligence), interest, and motivation. In the currently evaluated chats, the first factor (prior knowledge) can be considered approximately the same for all students because all come from the same educational environment and share a common background. On the other hand, the remaining features vary greatly from one student to another and the last two ones are greatly reflected in their implication in the chat. Therefore two key aspects must be taken into consideration: involvement and competency, both evaluated from the social network’s point of view and with a semantic approach which will be detailed further in this paper. Readability is commonly used unconsciously, based on the insight of other chat participants, but for its evaluation a readability formula is used, which is calibrated against a more labor-intensive readability survey and which matches the overall text with the expected reading level of the audience. These formulas estimate the reading skill LTfLL - 200 8- 2125 78 47 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck required to read the utterances in a chat and evaluate the overall complexity of the words used, therefore providing the means to target an audience. Three formulas were computed: 1. The Flesch Reading Ease Readability Formula (http://www.readabilityformulas.com/flesch-grade-level-readability-formula.php) is one of the oldest and most accurate readability formulas, providing a simple approach to assess the grade-level of a chat participant and the difficulty of reading the current text. This score rates all utterances of a user on a 100 point scale. The higher the score, the easier it is to read, not necessarily understand the text. A score of 60 to 70 is considered to be optimal. 2. The Gunning’s Fog Index (or FOG) Readability Formula (http://www.readabilityformulas.com/gunning-fog-readability-formula.php) is based on Robert Gunning’s opinion that newspapers and business documents were full of “fog” and unnecessary complexity. The index indicates the number of years of formal education a reader of average intelligence would need to understand the text on the first reading. A drawback of the Fog Index is that not all multi-syllabic words are difficult, but for computational issues, the consideration that all words above 2 syllables are complex is used. 3. The Flesch Grade Level Readability Formula rates utterances on U.S. grade school level. So a score of 8.0 means that the document can be understood by an eighth grader. This score makes it easier to judge the readability level of various texts in order to assign them to students. Also, a document whose score is between 7.0 and 8.0 is considered to be optimal, since it will be highly readable. For each given chat, the system performs and evaluates all the 3 previous formulas and provides to the user detailed information for each participant. Also relative correlations between these factors and the manual annotation grades are computed in order to evaluate their relevance related to the overall grading process. Social Networks Analysis In addition to quantity and quality measures computed starting from the utterances, social factors are also taken into account in our approach. Consequently, a graph is generated from the chat transcript in concordance with the utterances exchanged by the participants. Nodes are participants in a collaborative environment and ties are generated based on explicit links (obtained from the explicit referencing facility of the chat environment used (Holmer, Kienle & Wessner, 2006) between utterances. From the point of view of social networks, various metrics are computed in order to determine the most competitive participant in chat: degree (indegree, outdegree), centrality (closeness centrality, graph centrality, eigen–values) and user ranking similar to the well known Google Page Rank Algorithm (Ridings & Shishigin, 2002). These metrics are applied on: LTfLL - 200 8- 2125 78 48 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck - The effective number of interchanged utterances between participants providing a quantitative approach; - The sum of utterance marks based on a semantic evaluation of each utterance; the evaluation process will be discussed later, based on the results obtained for each utterance, a new graph is built on which all social metrics are applied. This provides the basis for a qualitative evaluation of the chat. All the identified metrics used in the social network analysis are relative in the sense they provide markings relevant only compared with other participants in the same chat, not with those from other chats. This is the main reason why all factors are scaled between all the participants, giving each participant a weighted percentage from the overall performance of all participants. LSA and the Corresponding Learning Process Latent Semantic Analysis is a technique based on the vector-space based model (Manning & Schütze, 1999). It is used for analyzing relationships between a set of documents and terms contained within by projecting them in sets of concepts related to those documents. Our system uses words from a chat corpus. The first step in the learning process, after spell–checking, is stop words elimination (very frequent and irrelevant words like “the”, “a”, “an”, “to”, etc.) from each utterance. The next step is POS tagging and, in case of verbs, these are stemmed in order to decrease the corresponding forms identified in chats. All other words are left in their identified forms, adding corresponding tagging because same words, but with different POS tags have other contextual senses, and therefore semantic neighbors (Landauer & al., 1998). Once the term-document matrix is populated, Tf-Idf (term frequency - inverse document frequency – Lemaire, 2008) is computed. The final steps are the singular value decomposition (SVD) and the projection of the array in order to reduce its dimensions. According to Wiemer-Hastings & Zipitria (2001), the optimal empiric value for k is 300, a value used in current experiments at which multiple sources concord. Another important aspect in the LSA learning process is segmentation which is the process of dividing chats taking into consideration meaningful units. In the current implementation, the chat is divided between participants because of the considered unity and cohesion between utterances from the same participant. These documents are afterwards divided into segments using fixed non-overlapping windows. LSA is used for evaluating the proximity between two words by the cosine measure: Sim( word 1 , word 2 )   k   i 1 word 1,i , word 2,i .  k 2 k 2 (1) i 1 i 1 word 1,i word 2,i LTfLL - 200 8- 2125 78 49 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck Similaritudes between utterances and similaritudes of utterances related with the entire document are used in order to assess the importance of each utterance compared with the entire chat or with a predefined set of keywords referenced as a new document: Vector (utterance)   (1  log( no _ occurence( word i )) * vector( word i ) . (2) i 1 Sim (utterance1 , utterance 2 )  Sim Vector (utterance1 ),Vector (utterance 2 )  . (3) The Utterance and Participants’ Evaluation Process The Utterance Marking Process The first aspect that needs to be taken care of building the graph of utterances which highlights the correlations between utterances on the basis of explicit references. In order to evaluate each sentence, after finishing the morphological and lexical analysis three steps are processed: 1. Evaluate each utterance individually taking into consideration the following  effective length of initial utterance; features:  all key words which remain after eliminating stop words, spell-checking and  the level at which the current utterance is situated in the overall thread stemming and their number of occurrences;  the correlation / similarity with the overall chat;  the correlation / similaritude with a set of predefined set of topics of discussion. Furthermore, this mark combines the quantitative approach (the length of the sentence starting from the assumption that a piece of information should be more valuable if transmitted in multiple messages, linked together, and expressed in more words, not only to impress, but also meaningful in the context) with a qualitative one (the use of LSA and keywords). In the process of evaluating each utterance, the semantic value is evaluated with the help of likelihood between the terms used in the current utterance (those after preliminary processing) and the whole document, respectively those from a list of predefined topics of discussion. The formulas used for evaluating each utterance are:    mark(word )    length(initial _ utterance) 9 remaining  markempiric    10 10 word  . (4)  emphasis mark (word )  length(word ) * (1  log(no _ occurences)) . (5) LTfLL - 200 8- 2125 78 50 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck emphasis  (1  log(level))  (1  log(branching _ factor)   Sim(utterance, whole _ document)  . (6)  Sim(utterance, predefined _ keywords) 2. Emphasize Utterance Marks Each thread has a global maximum around which all utterance marks are increased correspondingly with a Gaussian distribution: ( x  ) 2 p ( x)   2 2 1  2 e , where: (7) max(id _ utter _ thread )  min(id _ utter _ thread)  ; (8) 2   id _ utterance _ with _ highest _ mark . (9) Therefore each utterance mark is multiplied by a factor of 1 + p(currrent_utterance). 3. Determine the final grade for each utterance in the current thread - Based upon the empiric mark, the final mark of the utterance is obtained for each utterance in its corresponding thread: mark final  mark final ( prev _ utter )  coefficient  mark empiric , (10) where the coefficient is determined from the type of the current utterance and the one to which it is tied to. For the coefficient determination, identification of speech acts plays an important role: verbs, punctuation signs and certain keywords are inspected. Starting from a set of predefined types of speech acts, the coefficients are obtained from a predefined matrix. These predefined values were determined after analyzing and estimating the impact of the current utterance considering only the previous one in the thread (similar to a Markov process). The grade of a discussion thread may be raised or lowered by each utterance. Therefore, depending on the type of an utterance and the identified speech acts, the final mark might have a positive or negative value. Participant Grading The in-degree, out-degree, closeness and graph centrality, eigen–values and rank factors are applied on the matrix with the number of interchanged utterances between participants and the matrix which takes into consideration the empiric mark of an utterance instead of the default value of 1. Therefore, in the second approach quality, not quantity is important (an element [i, j] equals the sum of markempiric for each utterance from participant i to participant j), providing a deeper analysis of chats using a social network’s approach based on a semantic utterance evaluation. LTfLL - 200 8- 2125 78 51 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck Each of the analysis factors (applied on both matrixes) is converted to a percentage (current grade/sum of all grades for each factor, except the case of eigen centrality where the conversion is made automatically by multiplying with 100 the corresponding eigen– value in absolute value). The final grade takes into consideration all these factors final _ gradei  k weight k  percentagek ,i , (including those from the surface analysis) and their corresponding weights: (11) where k is a factor used in the final evaluation of the participant i and the weight of each factor is read from a configuration file. After all measures are computed and using the grades from human evaluators, the Pearson correlation for each factor is determined, providing the means to assess the importance and the relevance compared with the manual grades taken as reference. General information about the chat – for example overall grade correlation, absolute and relative correctness – are also determined and displayed by the system. Optimizing each Metric’s Grade The scope of the designed algorithm is to determine the optimal weights for each given factor in order to have the highest correlation with the manual annotator grades.  Minimal/maximum values for each weight – for example a minimum of 2% in A series of constraints had to be applied: order to take into consideration at least a small part of each factor, and maximum 40% in order to give all factors a chance and not simply obtain a solution with all  Sum of all factors must be 100%; factors 0% besides the one with the best overall correlation – 100%;  Obtain maximum mean correlation for all chats in the corpus. In this case, the system has two components: 1. A neural network based on a perceptron in order to obtain fast solutions as inputs for the genetic algorithm. The main advantages for using this network with only one neuron are: o capacity to learn and adapt from examples; o fast convergence; o numerical stability; o search in the weight space for optimal solution; o duality and correlation between inputs and weights. 2. A genetic algorithm used for fine-tuning the solutions given by the neural network, also keeping in mind the predefined constraints. The solution for determining the optimal weights combines the two approaches in order to obtain benefits from both – numerical stable solutions from neural networks and the flexibility of genetic algorithms in adjusting these partial solutions. LTfLL - 200 8- 2125 78 52 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck System Evaluation The initial running configuration used by the system: 10% for Page’s Grading, 5% for social networks factors applied on the number of interchanged utterances, and 10% for the semantic social network factors applied on utterance marks.  Relative correctness: 77.44% The overall results obtained with these weights are:  Absolute correctness: 70.07%  Correlation: 0.514 Relative correctness and absolute correctness represent absolute/relative distances in a one-dimensional space, where the annotator’s grade and the one obtained automatically using the system are taken into consideration. Eventually, the final results (as arithmetic means for each of the 3 individual measures determined per chat) are also displayed. The results after multiple runs of the weight optimization system (all with 4 concurrent populations) show that most importance in manual evaluation is given to the following factors: factors with a corresponding percentage  10% Table 2. Results after multiple runs of the weight optimization system, with regards to Percentage Factor 20-25% Page’s Grading methods - so only surface analysis factors 10-15% Indegree from the social network’s point of view, applied on number of interchanged utterances 30-40% Outdegree also determined by the number of outgoing utterances – somehow a participant’s gregariousness measure ≈ 10% Semantic graph centrality – the only measure with a higher importance applied which relies on utterance marks All remaining factor are evaluated below 5%, therefore don’t have high importance in the final grading process.  Relative correctness: ≈ 46.83% The overall results, with regards to correlation optimization, are:  Absolute correctness: ≈ 45.70%  Correlation: ≈ 0.594 The spikes from each population’s average fitness are determined by newly inserted individuals or by the population reinitialization. After the first 10 iterations important improvements can be observed, whereas after 30 generations the optimum chromosomes of each population stagnate. Only population reinitializations and chromosome interchanges provide minor improvements in the current solution.  the human grading process uses a predominantly quantitative approach; Our results entail the following conclusions:  uncorrelated evaluations and different styles/principles used by different human  the improvement of correlation was in the detriment of absolute/relative correctness; annotators are the main causes for lowering the overall correlation and correctness;  convergence of the genetic algorithm can be considered after 30 generations. LTfLL - 200 8- 2125 78 53 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck Conclusions and Future Improvements The results obtained from our system allow us to conclude that the evaluation of a participant’s overall contribution in a chat environment can be achieved. Also we strongly believe that with further tuning of the weights, better LSA learning and increased number of social network factors (including those applied to the entire network) will increase performance and reliability of the results obtained. Moreover, the subjective factor in manual evaluation is also present and influences the overall correctness.  social experiments and the evaluation of their actual impact; different scenarios (with In the future, the following improvements are in sight: or without actual presence of professor) will be conducted and the actual impact  identify artifacts, voices and different perspectives over the same concept using LSA; assessed;  include common texts (for example newspaper articles) in order to have an overall  better segmentation by including concepts from Tristan Miller’s theory (Manning & view of all the words that appear in a text (not only those specific to a domain); Schütze, 1999) about cohesion and by taking into consideration semantic links  obtain a larger social network by merging multiple chats – overall evaluation on the between utterances; entire corpora and the possibility to search for a participant based on his evaluated competence regarding a predefined topic. LTfLL - 200 8- 2125 78 54 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck 6 . References Bakhtin, M. (1981). The Dialogic Imagination: Four Essays: University of Texas Press. Bakhtin, M. (1984). Problems of Dostoevsky’s Poetics (C. Emerson, Trans.). Minneapolis, MN: University of Minnesota Press. Bereiter, C., & Scardamalia, M. (1987). The psychology of written composition. Illsdale: Erlbaum. Bonk, C. J., & Dennen, V. (2003). Frameworks for research, design, benchmarks, training and pedagogy in web-based distance education. In M. G. Moore & W. G. Anderson (Eds.), Handbook of distance education (pp. 331–348). Mahwah: Erlbaum. Brassart, D. G. (1993). Remarques sur un exercice de lecture-écriture : la note de synthèse ou synthèse de documents. Pratiques, 79, 95-113. Butler, D. L., & Winne, P. H. (1995). Feedback and self-regulated learning: A theoretical synthesis. Review of Educational Research, 65(3), 245-281. Carthy, J. & Stokes, N. (2001). Lexical chains for topic detection and tracking. Dublin: University College. Code, J. R., & Zaparyniuk, N. E. (2009). The emergence of agency in online social networks. In S. Hatzipanagos & S. Warburton (Eds.), Handbook of research on social software and developing community ontologies (pp. 102-118). Hershey: Information Science Reference. Dong, A. (2006). Concept formation as knowledge accumulation: A computational linguistics study. Artif. Intell. Eng. Des. Anal. Manuf, 20(1), 35–53. Dysthe, O. (1996). The Multivoiced Classroom: Interactions of Writing and Classroom Discourse. Written Communication, 13–3, 385–425. Dufresne, A., Basque, J., Paquette, G., Leonard, M., Lundgren-Cayrol, K., & Prom Tep, S. (2003). Vers un modèle générique d'assistance aux acteurs du téléapprentissage. STICEF, 1, 57-88. Fernandez, S., Velazquez, P. & Mandin, S. (2008). Les systèmes de résumé automatique sont-ils vraiment des mauvais élèves? 9es Journées internationales d’Analyse Statistique des Données Textuelles JADT’08. Lyon (France). Flower, L., Stein, V., Ackerman, J., Kantz, M. J., McCormick, K., & Peck, W. C. (1990). Reading-to- Write: Exploring a Cognitive and Social Process. New York: Oxford University Press. Foltz, P. W., Kintsch, W., & Landauer, T. K. (1998). The measurement of textual coherence with Latent Semantic Analysis. Discourse Processes, 25(2-3), 285-307. Hadwin, A. F., & Winne, P. H. (2001). CoNoteS2: A software tool for promoting self-regulation. Educational Research and Evaluation, 7(2–3), 313–334. Hattie, J., & Timperley, H. (2007). The power of feedback. Reviw of Educational Research, 77(1), 81-112. Hirst, G. & Budanitsky, A. (2001). Lexical Chains and Semantic Distance. Conf. Eurolan-2001. Holmer, T., Kienle, A. & Wessner, M. (2006) Explicit Referencing in Learning Chats: Needs and Acceptance. In: Nejdl W.; Tochtermann, K., Innovative Approaches for Learning and Knowledge Sharing. First conference on Technology Enhanced Learning, pp.170-184, Springer, Berlin. Hübscher, R., & Frizell, S. (2002). Aligning theory and web-based instructional design practice with design patterns. In G. Richards (Ed.), Proceedings of world conference on E-Learning in corporate government, healthcare, and higher education (pp. 298-304). Chesapeake: AACE. Jiang, J. J. & Conrath, D. W. (1997). Semantic similarity based on corpus statistics and lexical taxonomy. Proc. International Conference on Research in Computational Linguistics (ROCLING X), Taiwan. Jones, R. H. (2005). Sites of engagement as sites of attention: time, space and culture in electronic discourse. In S. Norris & R. H. Jones (Eds.), Discourse in action. Introducing mediated discourse analysis (pp. 141-154). Londres: Routledge. Joshi, M. & Rosé, C. P. (2007) Using Transactivity in Conversation Summarization in Educational Dialog. Proc. of the SLaTE Workshop on Speech and Language Technology in Education. Jurafsky, D. & Martin, J.H (2009) Speech and Language Processing. An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Second Edition, Pearson Prentice Hall LTfLL - 200 8- 2125 78 55 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck Kintsch, E., Steinhart, D., Stahl, G., LSA Research Group, Matthews, C., & Lamb, R. (2000). Developing summarization skills through the use of LSA-based feedback. Interactive Learning Environments, 8(2), 87-109. Kintsch, W. (1998). Comprehension, a Paradigm for Cognition. Cambridge: Cambridge University Press. Kirkpatrick, L. C., & Klein, P. D. (2009). Planning text structure as a way to improve students' writing from sources in the compare–contrast genre. Learning and Instruction, 19, 309-321. Koschmann, T. (1999). Toward a dialogic theory of learning: Bakhtin's contribution to learning in settings of collaboration. Paper presented at the Computer Supported Collaborative Learning (CSCL '99), Palo Alto, CA. Proceedings pp. 308-313. Retrieved from http://kn.cilt.org/cscl99/A38/A38.htm. Kulhavy, R. W., & Stock, W. (1989). Feedback in written instruction: The place of response certitude. Educational Psychology Review, 1(4), 279-308. Landauer, T. K., & Dumais, S. T. (1997). A solution to Plato's problem: the Latent Semantic Analysis theory of acquisition, induction and representation of knowledge. Psychological Review, 104(2), 211-240. Landauer, T. K., Foltz, P. W., & Laham, D. (1998). An introduction to Latent Semantic Analysis. Discourse Processes, 25(2/3), 259-284. Lemaire, B. (2008). Limites de la lemmatisation pour l’extraction de significations. 9es Journées internationales d’Analyse Statistique des Données Textuelles JADT’08. Lyon (France). Lemaire, B., Denhière, G., Bellissens, C., & Jhean-Larose, S. (2006). A computational model for simulating text comprehension. Behavior Research Methods, Instrument and Computers, 38(4), 628-637. Lemaire, B., & Dessus, P. (2001). A system to assess the semantic content of student essays. Journal of Educational Computing Research, 24(3), 305-320. Lin, X. (2001). Designing metacognitive activities. Educational Technology Research & Development, 49(2), 23–40. Lindblom-Ylänne, S., Pihlajamäki, H., & Kotkas, T. (2006). Self-, peer- and teacher-assessment of student essays Active Learning in Higher Education, 7(1), 51-62. Lorino, P. (2009). Concevoir l’activité collective conjointe : l’enquête dialogique. Étude de cas sur la sécurité dans l’industrie du bâtiment. Activités, 6(1), 87-110. Manning, C. & Schütze, H (1999). Foundations of statistical Natural Language Processing. Cambridge (Mass.): MIT Press. Miller, T (2004). Latent Semantic Analysis and the construction of coherent extracts. In: N. Nicolov, K. Botcheva, G. Angelova, & R. Mitkov (eds.), Recent Advances in Natural Language Processing III (pp. 277–286). Amsterdam/Philadelphia: John Benjamins. Moreno, R. (2009). Constructing knowledge with an agent-based instructional program: A comparison of cooperative and individual meaning making. Learning and Instruction, 19, 433–444. Newell, G. E. (2006). Writing to learn. In C. A. MacArthur, S. Graham & J. Fitzgerald (Eds.), Handbook of writing research (pp. 235–247). New York Guilford Press. O'Rourke, S. T., & Calvo, R. A. (2009). Analysing semantic flow in academic writing. In V. Dimitrova, R. Mizoguchi, B. du Boulay & A. Graesser (Eds.), Artificial Intelligence in Education. Building learning systems that care: From knowledge representation to affective modelling (AIED2009) (pp. 173-180). Amsterdam: IOS Press. Page, E. B. & Paulus, D. H. (1968). Analysis of essays by computer. Predicting Overall Quality. U.S. Department of Health, Education and Welfare. Palincsar, A. S., & Brown, A. L. (1984). Reciprocal teaching of comprehension-fostering and comprehension-monitoring activities. Cognition and Instruction, 1(2), 117–175. Puustinen, M., & Pulkkinen. (2001). Models of self-regulated learning: a review. Scandinavian Journal of Educational Research, 45(3), 269–286. Rada, R. et al. (1989). Development and application of a metric on semantic nets. IEEE Transactions on Systems, Man, and Cybernetics, 19(1), 17–30. Ross, J. A. (2006). The reliability, validity, and utility of self-assessment. Practical Assessement, Research & Evaluation, 11(10), 1–13. LTfLL - 200 8- 2125 78 56 D 5 .2 Le a r n in g Suppor t a nd Fe e dba ck Scardamalia, M., & Bereiter, C. (1996). Adaptation and understanding: A case for new cultures of schooling. In S. Vosniadou, E. d. Corte, R. Glaser & H. Mandl (Eds.), International Perspectives on the design of Technology-Supported Learning Environments (pp. 149-163). Mahwah: Erlbaum. Scardamalia, M., Bereiter, C., & Lamon, M. (1994). The CSILE project : Trying to bring the classroom into World 3. In K. McGilly (Ed.), Classroom Lessons : Integrating Cognitive Theory (pp. 201-228). Cambridge: MIT Press. Segev-Miller, R. (2004). Writing from sources: the effect of explicit instruction on college students’ processes and products. L1-Educational Studies in Language and Literature, 4(1), 5–33. Sfard, A. (1998). On two metaphors for learning and the dangers of choosing just one. Educational Researcher, 27(2), 4-13. Spivey, N. (1997). The constructivist metaphor: Reading, writing, and the making of meaning. New York: Academic Press. Stahl, G. (2006). Group cognition. Computer support for building collaborative knowledge. Cambridge: MIT Press. Stahl.G. (Ed.) (2009), Studying Virtual Math Teams. Boston, MA: Springer US. Tannen, D. (1989). Talking Voices: Repetition, Dialogue, and Imagery in Conversational Discourse, Cambridge University Press Thibaudeau, V. (2000). 88 clefs pour identifier dans un texte un problème de logique ou d'expression de la pensée. Laval: Université de Laval Thiede, K. W., & Anderson, M. C. M. (2003). Summarizing can improve metacomprehension accuracy. Contemporary Educational Psychology, 28, 129-160. Tochtermann, K. (Ed.) (2006). Innovative Approaches for Learning and Knowledge Sharing. Proc. First European Conference on Technology-Enhanced Learning, EC-TEL (pp. 170–184). Berlin: Springer, LNCS 4227 . Toulmin, S. (1958). The uses of arguments. Cambridge: Cambridge Univ. Press. Trausan-Matu, S., Rebedea, T., Dragan, A., & Alexandru, C. (2007). Visualisation of Learners' Contributions in Chat Conversations. In J. Fong. & P. Wang (Eds.), Blended Learning (pp. 217- 226): Pearson-Prentice Hal. Trausan-Matu, S., & Rebedea, T. (2009). Polyphonic Inter-Animation of Voices in VMT, in G. Stahl (Ed.), Studying Virtual Math Teams (pp. 451–473). Boston, MA: Springer US; Available from: http://www.ischool.drexel.edu/faculty/gerry/vmt/book/24.pdf. van Aalst, J. (2009). Distinguishing knowledge–sharing, knowledge–construction, and knowledge–creation discourses. Computer-Supported Collaborative Learning, 4, 259–287. Vovides, Y., Sanchez-Alonso, S., Mitropoulou, V., & Nickmans, G. (2007). The use of e-learning course management systems to support learning strategies and to improve self-regulated learning. Educational Research Review, 2(1), 64-74. Vygotsky, L. (1978). Mind in society. Cambridge, MA: Harvard University Press. Wade-Stein, D., & Kintsch, E. (2004). Summary Street: Interactive Computer Support for Writing. Cognition and Instruction, 22(3), 333-362. Wegerif, R. (2006). A dialogic understanding of the relationship between CSCL and teaching thinking skills. Computer-Supported Collaborative Learning, 1(1), 143–157. Wegerif, R. (2007). Dialogic, Education and Technology: Expanding the Space of Learning. New York, NY: Kluwer-Springer. Wenger, E. (1998). Communities of practice: Learning, meaning and identity. Cambridge: Cambridge University Press. Wiemer-Hastings, P., & Zipitria, I. (2001). Rules for syntax, vectors for semantics. Proceedings of the 23rd Annual Conference of the Cognitive Science Society. Wiley, J., & Voss, J. F. (1999). Constructing arguments from multiple sources: Tasks that promote understanding and not just memory for text. Journal of Educational Psychology, 91(2), 301-311. Wresch, W. (1993). The Imminence of Grading Essays by Computer--25 Years Later. Computers and Composition, 10(2), 45-58, retrieved from http:// computersandcomposition.osu.edu/archives/v10/10_2_html/10_2_5_Wresch.html. LTfLL - 200 8- 2125 78 57