Papers by Claudio Giuliano

Research paper thumbnail of SocialLink: knowledge transfer between social media and linked open data

SocialLink: knowledge transfer between social media and linked open data

This dataset contains canonical citations (DOIs) for the SocialLink dataset (15th May 2017 releas... more This dataset contains canonical citations (DOIs) for the SocialLink dataset (15th May 2017 release), alignment data and code and entity data in .csv and .json format.SocialLink is a publicly-available Linked Open Data dataset that matches social media accounts on Twitter to the corresponding entities in multiple language chapters of DBpedia. By effectively bridging the Twitter social media world and the Linked Open Data cloud, SocialLink enables knowledge transfer between the two: on the one hand, it supports Semantic Web practitioners in better harvesting the vast amounts of valuable, up-to-date information available in Twitter; on the other hand, it permits Social Media researchers to leverage DBpedia data when processing the noisy, semi-structured data of Twitter. The SocialLink dataset is created by the SocialLink Pipeline, which aligns 271,000 DBpedia persons and organisations to their Twitter profiles via data acquisition, candidate acquisition and candidate selection phases. ...

Research paper thumbnail of Early Development of a Virtual Coach for Healthy Coping Interventions in Type 2 Diabetes Mellitus: Validation Study (Preprint)

Early Development of a Virtual Coach for Healthy Coping Interventions in Type 2 Diabetes Mellitus: Validation Study (Preprint)

BACKGROUND Mobile health solutions aimed at monitoring tasks among people with diabetes mellitus ... more BACKGROUND Mobile health solutions aimed at monitoring tasks among people with diabetes mellitus (DM) have been broadly applied. However, virtual coaches (VCs), embedded or not in mobile health, are considered valuable means of improving patients’ health-related quality of life and ensuring adherence to self-care recommendations in diabetes management. Despite the growing need for effective, healthy coping digital interventions to support patients’ self-care and self-management, the design of psychological digital interventions that are acceptable, usable, and engaging for the target users still represents the main challenge, especially from a psychosocial perspective. OBJECTIVE This study primarily aims to test VC interventions based on psychoeducational and counseling approaches to support and promote healthy coping behaviors in adults with DM. As a preliminary study, university students have participated in it and have played the standardized patients’ (SPs) role with the aim of ...

This paper summarizes FBK-irst participation at the lexical substitution task of the SEMEVAL comp... more This paper summarizes FBK-irst participation at the lexical substitution task of the SEMEVAL competition. We submitted two different systems, both exploiting synonym lists extracted from dictionaries. For each word to be substituted, the systems rank the associated synonym list according to a similarity metric based on Latent Semantic Analysis and to the occurrences in the Web 1T 5-gram corpus, respectively. In particular, the latter system achieves the state-of-the-art performance, largely surpassing the baseline proposed by the organizers.

Requirements Analysis of Cross-media Knowledge Acquisition, Deliverable D 8.1

User profiling has existed in the social media since their inception and has supported most of th... more User profiling has existed in the social media since their inception and has supported most of their business model. Even if users do not actively share the information about themselves on the social media (so-called passive users), they can still be profiled based on their location and who they follow. In this paper, we present a system that leverages the linking of followed (popular) Twitter users to DBpedia, and the information therein contained, to help users concealing their digital footprint. Specifically, our approach helps a passive Twitter user to stay private by proposing a list of additional profiles to follow that would confuse the social media’s inference pipeline and prevent it from inferring useful information about that passive user and his interests.

Research paper thumbnail of A Virtual Coach (Motibot) for Supporting Healthy Coping Strategies Among Adults With Diabetes: Proof-of-Concept Study

JMIR Human Factors, 2022

Background Motivation is a core component of diabetes self-management because it allows adults wi... more Background Motivation is a core component of diabetes self-management because it allows adults with diabetes mellitus (DM) to adhere to clinical recommendations. In this context, virtual coaches (VCs) have assumed a central role in supporting and treating common barriers related to adherence. However, most of them are mainly focused on medical and physical purposes, such as the monitoring of blood glucose levels or following a healthy diet. Objective This proof-of-concept study aims to evaluate the preliminary efficacy of a VC intervention for psychosocial support before and after the intervention and at follow-up. The intent of this VC is to motivate adults with type 1 DM and type 2 DM to adopt and cultivate healthy coping strategies to reduce symptoms of depression, anxiety, perceived stress, and diabetes-related emotional distress, while also improving their well-being. Methods A total of 13 Italian adults with DM (18-51 years) interacted with a VC, called Motibot (motivational b...

Research paper thumbnail of MicroNeel: Combining NLP Tools to Perform Named Entity Detection and Linking on Microposts

EVALITA. Evaluation of NLP and Speech Tools for Italian, 2016

English. In this paper we present the Mi-croNeel system for Named Entity Recognition and Entity L... more English. In this paper we present the Mi-croNeel system for Named Entity Recognition and Entity Linking on Italian microposts, which participated in the NEEL-IT task at EVALITA 2016. MicroNeel combines The Wiki Machine and Tint, two standard NLP tools, with comprehensive tweet preprocessing, the Twitter-DBpedia alignments from the Social Media Toolkit resource, and rule-based or supervised merging of produced annotations. Italiano. In questo articolo presentiamo il sistema MicroNeel per il riconoscimento e la disambiguazione di entità in micropost in lingua Italiana, con cui abbiamo partecipato al task NEEL-IT di EVALITA 2016. MicroNeel combina The Wiki Machine e Tint, due sistemi NLP standard, con un preprocessing esteso dei tweet, con gli allineamenti tra Twitter e DBpedia della risorsa Social Media Toolkit, e con un sistema di fusione delle annotazioni prodotte basato su regole o supervisionato.

Lecture Notes in Computer Science, 2017

We present SocialLink, a publicly available Linked Open Data dataset that matches social media ac... more We present SocialLink, a publicly available Linked Open Data dataset that matches social media accounts on Twitter to the corresponding entities in multiple language chapters of DBpedia. By effectively bridging the Twitter social media world and the Linked Open Data cloud, SocialLink enables knowledge transfer between the two: on the one hand, it supports Semantic Web practitioners in better harvesting the vast amounts of valuable, up-to-date information available in Twitter; on the other hand, it permits Social Media researchers to leverage DBpedia data when processing the noisy, semi-structured data of Twitter. SocialLink is automatically updated with periodic releases and the code along with the gold standard dataset used for its training are made available as an open source project.

Research paper thumbnail of Linking knowledge bases to social media profiles

Proceedings of the Symposium on Applied Computing, 2017

Social media have become an invaluable source of data for a wide variety of tasks. Unfortunately,... more Social media have become an invaluable source of data for a wide variety of tasks. Unfortunately, this data is hard to gather and process due to low amount of machine readable attributes, API limitations and noisiness. In this paper we propose a system that aligns knowledge base entries of people and organisations to the corresponding social media profiles. The motivation is twofold: (i) on the one hand, we facilitate processing of social media data by allowing the import of rich entity descriptions from knowledge bases; (ii) on the other hand, we are enabling an automatic enrichment of a knowledge base with additional data from the social media. We used this system to create a resource of 893,446 alignments between DBpedia entities and Twitter profiles. This resource allows, effectively, to connect Twitter to the Linked Open Data cloud.

Internal Deliverable INT5.3. First Prototype of IE Technologies

Research paper thumbnail of N-ary Relation Extraction for Joint T-Box and A-Box Knowledge Base Augmentation
The Web has evolved into a huge mine of knowledge carved in different forms, the predominant one ... more The Web has evolved into a huge mine of knowledge carved in different forms, the predominant one still being the free-text document. This motivates the need for Intelligent Web-reading Agents: hypothetically, they would skim through disparate Web sources corpora and generate meaningful structured assertions to fuel Knowledge Bases (KBs). Ultimately, comprehensive KBs, like Wikidata and DBpedia, play a fundamental role to cope with the issue of information overload. On account of such vision, this paper depicts the FACT EXTRACTOR, a complete Natural Language Processing (NLP) pipeline which reads an input textual corpus and produces machine-readable statements. Each statement is supplied with a confidence score and undergoes a disambiguation step via entity linking, thus allowing the assignment of KB-compliant URIs. The system implements four research contributions: it (1) executes N-ary relation extraction by applying the Frame Semantics linguistic theory, as opposed to binary techni...

Research paper thumbnail of N-ary relation extraction for simultaneous T-Box and A-Box knowledge base augmentation

Semantic Web

The Web has evolved into a huge mine of knowledge carved in different forms, the predominant one ... more The Web has evolved into a huge mine of knowledge carved in different forms, the predominant one still being the free-text document. This motivates the need for intelligent Web-reading agents: hypothetically, they would skim through disparate Web sources corpora and generate meaningful structured assertions to fuel knowledge bases (KBs). Ultimately, comprehensive KBs, like WIKIDATA and DBPEDIA, play a fundamental role to cope with the issue of information overload. On account of such vision, this paper depicts the FACT EXTRACTOR, a complete natural language processing (NLP) pipeline which reads an input textual corpus and produces machine-readable statements. Each statement is supplied with a confidence score and undergoes a disambiguation step via entity linking, thus allowing the assignment of KB-compliant URIs. The system implements four research contributions: it (1) executes n-ary relation extraction by applying the frame semantics linguistic theory, as opposed to binary techniques; it (2) simultaneously populates both the T-Box and the A-Box of the target KB; it (3) relies on a single NLP layer, namely part-of-speech tagging; it (4) enables a completely supervised yet reasonably priced machine learning environment through a crowdsourcing strategy. We assess our approach by setting the target KB to DBpedia and by considering a use case of 52, 000 Italian Wikipedia soccer player articles. Out of those, we yield a dataset of more than 213, 000 triples with an estimated 81.27% F 1. We corroborate the evaluation via (i) a performance comparison with a baseline system, as well as (ii) an analysis of the T-Box and A-Box augmentation capabilities. The outcomes are incorporated into the Italian DBpedia chapter, can be queried through its SPARQL endpoint, and/or downloaded as standalone data dumps. The codebase is released as free software and is publicly available in the DBpedia association repository.

Recent literature on text-tagging reported successful results by applying Maximum Entropy (ME) mo... more Recent literature on text-tagging reported successful results by applying Maximum Entropy (ME) models. In general, ME taggers rely on carefully selected binary features, which try to capture discriminant information from the training data. This paper introduces a standard setting of binary features, inspired by the literature on named-entity recognition and text chunking, and derives corresponding realvalued features based on smoothed logprobabilities. The resulting ME models have orders of magnitude fewer parameters. Effective use of training data to estimate features and parameters is achieved by integrating a leaving-one-out method into the standard ME training algorithm. Experimental results on two tagging tasks show statistically significant performance gains after augmenting standard binaryfeature models with real-valued features.

SocialLink: exploiting graph embeddings to link DBpedia entities to Twitter profiles

Progress in Artificial Intelligence

FBK-IRST: Semantic relation extraction using Cyc

Proceedings of the 5th International Workshop on Semantic Evaluation, 2010

ABSTRACT We present an approach for semantic relation extraction between nominals that combines s... more ABSTRACT We present an approach for semantic relation extraction between nominals that combines semantic information with shallow syntactic processing. We propose to use the ResearchCyc knowledge base as a source of semantic information about nominals. Each source of information is represented by a specific kernel function. The experiments were carried out using support vector machines as a classifier. The system achieves an overall F1 of 77.62% on the "Multi-Way Classification of Semantic Relations Between Pairs of Nominals" task at SemEval-2010.

A tool-box for lexicographers

Proceedings of the Tenth Euralex International Congress Euralex 2002 Copenhagen Denmark August 13 17 2002 Vol 1 2003 Isbn 87 90708 09 1 Pags 113 117, 2003

This paper describes an on-going annotation effort which aims at adding a manual annotation layer... more This paper describes an on-going annotation effort which aims at adding a manual annotation layer connecting an existing annotated corpus such as the English ACE-2005 Corpus to Wikipedia. The annotation layer is intended for the evaluation of accuracy of linking to Wikipedia in the framework of a coreference resolution system.

A Tool-Box for Lexicographer

Information extraction activities at ITC-irst

Un Tool-Box per la Lessicografia Ladina