The last years are characterized by a transitional period from monolithic, local informative systems to global ones, which are based on knowledge infrastructures as the Semantic Web. The data that will be provided via the Semantic Web, could be used immediately from numerous applications, such as search engines, information filtering applications based on the content of web pages, information extraction applications and summarization systems from multiple web pages etc. The Semantic Web creates new prospects for electronic trade and knowledge based systems.
Ontologies constitute the infrastructure of such systems in the Semantic Web and for this reason there are numerous endeavours towards the growth of relative technologies funded by the IST programme of the European Union. The research community today, not only conceives the role of ontologies as the basic technology in the effective management of knowledge, but also as a technology that contributes in the transformation of WWW in the Semantic Web. The problems however that are related with the creation of suitable ontologies, with their coordination (when more than one ontologies exist on the same thematic region), as well as with their effective maintenance, impede the spread of the Semantic Web and the applications that it supports. The success of the Semantic Web and its relevant technologies depend heavily on the fast and low cost growth, co-ordination and development of multiple ontologies, as well as from their effective exploitation of its various applications. Up to today, these activities require high cost and a lot of time, in order to be materialised by experts in the various thematic regions. The great importance of ontologies, requires the growth of new techniques in the scientific region of ontology engineering, as well as the innovative exploitation of ontologies from various brilliant applications.
OntoSum project aims at the growth of new methods and techniques for:
- Learning of new ontologies from existing data bases (Ontology Learning from Databases)
- The coordination of existing ontologies (Ontology Coordination) on the same thematic region, with accent in their correspondence (mapping) and in their fusion (merging)
- The production of summaries with the use of ontologies (Ontology-based Summarization)
These objectives will be achieved with the utilization of three PhDs on the former inquiring regions.
- Ontology Learning - The main objective of this PhD is the investigation of new methods and techniques for the automatic learning of ontologies from databases. The methods that are developed in this category of learning, utilize information from various schemas of databases, as well as from the records that are contained. Currently the creation of ontologies from databases is accomplished manually and it is focused on the location of equivalences between the elements of the relational database schema and the ontology. The objective of this research is the automation of the ontology creation process from databases, while taking into account, both the model of the database and the records that are stored inside. Machine learning techniques will be used for the creation of the ontology, in order to limit the creation time and the human intervention during the whole process. In the first phase of the doctorate, there will be determined the theoretical frame of learning of ontologies from databases, studying existing methods, determining precisely the open areas inquiring subjects and focusing in certain from them. For the confrontation of these subjects there will be proposed new methods and algorithms and a prototype system will be developed for the learning of ontologies from data bases. There will be determined also the frame of evaluation of methods that will be developed taking into consideration existing metres and processes of evaluation that are used in the creation of ontologies from texts.
- Ontology Coordination - Objective of this doctorate is the investigation of new methods of automatic co-ordination of ontologies. The existing methodologies that support the co-ordination of ontologies use only departments of available knowledge, while the human factor is taken into consideration only at the final stage of process of co-ordination (that is to say at the fusion). The proposed inquiring activity will supplement and will extend the existing inquiring results in the following subjects: Investigation of possibilities of exploitation of texts but also data bases relative with a thematic region for the support of co-ordination of ontologies in this region. Investigation of use of various alternative algorithms of automatic clarification of meaning of significances in combination with the use of dictionaries of general aim and bodies of texts. Study of techniques on the exploitation of taxonomic and non taxonomic relations. Development of heuristic algorithms and methods of constraint satisfaction for the better and more effective exploitation and combination of the above techniques. Development of prototype systems that will be disposed via the World Wide Web for the support of new methodologies.
- Production of Summaries - Objective of this doctorate is the investigation of methods of production of summaries from multiple documents (texts and data bases) with use of ontologies. For the creation of a summary, from multiple documents is essential the localisation of various structural units of information in these documents, that will be used for the production of the content of the summary. Up today however, the structural units of information and the ways with which these influence the activity of automatic production of summaries have not been studied extensively. Also, the use of ontologies has not been investigated sufficiently, despite the fact that the production of summaries is made for concrete facts and the ontologies are a means of representation of knowledge for these facts (i.e. which are the important entities that are involved in a fact, which are the characteristics of these entities, which is the role of these entities in the particular fact). In the first phase of the doctorate, the already existing ways of representation of structural units of information will be studied. There will be investigated the use of ontologies in the frame of this methodology for the representation not only of the hierarchy of concepts (taxonomic relations) but also of the characteristics of concepts, and non taxonomic relations between the concepts. The above will constitute the theoretical frame of work of thesis. Based on this frame there will be proposed new ways of representation of structural units. New methods and techniques will be developed for the automatic production of summaries that will exploit this new representation. In the next phase of the doctorate there will be developed a prototype system of production of summaries. The frame of evaluation of methods that will be developed will also be determined, taking into consideration existing metres and processes of evaluation that are used in the production of summaries.