Εργαστήριο Βάσεων Δεδομένων

Euroterm - "Extending the EuroWordNet with Public Sector Terminology EDC 2214"

Χρηματοδότηση:

European Commission

Διάρκεια:

1/1/2001 - 30/6/2002

Φορείς Υλοποίησης:

Databases Laboratory, Department of Computer Engineering and Informatics, Patras University, Greece (coordinator)
CentER Applied Research, Tilburg University, Netherlands
Department of Software and Computing Systems, University of Alicante, Spain

Περιγραφή:

The main objective of the Euroterm project was to enrich EuroWordnet multilingual semantic network with domain specific terminology for the set of European languages (Greek, Dutch and Spanish). EuroWordnet is a lexical database representing semantic relations among basic concepts for the West European languages, which are combined with a so called Inter-Lingual-Index (ILI). The latter forms an unstructured fund of English concepts with the only purpose to provide an efficient mapping across languages. Euroterm's main purpose was to combine effectively multilingual domain specific terminology into a common lexical database through Terminology Alignment System, in order to expand EuroWordnet and consequently the Inter-Lingual-Index with concepts restricted to the conceptual domain of environment. In processing applications Euroterm was incorporated in a search engine in order to assist towards query expansion and multilingual information retrieval tasks.

Ιστότοπος:

http://www.ceid.upatras.gr/Euroterm

BalkaNet - " Design and Development of a Multilingual Balkan Wordnet"

Χρηματοδότηση:

European Commission

Διάρκεια:

1/9/2001 - 30/8/2004

Φορείς Υλοποίησης:

Project Consortium

Databases Laboratory, Department of Computer Engineering and Informatics, Patras University, Greece (coordinator)
Research Academic Computer Technology Institute, Patras, Greece
University Alexandru Ioan Cuza (UAIC), Romania
Academia Romana - Centrul Pentru Cercetari Avansate in Invatarea Automata (RACAI), Romania
Bulgarian Academy of Science - Institute of Bulgarian Language (IBL-DCMB), Bulgaria
Sabanci University, Turkey
Masaryk University Brno - Faculty of Informatics (FIMU), Czech Republic
Memodata, France
University of Plovdiv (PU), Bulgaria
University of Athens (UOA), Greece

Subcontractors

MATF University, Serbia
Otenet S.A, Greece
CentER Applied Research, Netherlands

Περιγραφή:

The main objective of the Balkanet project concerns the construction of a multilingual semantic network for the Balkan languages. Such a network will contain concepts of all languages involved inter-linked via an Inter-Lingual-Index in terms of pre-defined lexical and semantic relations. The main objective behind the Inter-Lingual-Index is to provide a repository of English concepts, which will serve as the means via which the monolingual Balkan concepts will be linked to each other enabling thus navigation within and across the different monolingual Wordnets. The main motivation for the emergence of such a project was the success of other similar projects, namely Princeton Wordnet as well as EuroWordnet. In this respect Balkanet is being developed along the same lines as the above projects that its main lexical unit is a synonym set of terms (synset). Each synset contains terms of all languages that share the same semantic meaning. Terms within the same synset are linked by means of language internal lexical relations (e.g. synonymy) whereas synsets are linked to each other via pre-denined semantic relations (e.g. hyponymy, meronymy etc.). On top of that, monolingual synsets are linked to an ILI record via the near-synonym relation in such a way as to enable tracing terms of other Balkan languages that hold the same semantic properties of the term in question.
Each monolingual Wordnet is being developed by using various lexical resources available to each partner, such as explanatory dictionaries, corpora, bilingual lexica, thesauri, glossaries etc. Terms extracted from the abovementioned resources are being processed and inserted via the VisDic editor to each Wordnet. Such terms are going to form the core concepts of each language's wordnet. Once the main set of terms is selected and inserted in the wordnet then other terms are linked to it via lexical relations. The latter occur after a close linguistic processing of the various lexical resources.
Along with the VisDic editor a Wordnet Management System is currently under development with the main purpose to provide to the consortium as well as to end users of the wordnet a visualization of the project's results. The Wordnet Management System contains a series of services that enable any user to obtain the information he needs within the network. Such services apply both to each monolingual wordnet as well as the common multilingual lexical resources. Once completed the Wordnet Management System will be distributed to the EC and any other interested party and can serve as a valuable infrastructure for various kinds of NLP applications.
Finally, once the monolingual wordnets are fully developed and interlinked the project's application will take place. Such application concerns the incorporation of the Balkanet's infrastructure and content to an IR system with the purpose of performing conceptual indexing tasks. In particular, Balkanet is going to be used by a search engine during the indexing process in an attempt to make a preliminary categorization of the URLs that are being fetched by the engine's crawler and indexed.

Ιστότοπος:

http://www.dblab.upatras.gr/Balkanet

ΑΙΘΡΑ - «Μελέτη και ανάπτυξη νέων τεχνικών απόκρυψης ομιλίας σε ψηφιακά μέσα»

Χρηματοδότηση:

Δημόσια Διοίκηση

Διάρκεια:

1/1/04 - 31/12/04

Φορέας Υλοποίησης:

Εργαστήριο Βάσεων Δεδομένων, Τμήμα Μηχανικών Η/Υ & Πληροφορικής, Πανεπιστήμιο Πατρών

Περιγραφή:

Το έργο έχει στόχο την ανάπτυξη τεχνικών απόκρυψης πληροφορίας τύπου ομιλίας σε ψηφιακά δεδομένα όπως εικόνες και έγγραφα κειμένου. Ψηφιακές εικόνες και έγγραφα ανταλλάσονται καθημερινά μέσω ηλεκτρονικού ταχυδρομείου, μαγνητικών μέσων (δισκετών, CD/DVD ROM κ.α.) του διαδικτύου, των κινητών τηλεφώνων τελευταίας τεχνολογίας, των προσωπικών ψηφιακών βοηθών (PDA) και όχι μόνο. Οι χρήστες αυτών των τεχνολογιών ανέρχονται σε εκατομμύρια ενώ η διακίνηση των μέσων αυτών αντιπροσωπεύει έναν ιδιαίτερα σεβαστό όγκο της πληροφορίας που διακινείται μέσα από το διαδίκτυο. Κατά συνέπεια η χρήση τους για τον προαναφερόμενο σκοπό δεν προκαλεί ιδιαίτερες υποψίες. Θα ήταν λοιπόν επιθυμητή η ανάπτυξη νέων ή η προσαρμογή υπαρχουσών τεχνικών απόκρυψης σε αυτά τα μέσα. Να προστεθεί εδώ ότι το έργο αυτό έρχεται να βασισθεί στην ήδη επιτυχημένη εφαρμογή της απόκρυψης πληροφορίας κειμένου και εικόνας στα αντίστοιχα αρχεία.
Στο έργο αυτό ολοκληρώθηκε η μελέτη και η ανάπτυξη τεχνικών στεγανογραφίας για σήματα ομιλίας, αρχικά σε αρχεία τύπου εικόνας. Στη συνέχεια δώθηκε ιδιαίτερη σημασία στη μελέτη των αρχείων ομιλίας, του όγκου που καταλαμβάνουν ως σήματα καθώς και των τεχνικών κωδικοποίησης που χρησιμοποιούν.