Semantic Classes of Czech Verbs

Recent Advances in Intelligent Information Systems ISBN 978-83-60434-59-8, pages 207–217 Semantic Classes of Czech Verbs Dana Hlav´ aˇckov´ a, Maria ...
7 downloads 1 Views 325KB Size
Recent Advances in Intelligent Information Systems ISBN 978-83-60434-59-8, pages 207–217

Semantic Classes of Czech Verbs Dana Hlav´ aˇckov´ a, Maria Khokhlova, and Karel Pala Centre for Natural Language Processing, Masaryk University, Brno, Czech Republic

Abstract In this paper we present semantic classes of Czech verbs obtained from the lexical database VerbaLex that has recently been built at the Centre for NLP, Faculty of Informatics, Masaryk University. Similar lexical databases are mentioned in the paper, particularly VerbNet for English and Vallex for Czech. At the moment we have in VerbaLex 82 semantic classes covering 10,482 Czech verb lemmata and 19,556 verb valency frames. We discuss the criteria for establishing semantic classes: the most important one is grouping verbs according to their senses. The second one exploits relations between semantic classes of Czech verbs and semantic roles and subcategorization features as they are used in VerbaLex valency frames. We also touch on the issue of the ontology that could be used to describe the meanings of the verbs in the semantic classes. The semantic classification of Czech verbs can be extended for other languages via Interlingual Index (ILI) existing in WordNets and it can be used in the various applications in the NLP area (machine translation, syntactic analysis, semantic search, information extraction and others). Keywords: WordNet, VerbaLex, semantic classes, verbs

1 Introduction Previous research in the area of syntactic and semantic analysis in Czech linguistics has concerned sentence patterns and also semantic classes of Czech verbs. (Daneˇs, 1987) suggested a general classification of Czech verbs containing the following main semantic classes: 1. verbs denoting motion 2. verbs expressing manipulation (handling) 3. speech-act verbs, verbs of knowing, thinking and perception (verbs of propositional attitudes) 4. verbs denoting changes 5. verbs denoting processes, actions and events This classification proposal was offered as a base for the wider framework in which the semantic classification of the Czech verbs could have been developed. For each class Daneˇs’s proposal gives basic examples illustrating what verbs each class may include. However, the individual classes as such were not worked out more

208

Dana Hlav´ aˇckov´ a et al.

completely because the necessary larger empirical data were not at hand at this time (Czech corpora did not exist in 1987). The interest in semantic description of verbs as well as in their classification has grown stronger recently and it led to projects in which lexicons of Czech valency frames have been built, particularly Vallex, containing approx. 6,000 ˇ verbs (Zabokrtsk´ y, 2005) and VerbaLex (Hlav´aˇckov´a and Hor´ak, 2006) comprising approx. 10,500 verbs. The need for building a suitable semantic classification of Czech verbs has been felt for longer time with regard to natural language processing (NLP) applications. When building resources like Vallex, VerbaLex (and others for other languages) the issue of the semantic verb classes has been naturally invoked. The necessary condition is to have data that are large enough for such task. We consider the lexical database VerbaLex with approx. 10,500 verb lemmata a sufficient resource for this purpose. In our view the need for high-quality resources in the NLP field is sometimes underestimated with the belief that they can be at least partly replaced by statistical techniques. We are convinced that the resources like VerbaLex are indispensable for successful NLP applications such as disambiguation that has to include syntactic and semantic analysis. There is also a challenge to exploit verb semantic classes within applications related to creating metadata descriptions for Semantic Web, semantic search and extracting information in general. Last but not least, investigation of the semantic classes of Czech verbs is also driven by the ambition to understand better meanings of Czech verbs. 1.1 Lexical database VerbaLex The data on which semantic classification of Czech verbs has been built are contained in the Verb Valency Lexicon VerbaLex (Hlav´aˇckov´a, Hor´ak, Kadlec, 2006). The current version of VerbaLex contains 6,360 synsets, 21,193 verb senses, 10,482 verb lemmata and 19,556 valency frames. Valency database is available in txt, xml, pdf and html formats (http://nlp.fi.muni.cz/verbalex/htmlDEMO). The basic item in VerbaLex is a verb synset consisting of the individual literals (verb lemmata). The verbs are associated with their (complex) valency frames. The complex valency frames (CVF) with marked verb position contain the morphosyntactic (cases) and semantic information (deep valencies) related to verb arguments. Types of verbal complementation (nouns, adjectives, adverbs, infinitive constructions or subordinate clauses) are precisely distinguished in the verb frame notation. The type of valency relation for each constituent element is marked up as obligatory “obl” or optional “opt” as well. A CVF contains the simple example of the verb usage in a sentence. Semantic information about the verb arguments is represented by two-level semantic roles in CVFs. The first level contains the general semantic roles that are based on the 1stOrder-Entity and 2ndOrderEntity items from EuroWordNet Top Ontology. Their list is closed and contains approx. 30 semantic tags such as SUBS – Substance, ENT – Entity, PART – Part, COM – Communication, ACT – Action, INS – Instrument, GROUP – Group, etc. The items of the second level can be characterized as subcategorization features capturing the fact that verbs prefer arguments of a certain semantic class. To label

Semantic Classes of Czech Verbs

209

them we use specific literals (lexical units) taken from the set of WordNet Base Concepts with their respective sense numbers. This allows us to see what is the semantic structure of the analyzed sentences using their respective valency frames. The nodes that we traverse when going down the H/H tree (in WordNet) at the same time form a sequence of the semantic features which characterize meaning of the lexical unit fitting into a particular valency frame. These sequences can be interpreted as quite detailed subcategorization features. The list of 2nd level semantic roles is open, current version contains about 1000 wordnet literals from Princeton WordNet. An example of the CVF for the verb p´ıt/drink: who_nom*AGENT(human:1|animal:1) what_acc*SUBS(beverage:1) VerbaLex contains also additional information about the verbs: 1. 2. 3. 4. 5. 6. 7.

definition of verb meanings for each synset; verb ability or inability to create passive form; number of meaning for homonymous verbs; semantic class a verb belongs to; aspect (perfective, imperfective, biaspectual); types of verb use (primary, figurative, idiomatic); types of reflexivity for reflexive verbs.

As we have already hinted, within VerbaLex we work with verb semantic classes that were originally adopted from the Levin’s list of English verb classes (48 classes) (Levin, 1993) and the list of Martha Palmer’s VerbNet project (Palmer et al, 2006) (5,319 verb lemmata) with more fine-grained sets of verbs (total 82 classes, with 395 subclasses). These verb classes have been translated and adopted for the Czech language. Czech classes were enriched with synonyms, aspect counterparts and prefixed verbs. Presently, we work with 82 main semantic verb classes, 258 subclasses and 6,393 Czech verb lemmata in the current version of our list. In building the semantic classes we prefer semantic criteria against the diathesis alternations used by Levin because of the difference between Czech and English. As a result we have reduced the number of the subclasses and obtained verb classes that are semantically more consistent than Levin’s. It also has to be added that Levin’s classes do not by any means cover all of the major verbs, e.g. entries for specialize, specify, spell, spend, spoil, etc. are missing. Also verbs that take sentential complements have been deliberately excluded. Thus for Czech Levin’s classes served at best as an inspiration.

2 The Criteria for Establishing Verb Semantic Classes As we said above, Levin’s classes determined by means of diathesis alternations served as a sort of starting point, but we were aware of some problems indicated above: her classes are practically based on introspection techniques only, no corpus

210

Dana Hlav´ aˇckov´ a et al.

data were used, thus the alternations do not provide consistent classes (see also Hanks and Pustejovsky, 2005). It should be kept in mind that when dealing with tasks of semantic nature one can hardly avoid an arbitrariness (caused by introspection) in a relevant extent. Also according to (Hanks and Pustejovsky, 2005) a word’s meaning is difficult to evaluate with precision, it is a matter of introspection and conjecture. We are aware of these difficulties but in our view it still makes sense to try to process verb meanings manually. Thus our verb classes basically rely on two criteria: 1. human understanding of the verb senses and their groupings as we find them in Czech explanatory dictionaries and Czech WordNet. This work has been done manually and when it was difficult to decide what class a verb belongs to we regularly consulted Czech corpus data and word sketches (Kilgarriff et al, 2004). In this respect we took into account the verb’s behaviour in context, which is observable and verifiable. We also added some new senses found for the individual verbs in VerbaLex when working on their valency frames. 2. the semantic roles and subcategorization features appearing in the valency frames of the particular verbs. This is based on the natural assumption (see below) that verbs belonging to one semantic class usually combine with the same or similar role or subcategorization feature in their valency frame. It can be seen that for some classes this criterion works reliably, in (Nˇemcov´a, 2008), 18 complex semantic roles (second level) has been processed yielding altogether 186 semantic classes exploiting the complex semantic roles. This preliminary result calls for further analysis and a comparison with 82 main VerbaLex classes.

3 Main Classes In comparison with VerbNet, which claims to contain 82 main verb classes and 395 subclasses, in VerbaLex we presently work with 82 semantic classes and 258 subclasses (6,393 verb lemmata). The question may be asked whether something like ‘main’ or ‘important’ verb classes exists. We can certainly distinguish the most general classes or groups such as the ones mentioned above in Section 1 (states, events, processes, etc.), but we need to work out consistent subclasses as well. This, in fact, means that for this purpose we need a kind of conceptual system, i.e. a sort of ontology that would allow us to establish reasonably the ’main’ classes. 3.1 Ontology for Verb Classes If we have a look at the semantic verb classes (i.e. starting data) as they are marked in VerbaLex they naturally offer something that can be characterized as a ‘verb ontology’. The most difficult question is how to organize verbs and which verbal items found in VerbaLex can be considered as the main ones, as the representatives of the individual classes? In VerbaLex we have classes containing, for example, items denoting social relations, activities, processes, episodes, events, etc. Present ontologies group together mostly nouns as in WordNet or BSO (Hanks et

Semantic Classes of Czech Verbs

211

al., 2007), Roget’s Thesaurus (Roget and Lloyd, 1982) or SUMO/MILO ontology (Sumo/Milo, 2009). So we face a difficult problem here: we have been able to establish 82 verb classes based on the VerbaLex data (approx. 10 500 Czech verbs) but we are looking for the way in which they can be organized. One possibility is to exploit the fact that language is anthropocentric and try to build the verb classes around the semantic type HUMAN BEING. Below we offer a preliminary list (an outline of the ontology) that may be characterized as a sort of shallow ontology which is based on the 82 verb classes as they appear in the VerbaLex database (the names of the classes marked in italic basically correspond to Levin’s classes so far): • human activities – physical activities ∗ motion · body motion (body_internal_motion-49), running (run.51.3.2), driving (vehicle-51.4.1), body activities (eat-39.1, breathe-40.1.2)), care about body (floss-41.2, dress-41.1) · putting (put-9), removing (remove-10.1), carrying (carry-11.4), pushing (push-12), holding (hold-15) ∗ creating (create-26.4), mixing (mix-22), building (build-26.1), preparing (preparing-26.3) ∗ destroying (destroy-44), cutting (cut-21), splitting (split-23), breaking (break-45.1) – sense perception ∗ seeing (see-30.1) – mental activities ∗ thinking (consider-29.9), evaluation (assessment-34), characterize (characterize-29.2), meaning (conjecture-29.5), concentration (focus85) • social relations – social activities ∗ cooperating (cooperate-71), helping (help70), meeting (meet-36.3), stealing (steal-10.5), cheating (cheat-10.6), fighting (battle-36.4), killing (murder-42), defending (defend-83), giving (give-13.1), getting (get-13.5.1) – communication ∗ talking (say-37-7), writing (scribble-25.2), teaching (transfer_mesg37.1), learning (learn-14) – physical contact ∗ touching (touch-20), hitting (hit-18.1), hugging (marry-36.2) • states – existence (exist-47.1), living (lodge-46) – mental states ∗ feeling (amuse-31.1), dreaming (want-32), emotions (marvel-31.3)

212

Dana Hlav´ aˇckov´ a et al.

• processes – weather (weather-57) – natural phenomena ∗ growing (grow-26.2), emission (light_emission-43.1, smell_emission43.4, substance_emission-43.4) – beginning (begin-55.1), ending (stop-55.2) • animals – (calve-28, animal sounds-38) This outline relies on the classes that already exist in VerbaLex, i.e. it is based on real Czech data. The question how to organize the particular items in the proposed verb ontology cannot be answered completely, rather it is an attempt showing how further analysis could proceed.

4 Semantic Roles and Verb Classes in VerbaLex It can be observed that there is an obvious relation between the semantic roles and subcategorization features occurring in the VerbaLex frames and the verbs that share the same roles and features (Hlav´ aˇckov´ a, 2007). In fact, some of them can serve as criteria for establishing the particular classes, e.g. if we consider a complex role like SUBSTANCE(beverage:1) we get a reasonably consistent class (62 verbs) including verbs of drinking – drink, preparation of the beverages – e.g. ferment, boil, and manipulating with them – draw, pour, etc. Similarly, if we consider the complex role ENTITY(bird:1) – it yields 45 verbs related consistently to birds. The role PART(eye:1) provides the class of 79 verbs related to eyes (wink, rub and others). As can be expected, some complex roles provide larger groups of verbs, for instance, the role COMMUNICATION(communication:2) is related to a group of 340 verbs, which have to be further subclassified, yielding subtypes of communication. We can expect that some subcategorization features will serve as semantic complements of the verbs belonging to the same semantic class. For example, we can expect appearance of the subcategorization features – person/machine – object/location – instrument/substance/machine in valency frames for verbs of cleaning (someone cleans something with something – Mother cleans the house., He cleans the surface with a damp cloth., She is cleaning the room with the carpet sweeper.). These relations have been investigated for Czech verbs in VerbaLex (Nˇemcov´ a, 2008, see also above). Hovewer, many verbs display implicit (default) subcategorization features which are not included in their valency frames. For example, verbs like – m´est/sweep, vys´ avat/hoover, um´yvat/wash occur quite regularly in contexts like Use window coverings you can wash or clean easily... (found in BNC), in which the subcategorization feature instrument/substance/machine is not present. Thus we have to deal with the fact that some verbs do not require overtly expressed semantic complements. In spite of this humans understand their meaning without any problems. The question arises whether in these cases we should not

Semantic Classes of Czech Verbs

213

include the default subcategorization features in the frames as well (as is partially done in FrameNet, cf. Baker, Fillmore and Lowe, 1998). Our roles and subcategorization features are clearly related to semantic roles, types and lexical sets as mentioned in (Hanks et al., 2007). In the Pattern Dictionary (Hanks and Pustejovsky, 2005) a distinction is made between a semantic type and a semantic role: “The semantic type is an intrinsic attribute of a noun, while a semantic role has the attribute thrust upon it by the context.” This definition helps us to understand the problem itself, but if we are looking for an algorithm allowing us to recognize semantic types, roles and lexical sets in the corpus texts this is only a beginning. Perhaps lexical sets could help us here. According to (Hanks et al., 2007) a lexical set is a paradigmatic cluster of words that activate the same sense of a verb and that have something in common semantically. Deciding what counts as “the same” sense of a verb and therefore what constitutes a member of a relevant lexical set cannot be done by rote procedure. Hanks is right about the ‘rote’ procedure but to be able to process verb senses formally some procedure is needed anyway. How can it be constructed? It is obvious that it has to be based on corpus data. As Hanks shows, for instance, the verb wear is associated with the following lexical set: [[Garment]] suit, dress, hat, clothes, uniform, jeans, glove, jacket, trousers, helmet, shirt, coat, t-shirt, shoe, gown, sweater, outfit, boot, apron, scarf, tie, bra, pyjamas, stockings. However, when we observe the behaviour of the verbs such as vaˇrit/cook in the corpus we find out that the most frequent subject of vaˇrit/cook is not a cook as one could expect but mom, mum, mother, woman, grandmother, etc. So the question arises whether these nouns constitute the lexical set for this verb. If we follow the line of Hanks’ thinking the answer should be negative, it does not follow from the sense of vaˇrit/cook that semantic type cook determines the nouns mom, mum, mother, woman, grandmother as belonging to the respective lexical set. One possible solution is to try to obtain the relevant data from word sketches (Kilgarriff et al, 2004), which contain the collocational information we are looking for. Hanks’ solution is the Pattern Dictionary. Further research is necessary to solve this problem.

5 Czech Verb Classes in Princeton WordNet? At present 8,844 Czech verb synsets have been linked to their English equivalents in Princeton WordNet v.2 through ILI and further links are being added. In this way the semantic classes developed for Czech can be associated with the verbs in PWN yielding as a result the new English classes which will differ from the VerbNet classes. Moreover, English verbs can be accompanied with the valency frames originally developed for Czech. In (Hor´ ak and Pala, 2008) we show that ‘Czech’ valency frames can be exploited for Bulgarian and Romanian as well as for English. The results obtained so far show that the perspective is more than promising – experiments took place for Bulgarian and Romanian. In our view this can be extended also for verb classes.

214

Dana Hlav´ aˇckov´ a et al.

5.1 Unlinkables (Lexical Gaps) When linking Czech verbs to Princeton WordNet we have been regularly encountering a problem of non-existent translation equivalents or, as we call them, ‘unlinkables’. This issue appeared first during building EuroWordNet database where ˇ cek, 1999) was used and came up the term ’lexical gap’ (see eg. Pala and Seveˇ again within the Balkanet project (Balkanet, 2004) in which more attention was paid to this issue and virtual nodes were proposed as a solution. What we are facing now with regard to Czech and English verbs is even more disturbing: consider the simple fact that Princeton WordNet contains now approx. 13,000 verbs. Our starting resource for VerbaLex named Brief (Pala et al, 1997.) contains at the moment approx. 16,000 Czech verbs and the full list of Czech verbs includes approx. 35,000 items. Would it mean that we have to deal with more than 15,000 ‘unlinkable’ verbs? Hopefully not, but in our view one cause of this discrepancy follows from the different lexicographical treatment of the prefixed verbs in Czech and phrasal verbs in English. In VerbaLex we found approx. 1,500 (15 %) ‘unlinkables’ and when we tried to look for them in a standard Czech-English dictionary we could not find their lexicalized translation equivalents. When looking at the list of the ‘unlikables’ we are coming to the conclusion that: • one set of the unlinkables is related to the aspectual pairs of that are standard for Czech verbs – aspect is a grammatical category and practically every Czech verb must be either imperfective or perfective. Then it happens that one member of the Czech aspect pair has a lexicalized translation equivalent and the second not. For instance, the Czech imperfective verb br´ at has as the standard translation equivalent to take but its perfective member in the aspect pair – vz´ıt – cannot have take as its straightforward translation equivalent since it denotes an action that is finished. The number of aspect verb pairs in Czech in any case exceeds 10,000 items. • second set of unlinkables are verbs with prefixes and especially with double prefixes. For example, Czech verb vy-skoˇcit has as the translation equivalent to jump up but the verb po-vy-skoˇcit does not have a lexicalized translation equivalent in English and has to be translated as a sort of description, i.e. to jump up a little. The number of such double prefixed verbs in Czech is certainly higher than 1,000. The issue of the lexical gaps or unlinkables is another and long term task and we can hardly pay more attention to it in the present paper. 5.2 Czech Verb Classes in English and Russian From what we said above with regard to English a question arises naturally whether verb classes developed for Czech can be transferred to other languages as well. Since we work with Czech it is natural to ask this question about some Slavonic languages, Russian in particular. The easiest way to do it is to use interlingual index (ILI) as it exists for WordNets. In this way the information existing

Semantic Classes of Czech Verbs

215

centrovat:2/zacentrovat:1 chˇ napat:n1/chˇ napnout:n1 lapat:n1/lapit:n4/lapnout:n1

center:3, centre:1 пeрeдaвaть мяч в цeнтр snatch:2 хвaтaть/схвaтить catch:5, capture:3 поймaть, схвaтить, хвaтaть kliˇckovat:n1, prokliˇckovat dodge:1, dodge:2 обводить/обвeсти kopat:n1/kopnout:n1 kick:3, kick:2 удaрять/удaрить, бить nahr´ at:n2/nahr´ avat:n2 pass:21 пaсовaть, дeлaть пeрeдaчу, пeрeдaвaть мяч pˇrihr´ at:1/pˇrihr´ avat:1 pass:21 пaсовaть, дeлaть пeрeдaчу, пeрeдaвaть мяч rozehr´ at:1/rozehr´ avat:1 start passing:21 рaзыгрaть/рaзыгрыгвaть pˇrehazovat:n6/pˇrehodit:n6 throw over пeрeбрaсывaть pˇrehr´ at:n6/pˇrehr´ avat:n6 outplay:1 пeрeигрaть, обыгрaть pˇriklep´ avat:n4/pˇriklepnout:n4 pass lightly, softly дeлaть мягкую пeрeдaчу pˇriˇtuk´ avat:n2/pˇriˇtuknout:n2 knock дeлaть точную пeрeдaчу smeˇcovat:1/zasmeˇcovat smash:1, boom:2 гaсить (в тeннисe) stˇrelit:n11/stˇr´ılet:n11 shoot:3 удaрить (по воротaм), нaносить удaр vyp´ alit:n16/p´ alit fire the ball удaрить (по воротaм), нaносить удaр vystˇrelit:n8 volley:3 удaрить vstˇrelit:1 score:1, hit:13 зaбить гол vs´ıtit:1 score:1, hit:13 вогнaть мяч в сeтку Table 1: The verb classes based on the selectional preference INS(ball:1) in Czech, English and Russian

in Czech WordNet can be relatively easily projected to English or Russian. In the worst case we can try to find the respective translation equivalents manually. We have tried to do this for the class containing 17 basic (Czech) verbs denoting actions in which INSTRUMENT(ball:1) is the complex semantic role. Table 1 shows that classes obtained in this way for the three languages are consistent and that this technique can be applied to other verb classes as well.

6 Conclusions The following results have been obtained: there is a lexical database VerbaLex with Czech 8,844 synsets linked to PWN 2.0, 21,193 literals, 10,482 lemmata, 19,556 frames and 82 main semantic classes and 258 subclasses. The frames and semantic classes can be used not only for Czech, for which they have been specifically developed, but also for other languages. The valency frames will have to be tested in practical applications but it is already clear that they can serve well in syntactic analysis and also for searching in free texts coming from a various domains. According to the first tests the coverage of the VerbaLex database is approx. 84 % for a corpus text. The semantic classes of Czech verbs represent a new result in Czech linguistics. Their testing is the next task, which has to be

216

Dana Hlav´ aˇckov´ a et al.

worked out separately in a near future.

Acknowledgements This work has been partly supported by the Academy of Sciences of Czech Republic under the projects LC526, NPV II 2C06009 and by the Czech Grant Agency under the project GA201/05/2781.

References Collin F. Baker, Charles J. Fillmore, and John B. Lowe (1998), The Berkeley FrameNet Project, in Proceedings of the COLING-ACL, pp. 86–90, Montreal, Canada. BalkaNet (2004), Balkanet project, http://www.ceid.upatras.gr/Balkanet. Frantiˇsek Danesˇ and Zdenˇek Hlavsa, editors (1987), Vˇetn´e vzorce v ˇceˇstinˇe (Sentence Patterns in Czech), volume 23 of Studie a pr´ ace lingvistick´e, Academia, Praha. EuroWordNet (1999), EuroWordNet project, http://www.illc.uva.nl/EuroWordNet/. Patrick Hanks and Elisabetta Jezek (2008), Shimmering Lexical Sets, in Euralex XIII 2008 Proceedings, Pompeu Fabra University, Barcelona. ´ (May 10–11 2007), Towards an empiriPatrick Hanks, Karel Pala, and Pavel Rychly cally well-founded ontology for NLP, in Proceedings of GL 2007, Fourth International Workshop on Generative Approaches to the Lexicon, Paris. Patrick Hanks and James Pustejovsky (2005), A Pattern Dictionary for Natural Language Processing, Revue Fran¸caise de linguistique appliqu´ee, 10(2):63–82. ´c ˇ kova ´ (2007), The Relations between Semantic Roles and Semantic Classes Dana Hlava in VerbaLex, in Recent Advances in Slavonic Natural Language Processing RASLAN 2007, pp. 97–101, Masarykova universita, Brno. ´c ˇ kova ´ and Aleˇs Horak ´ (2006), VerbaLex – New Comprehensive Lexicon Dana Hlava of Verb Valencies for Czech, in Computer Treatment of Slavic and East European Languages, pp. 107–115, Slovensk´ y n´ arodn´ y korpus, Bratislava, Slovakia. ´ ˇ ´ ´ Dana Hlavackova, Aleˇs Horak, and Vladim´ir Kadlec (2006), Exploitation of the VerbaLex Verb Valency Lexicon in the Syntactic Analysis of Czech, 2006(4188):79–86. ´ Pavel Smrzˇ , and David Tugwell (2004), The Sketch Adam Kilgarriff, Pavel Rychly, Engine, in Proceedings of the Eleventh EURALEX International Congress, pp. 105– 116, Universite de Bretagne-Sud, Lorient, France. Karin Kipper, Anna Korhonen, Neville Ryant, and Martha Palmer (June, 2006), Extending VerbNet with Novel Verb Classes, in Fifth International Conference on Language Resources and Evaluation (LREC 2006), Genoa, Italy. Beth Levin (1993), English Verb Classes and Alternations: A Preliminary Investigation, The University of Chicago Press, Chicago. ´ (2008), S´emantick´e tˇr´ıdy sloves, Master’s thesis, Masarykova univerzita, Iva Neˇ mcova Brno. ´ (2008), Can Complex Valency Frames be Universal?, Karel Pala and Aleˇs Horak in Second Workshop on Recent Advances in Slavonic Natural Languages Processing, RASLAN 2008, Masaryk University, Brno. Karel Pala and Pavel Smrzˇ (2004), Building Czech WordNet, 1–2(7):79–88. ˇ ˇ ek (1997), Valence ˇcesk´ Karel Pala and Pavel Seve c ych sloves, in Proceedings of Works of Philosophical Faculty at the Masaryk University Brno, pp. 41–54, MU, Brno.

Semantic Classes of Czech Verbs

217

ˇ ˇ ek (1999), EuroWordNet, Final Report LE4-8283, Faculty Karel Pala and Pavel Seve c of Informatics, Masaryk University Brno, Amsterdam. Peter Mark Roget and Susan M. Lloyd, editors (1982), Roget’s thesaurus of English words and phrases, Longman, Harlow, Essex. VerbNet (2006), VerbNet, a class-based verb lexicon, project website, http://verbs.colorado.edu/˜mpalmer/projects/verbnet.html. ˇ ´ (2005), Valency Lexicon of Czech Verbs, Ph.D. thesis, Charles Zdenˇek Zabokrtsk y University, Prague, Czech Republic.

Suggest Documents