XQVT: A Visual XQuery Language

XQVT: A Visual XQuery Language Brendan Lucier 2004/12/11 Abstract This paper is a research project submitted for credit in the Non-Traditional Databas...

Author: Brook Dixon

2 downloads 2 Views 335KB Size

Report

Download PDF

Recommend Documents

XQuery: An XML query language

A Visual Language for XML

XQuery 1.0: An XML Query Language

XQuery: a typed functional language for querying XML

XQuery 1.0: An XML Query Language

Xing: A Visual XML Query Language

CodeSketch: A Visual Programming Language for Arduino

XQuery. XQuery Data Model. Sequences

Japanese Visual Language

DG-Query: An XQuery-based Decision Guidance Query Language

XQuery!: An XML query language with side effects

XML and Databases. XQuery. XQuery, XSLT and XPath. XML Data model life cycle. XQuery. Why do we need a new query language? Relational Data, SQL

XQuery!: An XML query language with side effects

CodeSketch: A Visual Programming Language for Arduino HCI Approach

Quick review of XPath. XQuery Language. Why might we manipulate XML? How XQuery can be used? Strengths. Weaknesses

A visual language for the creation of narrative educational games

Language as a Model for Visual Communication and Graphic Design

A Visual Language for Sketching Large and Complex Interactive Designs

A Visual Language for Non-WIMP User Interfaces

A Type-based Dimensional Analysis for XQuery

XPATH (an XQuery perspective)

ARTZ 105A.02: Visual Language - Drawing

Association of Visual Language Interpreters of Canada

Common Uses for XQuery. XQuery: Contents. What is XQuery. Processing Scenarios (1)

XQVT: A Visual XQuery Language Brendan Lucier 2004/12/11 Abstract This paper is a research project submitted for credit in the Non-Traditional Databases course at the University of Waterloo. We present the XQuery Visualization Tool (XQVT), an XQuery interface built upon a graphical language. The XQVT language maintains all the power of XQuery. This paper describes the XQVT language and illustrates sample queries. A prototype language implementation and user interface are proposed.

1

Introduction

Over the last decade XML has emerged as a standard way to share data between organizations. With this growth comes the need for tools to assist in extracting data from XML documents. Most XML query operations have been performed using XQuery [8] since it was accepted for standardization by the W3C [17]. However, XQuery is a textual language that is written to be easily understood by software developers and relational database programmers. For example, the FLWOR expression is modeled after the Select-FromWhere expression from SQL, and the syntax of a FLWOR expression is reminiscient of for loops and of functions returning values. Unfortunately, not all users of XQuery are experienced programmers. These users would benefit from an alternative language for expressing XML queries. To this end, this paper proposes the XQuery Visualization Tool (XQVT, pronounced Excavate), an interface based upon a new graphical XQuery language. This language allows XML queries to be represented visually in an intuitive way. The XQVT language is built on top of XQuery, in that queries are run by first translating them into equivalent XQuery code. Most importantly, XQVT captures all of the computational power of XQuery. It is this last point that distinguishes XQVT from previous visual XQuery languages and makes XQVT a viable tool for experts as well as novices. The visual XQVT language can be applied to a variety of tasks. It can be used as a learning tool for novice users, since simple queries can be represented intuitively without written code. It can be used as a visualization tool, providing an easy way to describe a complex XQuery fragment. It can be used for debugging, allowing a developer to isolate and independently run subparts of a larger query. Finally, since XQVT maintains all the computational power of XQuery, it can be used as an alternative language for developing queries of arbitrary complexity.

1.1

Design Principles

The XQVT language is designed to translate into XQuery. Since being accepted as a W3C standard, XQuery has become the dominant language for querying XML data. Numerous XQuery implementations exist, as well as interfaces between XQuery and relational databases. XQVT can make use of these XQuery tools through its translation into XQuery code. The XQVT language can be considered an extension of GXQL, which we shall describe in Section 2. The language was designed to achieve the following goals: 1. Power. The language should have the same computational power as XQuery.

1

2. Composability. Any query can be used as part of a larger query. 3. Scaleability. Conceptually simple queries should be easy to represent. 4. Distinctiveness. The XQVT language should be more than just a visual representation of XQuery. 5. Naturality. The language should be close enough in paradigm to XQuery that an XQuery programmer will be comfortable working with it. Learning XQVT should be a useful step in learning XQuery.

1.2

Overview

Section 2 describes existing query languages and explains the influences they have had on XQVT. In Section 3, the basic syntax of XQVT is described through a series of sample queries. Section 4 supplements the sample queries by describing the semantics of XQVT more formally. A brief outline of a proposed algorithm to translate from XQVT to XQuery is given in Section 5. Section 6 substantiates the claim that XQVT has all the computational power of XQuery (but due to space constraints, the full proofs in Section 6 have been omitted). Our prototype interface for XQVT is described and motivated in Section 7. Finally, Section 8 describes future avenues of research for implementation and optimization of XQVT.

2

Related Work

There have been numerous graphical XML query languages proposed over the last five years. These can be loosely grouped into two categories based upon the way that they represent the structure of XML data: as trees or as nested windows. Despite this superficial difference, existing graphical XML query languages are similar in a number of respects, as will be discussed below.

2.1

Visual Query Languages

Languages that represent XML documents as trees evolved from graph query languages. The first widelyknown graph query language was G [5], which expressed queries over a labeled directed graph. A query in G is represented by a series of subgraphs (with various label rules) to search for; the query engine then finds all instances of those subgraphs in the graph. The purpose of G was to make it easier to create recursive queries for a relational database. The subsequent languages G+ [7] and then GraphLog [4] include user interfaces and various options for returning data. The G-Log language [15] developed soon after. G-Log expresses queries as collections of rules, which are drawn as graphs just as in G. G-Log distinguishes between source rules and target rules; the source patterns must be matched in the database, and the target patterns will then be returned as output. This duality of source and target in G-Log heavily influenced subsequent graphical query languages. A variant of G-Log, WG-Log [2], was presented to perform web-based queries. WG-Log extends G-Log by adding notation specific to hypertext queries, such as homepages, links, and so on. The first tree-based visual XML query language was XML-GL [3], which evolved from G-Log and WGLog. An XML-GL query is drawn in two parts: the left-hand side selects information from an XML document, and the right-hand side describes how that information should be returned as output. The two sides are drawn as acyclic directed graphs, representing the tree structures in an XML document. Nodes from the left side are explicitly connected to nodes on the right side to describe what values should be returned as output. XML-GL is a versatile language, having almost as much representational power as XQuery (and contrary to claims in [9], XML-GL does indeed support sorting). A problem with XML-GL was that novice users found it difficult to use. The graph languages upon which XML-GL was built are highly theoretical, designed to fulfill formal logic properties rather than to be user-friendly. Various query languages were therefore developed to offer limited query capabilities but do so in a user-friendly manner. These languages tend to use nested windows or tables to represent data structures; these are logically equivalent to the trees in XML-GL, but more concretely represent XML data.

2

Querying Semistructured Data By Example (QSByE) [11] uses nested tables and windows to represent Web queries. EquiX [1] provides a very simple interface that uses forms and an indentation scheme to describe queries. EquiX displays queries in a natural language format that would appeal to unskilled users, but it is severely limited in expressive power (it does not allow output formatting, for example). BBQ [14] (Blended Browsing and Querying) is a more powerful query tool that represents queries as a directory-tree structure (resembling Microsoft Windows Explorer). BBQ provides document browsing and querying capabilities through an intuitive user interface, but does not allow all queries representable in XML-GL. Xing [10] is a recent visual languages that is as powerful as XML-GL but uses a nested windows representation. The interface for Xing is influenced by database query languages like Doodle [6], which represent relational database relations with nested boxes. Xing queries have a selection part and construction part, like XML-GL, but structural relationships are visualized by drawing elements inside other elements, like Doodle. Like XML-GL, however, Xing requires a significant amount of special notation to represent different kinds of relationships between elements.

2.2

Visual XQuery Languages

There are two recent visual query languages based on XQuery: XQuery By Example (XQBE) [9] and the Graphical XQuery Language (GXQL) [13]. These languages are designed to translate directly into XQuery, hence use existing XQuery implementations to execute queries. The syntax of XQBE is based heavily upon XML-GL. In particular, XQBE represents queries as tree graphs, and represents all parts of a query (including predicates, arithmetic operations, etc.) visually. On the other hand, GXQL uses a nested-windows approach like Xing or Doodle and allows certain non-visual entities (like predicates) to be expressed as text. Despite their superficial differences, XQBE and GXQL are similar as languages. Both express queries as two steps: a selection followed by a construction. Given an input document, the query locates all XML subtrees that match the patterns dictated in the selection part. The matching information is passed to the construction part and formatted for output. This two-step paradigm makes these languages similar to XMLGL. However, the expressive power of XQBE and GXQL is significantly less than XML-GL. Since XQBE and GXQL were designed to be user-friendly XQuery interfaces, they do not have many of the complex features of Xing or XML-GL.

2.3

XQVT Contribution

The XQVT language is best described as an extension of GXQL. XQVT adds more power to the GXQL language by allowing a query to be composed of more than two components. Recall that XML-GL (and all visual XML query languages thereafter) have only two parts: a selection and a construction. XQVT generalizes this concept to allow users to have many selection and construction components in a query. There are numerous advantages to this extension. Complex queries that are not possible with GXQL are simple to perform with XQVT. At the same time, simple queries are easily representable in XQVT, since they correspond to fewer components. XQBE and GXQL use many different symbols or drawing styles to achieve greater functionality. By constrast, XQVT uses its composability to generate complex queries and therefore has little notation to learn. Finally, the XQVT paradigm involves data flowing between components; an XQVT query thus loosely follows a pipeline model, which we feel many users will be comfortable with.

3

Syntax

We shall now describe the visual language that is the core of XQVT. First some basic concepts will be explained. The details of XQVT syntax will then be described through example queries.

3

Singleton Element

Sequence Element

Descendent

Connection

Selection Component

Construction Component

Atomic Value

Sequence of Values

Document

Figure 1: The graphical symbols used in XQVT.

3.1

Basic Concepts

An XQVT query is drawn upon a single canvas. The query is made up of subparts called components. Each component is represented by a rounded rectangle. These components are connected by bold arrows refered to as connections. The connections can be thought of as carrying data elements between components. Each component performs operations upon data items. The operations are described by data nodes and relationships drawn inside the component. Elements are denoted by rectangles, and represent XML nodes. A rectangle can be shadowed to denote a sequence of nodes. Atomic data values are represented by ovals. Element relationships are described in one of two ways. First, structural relationships between nodes in a document (e.g. child, parent) are described by nesting elements inside each other. This syntax is based heavily on that of GXQL [13]. Second, bold arrows are used to indicate that certain elements should be passed to a function or to other componenets. These bold arrows are equivalent to connections. The first connections retrieve data from input documents, represented by bold rectangles. There are two types of components. Selection Components form output by only accepting input data that satisfies certain criteria. Construction Components form output by joining input streams together into some structure. Construction components are drawn in bold. Note that a component can contain other components. In particular, a full query is itself a component. Any XQVT query can be used in a larger query.

3.2

Examples of Use

We shall now present sample queries and show how they are represented in XQVT. This will demonstrate the syntax of XQVT and introduce the XQVT semantics. The language semantics will be described in more detail in Section 4. Most of these queries are taken from the W3C XQuery Use Cases [16]. The Use Case sections and query numbers are given when appropriate. It is assumed that the reader is familiar with an existing graphical XQuery language, such as XQBE or GXQL. Since XQVT handles simple queries similarly to these other languages, we will not give many examples of simple queries. The examples given here are meant to demonstrate the new features of XQVT that extend beyond previous languages. Query 1: XMP Q1 – List books published by Addison-Wesley after 1991, including their year and title. This is a simple query that was given as an example for GXQL. The XQVT representation is shown in Figure 2(a). The XQVT syntax is nearly identical to that in GXQL. The selection component on the left receives an input stream consisting of all top-level nodes in the input document. We select from that stream all subtrees that match the structure described in the component. In this case, we want all top-level bib nodes that contain book nodes, which in turn contain a publisher node and a year attribute. In the nested structure the attribute year is represented by a line, whereas the child publisher is represented by a nested box, just as in GXQL. Note the predicates on publisher and year. Predicates are specified with Where statements at the bottom of the component, which refer to variables in the component. In this case, the variable $p was assigned to the publisher node, and $y to the year attribute. 4

(a)

(b)

Figure 2: XQVT representations of sample queries 1 and 2

(a)

(b)

Figure 3: XQVT representations of sample queries 3 and 4 From the document subtrees that satisfy our conditions, we take as output the book nodes of those trees. These are precisely the books satisfying the conditions of our query. These are passed to the construction component, which generates a new structure containing the titles and years of the books. A single results node wraps all the books. The results node is returned as output. Query 2: Make a list of all books with their titles, including the editors only if they are affiliated with CITI. This example does not come from the W3C use cases. However, it does appear in the papers describing both XQBE and GXQL, and distinguishes XQVT from these languages. This query includes a predicate (“affiliated with CITI”) that influences only a portion of the returned values (“editors”). Both XQBE and GXQL express this query by restricting predicates to particular portions of trees. XQBE represents restriction by adding predicates to its construction component, whereas GXQL uses a colour-coding scheme. The XQVT representation of the query is shown in Figure 2(b). XQVT expresses this query by generating the list of editors in a separate (although nested) selection component. Nested selection components allow predicates to be scoped arbitrarily. This method mirrors the typical XQuery solution, which is to generate the list of editors in a nested expression. The XQVT solution can be interpreted as follows. For each bib entry in the document, we return each subnode book. For each book, we also return another sequence of nodes. That sequence is, for each book, the set of editors whose affiliation is "CITI". Note that the sequence of editors may be empty for a particular book. Finally, we construct a tree containing each book element with title, plus each editor in the sequence of editors. Connecting the sequence of editors to an unlabelled node simply adds each element of that sequence into the return structure without modification. Query 3: XMP Q6 – For each book that has at least one author, list the title and first two authors, and an empty et-al element if the book has additional authors. 5

(a)

(b)

Figure 4: XQVT representations of sample queries 5 and 6 The XQVT Representation is given in Figure 3(a). This query demonstrates the use of aggregates and functions. First, we consider each book in the document. For each book, we consider the set of all authors, as denoted by the shadowed box. We perform two independent operations with this sequence of authors. First, we use a function to select only the first two authors. Functions are represented by rounded rectangles, just like components. The function used here contains an inline condition, and is used to select a subset of our sequence. This is a convenient way to apply a filter to a sequence. Second, we use a nested component to return a single et-al node if the sequence of authors has length greater than 2. Otherwise, the nested component returns an empty sequence. Note that we assign the variable $a to a sequence node, so the predicate Where count($a) > 2 applies to the entire sequence of authors, not each individual author. In the construction component, we generate a single bib node that will contain all the returned books. Each book contains its title, the sequence of the first two authors, and the sequence of (zero or one) et-al nodes. In this case, it is important to connect the sequence of authors to a shadowed box in the construction component. This dictates how the et-al nodes will be placed into the structure. This is explained in more detail in section 4.3. Query 4: TREE Q1 – Prepare a (nested) table of contents for Book1, listing all the sections and their titles. Preserve the original attributes of each element, if any. This query demonstrates user-defined functions and recursion. See Figure 3(b). The main selection component is labelled local:toc. This allows the component to be used as a function at other points in the query. In particular, it can be used recursively. To generate a level of our table of contents, we construct a new tree for each section. This tree will contain information about that section, plus the table of contents for all subsections. The top-level results of the function call are placed in a toc node and returned. This sample query also demonstrates the power of using multiple construction components. Instead of describing output structure all at once, one can construct smaller structures and use them as subtrees in a larger structure. Query 5: XMP Q4 – For each author in the bibliography, list the author’s name and the titles of all books by that author, grouped inside a result element. This is an example of a truely non-simple query. It demonstrates independent sequences and constructing sequences of new nodes. See figure 4(a). Let us explain how this query is interpreted step by step. We consider every author node somewhere in the document, as denoted by the double-lined element. The lists of first and last names for authors are taken as output. The built-in function distinct is used to remove duplicated names. This function behaves the same way as the XQuery function distinct(). XQVT supports many other built-in functions; a full listing is beyond the scope of this report. The second selection component accepts the two generated sequences, containing all first names and last names of authors. The next selection component considers each pair of first name and last name. The names 6

Figure 5: XQVT representation of sample query 7 are not grouped by author; we are considering the cross product of all first names with all last names. For each pair of names, the list of all books with that author is generated. That list and the corresponding first and last names are returned as long as there is more than one book in the list. That is, we return only first and last name combinations that actually correspond to authors. In the construction component, we construct an author node for each pair of first and last names. The shadowed box for author indicates that an author node will be created for each tuple of subnodes. Because the list of titles connects to a sequence node, all the titles are grouped in the same author node. If the new node were drawn as a singleton, a separate author node would be created for each book title. Query 6: XMP Q12 – Find pairs of books that have different titles but the same set of authors (possibly in a different order). See Figure 4(b). This query demonstrates another use of user-defined functions. We wish to sort two different lists of authors by first and last names. Instead of representing this operation twice, we draw a component that performs this operation and we label it. We can then call it twice as a function to perform the desired operations. We also use the Strict keyword on the bib node. This indicates that the subchildren of bib must occur in the order in which they are drawn. In this case, it implies the condition $book1 1991 Return {$x2} } } let $in1 := "http://bstore1.example.com/bib.xml" let $x1 := construct1(select1(formatinput($in1))) for $x2 in $x1/xqvt:sequence/xqvt:item return $x2/*

Figure 9: XQuery Translation of Sample Query 1 that graph. The details of this algorithm are beyond the scope of this report, but the result is an order in which variables should be assigned. Once we know in what order we should express the elements of the component, generating the FLWOR expression is straightforward. We generate a sequence of For and Let statements that express the output connection. For statements correspond to singleton nodes, Let statements to sequence nodes. Nested nodes are traversed with XPath expressions. Note that we can traverse up a document as well as down; we simply apply ascending XPath axes as appropriate. Once the output connection is expressed, we must also express the variables that are used for predicates. This is done in the Where clause, using the existential expression Some $x in [expression] Satisfies [. . . ]. Note that Some might be replaced by Exists, Not Some, etc. if keywords are used. Here [expression] will contain For and Let statements that express $x. This completes the FLWOR expression; we simply use a Return statement to return each output element encapsulated in an tag. There are, of course, a number of details left undiscussed, but they are beyond the scope of this report. See Figure 9 for the XQuery code generated as a translation of Query 1. The formatinput function was not discussed; it simply wraps an input file into the communication structure.

13

6

Computational Power

We have claimed that XQVT maintains all of the computation power of XQuery. In this section we justify this claim by demonstration that XQVT can represent any query that is representable in XQuery.

6.1

Turing-Completeness

It was shown in [12] that the XQuery language is Turing-complete. That is, XQuery is as powerful as a Turing machine. It turns out that XQVT is also Turing-complete. Thus, in a theoretical sense, XQVT is as powerful as XQuery. Unfortunately, the proof that XQVT is Turing-complete is beyond the scope of this report. The idea of the proof is simply to follow the proof given in [12], replacing XQuery structures with equivalent XQVT constructions. In particular, the Turing-completeness proof for XQuery relies heavily on recursive functions, which are easily implemented in XQVT.

6.2

Implementation of XQuery

In this section we show that XQVT can be used to implement XQuery. This demonstrates more directly that XQVT is as powerful a language as XQuery. Recall that the full XQuery language can be reduced to a simpler core language [18], so it is sufficient to show that the XQuery Core can be represented in XQVT. An XQuery consists of a preamble followed by a list of query expressions to evaluate. The preamble can be expressed in XQVT through text written on the canvas. It remains to translate XQuery expressions into XQVT. We need only demonstrate how atomic expressions are represented in XQVT and then show how these expressions can be combined into larger expressions. We shall implement each expression in a given query as a separate XQVT component. First consider base expressions. Input documents are expressed as inputs on the left side of the query. A single atomic value or new node is represented as a component containing a single element populated with the desired value. A sequence of data (i.e. { 1, 2, 3 } ) is represented by a sequence element populated with the given data. We have created translations for all the XQuery Core expressions, but we will not describe them all here due to space constraints. We shall describe the implementation for only one expression: If $c1 Then $c2 Else $c3. This expression translates into three components: one returns $c2 if $c1 is true (using a Where directive), the second returns $c3 if !$c1 is true (again using a Where directive), and the third unions the results of the first two. The translations of the various other expressions in the XQuery core are similar to the one given above. Certain expressions, such as For $x At $i . . . and TypeSwitch require special built-in functions that were not discussed in this paper.

7

Uses and Interface

Up until this point we have been discussing XQVT as an abstract language, but a visual language requires an intuitive interface in order to be usable. XQVT is a tool meant to serve numerous functions. XQVT therefore requires a highly customizable interface. We now describe the potential uses for XQVT in more detail, then describe an interface that meets the requirements for those uses.

7.1

XQVT Tools

Visualization: A visualization tool is meant to assist in the description of XQuery code. It is assumed that the author already knows what her query will do; she simply wishes to represent it. This is done either before or after the XQuery code is written. If before, the user will want to sketch out an XQVT query quickly. Editing power will be paramount; the interface should be nearly as easy to use as a pencil and eraser. If the XQVT query is required after the fact, the fundamental process is to convert XQuery to XQVT and

14

display the results. The user should be able to edit the sizes and positions of the various icons to make the interface visually appealing. All relevant information for a query should be displayed on the canvas, so the entire query can be seen at a glance. Programming: A programming tool assists in developing a new query. The emphasis is on making the full power of XQVT available at all times. The user should be allowed to draw any language element manually. There should also be the possibility to automatically perform common constructions (connecting two elements, adding a child node, etc.). Learning Tool: XQVT can be used as a learning tool for a novice user. A new user may find the visual XQVT representation more intuitive than XQuery code. However, XQVT is a powerful language with options for complex queries. Thus, when used as a learning tool, XQVT should have many features disabled until the user is ready to use them. A query could be limited to two components, for example, to approximate the use of GXQL. Also, it should be possible to convert from XQVT to XQuery and back again, so the user can make modifications in one language and see how they are represented in the other. Debugging: XQVT lends itself well to debugging, since it represents a query as components with information flowing between those components. A user can run a query while choosing to peek at certain connections. The XML data returned via those connections would then be displayed. The user may also wish to disable some parts of the XQVT query, similar to commenting in XQuery.

7.2

Interface Prototype

We now describe a prototype interface for XQVT. This interface is meant to handle the requirements of each of the uses presented above. Unfortunately this interface has not yet been implemented, so no screenshots are available. The interface is dominated by the drawing canvas. A menu allows such operations as: saving and loading XQVT files, importing modules (XQVT canvases containing only function components), declaring namespaces/version/etc., and executing a query. Initially the canvas is blank, but the user can load a query template. A template might include an input document, a selection component, a construction component, and some blank nodes connected between them. This would help a noice user to get started. New query entities (elements, components, etc.) can be created in two ways: manually, using a toolbar of drawing tools; or automatically, using context-sensitive menus. The toolbar is a collection of buttons, modeled after image editing programs. Each button represents a different drawing option, such as Pointer (select an object), Selection (drag to select a group of objects), Connection, Node, Component, etc. This allows a query to be drawn manually. Note that textual information, such as predicates and node names, cannot be drawn onto the canvas using the toolbar. Note also that it is possible to draw illegal configurations, such as a connection that is not connected to an element. Such errors are always ignored until the user attempts to run or verify the query. A menu option determines behaviour when a query runs: bad elements are either ignored and greyed out or generate error messages. A context-sensitive menu allows easy generation of common elements. By right-clicking on a node, one is presented the options of generating a connection, adding a child or parent construction, applying a keyword (Strict, Not, or All), etc. Right-clicking on a component allows switching between selection and construction modes, adding a title, adding a component command, and so on. There is also a context-sensitive properties window that supports viewing and editing textual information about the selected element. This window is docked at the right-hand side of the screen. When a node is selected, it has fields that display the node’s name and variable assignment. This provides a simple way to modify this information. The properties window also attempt to display the data type and path of the node if this information is available. Other appropriate data is displayed if connections or components are selected. We allow elements to be disabled on the canvas. They are drawn in a light grey colour and are meant to be ignored. This is useful for debugging purposes, allowing certain components to be deactivated. It is also useful for novice users. The “expand” command of a node uses a loaded schema to find a list of all children for that node. The node will automatically expand and all children will be added to the canvas as subnodes. These new nodes will all start out disabled. The user can then select which nodes he wants to 15

use and activate them. This behaviour is similar to that of GXQL, which draws elements in light blue until they are selected to be part of a query. The user is allowed to draw arbitrary comments on the canvas. These are drawn in green to make it clear that they are comments. Other textual information such as predicates, namespace definitions, and static types are specified using menus. The interface might not draw all of this information (e.g. static types for all connections) on the canvas; this behaviour is customizable.

8

Future Work

This paper has proposed a new language, algorithms for translation, and an interface, but implementation is not yet complete. In this section we describe what must be done to complete a useable implemention of XQVT. The steps outlined in this section would take approximately a year of development.

8.1

Implementation

In this paper we described a multi-purpose interface for XQVT. This interface has not yet been implemented. Thus, before XQVT can be evaluated, the interface must be programmed and made available. We propose to implement the interface in two stages. A preliminary interface will be created first, capable of designing, parsing, and translating XQVT queries. This interface would then be expanded to include all the features described in Section 7. This includes automatic child generation based on schemae, debugging features, and various niceties of an intuitive graphical user interface. The implementation would also involve a rudimentory XQuery to XQVT conversion process.

8.2

Translation Improvements

We presented algorithms to translate from XQVT to XQuery and back again. However, these algorithms are not acceptable for real-world applications. In both directions, the queries generated are overly complex, making them inefficient and unreadable. It is necessary to develop some optimization strategies to create more natural translations. We imagine that these strategies would involve combining the generated functions/components in appropriate cases. In any case, these improvements are an important step in creating a useable implementation of XQVT.

9

Conclusion

In this paper we presented XQVT, a visual language for representing XML queries. This language is built upon XQuery and is heavily influenced by XQuery semantics. Unlike other visual query languages, XQVT captures all of the expressive power of XQuery, making it a viable tool even for complex queries. We believe that XQVT can find uses in a number of tools to support XQuery. The XQVT language lends itself well to visualization, modification, and debugging tasks. It is also a viable option for introducing new users to XQuery, since simple queries are particularly simple to represent in XQVT. Also, XQVT has some semantic features that make it an interesting language in its own right; XQuery developers may find XQVT a useful alternative for programming certain types of queries.

References [1] S. Cohen, Y. Kanza, Y.A. Kogan, W. Nutt, Y. Sagiv, and A. Serebrenik. EquiX – a search and query language for XML. In J. of the Am. Soc. for Info. Sci. and Tech., 53(6):454–466, May 2002. [2] S. Comai, E. Damiani, R. Posenato, and L. Tanca. A schema based approach to modeling and querying www data. In FQAS’98, May 1998.

16

[3] S. Comai, E. Damiani, P. Fraternali. Computing Graphical Queries over XML Data. ACM TOIS, 19(4):371–430, 2001. [4] M. P. Consens and A. O. Mendelzon. The G+/GraphLog Visual Query System. In Proc. ACM SIGMOD, 388, 1990. [5] I. F. Cruz, A. O. Mendelzon, and P.T. Wood. A graphical query language supporting recursion. In Proc. ACM SIGMOD, 323–330, 1987. [6] I. F. Cruz. Doodle: A Visual Language for Object-Oriented Databases. In ACM SIGMOD Conf. on Management of Data, 71–80, 1992. [7] I. F. Cruz, A. O. Mendelzon, and P. T. Wood. G+: Recursive queries without recursion. In 2nd Int. Conf. on Expert Database Systems, 355–368, 1988. [8] S. J. DeRose. XQuery: A Unified Syntax for Linking and Querying General XML Documents. In Proceedings of Query Languages ’98, 1998. [9] D. Draga and A. Campi. A Graphical Environment to Query XML Data with XQuery. In Fourth Int. Conf. on Web Information Systems Eng. (WISE’03), 31–40, 2003. [10] M. Erwig. Xing: A Visual XML Query Language. Journal of Visual Languages and Computing, 14(1):5– 45, 2003. [11] I. E. Filha, A. Laender, and A. da Silva. Querying Semistructured Data by Example: The QSByE Interface. Presented at the International Workshop on Information Integration on the Web, 2001. [12] S. Kepser. A proof on the turing-completeness of xslt and xquery. Technical report SFB 441, Eberhard Karls Universitat Tubingen, May 2002. [13] Y. Liu, M. McCool, Z. Qin, B. B. Yao. A Graphical XQuery Language Using Nested Windows. Technical report CS-2004-37, University of Waterloo, August 2004. [14] K. D. Munroe and Y. Papakonstantinou. BBQ: A Visual Interface for Integrated Browsing and Querying of XML. In 5th IFIP 2.6 Working Conf. on Visual Database Systems, 2000. [15] J. Paredaens, P. Peelman, and L. Tanca. G-Log: A Graph-Based Query Language. In IEEE Trans. on Knowledge and Data Eng., 7(3):436–453, June 1995. [16] W3C. XML Query Use Cases. [http://www.w3.org/TR/xmlquery-use-cases], November 2003. [17] W3C. XQuery: A Query Language for XML. [http://www.w3.org/TR/xquery/], October 2004. [18] W3C. XQuery 1.0 and XPath 2.0 Formal Semantics, Appendix A: Normalized Core Grammar. [http://www.w3.org/TR/xquery-semantics/#sec core], February 2004.

17