Part 4. Transforming data into INSPIRE compliant data

1 Part 4 Transforming data into INSPIRE compliant data Transformation process 1. Conceptual matching • Understand data specification • Flip chart ...
Author: Jane Harper
3 downloads 0 Views 911KB Size
1

Part 4

Transforming data into INSPIRE compliant data

Transformation process 1. Conceptual matching • Understand data specification • Flip chart exercise with domain experts: schema matching • Discuss different options for publishing data • Relationship source data vs INSPIRE theme • Type of download service

2. Configure mappings/ transformation & validate • Choose ETL/transformation tool • Configure schema mapping • Generate data and validate

2

3. Publish data • Create data or configure WFS • Upload data or deploy web service • Choose metadata tool • Create INSPIRE discovery metadata • URLs for access the data • Additional metadata elements as required by data specification • Update conformance statement • Publish metadata in INSPIRE discovery service

Basic transformation operations In the next slides we will highlight a series of basic transformation operations

1. 2. 3. 4. 5.

Schema translation: matching and mapping Coordinate conversion and transformation Filtering and resampling Edge matching Other operations

3

Schema translation

4

• A schema is here defined as a formal description of a model – Conceptual schema = data structures, codelists etc...(UML) – Logical schema = Physical structure (expressed in XSD) – Transfer files = XML/GML files

• Schema matching (finding semantically related objects) – Ontology, thesauri, dictionaries

• Schema mapping (finding transformation rules) – Reclassification – Data Type conversion – Reference systems

• Schema Transformation – Extract-Transform-Load (ETL)

Schema translation

5

Required knowledge • • • • •

6

INSPIRE directive, implementing rules and guidelines UML: to understand the target data model RDBMS: to understand relational models XML: needed to understand GML GML: needed to understand encoding – INSPIRE: GMLv3.2.1

• Network services: needed to publish harmonised data – CSW to host metadata catalogue – WFS to download spatial data in GML format

• ETL tools: needed to conduct data operations – Used to convert data as closely as possible to target schema

7

Schema matching • The process of identifying that two expressions are semantically related

7

Schema matching

8

• Schema matching is the process of identifying corresponding concepts in the source schema (national data sets) and the target schema (INSPIRE specification). • The matching process considered both the language issues as well as the semantic differences in both schemas.

• The result of schema matching process specify how the data in the source schema corresponds to the data in the target schema. • Schema matching is the first step in the data transformation.

Schema matching

9

Matching process

10

• To start matching process you need to: – Identify feature types in both the source schema and the target schema – Identify structural properties of the feature types – Identify attribute names in both schemas – Identify data-value types and characteristics

• The matching process can be performed manually as a desk study or using automated tools that uses intelligent techniques.

Matching process

11

• The result of schema matching is to make sure that features and attributes in both schemas are semantically related. • Matching process will result in a set of transformation (conversion rules) and translation table • The matching table will be used during mapping.

Matching (and filtering)

Schema matching example

12

• Translation table for matching GN feature type (NamedPlace, INSPIRE target) and (Ortnamn, source).

An illustration of schema matching process

13

An illustration of schema matching process

14

Result of schema matching: Transformation rules table

15

Source data type

Target data type

Conversion Type

Meaning

Code list

Character String

CodelistToText

Codelist value converted to character string

number*2

GML object

CoordinateToPoint

Coordinate pair converted to GML Point

Text or missing value

Text

Assign(Value)

Target value in brackets used instead of source text

Integer

Character String

IntegerToText

Integer value converted to character string

Char

Text

Equal

No conversion required

CodeList

CodeList

Assign

Target value used instead of source value

Schema matching example Transport network Matching Feature type VAGL, VAGOVRL >>> to FormOfway

16

Schema matching example • Matching of attributes VAGTYP >> FormOfWay values

17

Schema mapping

18

• Schema mapping determines how source schema’s elements are matched to the right target schema’s elements. • Two levels of mapping – Feature mapping: the process of connecting source feature types to target feature types. – Attribute mapping: attribute mapping is the process of connecting source feature types attributes to attributes on a target feature types

Schema mapping

19

Finding the transformation rules between the objects • Reclassification

• Type conversions Integer, Real, Date, Char and Varchar, BLOB, Enum, Spatial (point, linestring, polygon etc...)

Schema mapping • Example of type conversions

20

Coordinate Conversion

21

• Spatial Reference systems – – – – –

Geocentric coordinates (X,Y,Z) Geographic coordinates (lat, long, H) Projected coordinates (Northing, Easting, Height) Local coordinates (x,y,z) Linear coordinates (startNode, Length, Distance, R/L)

Coordinate Conversion

22

INSPIRE reference systems Reference system For the horizontal component, for the areas within the geographical scope of the European Terrestrial reference System 1989 (ETRS89) shall be used. Ellipsoid The parameters of the GRS80 ellipsoid shall be used for the computation of latitude and longitude (ETRS89-GRS80) and for the computation of plane coordinates using a suitable map projection Map Projection •The ETRS89 Lambert Azimuthal Equal Area (ETRS-LAEA) shall be used for purposes when true area representation is required •The ETRS89 Lambert Conformal Conic (ETRS-LCC) shall be used for conformal mapping at scales smaller or equal to 1:500.000 •The ETRS Transverse Mercator (ETRS-TMzn) shall be used for conformal mapping at scales larger than 1:500.000

22

Resampling or filtering

Filtering

Resampling

23

Edge Matching

24

• Old problem, dating back from times when map sheets were digitised • Occurs at borders between data sets / countries • Geometric and other conditions to be fulfilled – – – –

Connections Smoothness 90-degrees corners (for buildings etc…) Conditions often solved by least squares adjustment or similar (averaging etc...)

Edge Matching

25

Other operations

26

Less common operations – – – – – – –

Address matching (geo-coding) Transformation between temporal reference systems Multiple representation Topology Merging old and new information Multilinguality Nomenclatures and taxonomies