August 24, 2005

"Web Feeds" and Geographic Information

With news stories we are all starting to get confortable with the idea of dynamic or live feeds of content to our web pages.  In other areas we take dynamic data updates as almost a given or at least a given that will shotly be the norm - areas like the stock market, defense, and even medical information all are moving or have already moved to "online - all of the time".  For reasons not totally clear this has not yet happened in the geographic information industry - in spite of the fact that few other kinds of information demand such integration more strongly than that of geographic information. 

The idea of news feeds - real time information syndication offers an instructive model for real time integration of geographic information.  It shows that it is possible. And since of these feeds have geographic information associated with them (see www.georss.org) - it seems only a small step from feeds of geo-oriented content to feeds of geoinformation itself.  This was the reason for the creation of GML and is at the heart of the meaning of the Geo-Web. 

The news feed analogy also carries another point - the importance of combing the right information. While a news editor may justapose disimilar stories, this is always done for a point - to achieve a particular effect. We need to understand how to do the similar thing in the geo-domain - meaning real time data integration.

Automated news feeds also make us think about information quality.  Is this story true? As a publisher can I accept the feed without review?  Can I accept it from just anyone?  Similar issues exist. albeit in a more complex guise, in the world of geographic information.  Data quality is paramount. Garbage in  and garbage out.  What do we know of the quality of what we are receiving?  How can we provide for high data quality in an online environment?  How do we restrict who can publish and what they can publish?

Posted by RLake at 00:39:34 | Permanent Link | Comments (0) |

August 23, 2005

What is the Geo-Web?

The notion of a Geo-Web can be viewed from various perspectives:

  • Real time syndication or web feed model.
  • Spatial data infrastructure.
  • Real time geographic data publication and synchronization
  • Simulation of the world - real time digital globe
  • From Services to Data - From Data to Services
  • Data is not a local private resource - beyond ETL & Cut and Paste

Real Time Syndication/Web Feed Model

In the news world - the idea of syndicating or integrating information from multiple sources is well understood.  With the advent of RSS - we can create a news page that dynamically integrates information from multiple sources or publishers.

In the geographic information world we are moving to doing the same thing but for geographic information - with real time publication and integration of geographic information from multiple sources - from governments (municipalities, regional governments, states) and private corporations (utilities, transportation companies, resource management ..).

Spatial Data Infrastructure (SDI)

This is an old idea - even older than the web - by which we would be able to source and integrate geographic information in real time.  Most countries in the world have some form of SDI program.  These programs have names like CGDI (Canada), NSDI (USA), NSDI(Japan), .. and so on.  The main problem with these programs is that they have been national and top down - rather than bottom up and driven from the data sources.  So the question of sustainability has been raised over and over again. There is even a Global SDI.

Real Time Data Publication and Synchronization

GML (and the associated WFS specification) provides the means to publish geographic information - independent of the storage model and storage software - and to synchronize such database across the Internet.  This means that the technology now exists to realize the objectives of the Spatial Data Infrastructure.

Simulation of the World - The Digital Globe

We can think about a mass of information about the earth as a kind of model or simulation of the earth and the processes taking place on it.  This is a very old idea as well dating back to the 60's at least (I am sure it is really much older).  We hold this model of the earth in our hand and use it to dream, to plan and transact business with one another. I expect that this view of the GeoWeb will have more currency in the future as we increasingly integrate the environment into our notions of economics.

From Services to Data - From Data to Services

Geographic information is usually not an end in itself. I need information about "where" in order to make decisions. Is this mine likely to be promising?  If I can find oil can it be commercially extracted and brought to market?  If I go hiking today will I encounter snow? Where are our ships?  Will the ones transporting timber get to port on time or be delayed by the storm in the Atlantic?  Services like these may depend on access to many kinds of data - from many sources .. they build on the Geo-Web. At the same time they may create yet new data (the ship locations, the planned oil transportation corridor..) which form yet another part of the Geo-Web.

Beyond ETL - Beyond Cut and Paste

The geographic information world today - the world identified with GIS - treats geographic information as if it were a local and almost private resource.  People pass around files of data and cust and paste it into their application systems - at best they automate this to a degree with ETL (Export - Transfer - Load) - an automated version of cut and paste.  What problems demand - however - is a global and integrated view of geographic information - and that is the objective of the Geo-Web.

All of these ideas - all of these views - are part of the same thing - the Geo-Web.

Posted by RLake at 18:35:57 | Permanent Link | Comments (0) |

IS WGS84 Enough

Most of the search engine based "geographies" assume that WGS84 is enough - that other coordinate systems are not required.  This note is to explore the validity of this approach.  Several issues need to raised at the outset:

  • There are large amounts of data today which exist in hundreds of coordinate systems and conversion of all of this data is not going to happen.
  • WGS84 is limited in accuracy (today) to a best 1 metre.  For many applications this is fine - for many more this is not close.  Land surveys are typically being done to 10 cm or better. You would not want to drive your car +/- 1 M on a road and you would not want to lose 1 M of your property.
  • WGS 84 will eventually be replaced.
  • IF I express the coordinates of a "straight line" segment (meaning linear interpolation) in WGS84 it will NOT be a shortest line on the surface of the earth. Consider taking two points (45,100), (45, 120)  (45 = 45 degrees North latitude).  Linear interpolation in the (lat,lon) coordinates will generate a small circle arc connecting the two points - but the shortest distance would be a Great Circle (or nearly a Great Circle) since the earth is modelled as an Ellipsoid in WGS84 and not as a sphere.  So I MAY want to use other projective coordinates in some cases.
  • In many problems I don't really care where things are located relative to the entire earth - just where they are relative to one another.  Things like highway exits, road signs, rest stops are usually given in terms of driving distance along the highway from some road intersection.  These so called "linear" reference systems tend to dominate the discussion in the transportation sector.
Posted by RLake at 18:15:51 | Permanent Link | Comments (1) |

August 04, 2005

Coordinates in GML

Coordinates in GML can be a little confusing to the newbie.  This is mostly because GML has more than one way of writing them.  In many of the examples on this page coordinates are expressed using the gml:coordinates property tag.

For example:

 <gml:Point gml:id="p21" srsName="urn:ogc:def:crs:EPSG:6.6:4326">
    <gml:coordinates>45.67, 88.56</gml:coordinates>
 </gml:Point>

Note that when we express coordinates in this way, the individual coordinates like 88.56 are not visible in the XML - the whole text content of the <gml:coordinates> element is just one string.  To make the coordinates visible in GML, GML 3. introduced the <gml:pos> element and then the <gml:posList> element.  {GML already had the <gml:coord> element - however this should be treated as a defect and NEVER used}.  Using the <gml:pos> element the above example is written:

 <gml:Point gml:id="p21" srsName="urn:ogc:def:crs:EPSG:6.6:4326">
    <gml:pos dimension="2">45.67 88.56</gml:pos>
 </gml:Point>

The posList element is used for list of coordinate tuples such as for linear geometries. A <gml:LineString> using the <gml:coordinates> property would be written:

 <gml:LineString gml:id="p21" srsName="urn:ogc:def:crs:EPSG:6.6:4326">
    <gml:coordinates>45.67, 88.56 55.56,89.44</gml:coordinates>
 </gml:LineString >

Using the <gml:posList> element this would be written as:

 <gml:LineString gml:id="p21" srsName="urn:ogc:def:crs:EPSG:6.6:4326">
    <gml:posList dimension="2">45.67 88.56 55.56 89.44</gml:posList>
 </gml:LineString >

'''How do I know which to use?'''

In the future use only <gml:pos> or <gml:posList>. 

Since many current GML data servers (WFS) and conversion tools may only support <gml:coordinates>, however, this will quickly change the next several months.  If you are creating data - use <gml:pos> or <gml:posList>.  If you are writing GML processing software it is a good idea to also support <gml:coordinates>.  Support for <gml:coord> is not necessary.

To understand the use of the srsName attribute see the Wikipedia article on GML and Coordinate Systems.

Posted by RLake at 01:33:41 | Permanent Link | Comments (1) |

August 03, 2005

GML Profiles

While the GML schemas are arranged in a modular fashion (e.g. one need only import geometryBasic0D1D.xsd for many applications), many have felt the need to have a lower bar to facilitate the broader and more rapid adoption of GML.  This is accomodated in GML by the use of profiles.  GML defines a profile (in the specification) and a number of profiles have been created or are being proposed, including:

  • A very simple GML Point Profile aimed at specification developers that have point geometric data but do not want to use the GML grammar.
  • A simple GML for Simple Features profile aimed at supporting vector feature requests and transactions (e.g. to/from a WFS).
  • A GML profile for GMJP2 (GML in JPEG 2000)
  • A GML profile for RSS (discussed below)

In addition, the GML specification provides a subset tool that can automatically generate compliant profiles of GML containing a user-specified list of components.

 

I believe that each of these are important steps forward in increasing the acceptance of GML.

 

We should note that Profiles are not to be confused with Application Schemas.  Profiles live in the GML namespaces (http://www.opengis.org/gml) and define restricted subsets of GML.  Application schemas are XML vocabularies defined using GML and which live in an application-defined target namespace.  Application schemas can be built on specific GML profiles or use the full GML schema set.

 

When Points are sufficient:

The GML Point Profile contains a single GML object, namely a gml:Point.  It can be used in any XML Schema simply be importing the Point Profile and referencing the point as required.  A simple example might look as follows:

 

<PhotoCollection xmlns="http://www.myphotos.org" xmlns:gml="http://www.opengis.net/gml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.myphotos.org

 MyGoodPhotos.xsd">

     <items>

          <Item>

               <name> Lynn Valley</name>

               <description>A shot of the falls from the suspension bridge</description>

               <where>North Vancouver</where>

               <position>

                    <gml:Point srsDimension="2" srsName="urn:ogc:def:crs:EPSG:6.6:4326">                                                  <gml:pos>49.40 -123.26</gml:pos>

                    </gml:Point>

               </position>

          </Item>

     </items>

</PhotoCollection>

 

Note that in this case the ONLY GML is the gml:Point object.  The rest is defined by the photo-collection schema.

 

 

Simple and Stupid is Often Best

 

The GML Profile for Simple Features is a more complete profile of GML than the Point Profile and provides sufficient support for a wide range of vector feature objects.  It includes:

 

1.     A reduced geometry model allowing 0d, 1d and 2d linear geometric objects (all based on linear interpolation) and the corresponding aggregate geometries (gml:MultiPoint, gml:MultiCurve, etc).

2.     A simplified feature model which can only be one level deep (in the general GML model, arbitrary nesting of features and feature properties is not permitted).

3.     All non-geometric properties must be XML Schema simple types – i.e. cannot contain nested elements.

4.     Remote property value references (xlink:href) just like in the main GML specification.

 

Since the profile aims to provide a simple entry point it does not provide support for:

  • coverages
  • topology
  • observations
  • value objects (for real time sensor data)
  • nor support for dynamic features.

Nonetheless it will support a good variety of real world problems.

 

A Possible Profile for RSS

 

The RSS community may have need for a different GML profile, one that DOES NOT have the complexity of the Simple Features Profile from a geometry perspective, but DOES have items needed in a news feed, like:

  • time (Time position, time duration)
  • dynamic features
  • observations
  • simple geometry (Points, LineStrings, Polygons)
  • features

Here is a draft of this profile:

 

For those who want more:

Although it is not all that widely known, GML has incorporated a profiling tool as part of the specification since GML 3.  This profiling tool is referred to as a subset tool and is a pair of XSLT scripts written by Paul Daisey of the US Census Bureau.  These scripts permit the automatic generation of a profile or a profile starting point should you wish to additional manual editing or schema restriction (remember that a profile is always a strict restriction of the full GML specification, and any application schema that can be generated using a profile must also be a valid application schema with respect to the full GML specification).  In fact, both of the profiles above were generated using the subset tool, followed by some manual edits to enforce some specific schema restrictions.

 

The subset tool can be used to generate profiles for many other reasons as well.  Simply list the elements/attributes you want included in the resultant profile schema and run the tool.  The result, a single profile schema file containing only the user-specified items and all of the element, attribute and type declarations on which the specified items depend.  Profile schemas have been created in this manner for a range of other specifications including IHO S-57 and GML in JPEG 2000.  This ensures that the application schemas developed on these profiles do not carry around any components that will never be used.

 

 

 

 

Posted by RLake at 23:40:08 | Permanent Link | Comments (4) |

GML and Coordinate Systems

You may have wondered how in GML we designate the coordinate system used to interpret the coordinates in a <gml:coordinates>, <gml:pos> or <gml:posList> element.

Unlike KML or geoRSS (at least thus far) GML does NOT assume a single, fixed coordinate system.  It must be specified by the creator of the data.  The coordinate system is specified (what OGC calls a Coordinate Reference System) using the srsName attribute. This is attached to a geometry object as shown in the following example:

    <gml:Point gml:id="p1" srsName="#srs36">
        <gml:coordinates>100,200</gml:coordinates>
    </gml:Point>

The value of the srsName attribute is a URI.  It refers to a definition of the coordinate reference system that is used to interpret the coordinates in the geometry.  This definition may live in a document (e.g. a flat file)or in an online web service (e.g. CRS Demonstration).

The srsName URI may also be a URN.  This then provides a well known string for referencing common CRS definitions. The OGC has developed a set of URN strings for some common coordinate systems.  It is intended that these URN's resolve through a URN resolver to CRS definitions.

GML provides a means of encoding CRS definitions.  A draft URN structure has been created by the OGC in part for this purpose and specific URN names have been created.

 

Posted by RLake at 23:00:40 | Permanent Link | Comments (1) |

Information Sources

Some GML Information Sources

Related Information Sources

Posted by RLake at 22:36:50 | Permanent Link | Comments (1) |

Features and Geometry Properties

To understand GML it is necessary to understand the relationship between GML geomery objects and GML features.  A feature is an application object like a building, a river or person. It may or may not have geometric aspects.  A geometry object is NOT a feature.  Note that in some GIS (especially older ones) a feature referred to something on a map and was more or less the same thing as a geometry object. This is NOT the case in GML.

We do not (except in abusing the language) speak in GML of point features or area features.  A feature can have various geometric properties that describe aspects or characteristics of the feature.  A Building for example might have a position given by a Point geometry object.  We do NOT think of the Building as a point, but we can say that the position of the building is given by a point. In GML we might write:

<abc:Building gml:id="SearsTower">
       <abc:position>
                <gml:Point>
                        <gml:coordinates>100,200</gml:coordinates>
                </gml:Point>
      </abc:position>
</abc:Building>

We might also say that the building has a "footprint" or an "extent" and write:

<abc:Building gml:id="SearsTower">
       <app:extent>
                <gml:Polygon>
                        <gml:exterior>
                                <gml:LinearRing>
                                         <gml:coordinates>100,200</gml:coordinates>
                                </gml:LinearRing>
                       </gml:exterior>
                </gml:Polygon>
      </app:extent>
</abc:Building>
 

Of course, our Building may have both a position and an extent, as well as other properties and could be encoded as:

<abc:Building gml:id="SearsTower">
       <gml:name>Sears Tower</gml:name>
       <abc:height>52</abc:height>
       <abc:position>
                <gml:Point>
                        <gml:coordinates>100,200</gml:coordinates>
                </gml:Point>
      </abc:position>
       <app:extent>
                <gml:Polygon>
                        <gml:exterior>
                                <gml:LinearRing>
                                         <gml:coordinates>100,200</gml:coordinates>
                                </gml:LinearRing>
                       </gml:exterior>
                </gml:Polygon>
      </app:extent>
</abc:Building>
 

GML provides the ability for features to share geometry with one another. This is accomplished using the remote property reference on a geometry property.  Remote properties are a general feature of GML borrowed from RDF.  If you see (or use) an xlnk:href on a GML property it means that the value of the property is the resource referenced in the link.  This can be used for geometry property values.

Suppose we had a Building whose position was given by a Point with identifier p21 (gml:id = "p1").  Suppose also that this Point was also the position of a survey Monument.  We might then write in GML something as follows:

<abc:Building gml:id="SearsTower">
       <abc:position xlink:type="Simple" xlink:href="#p21"/>
</abc:Building>
 
<abc:SurveyMonument gml:id="g234">
       <abc:position>
                <gml:Point gml:id="p21">
                        <gml:coordinates>100,200</gml:coordinates>
                </gml:Point>
      </abc:position>
</abc:SurveyMonument >

Note that the reference is to the shared point and NOT to the SurveyMonument, since the feature object can have more than one geometry property.

Posted by RLake at 22:17:40 | Permanent Link | Comments (0) |

GML Geometries

GML provides a range of geometry objects that can be used in the description of application objects (features).  In GML 1.0 and GML 2.0 this list was quite short. The key geometry objects were:

  • Point
  • LineString
  • Polygon

A Polygon (closed region of space) was encoded in GML 1.0 and 2.0 as follows:

<gml:Polygon>
         <gml:outerBoundaryIs>
                 <gml:LinearRing>
                         <gml:coordinates>0,0 100,0 100,100 0,100 0,0</gml:coordinates>
                 </gml:LinearRing>
        </gml:outerBoundaryIs>
</gml:Polygon>
Posted by RLake at 18:43:52 | Permanent Link | Comments (0) |

GML FAQ for RSS Geeks and others

This is an introduction (to begin with) to GML for RSS Geeks and other web developers. The objective is to take the fear and loathing of that big GML specification and make it something friendly and accessible.

To begin with - why was GML created ?

GML was created to enable geographic transactions on the Internet, that is to support the dynamic sharing of geographic information - in effect to lay the foundation for a Geo-Web.

What makes GML so complex?  Why is the specification so long?

The model behind GML is not complex. It is largely borrowed from RDF. GML provides, effectively, an XML encoding of extended Entity-Relationship diagrams (meaning we have entities, relationships between them, and we allow for inheritance between entities).

The GML specification is long because (at least since GML 3.) it describes a lot of primitive objects that are useful in building a geographic vocabulary.

What sort of basic or primitive objects does GML provide?

The list is now fairly long. It includes: 

  • features
  • various types of geometries (we will come back to that in a minute)
  • coordinate reference systems
  • time
  • dynamic features
  • coverages (that includes geographic images)
  • units of measure
  • styles for map presentation.
What is a feature in GML?

A feature is one of the key building blocks in GML. Features are meaningful entities in the user's world. GML does not try to define what a road is or what a river is. These are features in GML. GML provides the framework for users to create there own feature definitions.

Why is GML written in XML Schema?

It may come as a surprise but GML 1.0 did use RDF (not XML Schema). We switched to XML Schema in GML version 1.0 - because we thought that semantic description could be separated from object description and that XML Schema could better connect to the world of geographic databases. We kept many features from RDF however in the design of GML.

Can GML be written in RDF?

While that would not currently be normative - I would argue YES. GML provides core objects and rules for building application domain objects. Currently these rules and core objects are expressed in XML Schema - but they COULD be translated to RDF.

How do users build application vocabularies?

My application area is tourism. We are interested in objects like monuments, places of interest, museums, road exits, viewpoints etc. Using GML you would create an Application Schema that "describes" these objects for your purposes. This means you create in the GML Application Schema using GML core objects and using the rules for creating application schemas.

Are Aplication Schemas Unique to GML? 

No.  Other markup languages for geography, such as Google KML, use schema constructs also.  The main difference is that GML does not invent a new schema language as KML does, but rather uses an existing one, namely XML Schema.

Posted by RLake at 17:55:41 | Permanent Link | Comments (3) |