November 25, 2005

GML is the same for all applications

You may have noticed that different geospatial applications have developed a variety of different formats (see Is GML a format?) - VPF/VMAP for military data, S-57 for oceanographic and ENC data, numerous national data formats, and formats which depend on the vendor software being used. The point about GML is that all of these format differences go away - if you can read ONE GML file or stream you can read them all. This does not mean that you understand all of the semantics (you may or may not know what to do with a BSpline) - but you will know that it is a GML object and that it is a kind of curve segment. Previous to GML, different readers were written for each and every format, hence different software was often needed for different applicarions just to read the data.
Posted by RLake at 02:58:48 | Permanent Link | Comments (2) |

Schemas and Profiles - whats the difference?

Lots of discussion has taken place as to the role of GML Application Schemas and GML profiles and a lot of this discussion has been misleading as the two items are often confused with one another.

GML Application Schema defines an application vocabulary:

A GML Application Schema defines an application vocabulary - meaning it defines the objects of interest (roads, runways, airspaces, survey lines, monuments, shipping lanes, wharves etc.) in a particular application domain. This definition is in terms of the properties of the object (e.g. number of lanes, surface type, geometry etc.). Some of these properties are geometric, topological and so forth. GML core schemas provide many of the objects (geometry, topology etc.) that are used to define the properties of the application objects.

GML Profile defines a Subset of the GML Grammar

objects. GML provides a rich collection of basic object types - e.g different kinds of objects, topological elemenyts, coordinate reference components and so on. Not every database, nor every application schema needs all of these objects. A profile is a subset that suits a particular domain. This can be defined by an application (and possibly a broad horizontal application schema) or it can be defined by some processing technology. Profiles of GML can be created by the "subset tool" that is part of the GML specification. Current profiles include the GMLJP2 profile, the Point profile and the simple features profile. Note that all such profiles ARE GML and one can create application schemas USING THEM.
Posted by RLake at 02:53:03 | Permanent Link | Comments (0) |

November 22, 2005

Schemas - why the big deal?

One of the aspects of GML that often receives comment is that of GML application schemas.  For some reason many people in GIS and Cartography find this notion either complex or strange or both.  This is so, in spite of the fact that the same persons are already aware of relational schemas or at least withe concept of a table in a relational database.  If I say I want to create a table containing persons and that table has the attributes (Name(string), Age(integer), Address(string) no one gets excited.  This means I have created (somewhere) a schema that looks like:

    Person(Name:String, Age:Integer, Address:string)

Great - right.  For some reason the idea of a GML application schema seems more foreign - like saying in GML that I want to create a Person object with a shema (GML Application Schema) that defines a Person Object with the properties Name, Age and Address.  Not really so difficult is it?

Note that in both cases we define a KIND OF THING (in this case a PERSON) and we DESCRIBE that KIND OF THING by specifying its attributes (relational schema) or properties (GML Schema), the specification of the properties including the property NAME (e.g. Age) and the property TYPE (e.g. Integer). Of course GML provides a whole set of special types for both KINDS OF THINGS (e.g. geometric, topological objects) and for PROPERTIES. In fact MOST of GML is simply a catalogue of these KINDS OF THINGS.

So there is no need to be afraid of GML application schemas.

How come other geographic encodings don't use schemas?    Well some do (e.g. KML) and some do not.  The ones that do not typically do NOT provide a means to create a new kinds of objects, somethings which is critical to GML and to geographic information in general.  Note that when you pass DBF files around you ARE passing in effect schema information.

Why does GML use XML Schema to write application schemas? Well we could have created YET ANOTHER SCHEMA LANGUAGE.  This is the solution adopted by KML, SAIF and others, and which was initially used in GML v1.1.  The problem with this approach is that you then need to create ALL of the tools to work with these schemas - schema design, translation, and validation - AND the tools to ensure that data instances are in fact consistent with the developed schemas.  In GML we felt that it was better to use an existing, developed schema language.
Posted by RLake at 14:30:54 | Permanent Link | Comments (0) |

November 15, 2005

GML for Geographic Imagery

Perhaps you thought GML was only for vectors? Well think again. A new specification recently endorsed at the OGC (Open Geospatial Consortium) takes GML solidly into the world of imagery.

GML Coverages:

Of course, you may already be aware that GML provides imaging support via GML coverages. Loosely speaking a coverage is a sort of generalization of a geographc image - meaning it specifies the geometry of the coverage in some geographic space (e.g. some part of the earth's surface) and defines a function on that space. In image terms you can think in terms of the geometry as the image "grid" and the function as the radiometry or pixel values. The geometry can be a gridded structure (like most raster images) or can be a tesselated field, collections of polygons, a set of curves or curve segments or just a random collection of points. The function provides values at these points - such as soil samples (random point collection), road surface type (curve segments), crop type (polygons) and brightness (gridded image).

How does this work in GML? Let us consider the gridded case. Pretty much as you would expect. GML just describes the structure of the GRID - the origin, deltaX, deltaY type of thing - actual points of the grid do not appear in GML. So the GML part of the geometry description is TINY.

GML also describes the function or value part of the coverage - in gridded terms - this means what the "pixels" mean - e.g. are they elevation (in meters)? reflectivity? etc. So this part is also TINY.

GML can allow you to define ANY parameters you like including their units of measure. So you can easily construct image descriptions for fancy satellite images with multiple bands or image structures like stereo pairs or triangulation blocks - in fact these specific image types will emerge from the next phase of the specification.

GML in JPEG 2000: (GMLJP2)

So how does this all work with JPEG 2000? What is JPEG 2000 anyways? JPEG 2000 (http://www.jpeg.org/jpeg2000/) is a powerful image specifcation that enables both lossy and lossless compression and enables the handling of multi-gigabyte images - even via the Internet.

A neat thing about JPEG 2000 is that it allows the use of XML as part of the image description. For Geographic Images this means using GML as the natural vehicle. The idea, originally suggested by Lucio Colaiacomo has now been developed into a specification. With GML in JPEG 2000 (now called GMLJP2) we can stuff the GML description INSIDE the JPEG 2000 image to create a JPEG 2000 geographic image. This is like GeoTiff but enhanced, since not only is the geometry handled, but we can also describe the value side (radiometry for you image types) - moreover you can embedd vector annotation, coordinate reference system descriptions, units of measure and even extracted or critical features INTO the image. Never again need an image be lost for the want of data about the image - it is right there INSIDE it.

Of course this would be no good without the software and LizardTech, ERM and others have stepped up to the plate already with supporting software. Also the imaging vendors have gotten into the act - with participation by SPOT Image and others. Once you can read ONE of these "formats" (see article below why GML is NOT a format) - you can read them ALL.

This is a major step forward for geographic imaging.
Posted by RLake at 01:35:27 | Permanent Link | Comments (0) |

November 13, 2005

GML, and KML - Why the fuss?

Various other blogs have been making comparisons between GML and KML.  Such discourse is interesting, however, I think that most of them miss the point.  Comments like KML is light and GML is heavy -  or "I was like a kid in a candy store" - are misleading at the best and  border on being disingenuous.  The difference, the key difference between GML and KML is not complexity nor expressiveness, and can be expressed in a single word - Google.  Had Google decided to use GML (and THEY DID) - we would be saying the same things about GML as KML.  This is not sour grapes - it is simply reality.  One can easily argue that KML is already a profile of GML - just unoffically so.

Of course you would not want to express avionics in KML, nor the charts for ships at sea.  The point of GML was to enable profiles on which could be constructed application vocabularies for different domains.  The rise of KML reinforces the importance of XML in the geospatial domain and in no way reduces the efficacy or importance of GML.

Posted by RLake at 20:23:48 | Permanent Link | Comments (4) |

November 10, 2005

Is GML a format?

One often hears the term "data format" without much discussion as to what it means. People talk about converting from one format to another even when they express distinctly different semantics - for example "the conversion from Shape format to SVG format" and so on. While this "abuse of language" may be convenient it quite often makes it very unclear what is really going on.

What is a data format any ways?

Those that have been around long enough to remember when the discussion of database theory was stil in fashion will remember the terms "physical and logical independence" of programs and data. Loosely put this meant that one could write computer programs that could make data accesses without knowing the physical location and structure of the data elements they were reading or writing. Without such independence data access software was very brittle and broke anytime someone added a new "field" to a data file, even if the that program made no use of the field in question. Such independence of programs and data was much touted as a key rationale for databases. Databases allowed the writer or reader to perform data access operations without knowledge of the structure of the data.

Programs might read data into internal record structures but these structures existed only in the program and were completely decoupled from the actual structures used by the data base for data storage. New "fields" could be added to the database without any impact on existing programs and requiring no change to their internal data structures.

What then are data formats? Essentially they are just data record structures that are written to a file. A format specification provides the structure of the records and their external semantics - e.g. the first 10 characters is the object ID and so on. Often these formats are isolated behind API's but this does not change the nature of the format itself. The relationship between programs that deal with the formatted file are in the same position as data access software in the pre-database age.

Is this changed by the emergence of XML? Can we speak of GML as a format?

I would argue no. GML is NOT a format. Creators of software that read or write GML do not think of how the XML is layed out in a file and have no access to it. There are NO specifications for the length of records or even the order of the records within a file structure. Software accesses the data through various data models built by the parser (e.g. DOM, SAX etc.) and in which the items of interest are defined by the associated XML Schema (GML Application Schema). This means that such software is independent of the physical organization of the data- and really does deal with the data in terms of the logical model defined by XML (i.e. the XML Infoset).

One can thus think of GML (and any XML grammar) as a kind of "local database" that brings the independence of programs and data to the world of information exchange.

So GML is not a format.
Posted by RLake at 09:08:45 | Permanent Link | Comments (8) |

November 09, 2005

Embedding GML in "foreign" grammars

It seems that there are many people who would like to use GML in a non-GML grammar but are not sure how to do it.  This note is offered as a suggested best practice.

GML follows the so called "object-property-value" rule borrowed more or less from RDF.  This gives GML instances a particular striped structure - especially if the value is again a GML object.  GML was intended to allow applications to define voacabularies of geographic objects - i.e. concrete or abstract real world objects such as political boundaries, bouys, airspaces, shipping lanes, roads and forest districts.  Since such vocabularies are open ended GML uses XML Schema and creates so called application schemas that define these vocabularies,

Some people would, however, like to use GML within an existing XML grammar for such things as the expression of geometry, time, direction etc.  For such purposes the application schema route is not the best approach.  In this case it is suggested that ONLY GML objects be used.  GML objects are XML elements whose content models derive from one of the GML core schema objects such as features, geometries, topologies, time etc.  Their children are ALWAYS properties of the object.

To use a GML object, enclose it in a property in the "foreign" grammar.

Thus you might write:

<foreignGrammar>
      <gml:Object> .. </gml:Object>
</foreignGrammar>

Note that this approach is NOT recommended when defining geographic objects and everything outside the GML must be handled by non-GML aware sotware.
Posted by RLake at 06:15:17 | Permanent Link | Comments (0) |

November 03, 2005

Authentication and Access Control

The GeoWeb will not succeed in the near term without a broadly accepted standard for authenticaton and access control.  While in some parts of the world geographic information is free and open, different cultural norms, business models and views on security will ensure that not all information will be freely available to all people.  This is of course already true of the Internet in general, whether we are talking about defense information or medical diganostics.

By Authentication, I mean the ability of a web service to 1) determine information about an authenticated individual or organization and 2) to pass this information in a secure manner from one web service to another. 

By Access Control, I mean the ability to regulate access (read or write - and write will be increasingly important in the GeoWeb) to geographic information based on who the user is, what they want to do, and on which geographic resources they wish to operate.  Not everyone in a flood should be able to mark that a bridge is out -- but everyone should be able to report the observation that this is the case.

Fortunately specifications for these purposes DO NOT need to be invented for the geospatial world. The OASIS organization has already created specifications for transporting authentication requests and responses, called SAML (Security Assertion Markup Language) and for transporting access control requests and responses - AND for the expression of Access Control Policies (XACML).  While there may be some need to provide spatial extensions to XACML (this is a matter of dispute) the bulk of the work is already done.

The key thing then is to get the spatial community to adopt these standards for Authentication and Access Control and I believe we will see this happen with the next year.  Geographic data servers (WFS) supporting SAML already exist (e.g. Galdos Cartalinea) and we expect others to follow suit in the near future.
Posted by RLake at 21:42:09 | Permanent Link | Comments (0) |

OnStar in the era of the GeoWeb

This morning I was listening to an advertisement for OnStar. It began with the OnStar operator taking a call from an OnStar customer who was having a heart attack. The operator was polite and immediately contacted 911 and then in a shared telephone conversation explained that the customer was located on "Cochrane Street". OnStar obtained the customers location via GPS in the vehicle and then referenced this to a map in the OnStar system. Very cool you say! But Wait! - note HOW that information was transferred to the 911 system. Not by some automated means - but by a telephone call - by one person talking to another.

How would this work in the era of the GeoWeb?.


The customer's position would be sent to OnStar from the GPS device and referenced to the road system. This is as now. When the Onstar system confirms with the customer that an emergency is in progress, the location of the customer is automatically sent to the appropriate 911 system. Paramedic vehicles immediately see the customers location and the dispatcher can determine who is closest. There is no potential for confusion about Cochrane or Crane Street. The estimated time of arrival can be known almost immediately. Information managed in this way can be more accurate and more certain - increasing the value of the both the 911 system and the OnStar system.

With GPS enabled telephones this will move from the vehicle to your telephone or PDA.
Posted by RLake at 19:09:15 | Permanent Link | Comments (0) |

Do we need to encode location in news feeds?

There has been a great deal of discussion of late about the use of geotags - meaning the enccoding of location information within a web page or a news feed.  Many proposals have been made for how to do this in RSS and Atom.  See for earlier views of this blogger.

I would like to argue now that RSS, ATOM, and even web pages SHOULD NOT encode location information at all - the possible exception being the encoding of observations.  Otherwise the locations should be done ONLY by reference. Such a mechanism is simple, respects the notions of modularity and orthogonality in design and allows web feeds to "stick to their knitting".

I would propose that we have some sort of link mechanism like rdf:resource, the <a> in HTML or xlink:href as used in GML. In fact all might be used (in different places).  The link then references a geographic or geometric object and the source for that object is then a standadized server such as an OGC WFS (Web Feature Service).  Processing the return is up to the data consumer and the range of choices is wide indeed.

One possibility is to have a glink tag with the syntax

    <glink xlink:href = "http://www.myobjects.org/obbjects#t11">content</glink>

Here the content is a word in the news feed or the web page.

It is up to the reader to dereference the link and process the content returned by doing so. This could be to draw a map, create an animation etc.

So this might look like:

Hurricane Zeta is forming in the <glink xlink:href = "http://www.myobjects.org/obbjects#t11">Gulf</glink> and its <glink xlink:href="http://www.myobjects.org/obbjects#t11">track</glink> is such that landfall will be near <glink xlink:href ="http://www.myobjects.org/obbjects#t12">Corpus Christi</glink>.  Zeta has already done a great deal of damage throughout a <glink xlink:href ="http://www.myobjects.org/obbjects#t13">wide area</glink>.

This approach allows a great deal of flexibility.  Any word can be associated with a geographic context.  Any sort of geographic context can be supported. 

This seems much simpler than embedding information about Corpus Christi in dozens of web pages.  There is only one Corpus Christi and only one track for hurricane Zeta.

What about observations?  In the case of an observation - we take a "measurement" at some time and some location. The most obvious example would be that of taking a photograph.  Again it would make sense to handle this the way images are handled in HTML. We don't embed the image we reference it.  So let's not embed the observation let's reference it also.
Posted by RLake at 17:07:07 | Permanent Link | Comments (0) |
1 2