March 05, 2006

GML Complexity Re-visited

I have discussed the issue of GML complexity a number of times in this blog. Mostly we have looked at things like the number of tags, use of XML Schema, subject complexity and so forth. Most of it was pretty qualitative. We had no real measures of the complexity, nor comparisons to other established XML grammars to see how GML stacked up. Well, now some folks over at Microsoft, led by Stan Kitsis have set about to create a number of XML Schema metrics and applied these to a large number of schemas, GML among them. Their work used GML v3.1 which is close enough to the current release (GML v3.1.1 and the pending GML v3.2) to mean their results are completely refelective of the GML we are all working with or planning to. The paper is entitled "Analysis of XML Schema Usage" and begins by developing a variety of metrics for XML Schema size and complexity and utiization of particular XML Schema features (e.g. Model-group operators, Simple type features, Occurence features, subtyping and friends, mixed content, wild cards, identity constraints and modularization).

They then provide statistics on the application of these metrics to a set of 63 schema projects from different IT Sectors. Some were internal to Microsoft and some wee external including of course GML. The schemas included some 6000 individual schema files, with roughly 82,000 global element names.

So how did GML stack up? There is not space to go over all of the findings and I will leave that to Stan and the Microsoft folks. However just a few items will give you the general idea.

Schema Size based on Lines of Code (LOC)

The range of schemas is shown in the table below with GML.

 

LOC-based category

Definition

Schema count

Mini

0 – 100

0

Small

100 – 1,000

12

Medium

1,000 – 10,000

24

Large

GML

10,000 – 100,000

10,291 lines

23

Huge

100,000 – …

4

It is clear from this measure that GML is at the bottom end of the large schemas.

Schema Size - Based on size in kilobytes.

The schemas in the study ranged from a 6 Kbytes to 18 Mbytes. Most of these schemas (26 of the 63) are in the range of 100 KB to 1MB and this is indeed where we find GML at 532 Mbytes. There were NOT many small schemas (only 6 less than 10Kbytes), and as one might expect not many really large schemas (only 11 in this range).

Number of Complex Type Definition:

Some people think GML is complex because it declares so many complex types - well does it?

According to the Microsoft study this metric ranged over the following:

 

#CT-based category

Definition

Schema count

Mini

0 – 32

13

Small

32 – 100

12

Medium

100 – 256

14

Large

256 – 1,000

12

Huge

1,000 – …

12

and GML - well 287 - so again at the bottom end of the large schemas.

 


Posted by RLake at 23:09:32 | Permanent Link | Comments (10) |
Comments
1 - One contributing factor to GML's complexity is that it is built upon XML standards such as the XML Schema Definition Language XSD, which has turned out to be notoriously complex and difficult to use (see the comments about XSD at http://www.25hoursaday.com/weblog/PermaLink.aspx?guid=7a01ea38-6937-453a-bab1-0660f027979c). Indeed, there are some similarities between the limited adoption of SOAP and GML because of this. Which also raises the question as to how prevalent "premature standardization" might be - just because a comprehensive "official Standard" exists doesn't necessarily mean it should be used. The real criteria/test for the effectiveness of "Standards" is how well do they work in practice. (Comment this)

Written by: Roger at 2006/03/06 - 06:26:46
2 - You may note that GML is written in XML Schema - but has also been written in DTD and RDF - and could be ported to OWL or other schema languages in the future. (Comment this)

Written by: Ron Lake at 2006/03/06 - 17:22:20
3 - One of the questions that really needs to be asked with respect to GML and its adoption/uptake is "who is the intended consumer?" I believe that if the consumer is intended to be another computer application the current GML is ok - it is verbose, but at times you are forced to be verbose. However, if the consumer is a simple application (i.e. viewer type GIS) or a human, GML is simply way too complex - forget verbose. (Comment this)

Written by: Darrell O'Donnell at 2006/03/12 - 15:46:44
4 - I think the main issue is not how many 'types' GML has or how complex it is, but how easy it is to support in software. One of the biggest problems is that standards and specification writers often fail to understand that every new GML 'profile' means a whole new XML to support in software, and this takes time and money.

The best way to deal with the GML 'complexity' issue is through the new GML Simple Features Profile (GMLSF). GMLSF makes it much easier to use GML and WFS in multiple applications, and will translate to lower overall implementation costs and greater flexibility.

An article on using GMLSF in 'real-life' is available here http://www.directionsmag.com/article.php?article_id=1971&trv=1

Regards,
Jeff (Comment this)

Written by: Jeff Harrison at 2006/04/30 - 16:21:14
5 - GMLSF is a profile so falls under your issue of "means a whole new XML to support in software and this takes time and money".

I agree that GMLSF is useful - but there are a lot of problems (hence application schemas) where GMLSF does not do the trick. This include for example XMML, AIXML, IAXS, TransXML etc. So while useful the restrictions are also a detraction.

Cheers

Ron (Comment this)

Written by: Ron Lake at 2006/05/01 - 15:37:54
6 - GMLSF is a profile so falls under your issue of "means a whole new XML to support in software and this takes time and money".

I agree that GMLSF is useful - but there are a lot of problems (hence application schemas) where GMLSF does not do the trick. This include for example XMML, AIXML, IAXS, TransXML etc. So while useful the restrictions are also a detraction.

Cheers

Ron (Comment this)

Written by: Ron Lake at 2006/05/01 - 16:29:36
7 - So you're saying that more profiles help interoperability?

The plain and simple fact that few want to come out and say is that proliferating more and more profiles in the interest of promoting geospatial interoperability is probably hurting geospatial interoperability.

Why? Because in the words of one frustrated member of the Open-Geospatial .NET Community, right now XML schema validation often requires special domain (profile) dedicated coding to make it work..."One for Ordinance survey, one for each and every GML data provider."

This is a big challenge for implementation.

A reasonable path forward is to begin to "keep it simple" (as a start), apply more software engineering rigor in the standards development process and for standards and specifications writers to remember that - every new GML 'profile' means a whole new XML to support in software.

Regards,
Jeff (Comment this)

Written by: Jeff Harrison at 2006/05/03 - 05:02:28
8 - In response to Jeff. No I would not say that lots of profiles help. In fact I could argue that NO profiles are required as profiles are an impediment to interoperability. On the other hand, profiles are useful to increase adoption by lowering the entry bar. So profiles definitely have their place. We are already well down the road, however, where there are a number of important application schemas (e.g.AIXML) that could NOT build on the SF profile. Please don't mixup profiles and application schemas. A profile is just a subset of GML with restrictions. There should NOT be different profiles for different users as you suggest - this is a misunderstanding - the "one of ordnance survey" line is simply incorrect. Ordnance Survey has an application schema - meaning a domain specific schema built on the core GML elements and using XML Schema. This is no different than in the case of a relational schema - there are hundreds or tens of thousands of relational schemas - no one worries about that and this is NO different. (Comment this)

Written by: Ron Lake at 2006/05/05 - 06:30:54
9 - Ron,

I think it comes down to three points that folks could agree on -

1) One of the best ways to deal with the GML 'complexity' issue is through the new GML Simple Features Profile (GMLSF). GMLSF makes it easier to use GML and WFS in multiple applications, and can reduce overall implementation costs while providing a certain base level of interoperability.

2) There are problems where GMLSF does not do the trick. This include, for example, Aeronuatical Information Exchange, hence application schemas like AIXML have a role in the geospatial information infrastructure.

3) When developing either profiles or applicaiton schemas folks should apply software engineering rigor in the standards development process. Standards and specifications writers need to remember that - every new GML 'profile' 'application schema' means a whole new XML to support in software and there is no magic to interoperability.

Regards,
Jeff

 (Comment this)

Written by: Jeff Harrison at 2006/05/08 - 05:29:10
10 - Hell! Great site!
animal stuffed webkinz
animal webkinz
animal code webkinz
webkinz dogs and more
free webkinz dog
free webkinz panda
webkinz
cheat code webkinz
cheat code pet webkinz
cheat code secret webkinz
cheat code webkinz world
cheat code money webkinz
webkinz world
cheat webkinz
code webkinz
webkinz welcome world
code secret webkinz
love puppy webkinz
child webkinz welcome world
recipe webkinz
webkinz world.com
cheeky dog webkinz
black lab webkinz
ganz webkinz world
ganz site web webkinz
ganz site web webkinz welcome
ganz webkinz welcome world
new webkinz
webkinz welcome
ganz webkinz
pet play webkinz world
code free secret webkinz
bear polar webkinz
game webkinz
code free webkinz
recipe secret super webkinz
toy webkinz
bunny sherbert webkinz
game pet play webkinz world
code secret webkins webkinz
web webkinz
cat cheeky webkinz
recipe secret webkinz
store webkinz
cat webkinz
code secret toy webkinz
panda webkinz
google webkinz
code pet plush secret webkinz
kinz lil webkinz
pug webkinz
googles webkinz
lil webkinz
monkey webkinz
ganz pet play webkinz world
clothes webkinz
search webkinz
leopard webkinz
valentine webkinz
picture webkinz
rabbit webkinz
cheat money webkinz
bullfrog webkinz
hippo webkinz
chihuahua webkinz
new recipe webkinz
pet webkinz
frog webkinz
koala webkinz
password webkinz
cheeky monkey webkinz
plush toy webkinz
dog webkinz
retired webkinz
cheap webkinz
bunny sherbet webkinz
horse webkinz
bulldog webkinz
com webkins webkinz
tiger webkinz
ca webkinz
gorilla webkinz
golden retriever webkinz
collection webkinz
cat persian webkinz
accessory webkinz
unicorn webkinz
cow webkinz
frog tree webkinz
webkins webkinz
free webkinz
locator store webkinz
cheeky webkinz
bull dog webkinz
find webkinz where
bear black webkinz
shop webkinz
penguin webkinz
webkinz yorkie
pig webkinz
cheat webkinz world
ganz site web webkinz world
recipe webkinz world
code secret webkinz world
recipe secret webkinz world
game webkinz world
play webkinz world
pet webkinz world
web webkinz world
cheat kinzcash webkinz
cheat tamagotchi webkinz
cheat ganz site web webkinz
cheat kincash webkinz
cheat secret webkinz
cheat game webkinz
cheat recipe webkinz
any cheat there webkinz
cheat tamagotchi toy webkinz
code unused webkinz
code com secret webkinz
code tag webkinz
buy code secret webkinz where
cheeky code dog secret webkinz
code webkinz xom
code pet secret webkinz
code free new pet webkinz
code pet webkinz
code friend webkinz
code secret tag webkinz
code love puppy secret webkinz
new pet recipe webkinz
food recipe webkinz
book recipe unofficial webkinz
banana dome recipe webkinz
recipe stove webkinz
holiday new recipe webkinz
ganz recipe webkinz
blender recipe webkinz
book recipe webkinz
cookie recipe tornado webkinz
recipe sandwich webkinz
rare recipe webkinz
webkinz welcome world.com
black lab new webkinz
game ganz site web webkinz world (Comment this)

Written by: Anonymous at 2007/06/15 - 10:42:45
Write a comment