revised: 27-Jul-2018

NMBGMR Draft Geologic Data Model - v. 1.0.4
Overview

In the interests of standardization and interoperability, we are migrating our geologic maps into the USGS NCGMP09 data model. However, our well data model is still actively being used. Documentation for the NMBGMR geology data model is being provided for users with legacy data in this model. We also hope that it will provoke discussion in the ongoing effort to improve NCGMP09 and other models.

Created by: Adam S. Read, Geoff Rawling, Daniel J.Koning, Gretchen Hoffman, Sean D. Connell, J. Michael Timmons, David McCraw, Glen Jones, Mark Mansell, & Shannon Williams

Introduction:

Like most other mapping agencies, the New Mexico Bureau of Geology and Mineral Resources (NMBGMR) has produced geologic maps for many years using a Geographic Information System (GIS). A GIS is essentially a geospatial database that stores information about the shape and position of mapped features as well as associated data. In order for a particular GIS-based map to be interoperable with other maps, their geospatial databases must be organized with a consistent structure. A data model is a standardized database structure (also called a database schema) that defines what features (or entities) are recorded, what their attributes are (often with a pre-defined set of possible values), and how they relate to one another.

The NMBGMR began investigating the implementation of a new geologic data model when the limitations of our existing rudimentary model became apparent. Our early model was based on an early USGS schema and was very basic. It was sufficient to attribute simple orientation data, contacts and faults (solid, dashed, dotted and queried), and polygons (attributed with a map unit abbreviation). This allowed us, in most cases, to create a paper plot (or PDF) of a useable geologic map. However, this system did not allow us to store detailed data regarding geologic features or to adequately show geologic relationships between them. For instance, planar fault orientation data was recorded as a separate record from linear slickenline data on the same fault and the topological relationship between them was lost. It was also not possible to record kinematic information regarding fault movement in the old model. Because of these and many other similar problems, the ability to analyze these geologic map spatial databases was somewhat limited and topological relationships between geologic features were lost or were hard to recreate when needed. We also had problems when attempting to compile geologic maps of larger areas because each quadrangle often had a 'custom' (i.e. seat-of-the-pants) data model applied as particular data types were encountered. This was an inconvenience when the number of GIS-based geologic maps was relatively small. However, as the number of GIS map projects multiplied, it became clear that we needed a more sophisticated geologic data model, both for consistency between maps, and to allow our geologists to record a wider variety of geologic information. It also became apparent that we needed a way to store feature level metadata, particularly the source ID for features compiled from other maps.

The NMBGMR Model

When we decided we needed a better data model, we first looked at existing geologic data models. At the time we found that existing models were either too complex to be practical or otherwise don't fit our needs. So, we chose to create our own model from scratch, borrowing useful ideas from other models. Since both field-mapping and digitization of maps are already fairly labor-intensive, we didn't want to add needless complexity to the process of producing maps. However, we did want have the ability to create a fully attributed geologic map in a GIS. The links in the banner above describe the feature classes and attributes of our model. An XML schema of our model is available for download.

Our Goals:

Build a model that is comprehensive enough to record all data relevant to creation of a geologic maps within of the vast diversity of the geologic features mapped in New Mexico (and beyond). Whether any particular map is attributed to the fullest extent possible though will depend on the intended use of the map and the interest and time available to do so.
Attempt to make it possible to separate observations from interpretations and geologic data from style of presentation much the same way that web standards separate style and content.
Make sure that any feature data presented using cartographic enhancements to maps is actually encoded in the geodatabase so that if the map were to be recreated solely from the geodatabase, no critical information would be lost. We decided to use feature classes to store symbology where the symbol position conveys geologic information. This decision predated ESRI cartographic representations which are another reasonable approach to this issue (but see next bullet).
Construct the model such that it will work well with ArcGIS software, particularly with regard to how symbology is handled (i.e. try to use three or fewer fields as the basis for symbology—the current maximum in ArcGIS without an additional definition query). Another alternative would be to use cartographic representations. However, cartographic representations require that all symbology be abstracted from an integer 'rule' code and also require orientation of symbols to be attributed opposite to azimuth conventions on geologic maps.
Attempt to make the model comprehensible to field geologists, cartographers, and end-users.
Allow the model to be simple enough that a geologist can still produce a map using long-established conventions from pre-GIS mapping projects or can create GIS products by digitizing published paper maps.
Make the model flexible enough to accommodate the wide range of mapping styles and interests of our field geologists.
Make the model granular enough that features, like faults or mine workings, can be easily be turned on or off without resorting to definition queries.
Make the model modular so that typically unused feature classes or tables (e.g. metamorphic isograds) can be ignored or deleted from the geodatabase if they don't apply to a given map.
Make the benefits of a new model clear enough that the effort necessary to use it is apparent. Geologists in the field should be aware of what can be coded into the model and record their observations accordingly. Ideally, there should be either an attribute cheat-sheet, or some portable digital device to display model attributes and/or record field data.
Ensure that the model and symbolization used will portray geologic maps as they were intended by the author.
Generally avoid using numerical codes for attributes in favor of English terms, even at the expense of computation speed so that less technically proficient users or those using shapefile versions of the GIS data will have less difficulty understanding attributes.
Attempt to make attribute tables able to stand alone when Related Tables to them are broken (even at the expense of efficiency and somewhat redundant data storage -- i.e. don't insist on a "normalized" database).
Use field names and table names that can be exported to ArcInfo coverages and/or shapefile formats (table names should be 13 characters or less and field names should be 10 characters or less -- avoid unusual characters). Provide aliases where necessary to make these names easier to decipher.
Attempt to make the attribute tables extensible so that new features can be added to the model without extensive redesign of the table structure.
The model will be principally concerned with recording data relevant to a geologic map. Other types of data may be recorded in the model to a limited degree (i.e. geochronologic data or geophysical data). However, these data sets should have separate data models (existing or to be constructed) which may be linked to the geologic map data model.
We hope to create GUI tools that will speed up the process of feature digitization and attribution as well as ensure accurate data entry.

Schema

The NMBGMR Geologic Map Data Model is expected to evolve as various portions of it are implemented.

These web pages that describe the current NMBGMR Geologic Data Model and are available for download as a zip archive.
An XML schema (4 MB, right-click and save XML file) of the current version of our data model can be imported into ArcGIS using ArcCatalog. This schema was created with ArcGIS Diagrammer. This schema is being actively updated as needed. Please note the version number at the top of this page.
There will be inevitably be many feature classes and tables that are unused. A VB script can remove empty layers and tables from ArcMap (they won't be deleted from the geodatabase).

Change Log (from version 1.0):

v. 1.0.1

Modified 'Sources' table -- added several attributes.
Modified 'Lithology' table -- added several attributes and added values to related coded-value domains.
Added a boolean field 'ShowMarker' to Fault_line and Fold_line to allow toggling of automated line decoration (e.g. thrust teeth, syncline arrows).

v. 1.0.2

Modified Fold_Symbol feature class -- added FoldLineType field
Appended '_AA' to cross section feature class names. This will allow for unique feature class names when there are multple cross sections.
Changed 'Section' field in Well_point to 'Sectn' (alias is Section) because of personal geodatabase reserved word.
Fixed alias for Lith_poly and Lith_poly_label.
Added Metamorphic_pt feature class to store metamorphic assemblage data recorded at a point.

v. 1.0.3

Extensive revisions to Attribute tables related to Well_point and the addition of several well-related feature classes.
Changed field 'URL' to SourceURL in the 'Sources' table (bibliographic citations) to avoid keyword problems on a web-based data-entry form.
Removed coded value domains for FGCD codes for ease of data import.
Made most fields nullable that were not nullable previously for ease of data import.
The schema is now a version 10 geodatabase and will no longer function on on Arc 9.x systems.

v. 1.0.4

Split NMWells schema from this geologic map schema and made a simplified Well_spot table suitable for storing the location, name, symbology, and TD of a well that will appear on a geologic map.

Unresolved Issues:

Quadrangle Based: Our primary use for the data model is for 1:24,000 scale geologic maps of USGS 7.5-minute quadrangles. We think that each map should have a separate geodatabase rather than use a single statewide geodatabase that contains every quadrangle mapped. The advantages of a single statewide database include a seamless data set and simplicity of file management. The advantages of a separate geodatabases for each quadrangle include portability, ease of distribution, ease of customization, and adaptability to different approaches to stratigraphy and mapping focus. Hopefully, this approach won't cause problems for us later.
GeoSciML: We don't know how compatible our model will be with the emerging Geoscience XML standard (GeoSciML) or if data export to this standard will be difficult. Of particular concern are the 'earth materials' properties which appear to be much more granular in GeoSciML. We chose to use a more flexible general approach that we feel is more suitable for making geologic maps. 'Earth materials' is such a broad topic that perhaps that could be a data model unto itself. However, it would be simple to incorporate the GeoSciML approach to 'Earth materials' the same way that the NCGMP09 model does. Another issue is Enumerated 'GeologicEvents' which is another broad topic. Currently, events related to geologic structures are confined to free text attributes for Ancestry, LastActive, and Comments and in traditional reports. It seems unlikely that even these fields will be routinely attributed, so we'll leave the fully relational GeologicEvents out for now. Again, this would be easy to incorporate into our model, but perhaps difficult to actually use.
Database normalization: The model may allow data inconsistencies. For instance, a fault (line) could have an 'SlipBasis' attribute set as "[map] pattern " but a point on that fault could have 'SlipBasis' attribute set as "offset marker". Such inconsistencies will arise in the course of a mapping project, even with only one author, but this may actually be a benefit to the model. The repetition of attributes for mapped linear features like faults and particular observations taken on faults seems necessary to accommodate the way geologic maps are actually constructed in the field — and the natural variation of most geologic features.

Other Geologic Data Models

Before we began creating our model, we looked at geologic data models developed by other organizations. At the time, we looked closely at the North American Geologic Map Data Model and its variants and an early ESRI model. The complexity of these models, coupled with the fact that they had not been significantly implemented, suggested that they were unsuitable for our needs.

National Cooperative Geologic Mapping Program 2009 (NCGMP09) Data Model

In response to difficulties encountered with the NADM, the USGS recently proposed a new draft standard: NCGMP09 for individual geologic map quadrangles. The NCGMP09 is much less complicated than the NADM, and is quite similar to the model that we developed in parallel. In some ways, our model is much more granular because we have separate feature classes for separate types of features (see comparison below).

Model Comparison

As noted above, our model developed in tandem with both the NCGMP09 and ESRI models and shares several design features with them -- but also has some important differences:

Feature Classes

We have far more granular feature classes in our model than the NCGMP09 model does and a different structure than the ESRI Geologic Mapping Template. The benefit of separate feature classes is that it is easier to create maps that display just the features of interest. For instance, if a structure map is needed, you can just display the faults, folds, and perhaps structure contour layers. To do this in the NCGMP09 model would require querying the data and perhaps exporting features to new feature classes to construct these derivative maps. Another benefit of feature classes dedicated to a particular type of feature is that attributes can be more specific for that feature type. Of course, the drawback of our approach is that having more feature classes can make geodatabases more complicated.

Confidence, Locational Accuracy, and Exposure

Traditionally, lines (generally contacts and faults) on printed geologic maps are either solid, dashed, dotted or queried. Solid lines were used to represent linear features that were confidently identified, well located, and exposed. Dotted lines were used for concealed features that a geologist felt reasonably confident in projecting beneath another unit. Queries along lines reflected decreased confidence about both existence and location. Dashed lines were more mysterious. Dashed lines could represent decreased confidence because a contact was mapped with binoculars or air photos, was poorly exposed, wasn't well located in areas of low relief, or was interpolated. The main problem with the standard line types used on paper geologic maps is that there are multiple inter-related attributes than can be effectively symbolized with such a simple system of line types.

We chose to attribute linear features using a combination of two attributes, 1) Exposure (exposed, obscured or intermittent, concealed) and 2) Scientific Confidence combining confidence regarding the existence and/or identification of a feature (certain, probable, uncertain). Note that positional (or locational) accuracy is not recorded for lines and does not affect our symbology (positional accuracy can be recorded for points along the line however). There are several reasons for not attributing positional accuracy for linear features. First of all, it is much simpler not to because it quickly becomes very difficult to create a workable field symbology for use on paper field maps. Also, if a line were attibuted as having an 'accurate' location, the assumption on the part of map users is that the line has actually been surveyed -- the way a road or pipeline might be. This is very rarely the case for lines on geologic maps. Positional accuracy usually varies continuously along the linear features. This variability can better be expressed by recording this data at the specific points where the positional accuracy was actually measured.

Topology

We have defined a number of important topology rules that should be valid for any geologic map. Most of these rules are obvious: no gaps between polygons, contacts must overlie polygon boundaries, contacts can't dangle etc. These rules help identify and fix inevitable digitization errors. Other rules require that fold and fault measurements should line on their respective line types or be marked as exceptions. These exceptions will additionally have an attribute "MappedFeature" set to FALSE so that they can easily be symbolized as minor structures.

A more fundamental topologic relationship exists for point data that can have measurements for both planar and linear data, like faults with slickenlines, fold axial planes and fold axes, foliations with extension lineations, or bedding with paleocurrent vectors. For all these types of features, planar and linear data resides in the same record. Of course, there are many ways to store such a relationship in a database, but this method is by far the simplest to see and understand when editing or viewing the database. Many other geologic data models store one point for a fault plane and another for the slickenline in that plane. It then becomes very difficult to extract this key data from what is really a single data point.

Our line feature classes are structured somewhat differently than other data models. Lithologic contacts are separated into two feature classes: Lith_Contacts and Concealed_Contacts. Additionally, faults are stored and fully attributed in Fault_line rather than being combined with contacts as in the NCGMP09 model. After faults are attributed, non-concealed faults (that participate in polygon topology) are copied to the Lith_Contacts layer where they will retain their LineClass attribute of 'fault'. Before building polygons, the Map_Boundary polyline is also copied to the Lith_Contacts layer and the topology is validated. Faults that dangle are deleted from the Lith_Contacts layer and any other topology problems are fixed. When there are no topology errors -- and no exceptions -- polygons can be built from the Lith_Contacts layer (and attributed using Lith_poly_label points if present).

Lithologic Classification

We chose to proceed from very general lithologic attributes to more specific attributes:

LithClass: (LithType)
- Sedimentary: (siliciclastic, mixed, nonsiliciclastic)
- Volcanic: (lava flow, dome, ash, volcaniclastic)
- Intrusive: (plutonic, hypabyssal, dike, sill)
- Metamorphic: (metasedimentary, metavolcanic, metaplutonic, unknown protolith)
- Anthropogenic: (disturbed land, artificial fill, tailings, dump)

The most specific lithologic attributes are divided into PrimaryLithology and SecondaryLithology which could either use uncontrolled terms or use the NGMDB vocabularies.

In addition to a long Text field for UnitDescription, we also include a ShortDescription field suitable for the map legend.

Geologic Events

A geologic events table as specified in the NCGMP09 model is not currently part of our model. However, features like faults have attributes for Ancestry and LastActive. Our Lithology table has attributes for min/max/preferred age, GeneticEnvironment and DepositionalSystem. Having all geologic features linked to attributes about their geologic history sounds like it could be very useful, but that may be extremely difficult to implement.

Attributes

Almost all attributes in our model are human-readable text. This is less space efficient than using numeric codes and precludes the use of ArcGIS subtypes, but it makes the database much easier to comprehend, especially if it has been exported to a shapefile. Many of these have coded value domains that provide pick lists for acceptable attributes. Aliases show the text used as the code within brackets to facilitate editing when more than one feature is attributed with the ArcMap field calculator. For instance, “obscured” is a code that has the alias: “[obscured]: Intermittantly exposed / obscured by colluvium”.

Some feature classes like DataPoint are just containers for the location of point data that can be attributed in separate tables as needed. In general, however, most feature classes have a fairly comprehensive set of attributes. These could be expanded as the need arises. Another approach is to use extended attributes as used by the NCGMP09 and ESRI Geologic Mapping Template. This allows for uncommonly used attributes to be stored in a separate related table. In these models, one table is used for extended attributes for all feature classes by relying on user-maintained keys specifying the parent feature and the parent feature class. This approach seems difficult to manage if a large number of extended attributes are used. Perhaps feature classes could have rarely used and extended attributes in a dedicated table and use one-to-one relationship classes to maintain the link between features and attributes. For instance, Fabric_point could store rarely used attributes and extended attributes in a table called Fabric_pt_attr. This wouldn't rely on user-maintained database keys and would allow for automatic deletion of attribute data when the parent feature is deleted. The downside of this approach would be more tables in the geodatabase.

Correlation of Map Units (CMU)

We don't currently encode the correlation of map units diagram within our geodatabases. The flexibility of producing custom diagrams in a graphics program is somewhat offset by the utility of standardization as advocated by the NCGMP09. The encoding scheme in the NCGMP09 seems rather complicated for practical use however.

Relationship Feature Classes

We have set up geodatabase relationship classes between features and stand-alone tables. For instance Lith_poly units are in a relationship class with the Lithology table. Relationship classes have the advantage over standard database joins or relates in that the relationship is stored in the geodatabase itself and not in the ArcMap MXD file.

Subcrop

We include feature classes for creating an bedrock map beneath alluvium/colluvium or other cover. These derivative maps are very useful for hydrologic modeling, basin analysis, and other geotechnical projects.

Symbology

When we began constructing our model, cartographic representations were not available in ArcMap and symbology was (and still is) limited to combinations of three attributes at a time. Our feature classes were designed with this in mind. Many feature classes had somewhat generic attributes based on a Class, Type, SubType attribute hierarchy. This has evolved somewhat over time, but we have tried wherever possible to limit the number of attributes that must be considered to define symbology to three.

Cartographic representations are another approach to symbology but require that all symbology be abstracted from a code. They also require orientation of symbols to be attributed opposite to azimuth conventions on geologic maps. One problem with the FDGC standard and ESRI approach to using cartographic representations of it is that a number of the symbol codes refer to multiple features that should be symbolized separately. For instance:

2.11.13

Lineation on inclined fault surface—Tick shows fault dip value and direction; arrow shows bearing and plunge of lineation

While the fault plane and lineation (slickenline) are both fundamentally part of one data point measurement (see above), They need to be symbolized with two instances of the data. Of course, there are individual FDGC codes for each of these elements, but it might be useful to eliminate the FGDC codes that aren't granular enough to apply to individual features and data types. Another problem with the code approach is that it would be very easy for the code to not be synchronized with the actual attributes of the feature which would become very confusing to end-users. One way to get around this problem would be to have separate joined tables that allow determination of symbol codes based on attributes. This has the added benefit of separating style from content the same way that standards-compliant HTML encodes content while CSS applies styles for Web pages.

Wells

An extensive revision of how our model handles well locations and data related to wells was made at version 1.0.3. It would be possible to use just the well-related part of our datamodel for those who already have a schema that suits their needs for other geologic data. The well schema became complex enough that we decided to detach it from the rest of the geologic data model. Please see the discusion on the NM Wells Data Model page for more information.

Where do we go from here?

Eventually, some consensus will probably be reached and a single geology data model will be widely adopted and be interoperable with GeoSciML. This will make it much easier for anybody who tries to produce compilations, create derivative maps, or to perform spatial analyses of existing geologic map data. While our model has been working reasonably well for us, we don't presume that it will be the model adopted. Nonetheless, we do hope that some of the ideas presented by this model will be borrowed by other models -- just as we have done.

NMBGMR Draft Geologic Data Model - v. 1.0.4
Overview

Contents:

Introduction:

The NMBGMR Model

Our Goals:

Schema

Change Log (from version 1.0):

v. 1.0.1

v. 1.0.2

v. 1.0.3

Unresolved Issues:

Other Geologic Data Models

National Cooperative Geologic Mapping Program 2009 (NCGMP09) Data Model

Other geologic data models and related links of interest:

Model Comparison

Feature Classes

Confidence, Locational Accuracy, and Exposure

Topology

Lithologic Classification

Geologic Events

Attributes

Correlation of Map Units (CMU)

Relationship Feature Classes

Subcrop

Symbology

Wells

Where do we go from here?

NMBGMR Draft Geologic Data Model - v. 1.0.4 Overview

Contents:

Introduction:

The NMBGMR Model

Our Goals:

Schema

Change Log (from version 1.0):

v. 1.0.1

v. 1.0.2

v. 1.0.3

Unresolved Issues:

Other Geologic Data Models

National Cooperative Geologic Mapping Program 2009 (NCGMP09) Data Model

Other geologic data models and related links of interest:

Model Comparison

Feature Classes

Confidence, Locational Accuracy, and Exposure

Topology

Lithologic Classification

Geologic Events

Attributes

Correlation of Map Units (CMU)

Relationship Feature Classes

Subcrop

Symbology

Wells

Where do we go from here?

NMBGMR Draft Geologic Data Model - v. 1.0.4
Overview