NMBGMR Draft Geologic Data Model - v. 0.90
revised:
28-Aug-2007
Introduction:
The New Mexico Bureau of Geology and Mineral Resources (NMBGMR) has produced geologic maps for many years now using a Geographic Information System (GIS). A GIS is essentially a geo-spatial database that stores information about the shape and position of a mapped feature as well as data associated with it. In order for a GIS to be useful, the geo-spatial database must be organized with a consistent structure. A data model is a standardized database structure (also called a database schema) that defines what features (or entities) are recorded, what their attributes are (sometimes with a limited number of choices), and how they relate to one another.
The NMBGMR began investigating the implementation of a new geologic data model when the limitations of our existing rudimentary model became apparent. Our early model was very basic and was sufficient to attribute simple orientation data, contacts and faults (solid, dashed, dotted and queried), and polygons (map unit abbreviation). This basic model was sufficient, in most cases, to create a paper plot (or PDF) of a useable geologic map. However, this system did not allow much opportunity to store detailed data regarding geologic features or to adequately show relationships between them. For instance, planar fault data was recorded as a separate record from linear slickenline data on the same fault and the topological relationship between them was lost. It was also not possible to record kinematic information regarding fault movement in the old model. Because of these and many other similar problems, the ability to analyze these geologic map spatial databases was somewhat limited and topological relationships between geologic features were lost or were hard to recreate when needed. We also had problems when attempting to compile geologic maps of larger areas because each quadrangle often had a 'custom' data model applied as particular data types were encountered. This was an inconvenience when the number of GIS-based geologic maps was relatively small. However, as the number of GIS map projects multiplied, it became clear that we needed a more sophisticated geologic data model, both for consistency between maps, and to allow our geologist to record a wider variety of geologic information.
Other Geologic Data Models
We began looking at geologic data models developed by other organizations, specifically the North American Geologic Map Data Model and its variants. The complexity of these models, coupled with the fact that they have yet to be significantly implemented to date, suggested that they are unsuitable for our needs.
Particular criticisms of the NADM and its variants include:
- The complexity of the model makes it very difficult to use by a general audience.
- The complexity of the model will irritate both geologists and cartographers alike and slow map production considerably.
- Complex relationships between tables render individual tables incomprehensible if or when these joins/relates are broken and it may not be obvious what these relationships are or how to establish them.
- Extensive use of numerical codes to describe geologic features, while computationally efficient, makes the database difficult to understand. The speed of modern computers, the low-cost of storage, and now the relatively small size of geodatabases makes coded values less necessary.
- Every implementation of the 'standard' has resulted in extensive custom modifications of the data model.
Other geologic data models and related links of interest:
The NMBGMR Model
Because existing geologic data models are too complex to be practical and otherwise don't fit our needs, we chose to create our own model from scratch (but borrowing useful ideas from other models). Since both field-mapping and digitization of maps are already fairly labor-intensive, we don't want to add needless complexity to the process of producing maps. That said, we do want have the ability to create a fully attributed geologic map in a GIS. Whether any particular map is attributed to the fullest extent possible will depend on the intended use of the map and the interest and time available to do so.
Our Goals:
- Build a model that is comprehensive enough to record all data relevant to creation of a geologic maps within of the vast diversity of the geologic features mapped in New Mexico.
- Make sure that any feature data presented using cartographic enhancements to maps is actually encoded in the geodatabase so that if the map were to be recreated solely from the geodatabase, no information would be lost.
- Attempt to make the model comprehensible to field geologists, cartographers, and end-users.
- Allow the model to be simple enough that if a geologist wants to produce a map using established conventions from previous mapping projects, it won't cause problems.
- Make the model flexible enough to accommodate the wide range of mapping styles and interests of our field geologists.
- Make the model modular so that commonly unused portions of the model (e.g. metamorphic isograds) can be ignored/deleted from the geodatabase if they don't apply to a given map.
- Make the benefits of a new model apparent and the effort necessary to use it worthwhile.
- Ensure that the model and symbolization used will portray geologic maps as they were intended by the author.
- Generally avoid using numerical codes for attributes in favor of English terms, even at the expense of computation speed so that the maps are easily accessible to a wide audience.
- Attempt to make attribute tables able to stand alone when joins and relates to them are broken (even at the expense of efficiency and somewhat redundant data storage -- i.e. don't strive to make a 'normalized' database).
- Construct the model such that it will work well with ArcGIS software, particularly with regard to how symbology is handled (i.e. try to use three or fewer fields as the basis for symbology—the current maximum in ArcGIS without an additional query).
- Use fieldnames and table names that can be exported to ArcInfo coverages and/or shapefile formats (table names should be 13 characters or less and field names should be 10 characters or less -- avoid unusual characters).
- Attempt to make the attribute tables extensible so that new features can be added to the model without extensive redesign of the table structure.
- The model will be principally concerned with recording data relevant to a geologic map. Other types of data may be recorded in the model to a limited degree (i.e. geochronologic data or geophysical data). However, these datasets should have separate data models (to be constructed) which may be linked to the geologic map data model.
- We hope to create GUI tools that will speed up the process of feature attribution as well as ensure accurate data entry.
Unresolved Issues:
- Our primary use for the data model is for 1:24,000 scale geologic maps of USGS 7.5-minute quadrangles. We think that each map should have a separate (file-based) geodatabase rather than use a single statewide geodatabase that contains every quadrangle mapped. The advantages of a single statewide database include seamlessness and simplicity of file management. The advantages of separate databases for each quad include portability, ease of distribution, ease of customization, and adaptability to different approaches to stratigraphy and mapping focus. Hopefully, this approach won't cause problems for us later.
- We are unsure of the best approach to actually constructing/modifying the database schema—whether by hand with ArcCatalog, via CASE or UML tools, or via XML.
- Parent/child/sibling relationships between features and between features in different feature classes could be maintained using Global user IDs (GUIDs) or by topology rules. Using GUIDs seems cumbersome so for now we'll try using topology rules. For example: fault measurements need to lie on fault lines. We'd then need to allow exceptions to this rule for minor faults. However, in order to generate the correct symbology, an attribute MappedFtr (MappedFeature) will need to be set true or false to allow a query to determine symbology intent.
- We don't know if our model will be compatible with the emerging Geoscience XML standard (GeoSciML) or if data export to this standard will be possible. Of particular concern are the 'earth materials' properties which appear to be much more granular in GeoSciML. We chose to use a more flexible general approach that we feel is more suitable for making geologic maps. 'Earth materials' is such a broad topic that perhaps that could be a data model unto itself.
- Database normalization: The model may allow data inconsistencies. For instance, a fault (line) could have an 'SlipBasis' attribute set as "map pattern " but a point on that fault could have 'SlipBasis' attribute set as "offset marker". Such inconsistencies will arise in the course of a mapping project, even with only one author, but this may actually be a benefit to the model. The repetition of attributes for mapped linear features like faults and observations taken on faults that are too small to map as lines seems necessary to accommodate the way geologic maps are actually constructed in the field, and the natural variation of most geologic features.
- It is unclear how best to manage repeated data collection at a single site. For measurements we don't want to appear on the map (like multiple paleocurrent measurements, or populations of fault measurements), we could set the DisplayScale to zero and have repeated Station numbers. However, probably a better approach would be to have one feature and StationID in Data_point that is related to a separate related table (on StationID) that has a user-defined structure.
This is a DRAFT of the NMBGMR Geologic Map Data Model and is expected to change significantly as various portions of it are implemented.
[return to top]