shrug-l: Data Drift in GeoDataBases

Clay, Linc Linc.Clay@dep.state.fl.us
Fri, 18 Oct 2002 13:42:27 -0400


We have provided Mark some direct responses based on our experience.  =
For the
good of the order since this is an interesting topic, here are those
responses.

DEP's general response --

Mark,

My guess is that the coordinate creep is a function of the data type or
precision (or scale more correctly) in which the coordinates are stored. =
The
GDB may use a different data type, so you may be seeing "rounding" or
truncating error in essence when moving from one form to another.  This =
would
also explain why the creep was retained when you went back to the native
form.

-Linc

DEP's ArcSDE specific response --

Mark,

My answer assumes that your GeoDataBase is ArcSDE, but is probably true =
for
personal GDBs as well if they use the same concepts for storage and =
access,
which I believe they do.  If I understand the problem correctly . . .

ArcSDE will always be offset from the original data source at some level =
of
precision.  Vertices in many of our layers are 2mm (.002) 'off' from the
source data.  This is inherent in the nature of the way ArcSDE works.  =
ArcSDE
stores coordinates as whole integers (faster access through easier
computation).  These integers are created by a precision scale factor =
that
you supply to ArcSDE when loading data (we normally use 2000).  ArcSDE =
would
take a number 652123.1524786 and store it as 1304246304 and then, when
retrieving the data, divide it by the factor (2000 for this example) to =
get
652123.152 for 'proper' display.  Obviously, 652123.1524786 and =
652123.152
are not coincident.  And if the original ten-thousandths numeral had =
been 5
or greater, the coordinate pair that this number helped construct might =
be
sitting even further away when it comes back out of the database.

In other words, ArcSDE is (in effect) snapping coordinates/vertices to a
grid.  That grid might be a decimeter grid, a millimeter grid, a =
nanometer
grid, etc, depending on the precision scale factor used when a layer is
created.  This phenomenon is not normally going to be dependent on the
precision of the data source in terms of seeing a difference between a =
single
precision or double precision ARC/INFO coverage.  I would guess the
difference between the two is usually beyond the precision level that =
ArcSDE
is requested to store.

The drift is definitely going to be present in shapefiles and coverages
created from ArcSDE layers because as far as ArcSDE is concerned, the =
number
is 652123.152, not 652123.1524786, and ArcSDE has no knowledge of the
original.

When determining the precision factor for a layer, ESRI suggests using =
only
as much precision as you need.  For instance, we break this rule when we =
use
the precision factor of 2000 for data whose source is 1:2,000,000.  A =
factor
of only 1 or 10 might suffice, but of course it wouldn't 'look' good if =
users
zoomed in to 1:1 scale or greater in their view and compared that layer =
to
the source data.  Their reasoning is that greater precision increases =
the
database size and their assumption is that you never want more data than =
you
need to get the job done because it involves greater cost in terms of =
disk
space and computation time.

-Eric Brockwell

Linc Clay
Bureau of Information Systems/GIS
2600 Blair Stone Rd., MS 6520
Tallahassee, FL 32399-2400
Voice: 850-245-8295 / SC 205-8295
Fax: 850-245-8263 / SC 205-8263
E-mail: linc.clay@dep.state.fl.us