shrug-l: Data Drift in GeoDataBases
Clay, Linc
Linc.Clay@dep.state.fl.us
Fri, 18 Oct 2002 13:42:27 -0400
We have provided Mark some direct responses based on our experience. =
For the
good of the order since this is an interesting topic, here are those
responses.
DEP's general response --
Mark,
My guess is that the coordinate creep is a function of the data type or
precision (or scale more correctly) in which the coordinates are stored. =
The
GDB may use a different data type, so you may be seeing "rounding" or
truncating error in essence when moving from one form to another. This =
would
also explain why the creep was retained when you went back to the native
form.
-Linc
DEP's ArcSDE specific response --
Mark,
My answer assumes that your GeoDataBase is ArcSDE, but is probably true =
for
personal GDBs as well if they use the same concepts for storage and =
access,
which I believe they do. If I understand the problem correctly . . .
ArcSDE will always be offset from the original data source at some level =
of
precision. Vertices in many of our layers are 2mm (.002) 'off' from the
source data. This is inherent in the nature of the way ArcSDE works. =
ArcSDE
stores coordinates as whole integers (faster access through easier
computation). These integers are created by a precision scale factor =
that
you supply to ArcSDE when loading data (we normally use 2000). ArcSDE =
would
take a number 652123.1524786 and store it as 1304246304 and then, when
retrieving the data, divide it by the factor (2000 for this example) to =
get
652123.152 for 'proper' display. Obviously, 652123.1524786 and =
652123.152
are not coincident. And if the original ten-thousandths numeral had =
been 5
or greater, the coordinate pair that this number helped construct might =
be
sitting even further away when it comes back out of the database.
In other words, ArcSDE is (in effect) snapping coordinates/vertices to a
grid. That grid might be a decimeter grid, a millimeter grid, a =
nanometer
grid, etc, depending on the precision scale factor used when a layer is
created. This phenomenon is not normally going to be dependent on the
precision of the data source in terms of seeing a difference between a =
single
precision or double precision ARC/INFO coverage. I would guess the
difference between the two is usually beyond the precision level that =
ArcSDE
is requested to store.
The drift is definitely going to be present in shapefiles and coverages
created from ArcSDE layers because as far as ArcSDE is concerned, the =
number
is 652123.152, not 652123.1524786, and ArcSDE has no knowledge of the
original.
When determining the precision factor for a layer, ESRI suggests using =
only
as much precision as you need. For instance, we break this rule when we =
use
the precision factor of 2000 for data whose source is 1:2,000,000. A =
factor
of only 1 or 10 might suffice, but of course it wouldn't 'look' good if =
users
zoomed in to 1:1 scale or greater in their view and compared that layer =
to
the source data. Their reasoning is that greater precision increases =
the
database size and their assumption is that you never want more data than =
you
need to get the job done because it involves greater cost in terms of =
disk
space and computation time.
-Eric Brockwell
Linc Clay
Bureau of Information Systems/GIS
2600 Blair Stone Rd., MS 6520
Tallahassee, FL 32399-2400
Voice: 850-245-8295 / SC 205-8295
Fax: 850-245-8263 / SC 205-8263
E-mail: linc.clay@dep.state.fl.us