<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>Duane,</p>
<p>At the risk of being cast out as a heretic (especially on this
list) I'd argue that much of your analysis is not spatial, its
more analytical, statistical, operational, business intelligence,
machine learning, etc. Your most relevant prediction tools will
not be "map based." <br>
</p>
<p>First thought was yes, totally agree, Excel has all the basic
stat tools and is easy to use. Correlation is a snap to calculate.
Linear (and some non-linear) regression analysis functions are
included and work great. Chi Square is included. (note the
analysis pack is included with Excel but you have to tell Excel to
load it.) <br>
</p>
<p>Suggest a current "masters level" or Phd candidate level in
statistics or applied social sciences (with a recent and solid
stats background) to <i>briefly </i>consult on your stats end
here, especially if you intend to publish. Unless you use those
skills frequently they fade from memory pretty quickly. Its always
good to get two sets of eyes on the stats approach anyway. <br>
</p>
<p>Preliminary thoughts.<br>
</p>
<p><b>Independent variables</b><br>
dilapidated housing (housing conditions data)<br>
street lighting (street maps of lighting)<br>
prior crime occurred (geo code from street address)<br>
prior fire occurred (geo code from street address)<br>
</p>
<p><b>Dependent variables:</b><br>
<i>future </i>crime call occurs (roll officers)<br>
<i>future </i>fire call occurs (roll truck)</p>
<p>One of the biggest errors in prediction is using Independent
Variables that are <i>not yet available </i>for FUTURE
predictions. You have to lag back in time the Independent
variables so they are published and loadable in the time-frame of
making the FUTURE prediction. A high correlation with data that is
only available concurrently is worthless for <i>prediction</i>. <br>
</p>
One approach might be: create a grid over the area and assign all
variables to a grid cell. Then consider each cell a sample. At this
point the analysis is "flat", not really spatial. <br>
<br>
Then you start to think, how about "nearby" cells potential
influence? And, counts/trends over time, recency, occurrence nearby?
Perhaps more independent variables? Did the assessors office or
other departments know the property was vacant? Owner occupied or
rented? Occupied by a business with a SIC code known to use highly
flammable supplies? How about known prior bad actors and where they
live, visit. Things can get complex quickly. (Can we scan license
plates? Run facial ID?)<br>
<br>
As the thirst for more and better Independent variables continues
unabated the<b> Extract Transform Load</b> (ETL) functions will rise
dramatically. The tool chain used to produce more or less continuous
predictions needs to be efficient. You need to be able to add in
future data <i>streams </i>(in near real time, vs. batch)... You
need to be able to automate as much "data acquisition" as possible,
write the converter(s) once and use many times. If the people doing
the data collection and clean up work will turn over in their jobs,
the process itself will need to be very clear what they did, so the
next person doesn't start from scratch again and again. ETL<i> as a
separate function</i> is not to be underestimated. None of it is
mapping oriented. <br>
<br>
Of course you want to do some really solid interviews with the fire
chief and police people with long experience - extract the data
sources they use, and the "intelligence" in predicting trouble. Let
those in-depth interviews guide the process of ferreting out the
best Independent Variables you can grab in the time frame you need
them. <br>
<br>
Here are some key terms/phrases for this area for police work (I'm
sure there are parallels for fire. From source #2 below)<br>
<blockquote>"machine learning" AND crime/offense<br>
crime/offense AND predict*/forecast*/map*<br>
"predictive policing"<br>
"risk terrain modeling"<br>
"prospective hot spot/hot-spot analysis/mapping"<br>
"prospective hot-spotting"<br>
"spatiotemporal crime forecasting"<br>
"predictive/prospective crime mapping/analysis"<br>
</blockquote>
<b>Background Papers<br>
<br>
</b>Office of Justice Programs (unit of DOJ)<br>
RAND report on Predictive Policing<br>
<a class="moz-txt-link-freetext" href="https://www.ncjrs.gov/pdffiles1/nij/grants/243830.pdf">https://www.ncjrs.gov/pdffiles1/nij/grants/243830.pdf</a><br>
[RAND tends to do great work but may be slightly dated?]<br>
<!--[if gte mso 9]><xml>
<o:OfficeDocumentSettings>
<o:AllowPNG/>
</o:OfficeDocumentSettings>
</xml><![endif]-->
<p class="MsoNormal">A Scoping Review Of Predictive Analysis
Techniques For
Predicting Criminal Events<br>
<a class="moz-txt-link-freetext" href="https://www.researchgate.net/profile/Lieven_Pauwels/publication/321833027_A_scoping_review_of_predictive_analysis_techniques_for_predicting_criminal_events/links/5a33e45b45851532e82c9411/A-scoping-review-of-predictive-analysis-techniques-for-predicting-criminal-events.pdf">https://www.researchgate.net/profile/Lieven_Pauwels/publication/321833027_A_scoping_review_of_predictive_analysis_techniques_for_predicting_criminal_events/links/5a33e45b45851532e82c9411/A-scoping-review-of-predictive-analysis-techniques-for-predicting-criminal-events.pdf</a><br>
[Good literature review]<br>
</p>
<p class="MsoNormal">Most of above is likely to be way overkill,
especially to start. Still there may be some nuggets in there to
help avoid a false start.<br>
</p>
<p class="MsoNormal">If you wanted to "try out" current predictive
technology perhaps fund a grad student or two at FSU? I'm thinking
use Python code and standard (well understood) <b>Python code
libraries</b> for statistics and machine learning. Keep it all <i>real
simple. Let them focus on <b>demonstrating</b><b> </b>the
prediction/learning side. </i>Output those with lat/long info
attached. To start just import their predictive output into your
mapping systems. Up front the Python analytics would play really
well with traditional mapping products downstream. Best of all
Python is now mainstream, with incredible pre-written libraries
that will be around a long time already "on the shelf" ready to be
strung together. <br>
</p>
<p class="MsoNormal">Avoid a "one off" totally custom solution. Make
sure FSU knows you want the simplest solution possible using only
the most standard Python coding and well established libraries.
Don't let it get esoteric. <br>
</p>
<p class="MsoNormal">Rick<br>
</p>
--
<pre class="moz-signature" cols="72">Richard J. Labs, CFA, CPA
CL&B Capital Management, LLC
Phone: 315-637-0915
E-mail (preferred for efficiency): <a class="moz-txt-link-abbreviated" href="mailto:rick@clbcm.com">rick@clbcm.com</a>
3209 Yorktown Dr, Tallahassee, FL 32312
June-August: 408B Holiday Harbour, Canandaigua, NY 14424</pre>
</body>
</html>