CPRA Request For hosting E911 Realtime Data On data.smcgov.org
Technical Issues
Robert Harker, Open Data Evangelist
September 11th, 2015
===== E911 call data =====
E911 calls are 911 emergency calls not routed to law enforcement
The county already publishes realtime E911 call data to the web at:
http://www.firedispatch.com/http://www.firedispatch.com/
Data published includes:
datetime
Incident type
Incident number
Address:
Medical aid calls are anonymised to the street level
Other aid calls include exact street addresses
Responding organization
Unit responding(?)
Other information field (multiple units responding?)
This information is published to the public Internet using a contracted service provided by San Mateo Regional Network, Inc.:
http://www.smrn.com/
They also proved the actual E911 call center software as manged software as a service, SaaS.
===== Data pipeline =====
Standard Jenkins build job:
Data submitted -> archive - pre-process/validate -> upload to Socrata ->
-> Verify data by downloading -> archive download results
===== Proposed Realtime Data Feed Solution =====
Need for County relay between data source and Socrata
===== E911 Realtime Call Feed =====
The firedispach site receives its data as a ??? feed from the central E911 call center software.
The data feed is:
Data stream:
Data source:
Data destination:
Protocol used:
Format of data stream:
===== Data structure Of firedispach Web Site =====
What is the name of the table(s) that contains the data:
Are any of them small static translation tables for data in columns:
Fire station ID number to fire station name, unit type ID to unit type name
What are the column names for the data that is published on firedispatch.com:
Datetime:
Type of incident:
Incident number:
Equipment dispatched:
Address:
Department responding:
Other data: looks like additional units dispatched:
X,Y location: can be inferred from map location of a marker
Does this make sense? Overkill? Under kill?
===== Data Upgrade Request =====
If possible anonymised street addresses should be upgraded to anonymised block level.
===== Data that is published on firedispatch =====
Here is what I found so far:
Date,Time,Response time(?),Incident Number,Fire Department,
Incident Type,Street Address,City,"Units Dispatched"
1/1/2015,12:00:00 AM,(25 min),CCF150010001,Central County Fire,
Medical aid,LAKEVIEW DR,HIL,"E33"
1/1/2015,1:31:11 AM,(12 min),SMF150010004,San Mateo Fire,
Full assignment response,2036 HARDING AV,"SMO,BC5,E21,E24,E26,PT21"
1/1/2015,1:51:13 AM,(28 min),MNF150010006,Menlo Park Fire,
Fire alarm - smoke detector,430 E O KEEFE ST,EPA,"E1"
Location to the block level location for Medical Aid is published as data point s on the realtime map.
The text address is only published to the street level.
Not very useful for a map
===== Source Database Information Security =====
Basic philosophy:
Make all db queries read only of a minimal set of columns from an unprivileged db account. The DB admin controls access.
To limit the data that my SQLtoREST program can access:
Create a new unprivileged read-only Microsoft SQL sever account.
Explicitly grant read-only access to exactly the columns in the table(s) that are already published.
Perform any address obfuscation such a street level or block level addresses on the SQL server side before the data is returned to my SQLtoREST program.
Could this be a stored procedure on the server?
Limit on the server side the age of records to less than 14 day old.
For additional safety use a TLS or ssh authenticated (shared public keys) tunnel to only allow access from a County designated data translation host.
===== Previous year of historical E911 call data =====
The previous year of E911 call data is requested to provide a useful dataset for the County and the public to experiment (play) with.
Additionally metadata (information) about how the request was satisfied will be included.
Fulfillment metadata:
Name of dataset
Description of dataset
Date generated
Internal dataset name
SQL query or script used to generate the data
Employee/title legally responsible for the data
Employee/title actually generating the data
Any additional information about the dataset the department thinks is useful