Come take a look over my shoulder on a typical day

I thought I’d explain a bit about how my open data projects function behind the scenes so people get an insight into the level of labour and expertise they require.

I administer my servers using CPanel (which takes care of most administrative functions for me) and any data in my open data projects is held in a MySQL database. This allows me to perform queries on individual tables or across multiple tables. These queries allow users to search the data to get just the cells they want, transforming an Excel/CSV file like this which has 91 columns (87 of which are the QPS offence types)…

…into the simple chart below. You can only see the first 12 columns in the image above. This is because that is all that fits into the screen. The columns continue on, and on, and on for 91 columns/fields which is WHY we need to use a database simply just to find the row or cell that we want. To do this we need to use MySQL and end up with results like this:

That little table might look simple but getting those numbers is deceptively hard work! To begin with, MySQL is used to ask the database for specific data and can also be used to transform that data through algorithms. The example above has to perform an algorithm to transform (summed & average) the 12 data points for each District per offence type into a single number and THEN comparing each to all the others to allow them to be ranked to see which District has the most or least recorded rates for each (or all) offences.

The actual query is quite ridiculous and looks like this:

Giant query to rank all Districts on all offences


SELECT ERP,District,crime_table.Division,

(

(sum(Homicide_Murder)/12)+

(sum(Manslaughter_excl_by_driving)/12)+

(sum(Manslaughter_Unlawful_Striking_Causing_Death)/12)+

(sum(Driving_Causing_Death)/12)+

(sum(Attempted_Murder)/12)+

(sum(Conspiracy_to_Murder)/12)+

(sum(Kidnapping_Abduction)/12)+

(sum(Extortion)/12)+

(sum(Stalking)/12)+

(sum(Life_Endangering_Acts)/12)+

(sum(Arson)/12)+

(sum(Breach_Domestic_Violence_Protection_Order)/12)+

(sum(Grievous_Assault)/12)+

(sum(Serious_Assault)/12)+

(sum(Serious_Assault_Other)/12)+

(sum(Common_Assault)/12)+

(sum(Rape_and_Attempted_Rape)/12)+

(sum(Other_Sexual_Offences)/12)+

(sum(Armed_Robbery)/12)+

(sum(Unarmed_Robbery)/12)+

(sum(Stealing_from_Dwellings)/12)+

(sum(Shop_Stealing)/12)+

(sum(Unlawful_Use_of_Motor_Vehicle)/12)+

(sum(Vehicles_steal_from_enter_with_intent)/12)+

(sum(Other_Stealing)/12)+

(sum(Possess_Property_Suspected_Stolen)/12)+

(sum(Receiving_Stolen_Property)/12)+

(sum(Possess_etc_Tainted_Property)/12)+

(sum(Other_Handling_Stolen_Goods)/12)+

(sum(Other_Property_Damage)/12)+

(sum(Unlawful_Entry_With_Intent_Dwelling)/12)+

(sum(Unlawful_Entry_Without_Violence_Dwelling)/12)+

(sum(Unlawful_Entry_With_Violence_Dwelling)/12)+

(sum(Unlawful_Entry_With_Intent_Shop)/12)+

(sum(Unlawful_Entry_With_Intent_Other)/12)+

(sum(Trespassing_and_Vagrancy)/12)+

(sum(Fraud_by_Computer)/12)+

(sum(Fraud_by_Cheque)/12)+

(sum(Fraud_by_Credit_Card)/12)+

(sum(Identity_Fraud)/12)+

(sum(Other_Fraud)/12)+

(sum(Trafficking_Drugs)/12)+

(sum(Possess_Drugs)/12)+

(sum(Sell_Supply_Drugs)/12)+

(sum(Other_Drug_Offences)/12)+

(sum(Found_in_Places_Used_for_Purpose_of_Prostitution_Offences)/12)+

(sum(Have_Interest_in_Premises_Used_for_Prostitution_Offences)/12)+

(sum(Knowingly_Participate_in_Provision_Prostitution_Offences)/12)+

(sum(Public_Soliciting)/12)+

(sum(Procuring_Prostitution)/12)+

(sum(Permit_Minor_to_be_at_a_Place_Used_for_Prostitution_Offences)/12)+

(sum(Advertising_Prostitution)/12)+

(sum(Other_Prostitution_Offences)/12)+

(sum(Unlawful_Possess_Concealable_Firearm)/12)+

(sum(Unlawful_Possess_Firearm_Other)/12)+

(sum(Bomb_Possess_and_or_use_of)/12)+

(sum(Possess_and_or_use_other_weapons_restricted_items)/12)+

(sum(Weapons_Act_Offences_Other)/12)+

(sum(Good_Order_Offences)/12)+

(sum(Disobey_Move_on_Direction)/12)+

(sum(Resist_Incite_Hinder_Obstruct_Police)/12)+

(sum(Fare_Evasion)/12)+

(sum(Public_Nuisance)/12)+

(sum(Traffic_and_Related_Offences)/12)+

(sum(Dangerous_Operation_of_a_Vehicle)/12)+

(sum(Drink_Driving)/12)+

(sum(Disqualified_Driving)/12)+

(sum(Interfere_with_Mechanism_of_Motor_Vehicle)/12)+

(sum(Liquor_excl_Drunkenness)/12)+

(sum(Gaming_Racing_Betting_Offences)/12)+

(sum(Stock_Related_Offences)/12)+

(sum(Miscellaneous_Offences)/12)+

(sum(Other_Offences)/12)

/73)


as sum_of_all


 


FROM crime_table_2016 JOIN other_table ON crime_table_2016.Division=other_table.Division


 


GROUP By crime_table.Division


ORDER BY

(

(sum(Homicide_Murder)/12)+

(sum(Manslaughter_excl_by_driving)/12)+

(sum(Manslaughter_Unlawful_Striking_Causing_Death)/12)+

(sum(Driving_Causing_Death)/12)+

(sum(Attempted_Murder)/12)+

(sum(Conspiracy_to_Murder)/12)+

(sum(Kidnapping_Abduction)/12)+

(sum(Extortion)/12)+

(sum(Stalking)/12)+

(sum(Life_Endangering_Acts)/12)+

(sum(Arson)/12)+

(sum(Breach_Domestic_Violence_Protection_Order)/12)+

(sum(Grievous_Assault)/12)+

(sum(Serious_Assault)/12)+

(sum(Serious_Assault_Other)/12)+

(sum(Common_Assault)/12)+

(sum(Rape_and_Attempted_Rape)/12)+

(sum(Other_Sexual_Offences)/12)+

(sum(Armed_Robbery)/12)+

(sum(Unarmed_Robbery)/12)+

(sum(Stealing_from_Dwellings)/12)+

(sum(Shop_Stealing)/12)+

(sum(Unlawful_Use_of_Motor_Vehicle)/12)+

(sum(Vehicles_steal_from_enter_with_intent)/12)+

(sum(Other_Stealing)/12)+

(sum(Possess_Property_Suspected_Stolen)/12)+

(sum(Receiving_Stolen_Property)/12)+

(sum(Possess_etc_Tainted_Property)/12)+

(sum(Other_Handling_Stolen_Goods)/12)+

(sum(Other_Property_Damage)/12)+

(sum(Unlawful_Entry_With_Intent_Dwelling)/12)+

(sum(Unlawful_Entry_Without_Violence_Dwelling)/12)+

(sum(Unlawful_Entry_With_Violence_Dwelling)/12)+

(sum(Unlawful_Entry_With_Intent_Shop)/12)+

(sum(Unlawful_Entry_With_Intent_Other)/12)+

(sum(Trespassing_and_Vagrancy)/12)+

(sum(Fraud_by_Computer)/12)+

(sum(Fraud_by_Cheque)/12)+

(sum(Fraud_by_Credit_Card)/12)+

(sum(Identity_Fraud)/12)+

(sum(Other_Fraud)/12)+

(sum(Trafficking_Drugs)/12)+

(sum(Possess_Drugs)/12)+

(sum(Sell_Supply_Drugs)/12)+

(sum(Other_Drug_Offences)/12)+

(sum(Found_in_Places_Used_for_Purpose_of_Prostitution_Offences)/12)+

(sum(Have_Interest_in_Premises_Used_for_Prostitution_Offences)/12)+

(sum(Knowingly_Participate_in_Provision_Prostitution_Offences)/12)+

(sum(Public_Soliciting)/12)+

(sum(Procuring_Prostitution)/12)+

(sum(Permit_Minor_to_be_at_a_Place_Used_for_Prostitution_Offences)/12)+

(sum(Advertising_Prostitution)/12)+

(sum(Other_Prostitution_Offences)/12)+

(sum(Unlawful_Possess_Concealable_Firearm)/12)+

(sum(Unlawful_Possess_Firearm_Other)/12)+

(sum(Bomb_Possess_and_or_use_of)/12)+

(sum(Possess_and_or_use_other_weapons_restricted_items)/12)+

(sum(Weapons_Act_Offences_Other)/12)+

(sum(Good_Order_Offences)/12)+

(sum(Disobey_Move_on_Direction)/12)+

(sum(Resist_Incite_Hinder_Obstruct_Police)/12)+

(sum(Fare_Evasion)/12)+

(sum(Public_Nuisance)/12)+

(sum(Traffic_and_Related_Offences)/12)+

(sum(Dangerous_Operation_of_a_Vehicle)/12)+

(sum(Drink_Driving)/12)+

(sum(Disqualified_Driving)/12)+

(sum(Interfere_with_Mechanism_of_Motor_Vehicle)/12)+

(sum(Liquor_excl_Drunkenness)/12)+

(sum(Gaming_Racing_Betting_Offences)/12)+

(sum(Stock_Related_Offences)/12)+

(sum(Miscellaneous_Offences)/12)+

(sum(Other_Offences)/12)

/73)


DESC


It is my job to construct a query (and/or) algorithm that will create the result that will provide meaningful results for people to interact with. Making the column names into the variable $offence means that it can be replaced (in the query) by any of the 87 offences in the QPS data. The reason I’m including only 73 in the query above is because some of the columns in the table are subtotals of others. The form you see on the home page allows you to select the offence type which is then plugged into a query like the one above that ranks all Districts or and Divisions on a particular offence (as opposed to across every offence).

The SEIFA charts (also accessible from the home page) are simply hand coded tables with divs inside each table cell. These divs have varying widths and background colour which is generated on the fly from the query that creates the table. This is because I used in-line css style tag for width and background colour with variables that get their values from the query.

<td class=’quintiles’ style=’width:”.number_format(($row[‘one’]*2)).”px; background:RGBA(117, 158, 52,0.”.number_format($row[‘one’]/10).”)’>”.$row[‘one’].”%</td>

The width and colour opacity change with the data from the table, creating different widths and background colour dependent on the values in the SEIFA data I copied out of the POLSIS Community Profiles. This provides a visual representation of the data which allows comparison of the percentage of the local population in each SEIFA quintile from least to most advantaged for each District or Division.

The POLSIS Community Profiles do not provide comparison, just the raw data. Creating a CSV of this data and then creating the MySQL query and then the PHP, HTML & CSS code to create the coloured table is what I have done with the raw data from 15 different District Community Profiles.

In order to make the raw crime rate data usable, I needed a way to plot the data that could vary the size of the bars or graph height so that the resulting chart would come out the same size whether it was displaying 100 crimes per month or just 1, depending on whether the query was charting a prolific crime like traffic offences over many years or a rare offence such as homicide. This is what the graphing API’s do- but you do tend to then lose that comparison between less prolific crimes as the vertical axis collapses or expands so that all the charts end up the same size. Sometimes you just have to decide what information is best to preserve and which is ok to lose in order to make data that varies wildly from one extreme to the other fit in the same space on a page.

To use Google Charts I have to create MySQL queries to get the data that I want and embed it in the JavaScript supplied by Google using PHP. So I’m now using MySQL, PHP, JavaScript, HTML & CSS like a sandwich to create the charts that you see.

 

The MySQL asks the data table for the data for a certain offence and the PHP embeds the MySQL query into the JavaScript:

If this all looks onerous and complicated, that’s because it is but I hope I’ve been able to communicate some of the broader takes undertaken with creating open data projects. There are many issues I could not touch on including the sociological issues that impact on crime rates which are better dealt with in a separate post.

Ultimately, I hope I have  provided a little insight into what actually goes into a typical project of mine as I get the feeling that the range of skills that I use are not particularly well understood.

Print Friendly, PDF & Email
Liked it? Take a second to support Rosie on Patreon!