There was a comment on a recent Hacker News thread about a world airport Voronoi map that said “if only there was a webpage/software where someone could click/select points on a map…and a user Voronoi diagram would be created ;-).”
I knew such a tool already existed, but I thought I might as well try to implement one myself, so I put together the pieces:
- The deldir R package can quickly create Voronoi lines for a given set of two-dimensional points.
- Google Maps, since I knew how to handle events like clicking to add a point, dragging the points, and double-clicking to remove. Plus it’s easy to draw the Voronoi lines.
- Shiny can link that quick calculation of Voronoi lines with the front-end maps library API, so that user events and the server-side data stay in sync.
A live demo version is available on shinyapps.io: https://modernresearch.shinyapps.io/create-voronoi/
The user interface is pretty simple – just click to add points, drag to change, and double-click to remove, and the Voronoi will update automatically.
There’s also an option to change the lines from straight to geodesics (following the curve of the earth).
The code is on Github and MIT-licensed.
I’d love to add a way to load sets of point at once (US state capitals, 100 tallest mountains, sports arenas, etc) when I have more time to work on this.
The previous post was all about an open data / open source strategy.
There’s plenty of data available from public sites that can be turned into useful tools, and one of the sources we’ve focused on recently is police response data.
Winston-Salem Police Department publishes a simple text file daily, containing information about all the responses from the previous day: report number, address, time of day, and the type of issue (for example, vandalism, motor vehicle theft, arson, etc.).
But the interface isn’t very useful: no aggregation, no filtering, no visualization, nothing but daily text files.
WSPD does contract with crimemapping.com to display individual responses on a map, which can be filtered going back several months, and users can even receive email alerts for activity within a specified radius of any address.
That’s great! But of course we wondered if an open-source tool would be possible.
So we’ve released a minimal version of a Shiny dashboard as an example:
And we’ve also released the source code here:
Continue reading “Winston-Salem Police Response Data”
We’ve started offering clients open data and open source strategy consulting, and hope to have some case studies in the coming months.
Releasing data and software (or even both) to the public seems counterproductive to good business – how could it do anything but feed the competition, or draw attention to mistakes?
It’s worth thinking one step deeper:
- Releasing code and data can be a conspicuous demonstration of competence, a great way to draw incoming leads and even recruit to your organization.
- It also signals that your value is greater than the contents of files – you’re not worried about giving away the data, because where you really shine is customization and customer service, which is much harder to copy.
- In a world where a million companies are selling software and services for a monthly subscription, there’s a sharp discontinuity in user acquisition at $0. Start them for free, then offer upgrades in the form of additional features or support.
- Competitors may learn from your data and/or your code, but that’s a short-term effect – if the overall market grows, that’s good for you, even if your competitor gets a chunk of that market.
- If you have mistakes in your code or your data, you want to know as soon as possible, and what better way than to increase use by offering it for free – it’s almost like outsourced testing, and in our experience, fixing a publicly known bug or inaccuracy with grace and polite thanks to the finder does no harm to your reputation.
We talk about #4 a lot – if we’re unable to fit a project into our schedule, we don’t hesitate to refer to competitors. If there are more groups delivering successful projects in our field, that’s great!
Continue reading “Open Data and Cool Maps”
I’ve been exploring some of the lesser-known Amazon Web Services (AWS) tools recently, and when I saw the in-console demo of Rekognition, the image labeling/classification service, I knew I needed to find an application for testing.
Then I forgot about it for a few weeks.
And THEN! I came across this archive of all the Sports Illustrated covers in the Sports Illustrated Vault. The rest happened in a whirlwind.
Continue reading “Sports Illustrated Cover Image Analysis with AWS Rekognition”
I was working on a task a few days ago that required converting a lot of timestamps into dates (not times, just dates). Thankfully,
as.Date seemed to be working fine as a default approach, but then an error showed up:
Error in charToDate(x) :
character string is not in a standard unambiguous format
Fair enough, nobody expects to get through parsing dates without a few hangups.
So I focused in on the vector that was causing the trouble, and didn’t see anything obviously wrong – those timestamps looked just like the other columns of timestamps that had worked thus far.
There were some blanks, but there were blanks in the other columns too, and besides,
as.Date("") is just
Continue reading “Order Dependence in as.Date”
I gave a talk at Monday night’s Winston-Salem R Users Group that covered a lot of the base R package, and also showed a brief demo of how to use
packrat when packages are necessary, and so is sharing code across multiple team members and/or environments.
The idea came from a comment at a previous meeting about the dangers of trying to maintain common versions of code within and across teams, not just to avoid surprising errors, but also to ensure reproducibility.
So my recommended approach was:
- Learn as much as you can about the details of the base package. It’s a huge package, and a lot of common needs can be handled simply and effectively.
- When you need a package (and there are certainly useful and necessary packages), use a system like
packrat to keep dependencies systematically managed.
Most of the content wouldn’t be a surprise to daily R users, but I did throw in some things that either 1) surprised me when I first learned them, or 2) increased my productivity so much that I think everyone should know them.
Continue reading “Base R When Possible, Packrat When Not”
I grew up in Raleigh, North Carolina, the state capital and seat of Wake County, which contains several other municipalities.
When I was in middle school in the year 2000, I would have laughed if you had told me any of the following:
- The county’s population would grow by 43.5%, from 627k to 900k, in the next ten years.
- The county’s population would actually reach 1 million in the next fifteen years.
- Cary would become the seventh-most-populous municipality in North Carolina.
- Holly Springs and Fuquay-Varina would be considered reasonable suburbs to Raleigh.
- By 2017 Wake County would have 172 public schools (27 high schools) for 155,000 students.
Well, here we are. Wake County and Raleigh have made a million of those “best places to live/work/retire” lists since the year 2000, and the county has seen growth that puts it in the top ten or fifteen fastest-growing counties in the country, depending on how you measure.
So what does that rate of development look like on a map of land parcels? Good question, keeping reading for more…
Continue reading “Growth and Development in Wake County, NC”