View Sidebar

Archive for category: Projects

How to Turn a Status History Table into a Daily Status Table Using PostgreSQL, with the Help of Some Window Functions

How to Turn a Status History Table into a Daily Status Table Using PostgreSQL, with the Help of Some Window Functions

Let’s say you have a table of the format:

object_id |        timestamp      | old_status | new_status
32049     | 2018-02-01 14:32:22   | TRUE       | FALSE
32049     | 2018-03-02 12:30:20   | FALSE      | TRUE
32049     | 2018-03-04 18:30:50   | TRUE       | FALSE
52053     | 2019-02-09 00:30:50   | FALSE      | TRUE
52053     | 2019-02-09 00:30:53   | TRUE       | FALSE
52053     | 2019-02-10 16:30:56   | FALSE      | TRUE

Where each entry, for simplicity’s sake, is one change in boolean status for the object: TRUE | FALSE or FALSE | TRUE. This is a fairly common history table that you’d find in the database of many applications, for example a profile settings history with a search engine visibility toggle.

Let’s say that you want to build a lookup table in your warehouse so that you can easily discover whether, on any given day, the toggle in question is set to a specific state, in this case TRUE. This happens to be the situation we found ourselves in at WayUp in order to make it easy to know whether a job was active during a given period.

This is the output we’re going for, where every row is a date where the object is active. If the object is not active during that date, there would be no entry for it.

For the example above, our work should yield the following. Since object 32053 was set to true on 2/10/2019, it is currently also true on 2/12/2019. We also will assume that both objects were created on 1/31/2018.

object_id | date
32049     | 2018-01-31 (creation as active)
32049     | 2018-02-01
32049     | 2018-03-02
32049     | 2018-03-03
32049     | 2018-03-04
52053     | 2019-02-09
52053     | 2019-02-10
52053     | 2019-02-11
52053     | 2019-02-12 (today)

How do you do it?

It’s a deceptively hard problem, because the result we’re going for is a lookup table where we need dates in between, and inclusive of, the toggles that are set to TRUE. We also need dates after the last toggle was set to TRUE up through the current date. If the history table doesn’t include the initial state the object was created in (which is very often the case), we will need to handle that case as well.

Step One: Build a Date-Based History Table

We’ll first need the history of this object by date, because that’s the final format we’ll be storing the table in. (Even if your final use case for this lookup will be rollups by month or quarter, if you have it broken out by date, it will make rollups easy using SELECT DISTINCT across any time period.)

Let’s go back to our original sample table.

object_id |        timestamp      | old_status | new_status
32049     | 2018-02-01 14:32:22   | TRUE       | FALSE
32049     | 2018-03-02 12:30:20   | FALSE      | TRUE
32049     | 2018-03-04 18:30:50   | TRUE       | FALSE
52053     | 2019-02-09 00:30:50   | FALSE      | TRUE
52053     | 2019-02-09 00:30:53   | TRUE       | FALSE
52053     | 2019-02-10 16:30:56   | FALSE      | TRUE

The first thing we do, is append the current status of the object at the current time. The reason is, we need to eventually add active date rows for objects which were previously set to active, and thus are still active today. This can be done with a simple union, with a NULL for new_status.

object_id |        timestamp            | old_status | new_status
32049     | 2018-02-01 14:32:22         | TRUE       | FALSE
32049     | 2018-03-02 12:30:20         | FALSE      | TRUE
32049     | 2018-03-04 18:30:50         | TRUE       | FALSE
32049     | 2019-02-12 23:59:43 (now)   | FALSE      | NULL
52053     | 2019-02-09 00:30:50         | FALSE      | TRUE
52053     | 2019-02-09 00:30:53         | TRUE       | FALSE
52053     | 2019-02-10 16:30:56         | FALSE      | TRUE
52053     | 2019-02-12 23:59:43 (now)   | TRUE       | NULL

Now it’s easy to build a day-by-day history table using some window functions. Remember the DISTINCT, otherwise you will get duplicate rows.

   , DATE(timestamp) AS date
FROM status_history

The output will give you the first old status and last new status of each date. Note object #52053 on 2/9: although it was active on this date, it started and ended as inactive. At some point, we’ll need to remember to account for this use case.

object_id |    date    | old_status | new_status
32049     | 2018-02-01 | TRUE       | FALSE
32049     | 2018-03-02 | FALSE      | TRUE
32049     | 2018-03-04 | TRUE       | FALSE
32049     | 2019-02-12 | FALSE      | NULL
52053     | 2019-02-09 | FALSE      | FALSE
52053     | 2019-02-10 | FALSE      | TRUE
52053     | 2019-02-12 | TRUE       | NULL

Step Two: Prepare the Join

You can begin to see now what we’ll need to do. We’ll need to insert new rows for each object between every row where new_status is TRUE and old_status is TRUE. These rows will need to be bounded by date. We already have an upper bound: the “date” column. But we need to add a lower bound as well–which means we need to know for each row what the previous date was that the toggle was changed. This can be accomplished with the LAG() window function.

You may realize at this point we have one more problem. For the first entry for each object, there will be no previous date: it will be NULL. This gives us an opportunity to address another use case: when the objects are created TRUE without the history table being altered. So we’ll replace that first prev_toggle cell for each object with the creation date for each object. We’re also going to offset this date by one. You’ll see why in the next step.

   , date
   , old_status
   , new_status
      LAG(date,1) OVER(PARTITION BY object_id ORDER BY date)
      , DATE(object.created_at) - INTERVAL '1 DAY'
    ) AS prev_toggle
FROM daily_status
JOIN object USING (object_id)

The output is ready now to be easily transformed using a simple join.

object_id |    date    | old_status | new_status | prev_toggle
32049     | 2018-02-01 | TRUE       | FALSE      | 2018-01-30
32049     | 2018-03-02 | FALSE      | TRUE       | 2018-02-01
32049     | 2018-03-04 | TRUE       | FALSE      | 2018-03-02
32049     | 2019-02-12 | FALSE      | NULL       | 2018-03-04
52053     | 2019-02-09 | FALSE      | FALSE      | 2018-01-30
52053     | 2019-02-10 | FALSE      | TRUE       | 2019-02-09
52053     | 2019-02-12 | TRUE       | NULL       | 2019-02-10

Step Three: Join and Union

We’re now going to do our join in two steps. First, we’re going to use a cross-join to add all the dates before the rows where old_status is TRUE. We know for a fact that every one of these dates is active.

object_id |    date    | old_status | new_status | prev_toggle
32049     | 2018-01-31 < ======== HERE
32049     | 2018-02-01 | TRUE       | FALSE      | 2018-01-30
32049     | 2018-03-02 | FALSE      | TRUE       | 2018-02-01
32049     | 2018-03-03 < ======== HERE
32049     | 2018-03-04 | TRUE       | FALSE      | 2018-03-02
32049     | 2019-02-12 | FALSE      | NULL       | 2018-03-04
52053     | 2019-02-09 | FALSE      | FALSE      | 2018-01-30
52053     | 2019-02-10 | FALSE      | TRUE       | 2019-02-09
52053     | 2018-02-11 < ======== HERE
52053     | 2019-02-12 | TRUE       | NULL       | 2019-02-10

Then, we’re going to append all the dates in which a toggle was set to TRUE at some point. For this we’ll go back to our original status change table.

For this join, we’ll need a list of all objects cross-joined against all dates. This is easy using an all_dates table you can generate with a csv for this purpose.

   , date
FROM object
JOIN all_dates
ON >= DATE(object.created_at)

This will yield a potentially massive table even truncating the dates as shown above. It should perform fine up to a couple million objects though–OK for most practical purposes.

object_id | date
32049     | 2018-01-31
32049     | 2018-02-01
32049     | 2018-02-02
...       | ...
32049     | 2019-02-12
52053     | 2018-01-31
52053     | 2018-02-01
52053     | 2018-02-02
...       | ...
52053     | 2019-02-12

Now, the join and the union can be done inside the same query. Remember we’re using the JOIN to create additional rows in before status_old is TRUE, using the prev_toggle field as an upper bound. The reason we had offset the creation date by one, is so we could fill in rows between creation date and a TRUE toggle, using the same bounded method. The UNION is used to append those dates in which we know the toggle to be active, and remove duplicates.

   , AS active_date
FROM daily_status_with_prev
JOIN all_time_objects ON
   daily_status_with_prev.old_status IS TRUE
   AND <
   AND > daily_status_with_prev.prev_toggle
   AND all_time_objects.object_id = daily_status_with_prev.object_id


   , DATE(status_history.timestamp) AS active_date
FROM status_history
WHERE old_status IS TRUE
OR new_status IS TRUE

The result is exactly what we were aiming for! One row per date each object was active.

object_id | date
32049     | 2018-01-31
32049     | 2018-02-01
32049     | 2018-03-02
32049     | 2018-03-03
32049     | 2018-03-04
52053     | 2019-02-09
52053     | 2019-02-10
52053     | 2019-02-11
52053     | 2019-02-12

Used correctly, this new table is extremely powerful for analytics and in rollups. It’s hard to make but well worth it!

February 12, 2019Comments are DisabledRead More
A Crossword Puzzle With a Theme

A Crossword Puzzle With a Theme

I took a stab at making my own crossword puzzle, complete with thematic clues and even thematic grid shapes to match. Have fun, and if you want the solution, drop a note in the comments.

C6 Grid


1. Battleship, for example
5. Like hydrochloric and nitric
9. Chit chat
12. Madison and Park, e.g.
13. Paradise dweller
14. Lounge
15. Has notoriety, for a herpetologist?
20. “Things Fall Apart” author Chinua
21. Castles may have them
22. Name in waffles
25. Narrow cut
26. Game play, to a subway hooligan?
32. Support group introduction
33. “Cool!”
34. Many Canada-residing First Nations
36. Nuisance pest
37. Dutch town
38. Some cereals
39. One may be viral
40. Insufficient
42. Assault vehicle of Star Wars fame
43. The wealthy elite, for a statistician?
46. Botswana’s House of Chiefs: _______ ya Dikgosi
47. Rocket Craig
48. Activewear brand
49. City in Argentina
54. Police command, for an interior decorator?
57. Lot of mine
58. Mon follower
59. Ones in a casino
60. Put on a swim cap, perhaps
61. Home of “Woodstock for Capitalists”
62. “Aw, ______!”


1. Lady born Stefani
2. May be part of a café order
3. Controlled crystal
4. Elementary Latin verb
6. Roadside threat: abbr.
7. TiVo, e.g.
8. Looks like it does
9. Film set
10. Troubles
11. Meet for a spell
16. Settle ________
17. Started
18. College dept.: ____ Sci
19. Jacket style
23. Country that writes a 26-down
24. Rashly, perhaps
26. Letter from the people of 23-down
27. Big ring name
28. Periodic lamentation of “Arrested Development”‘s Buster
29. Trademark lawyer admonition
30. Odor
31. Where Cortés sailed
35. List ender: abbr.
40. Expected to, à la Ricky Ricardo
41. AT&T, e.g.
44. Moose, to Bach
45. Film style
48. With a draft
50. State of Zion
51. “No _______.”
52. Actor Baldwin
53. Student’s turf
54. In
55. Giant word
56. Open land

July 31, 2016Comments are DisabledRead More
ReservationHop Does a Hard Pivot: We are Now OK Shift

ReservationHop Does a Hard Pivot: We are Now OK Shift

I have been relatively silent recently regarding ReservationHop. We have been doing a lot of exploration (or in startup lingo, “customer discovery”) as we try to find a good niche for innovation in restaurants that benefits consumers and optimizes some part of the dining experience. Consequently, in the last couple of months, we have built and tested various prototypes with several end users. These include:

– A reservation marketplace
– A reservation ticketing system
– A new online reservation system
– A restaurant table and server management software
– An easy restaurant scheduling software

It was on the last development that we started to see an opportunity in the form of service industry workers who have a lot of trouble changing out their shifts. This primary use case, allowing service workers to easily get their shifts covered in a way that fits into their existing workflow, is a niche in which we have seen the most traction.

Enter OK Shift, which we soft-launched a couple weeks ago and have been beta testing in several locations around the country. OK Shift is an SMS-based, cross-industry scheduling tool that lets hourly service industry employees get shift coverage, secure manager approval, manage their schedule and communicate with their coworkers.

screenshot-1 screenshot-2

59% of American workers are paid by the hour. There are 4.5M food and beverage service industry workers alone in the US, and millions more service jobs in other fields: transportation, hospitality, and medical care. It’s a gigantic market.

Moreover, the technology that almost all service workers use to manage their last-minute shift coverage run-of-the-mill text messaging. Not everyone has a smartphone, but everyone has an SMS-enabled cell, and anyone who has worked in the service industry will tell you that the only way to get shifts covered is to text or call coworkers on an individual or case-by-case basis.

The best part about OK Shift is that it’s free. Signing up is easy. If you work in the service industry, simply text HELLO to 513-OK-SHIFT (513-657-4438) or visit

Why did we decide to ultimately pivot away from ReservationHop? It became clear that the marketplace amongst restaurants wasn’t as big as we had hoped, for a couple reasons. First, after personally speaking to many of the best restaurants in San Francisco, as well as elsewhere, we could see that the value add we were bringing to the table wasn’t compelling enough to inspire a change in behavior. There was a real hesitance on the part of restaurants to mark up their prices in the form of paid reservations, for fear that they would lose control over dictating the value of their product. And there were, of course, branding concerns for many restaurants beyond simply maximizing revenue.

And finally, there are only a handful of restaurants where consumers are willing to pay for a reservation. Of these, we could only provide a niche service for certain prime times or weekends. In order to build a long-term sustainable business, the market would have to expand significantly beyond exclusive restaurants.

As I wrote in the past, we set out originally to work with restaurants and that’s what we intend to do as we evolve into this latest iteration. The past couple of months have been extremely educational and I am excited to move forward with OK Shift. We have spent the time to build and ship a researched, well-considered and tested product that we believe will solve a giant pain amongst a huge portion of the population.

ReservationHop was an experiment, and an important stepping stone in building a sustainable business. Now, as we move forward as OK Shift, the sky’s the limit.

November 6, 2014Comments are DisabledRead More
OK Shift: Manage your workplace shifts via text message

OK Shift: Manage your workplace shifts via text message

The idea was simple: why isn’t there a good system for hourly workers (60% of the US workforce) to swap their shifts out via text message?

Enter OK Shift, a text message-only based system that allows hourly workers to swap their shifts via text message, and get those swaps approved by managers. The system allows for shift management, directory management, the ability for managers to call off a shift or call in workers, and more importantly to communicate with the entire team at once via group chat.


It all works by texting the number 513-OK-SHIFT (513-657-4438) and the system prompts you through the rest: posting shifts, adding coworkers, adding managers to approve, and broadcasting.

On the technical side, it uses the Twilio API with a PHP/MySQL backend (like all my projects). It’s free for now, but eventually I want to add a paid scheduling layer on top of the system for management.

Try it out!

October 28, 2014Comments are DisabledRead More
A Personal API

A Personal API

Why is the API model traditionally built around a central entity node connected to many consumer nodes, rather than the other way around? Why is it possible for me to connect with various APIs from cloud services like Twilio and Dropbox but I can’t create an API for myself that allows companies to connect with me?

Instead of going to the cloud, why can’t the cloud come to me?

Put it this way. I watch Star Trek on both Hulu and Netflix. My episode history is out of sync on both platforms. Why is that? Because these are separate services with their own backends. There’s no way for them to talk to each other, and there isn’t because there’s only one point of intersection: me.

But what if I could store my own episode history in a personal API, which then Hulu & Netflix would talk to? Both would have permission to update my episode history, and both would have read access. I would give Netflix billing access to the banking endpoint of my API, and so they would enable additional access on their platform. They could push content to my API endpoint and it could be synced between all my devices, including my phone which would also have read access to my API.

Everything would remain “in the cloud,” but the cloud would be my own personal cloud. A mini-cloud, if you will.

There are several types of information that could be stored in a personal API:

  • My personal contact information
  • My correspondence
  • My media
  • My preferences: brands, things I read, movies I like
  • My shopping history
  • My payment information
  • My medical history and prescriptions

Pretty much all the things I do online I could do with a personal API, but there would be a few advantages introduced by creating a new protocol:

  • I would be able to control my own data. Companies/services would need to request access to my data on an individualized basis. I would only give data that would be needed for each service.
  • Privacy becomes completely up to me. I would be able to control how access to my data is granted and revoked. My data is only in place accessible only through authentication to my API. I can revoke access tokens upon request.
  • “Add-ons” to my API service could be enabled like encryption or new REST endpoints, that would allow me to evolve what my API is able to achieve.
  • I could create direct P2P connections with fellow users of the Personal API protocol without having to connect through a third party server.
  • The protocol could integrate with multiple devices, but the nature of these devices would need to change. For example, if I wanted to send a message to my brother, right now I send a text message to his phone which gets routed through AT&T’s cell phone towers (for example). But with a personal API, I would send a message to his API endpoint, and his devices would all pull from it. So it would be like iMessage, but an iMessage that would integrate with *everything* I interact with.
  • On that note, the “internet of things” becomes much more possible. Instead of having to program all my devices, my devices would be adapted to me. When I buy a new product, it requests access to my API, and then can interact with other services that also have access to my API.
  • It weakens the government data dragnet. Right now, one clandestine program by the NSA can tap into Facebook once, and have access to everyone’s data. With a distributed personal API, the government would need to focus its attention on just nefarious or dangerous individuals. The legal status of a personal API would be more akin to a lockbox in my house than a self-storage center that is analogous to the current cloud.

If a personal API protocol were to be created, that would only be the first and easiest step. Cloud services would need to play ball, adapting their account creation and sign in systems, not to mention data access and storage, to work off of my personal cloud rather than their common cloud.

Speaking of Facebook, everything I have listed above is something Facebook could create tomorrow (or Google or Apple), and they may even be considering doing so. They certainly have access to the data necessary to create the API. But they also have the problem of centrally storing that data, creating a single point of failure/weakness/whatever. A centrally stored backend does not meet the criteria of a truly personal API listed above. It needs to be distributed.

Again, it’s a specious concept, and I can’t be first person to think about it, but I would be interested in A) If anyone has fleshed out an idea like this a bit more or attempted to build it, B) Either way, if anyone would be interested in working on something like this with me.

Would love your input in the comments.

October 13, 20142 commentsRead More
EphChat: An Ephemeral Chat Program Written in PHP, JS + Firebase

EphChat: An Ephemeral Chat Program Written in PHP, JS + Firebase

I took a break from ReservationHop today to build a new chat program.

Screen Shot 2014-07-17 at 9.08.30 PM

EphChat, which stands for, you guessed it, “Ephemeral Chat,” is a chat program with a twist. No data is stored server side, and messages are only visible to participants for 60 seconds before they fade away into nothingness.

Anyone can create a new chatroom with a random URL hash, or can create their own chatroom.  Users are anonymous but you can edit your name if you wish. Messages are encrypted all the way to the server, where they are relayed to the chatroom participants and then immediately deleted. User sessions are stored until a user disconnects, then they are deleted, too.

Why did I build this?  Well, partly as an experiment with Firebase, but also because I like the idea of people being able to communicate in an encrypted, anonymous way without governments snooping on them.  Reporters can use this to do sensitive interviews; protesters under despotic regimes can use it to organize resistance.

The code is up on GitHub, which I felt was necessary to provide transparency into the app’s inner workings and security.  I don’t usually make my repositories public, for fear of being ripped apart by the hackersphere, but if anyone is going to use this app they’re going to want to know how it works.

I had a lot of trouble with Firebase’s security rules, but I think I figured it out.  Here’s the current security schema:

    "rules": {
      "rooms": {
        "$RoomId": {
          "connections": {
              ".read": true,
              ".write": "auth.username == newData.child('FBUserId').val()"
          "messages": {
            "$any": {
            ".write": "!newData.exists() || root.child('rooms').child(newData.child('RoomId').val()).child('connections').hasChild(newData.child('FBUserId').val())",
            ".validate": "newData.hasChildren(['RoomId','FBUserId','userName','userId','message']) && newData.child('message').val().length >= 1",
            ".read": "root.child('rooms').child(data.child('RoomId').val()).child('connections').hasChild(data.child('FBUserId').val())"
          "poll": {
            ".write": "auth.username == newData.child('FBUserId').val()",
            ".read": true

I welcome any and all feedback on the app, especially security.

Go ahead and start a new chatroom at!

July 17, 2014Comments are DisabledRead More
ReservationHop Does a Soft Pivot

ReservationHop Does a Soft Pivot

It has been a crazy holiday weekend.

In three days we went from relative obscurity to being the punching bag of the entire tech industry. I suppose some might envy me for all the media attention I’ve received for a side project I built in my underwear one night after waiting in line for a burrito, but that sort of attention does not a legitimate business make. Getting covered in CNN has its perks, to be sure, but a business needs customers, and most of all, trust.

Let’s start with customers. Opening a firehose of traffic on ReservationHop, with the sales that followed, showed that there is indeed a validated secondary market for restaurant reservations. Paid restaurant reservations are not only desirable, but the market is heading that way as people awaken to the inefficiencies in the current system. Some restaurants, annoyed by empty tables reserved for no-shows and short-sats, are moving towards ticketing and deposits anyway. Multiple chefs and owners have pointed to OpenTable as more of a problem than a solution. A paid reservation system makes sense as a way to dissuade no-shows, distribute covers throughout the week, and even increase fairness for customers. As Tyler Cowen put it in the New York Times, “Money is ultimately a more egalitarian force than privilege, as everyone’s greenbacks are worth the same.”

The biggest criticism we have received has not been about the principle of selling reservations, but rather the methods we initially employed to hack this project into existence. We appreciate the criticism and honest feedback, which is why today ReservationHop is doing a “soft pivot” to address the same customer demand, and in addition work with the restaurants directly to cut them in on the deal. We believe that restaurants can benefit from selling reservations for a couple tables per weekend. This will not only reduce no-shows and mediate demand for their peak reservations slots in favor of off-peak times, but they will also get paid for filling these tables, instead of the other way around.

It was never our intention to harm the restaurants. In fact, as we promised from the beginning, we called to cancel 15 or so reservations that didn’t get claimed this weekend 4-6 hours in advance, so restaurants would not have to deal with no-shows.

In addition, I spent a lot of time in the last couple days speaking and meeting with restaurant owners personally, offering my apologies for the troubles we may have caused them and discussing how we may work together in the future on the massive opportunity that has presented itself. It has not been lost on many restaurants that with the sort of media coverage ReservationHop is receiving and the hundreds of local customers begging for instant access to their tables, they are not only getting free advertising as the hottest ticket in town, but are given the opportunity to make money filling their best tables at near-zero risk of no-shows. This is of course an opportunity that we need to explore with them over the next couple of weeks.

This also means that ReservationHop will be evolving, as all early-stage startups do, as we experiment to find a product-market fit. We may find that our early assumptions about customers or restaurants are faulty, or there are better services we can offer to the foodies of San Francisco that are more scalable.  Or we may find that this entire venture doesn’t really have a large enough addressable market.  One of the interesting things about the last couple of days is how our initial in experiment in customer demand was taken to be “what we do,” with little acknowledgement or understanding (at least outside of the lean-startup-model-aware tech community) that rapid iterations on business models are the norm. As far as I can tell, it is rare for early stage startups to have this much press attention this early in the game. One of the challenges for us will be to navigate the extreme press scrutiny while simultaneously experimenting to find a model that works.

As we evolve, we will continue to let customers have exclusive access to the best tables in the city, while making a new promise to restaurants: we hear your concerns, and we want to work with you. As always, if you are in the restaurant business please drop us a line:

July 8, 20144 commentsRead More
How I Became the Most Hated Person in San Francisco, for a Day

How I Became the Most Hated Person in San Francisco, for a Day

This morning I put the finishing touches on, and launched,, a site where I’m selling reservations I booked up at hot SF restaurants this Fourth of July weekend and beyond.


I built it over the weekend after waiting at Off the Grid for 30 minutes for a burrito from Señor Sisig, and realized that there’s got to be a market for the time people spend waiting for tables at our finest city dining establishments.  Turns out I’m not the first person to think it, as there are two startups doing this very thing in New York City (here and here).

It’s a simple site with a simpler backend. I book reservations under assumed names, list them on ReservationHop, and price them according to the cost of the restaurant and how far in advance they need to be booked up. I don’t use OpenTable; I call the restaurants directly. And I have a policy of calling and canceling reservations that don’t get snapped up, because I don’t want to hurt the restaurants (the assumption being that on-demand restaurants with high walk-in traffic won’t have trouble filling those tables).

I anticipated some mild interest when I launched this morning, emailing the 20 or so potential customers I had interviewed at Off the Grid and some friends. I expected maybe having to make somewhat of an effort in order to get people to discover what I’m doing.  I never expected a maelstrom of internet hate.

Not all of the responses have been negative, but an overwhelming number of them has been.

I totally understand the frustration people have with SF’s particular brand of “innovation.” And it seems that everywhere you look cherished public resources are being claimed by startups, whether it’s Google laying claim to bus stops or parking apps laying claim to, well parking spaces. I’d half expect someone to come along one day and put picnic blankets down in Dolores park and sell them at $25 apiece.

I also understand that this represents, as one Tweeter put it, “a caricature of SF tech bro shithead.” And as someone who spends a lot of time complaining to my friends about how much of an insular bubble San Francisco has become, what with apps built by the 0.1% for the 0.1%, I completely agree. In fact, I would have much preferred the media raised this much a fuss about Drillbit or The Creative Action Network or any of my other startups over the years.

But there’s something peculiar about SF, in that our media seems to love hating on stuff like this, so I guess I’m not surprised that I got Valleywagged almost immediately, followed by a post from The Next Web. I responded to an interview request from TechCrunch so it’s written up there too.

Meanwhile, traffic has gone through the roof. Here’s my actual Google Analytics graph from today.

Screen Shot 2014-07-03 at 5.50.59 PM

I guess you can say that any press is good press.

But let’s talk about the questions/criticisms everyone has. What was I thinking! How dare I sell something that’s free! Is this even legal? Is it ethical? Restaurants are going to hate this!

To be honest, I haven’t spent a lot of time thinking through these questions. I built this site as an experiment in consumer demand for a particular product, and the jury’s still out on whether it will work. But I can tell you what I have thought through.

The initial criticism has been about the fact that restaurant reservations are free, and I shouldn’t be selling them. First off, reservations aren’t free. Restaurant tables are limited, in high demand and people wait a good long time as walk-ins to get them. Reservations take time and planning to make and the restaurant assumes an opportunity cost from booking them. My friend joked that it took me less time to build this site than most people spend hunting for OpenTable reservations in a given year.

What about ethics? We are talking about an asset that most people don’t think about having a value. That doesn’t necessarily mean that it doesn’t have a value, or that people wouldn’t be willing to pay for it. For instance, no one would have thought that taking a cab during rush hour should cost more than a normal ride, until Uber launched surge pricing and we realized that people are willing to pay for it. Clearly, the service of booking a reservation in advance has value to patrons. This is evidenced by the startups doing this right now in New York City.

If someone does pay for it willingly, is it really unethical? The consumer has made a choice, the reservation stands, the restaurant gets a table filled as planned, and I have made money for providing the service. That seems perfectly ethical to me. I am aware that the ethical conundrum is around the “what if” question: If I book a table and no one buys it, the restaurant loses business, doesn’t it? I don’t know if that’s true yet, and I’m also working at a volume so low that it probably won’t matter.  I’m canceling the reservations 4 hours before if they don’t get bought, and certainly a restaurant that’s booked weeks in advance won’t have trouble filling a table with their high walk-in traffic, or someone who gets lucky and snaps up the reservation for free on OpenTable.

But more importantly, I think that a paid reservation lets customers get skin in the game, and that means that restaurants might even reduce no-shows if paid reservations become a thing. When Alinea introduced ticketing (pre-paid reservations), they dropped their rate of no-shows by 75%. That’s a pretty good deal in an industry with razor-thin margins.  I’m just speculating on whether this might provide value for restaurants; I can’t speak for them and need to parse this out over the next couple days.

So, back to becoming the most hated person in SF. I learned a lot today about how media, culture and technology in this city interact, and I have to say that overall, I think that the people who have sent me violent threats via email and Twitter, while excessive, may have a point. So in the interest of ethics and fairness, I want to talk to restaurants about working with them directly on a better reservation system. I’ve heard that OpenTable is loathed by many restaurants who don’t want to pay to fill tables. There may be a ticketing solution to high-demand restaurants. If you’re a restaurant, please drop me a line.

And if you’re a regular Jane or Joe, and you missed an opportunity to get a reservation at a hot SF restaurant for your first wedding anniversary this weekend, check to see if there are any reservations available for you at

UPDATE: We have made a “soft pivot” to address feedback from the restaurant and tech industries. Read more here.

July 3, 2014132 commentsRead More
Announcing Drillbit: Who’s on your Mailing List?

Announcing Drillbit: Who’s on your Mailing List?

The genesis of this idea was a couple weeks ago when my cofounder said: “Would it be possible to see what percent of our email list was female or male based on their names alone?” Thus Drillbit was born.

Screen Shot 2013-06-20 at 6.01.30 PMIn the last couple weeks I have been pouring over data sets and trying different formulas to find the best way to break down a list of seemingly random name data into digestible information. The resulting app allows anyone to upload their mailing lists and see who’s in them, and in perhaps the coolest feature, they can segment their list as well.

The Project

Drillbit uses publicly available datasets to create a likely demographic profile of mailing lists based on first and last names. Upload your mailing, customer or user list with first and last names, and based on that information we will create an age, gender and demographic profile of your list.

The Datasets

Listed here are the foundational datasets of this project, including for analysis tools that haven’t yet been released.


The essential principle behind Drillbit is that an individual’s first and last names betray a lot of information about his or her background, origins, language, gender, and even income and ideology. Names can be both varied in their originality and popularity as well as conservative in their staying power. A surname can be passed down for generations, whereas first names have a tendancy to be cyclical.

As an example, take the name “Max.” It is a common name, or common enough it would seem, that one could find out very little information from the name alone. But as it turns out, “Max” only may seem common to us given its surge in popularity in the late 80’s and early 90’s–the birth years of the rapidly matriculating Generation Y. In 1974, only 400 Max’s were born nationwide!

Of course, baby name popularity is not a new idea. But the variance is astounding, and not just in terms of popularity. In 2012, the two most popular baby names for boys and girls were “Jacob” and “Sophia.” Unlike “Max,” both of these popular names seem to have spent the last 100 years on the up-and-coming list.

With this amount of unique variance in names–some names jump and others sink, some names are like fads and others never really take off–it isn’t surprising that, in the aggregate, it is possible to take a list of people and determine how old they are likely to be.

So that’s what I did. Using the above datasets on name popularity, I was able to come up with some pretty convincing initial results, benchmarking against existing lists I knew well.

The first step is to condense the data I had into a table which compared year of birth, and gender, with the % likelihood that any random “Michael” born in the last century was actually born in that year. For example, if 10,000 Michaels were born between 1900 and 2000, and 1000 Michaels were born in 1950, then 1950-M-MICHAEL has a 10% likelihood; i.e., given a random Michael, there is a 10% chance he was born in 1950.

With the charts above, you can see how this would play out. If you were to use Drillbit to upload a list of 5000 Jacobs, you would see the age match pattern roughly cohere to the above chart. The more Jacobs there are, the higher confidence we would have in the result.

There are some obvious complications with this model. The first is that although 10% of all Michaels might have been born in 1950, they would be over 60 now, and their chance of being around is much smaller than that of a Michael born in 2000. That’s where actuarial data comes in. Using the above actuarial table divided by gender, I was able to normalize the distribution based on likelihood of survival in each age cohort. No matter how many Max’s were born in the 1910’s, there aren’t a lot left today.

The second problem is that names are not unisex; in fact, most names in the database aren’t 100% unisex, Michael included. It became clear that age data had to be done on the basis of gender, and not on totals. Names that are popular with one gender are not necessarily popular with the other at the same time.

To compensate for this, age data was tabulated separately, all the way down to the actuarial normalization. Female names were rated and graded against each other, male names were separately, and only at the end were they normalized against each other.

Compared to Age, Gender and Race were quite easier. Gender analysis was a simpler form of the age analysis–likely names were divided by gender and then normalized by age. Race/ethnicity data was also quite simple based on surnames–the data was already organized by the Census, albeit 13 years ago, so getting it into a searchable database wasn’t tricky.


There are some obvious limitations to my method. The first is in the nature of large numbers, or small numbers as the case may be. If you were to put a list of 2 names into Drillbit, it would spit out a similar looking demographic profile running the gammut of all ages and perhaps some different races as well. There are few names that are reliably “Black” names or “Over 65” names (although, there are a few names with a 100% incidence within one to five years–challenge you to find them). Like with any aggregate data project, the larger the list, the more reliable Drillbit will be.

The other limitation is in any sort of list that comes with existing biases. Say, a list of NBA players (heavily 25-35 and black) or a list of sitting US Congresspeople (heavily male, white, and 35-55). These inherent biases will be reflected in the anlaysis, but probably not to the extent that they could be. This is the House of Representatives, according to Drillbit:

Obviously 18-24 year old congresspeople would be impossible. And yet, even with a small list of 435 names, the trends in age in reality poke through.

In short, you shouldn’t use Drillbit to analyze a list whose composition is already known to you to skew heavily in favor of one or two demographics. However, it’s worth nothing that Congress is 18.3% female, and Drillbit predicted 20.5% based on names alone. Not shabby.

The final inherent bias that’s worth mentioning is in skewing toward younger ages. Since younger people are overwhelmingly more likely to be alive, post-normalized numbers skew younger. In addition, in development, I had a category be “Under25” but it became apparent that although my database could detect age variability all the way to Age 0, babies aren’t going to be on mailing lists, and they were throwing off all the results. So to compensate for the younger skew, I made a judgment call to make a cutoff at 18, and not track any younger cohorts, even though some websites may have 13-18 year olds as users.

Now that you know more about how I did it, upload a list and try it out!

June 21, 2013Comments are DisabledRead More
Saving Civilization from Al-Qaeda (and the Weather)

Saving Civilization from Al-Qaeda (and the Weather)

You may have read the article in the New Republic last month about how 300,000 ancient books and manuscripts in the libraries of Timbuktu were evacuated in secret to protect them from Ansar Dine, an Al Qaeda cell. The manuscripts not only survived the burning of the Timbuktu library, but were smuggled in footlockers all the way to Bamako, the capital of Mali, where they are currently being hidden away by volunteers until they can be returned.

The problem is, Bamako is in the south and the climate is much wetter and thus more destructive to the manuscripts. As I write this, thousands of unique, priceless artifacts chronicling history, philosophy, science, literature, law and religion from the peak of medieval Islamic and North African civilization are slowly being eaten away by mildew.

I have teamed up with T160K, Timbuktu Libraries in Exile, to help drive attention and funding toward the preservation of these manuscripts, and by proxy, the preservation of civilization itself. There is nothing more offensive than religious zealots imposing their backwards ideologies on free thinking people. At T160K, we seek to keep these manuscripts safe from the elements while they wait out the Islamist threat in Mali.

300,000 unique books and manuscripts have been there 800 years. They should be around 800 years from now.

There are several things you can do to help, but right now we need to fund the Indiegogo campaign.

Fund the Indiegogo CampaignFind Out More

May 21, 2013Comments are DisabledRead More