I'm working on a site called Cyclopath. It's been live for years--go ahead and check it out. It's a geowiki--a map of the Twin Cities that anyone can edit and contribute to--made for and by cyclists. You can add paths if hey don't exist, tag points of interest, add notes on routes ('Lots of potholes here!'), or rate the bikeability of a path. It aims to be a valuable resource, a place for bikers to share information about navigating the cities by bike. Pretty cool, huh?
Now try finding a route. Unlike with Google Maps, where you just enter a start and end location and hit enter, Cyclopath gives you some options. You can tell it whether you want a route that's more direct, or easier to bike. You can also rate tags--different traits of roads, like hills or bike lanes--to tell Cyclopath whether you want it to favor or disfavor them. All this information goes into the customized route that it gives you after a few seconds. This is all well and good; go and enjoy the route, and maybe rate it afterwards or tag something cool you saw.
But how do you think Cyclopath comes up with this route? Technically it uses the A* algorithm, a graph-search algorithm that finds the lowest-cost path between two nodes. In this case, "cost" refers to how bikeable (or not) a path is, and how long it is. Cyclopath tries to find you a route that provides just the right balance between distance and bikeability (a balance that you can adjust, as mentioned above). But this relies on knowing your bikeability rating for every edge, so it can run calculations on it. But clearly no one rates every edge; in fact, many edges haven't been rated by anyone. (One of the biggest problems facing Cyclopath) But clearly it has to be able to recommend any edge, so it can provide routes from anywhere, to anywhere. So it needs a way to know, or guess, how you would rate a given edge even if you actually haven't. That is the focus of my research.
Reid (the now-departed doctoral student who founded Cyclopath) has already done research into different 'predictors' (different methods for guessing how users will rate edges). Some simply consider factors like the type of road and presence of bike lanes; others categorize the roads and assign default ratings to each category. Still others average how other users have rated an edge and predict the average. And, most interestingly, some predictors take into account how a user has rated other edges. Reid implemented a few dozen of these algorithms and tested them to see how accurate that are and how often they agreed with one another on which edge to choose. But that was as far as he got; his thesis left open the question of whether different predictors really mattered in recommending routes.
That's where I came in. This semester I've been testing the most promising predictors to see if they came up with different routes. Ultimately we were hoping to learn how to improve Cyclopath by personalizing the route recommender, so it could give users routes that they would really enjoy. I did this pretty much by brute-force. I randomly pulled 500 saved route requests from our records, fed each of them into the predictors I was testing (this took about 12 hours total), then compared the different routes generated by each predictor. This let me find some interesting things like how often the predictors gave the same route, and I was able to group them together by which ones usually agreed. But things got better last week when I figured out how to turn the results into cool pictures.
This is a rather epically long route from Jordan to north St. Paul I tested. The three colored routes are the different options the ten predictors came up with. The blue and green paths are pretty similar; the red one is quite different. I have ways of measuring this, but it's very helpful to be able to see them. The next step is looking for tangible patterns in the routes the predictors give, including surveying bikers to see which ones they prefer. I hope this sounds at least kind of interesting. I'd love to talk more about it!