Here’s my day. Not because I think I did anything groundbreaking, but because it might help you to get to see how tackle data science problems with my team. I’ll keep the details project centric, and not very in depth, so you won’t have to know why we kept 250 factors in our SVD calculations (but I’d LOVE to tell you all about it!!)
It’s 10:02 PM. As per the past ten or so days, I’m finally getting around to writing. I’m incredibly analytical, so I’m well aware that it takes me about twenty minutes to get to around 400 words. After an hour, I’m’ somewhere between 750 and 1,150 words. That’s quite a standard deviation. I suppose the second half-hour is more of a downhill run some nights. Unfortunately, my brain hasn’t been trained well enough yet to predict how many words this post will be, and thus, when I’ll finish.
This morning, as a team, we worked on a new project to build a travel location recommender, sourcing data from all over the internet. The objective is to help people find the next great place to visit, personalized to each person using it.
We have fantastic music recommenders, and movie recommenders, and product recommenders, but where’s that one tool that when we’re yearning to travel, provides us that perfect next spot?
I’m excited about this project for a personal reason; several years ago I had the incredible pleasure of spending a month on Bali. I even scootered around the entire island for 20 days; surfing, meeting locals, hiking a volcano. I really loved the experience. I thought I did all the “good stuff”. Then about a month later I started hearing, over and over, about some Indonesian islands that were gaining popularity; the Gili Islands.
For the next few months I heard about these islands seemingly everywhere I went. The bagger at the grocery store had been there! But the advice was too late for me. I was gone, already thousands of miles away.
Word of mouth failed when I was there. And the internet hadn’t provided for me. That’s why this project is so much fun to work on. The why is so helpful as a motivator, even to math-driven data scientists. That anchor engages us every step of the way.
Today, David, Andrey (data scientists on our team) and I worked on some text processing, taking stacks of data from a myriad of online sources and compiling them into a cohesive story; one story for each location. In the beginning, it’s the similarities between the destinations that will help us make recommendations. We’ve assembled tens of thousands of destinations, described by words and symbols, some that we’ve never heard of before. There’s a lot of manual research, and a lot of smart decision making.
On top of that effort, we’re building out a simple interface to showcase what we’ve built. Over the years, this ability to showcase the math/recommendations/predictive tools has served us as well as the data science itself. I’ve learned that if I can’t explain our work, it’s not much value. When we start any project, we begin with the end in mind; usually with a visualization. If it doesn’t work conceptually, we tear it apart and start again until it’s simple to understand and supportable by the intelligent methods we’re so familiar with.
Time check: It’s 10:23 and I’m at 515 words. It’s a good night! I think all the talk about islands is inspiring me.
I’ll be sure to post about how that project progresses and provide a link as soon as there’s an MVP to share. I mean, you only have your next big plane ticket purchase to lose!
That was most of my morning.
The afternoon involved a trip to the dentist for a cleaning (I’m doing fine).
Then I rushed to a meeting…that…got…cancelled. How many times to we all rush out of something to head to the next…cancellation? It’s wonderful (sometimes) to have the additional time, but it really makes me long for predictive calendar insights? (anyone working on this?)
Then it was home for a few hours spent reaching out to prospects. There are so many good ideas out there, and it’s a lot of fun talking with founders and managers who need help. This is a wonderful time to be a data scientist.
The gym, then dinner, then some online research. I’m thinking of video blogging and I’m assembling some notes and thoughts on how to approach it. Always. Be. Data. Driven.
And now, here I am. 714 words in, and it’s 10:30 PM. Flying tonight.
These are fun. I’ll do these posts now and again. I hope you enjoy ‘em!