The Explore-Exploit Dilemma — Lambda in ML and in Life

T

In machine learning, there is a concept of explore vs exploit. How often do you explore versus exploit? For example, when the algorithm is doing its supervised learning, you need to know how often to actually utilize the data that you have to get the right result. Let’s say you are trying to classify animals. You collect data, like how many legs they have, whether they produce milk, or whether they lay eggs. You might have a data set that says, for instance, a duck has two legs, is a bird, lays eggs, and produces no milk. You would feed that into a machine, and after training it, the machine has to figure out how often it should expand its horizons and learn something new, versus relying on what it already knows. For instance, if the machine learns that animals that produce milk are mammals, how often does it stick with that conclusion versus making room for the possibility that some mammals might not produce milk? 

This concept applies to people as well, not just machines. How often do you reflect on your best practices or current practices to determine if they are still good enough? For example, in your department’s approval process for deploying code to development, you might think about whether it’s worth challenging. That’s like exploring—you decide if it’s worth learning or changing something. But if you challenge everything, you won’t have time to get any actual work done. If you’re constantly second-guessing your approach, whether it’s using the tech stack you know or considering other options, you might never get anything done because you’re spending all your time exploring.

Each of us needs to find our balance. If you always only exploit your skills, you’ll never learn new things. After a few years, you’ll end up with outdated skills and won’t be competitive. On the other hand, if your exploration rate is set too high and you’re always exploring, you might not get anything practical done. In machine learning, for stochastic gradient descent, lambda is seen as the exploration rate, and is typically tuned to some low factor, say 10%. In life, you can think of it similarly. Google, for example, used to have a policy allowing employees to explore new ideas one day a week, believing that it would bring innovation (20%). But not all organizations are like Google, and some have different ways to encourage exploration, like requiring employees to complete a certain number of training hours each year.

Personally, I believe in keeping my lambda around 0.1 — spending about 10% of my time exploring. When I make decisions, like choosing a tech stack, I reflect on whether the justification for the choice is still valid. If it is, I stick with it. If something sounds questionable, then I challenge it and make a change. That’s how I like to grow in an organization as well. Challenging things can step on people’s toes, but it can also lead to good results and help everyone as a result.

Improvement comes from challenging the status quo. Even if your challenge is dismissed, at least you’ve learned something about the organization. If you don’t challenge, you won’t improve. The world didn’t get to a five-day, 40-hour workweek by everyone keeping their heads down and working their jobs. Innovation, whether incremental or disruptive, comes from asking questions and finding better ways of doing things than following the status quo.

When I do something I haven’t done in a while, I think about whether my methods are still state of the art. I keep up with tech radar articles to stay informed about new solutions. I don’t want to be only looking for solutions when I have a problem. Instead, I look for solutions and problems in advance, so when a problem arises, I already have several options to consider. I organize these ideas in my notes and bring them out when necessary. This helps me make better, more informed decisions quickly, which is important because when you’re faced with a problem, you often don’t have time to explore too much. That’s where your balance, or your lambda factor, is crucial.

Featured photo by Dariusz Sankowski on Unsplash

About the author

Clarence Cai

Add Comment

Clarence Cai

Get in touch

Reach out if you have something to discuss, you can find me in the following socials.