Term Overview: Model Parameters vs Model Hyperparameters
In this guide and we're going to walk through two different terms in the machine learning space and they are model parameters and hyperparameters.
Guide Tasks
  • Read Tutorial
  • Watch Guide Video
Video locked
This video is viewable to users with a Bottega Bootcamp license

We're going to split them up because I think the easiest way of understanding them in addition to seeing a basic example is by seeing there two definitions separately.

And so even though they have a similar name there's a very key difference that you need to be able to understand in order to work with them. So the first one is model parameter.

large

The definition for that is that it is a part of the model that is learned from historical training. So if we were to look at this algorithm output right here

large

our model parameters are elements such as the mean or the standard deviation like you can see on the top right hand side. Those are the elements that we've learned from the training data set. So they're not the data directly but instead they are the different parameters that once they've gone into the algorithm they were generated such as the standard deviation. And so those were used in order to build out that trainee model. So right here as you can see with the set of graphs we have four different types of parameters we have means that range from 0 to -2 and then we have standard deviation going from 0.2 all the way to 0.5

large

And then when you place those inside of the algorithm then it generates these various curves and so you don't have to worry about the graph in this sense I want you to just focus on those items on the top right-hand side because those are the model parameters they are what are being used in order to generate the different linear regressions or whatever type of algorithm that is being generated. Now that is a model parameter.

Now if that is still not perfectly clear let's look at a programming example because in programming such as in Python. Python functions have parameters. So right here

large

we have a basic greeting function and the greeting function takes in a couple arguments. It takes in a first and a last name for arguments but then when we call that functions when we call greeting down on the bottom we pass in two parameters we pass in Jon and then Snow and then those parameters are used in order to return the value.

Now this is a very basic example but this may also help you to understand exactly how parameters are used even in a slightly different way than they may be used directly in machine learning. So those are model parameters they are the data that you're working with and they are the different elements that are generated from the machine learning training data set.

Now let's talk about hyperparameters. So when it comes to model hyper parameters the definition for that is that it is a configuration that is external to the model and whose value cannot be estimated from the data.

large

So remember when we talked about model parameters every one of the model parameters are elements that are generated from the training data set. So when we generate a mean or an average or we generate a standard deviation. Those are elements that the algorithm was able to learn from the training data that we passed into it.

With a model hyperparameter these are elements that the model can't learn. And so one of the best examples of this is in the case nearest neighbors algorithm because right here we have a hyperparameter and in the K nearest neighbor algorithm evap is k. So notice that K is not something that the system generated automatically but instead it was something that we as the implementer of the algorithm had to provide.

large

And so I think that's one of the best examples of what a hyperparameter is because it's something that the system didn't generate. So it's not like a standard deviation or a mean it is simply an input that we pass in and typically a hyperparameter is something that we need to experiment with quite a bit. So for K nearest neighbors for example we typically are not going to be able to guess K perfectly right away. We're going to have to analyze the data test it and then eventually find whatever that case value should be.