Exercise 3
Part of the course Machine Learning for Materials and Chemistry.
Loss functions constitute the key metric that is minimized in order to improve a model. There are many different possible choices which impact the predictions of a model. This means that the loss function needs to be specified to fully qualify a model.
Different loss functions can yield highly disparate predictive power, exemplified in this work estimating bond dissociation energies.
Task 3.1: Loss functions
Let's fit a function of the form y=a*exp(b*x)
which can be logarithmized as log(y)=log(a)+b*x
. Write loss functions for the squared error of the residuals in the original form and the logarithmized form. Use the following signature: original_loss(params: tuple[float]) -> float
.
Task 3.2: Fit models
Write a function random_data(c: float) -> tuple(list[float], list[float])
that returns the values 0-9 for xs
and some exponential of the form y=a*exp(b*x)+c*epsilon
where epsilon is some elementwise normal distributed random noise. Fit the data using the loss functions of task 1 and scipy.optimize.minimize
once for c=0
and once for c
being some non-zero value. Discuss the results.
Task 3.3: Motivation for loss functions
Consider the effects you saw in task 2.
- What could be possible motiviations to choose one loss function over another?
- Is there a best loss function?
- Which problem do you see for loss functions for multivariate models?