Exercise 3

Part of the course Machine Learning for Materials and Chemistry.

Loss functions constitute the key metric that is minimized in order to improve a model. There are many different possible choices which impact the predictions of a model. This means that the loss function needs to be specified to fully qualify a model.

Different loss functions can yield highly disparate predictive power, exemplified in this work estimating bond dissociation energies.

Task 3.1: Loss functions

Let's fit a function of the form y=a*exp(b*x) which can be logarithmized as log(y)=log(a)+b*x. Write loss functions for the squared error of the residuals in the original form and the logarithmized form. Use the following signature: original_loss(params: tuple[float]) -> float.

Task 3.2: Fit models

Write a function random_data(c: float) -> tuple(list[float], list[float]) that returns the values 0-9 for xs and some exponential of the form y=a*exp(b*x)+c*epsilon where epsilon is some elementwise normal distributed random noise. Fit the data using the loss functions of task 1 and scipy.optimize.minimize once for c=0 and once for c being some non-zero value. Discuss the results.

Task 3.3: Motivation for loss functions

Consider the effects you saw in task 2.

What could be possible motiviations to choose one loss function over another?
Is there a best loss function?
Which problem do you see for loss functions for multivariate models?