Accelerated Discovery

8. Accelerated Discovery#

💡 Geoffrey Hinton: It’s quite conceivable that humanity is just a passing phase in the evolution of intelligence.

8.1. 🤖 x 🧪 Closed-loop optimisation#

The combination of automation and optimisation is powerful. Closed-loop workflows are of growing importance in materials research for many reasons, including:

Efficiency: Efficient allocation of resources, both in terms of time and materials. By continuously updating experimental parameters based on real-time feedback, we can reduce the number of trials needed to reach optimal outcomes.
Adapt to changing conditions: Adaptive decision-making, ensuring that experiments remain effective even when external factors fluctuate. This adaptability is highly valuable for complex systems where traditional trial-and-error approaches are prone to fail.
Exploration of large parameter spaces: Many materials science problems involve high-dimensional parameter spaces where exhaustive exploration is impractical. Techniques such as Bayesian optimisation can efficiently sample and search these spaces to identify optimal configurations and make discoveries.
Data-driven insights: Generation of valuable data from ongoing experiments. This data can be analysed to gain a deeper understanding of the underlying processes and relationships, facilitating scientific discoveries and supporting future efforts.

Today we will make use of the scikit-optimise package.

# Installation of libraries
!pip install scikit-optimize --quiet

# Import of modules
import numpy as np 
import matplotlib.pyplot as plt 
from scipy.stats import norm  # Statistical functions
from skopt import gp_minimize, dummy_minimize  # Bayesian optimisation
from skopt.utils import create_result  # Utility functions for skopt
from sklearn.metrics import r2_score  # R-squared metric

8.2. Bayesian optimisation (BO)#

BO is a powerful technique for optimising complex and expensive-to-evaluate functions. It combines probabilistic modeling and decision theory to search for the optimal set of parameters. In materials research, parameters like chemical composition, sample thickness, and processing conditions can be optimised.

BO aims to find the global minimum (or maximum) of an objective function, \(O(x)\), where \(x\) represents a set of parameters or design variables. Instead of exhaustive searches, BP builds a surrogate model, typically a Gaussian Process (GP), that approximates the true objective function. This surrogate model captures both the mean \(\mu(x)\) and uncertainty \(\sigma(x)\) associated with \(O(x)\). The GP is defined as:

\[ O(x) \sim \text{GP}(\mu(x), k(x, x')) \]

where \(k(x, x')\) is a kernel function that quantifies the similarity between two input points \(x\) and \(x'\).

The surrogate model balances exploration and exploitation using an acquisition function \(\alpha(x)\), which trades off between exploring uncertain regions and exploiting promising areas:

\[ x_{\text{next}} = \arg \max_x \alpha(x) \]

Common acquisition functions include Probability of Improvement (PI), Expected Improvement (EI), and Upper Confidence Bound (UCB). Each of these functions aims to maximise the expected gain in performance over the current best solution.

Curious about the kernel function?

The kernel determines the covariance structure of the GP. A commonly used kernel, and the default in sklearn, is the Radial Basis Function (RBF):

\[ k(x, x') = \sigma^2 \exp\left(-\frac{\|x - x'\|^2}{2l^2}\right) \]

where:

\(\sigma^2\) is the signal variance, which controls the overall magnitude of function variations,
\(l\) is the length scale, which determines how quickly the function values change with respect to input differences.

There are also many other choices, such as the Matérn kernel, which differ in how they model smoothness and continuity.

8.3. Building a BO model#

8.3.1. Step 1. Target function#

We can start by generating a simple sine-like target function with added noise to keep things interesting. This acts as our “virtual experiment”, i.e. we can call the function to obtain an output.

# Fixing the random seed for reproducibility
np.random.seed(42)

# Define the target function
def target_function(x):
    x = np.atleast_1d(x) # Ensure x is an array
    return np.sin(x[0]) + 0.1 * x[0] + 0.5 * np.random.randn()

# Generate data for visualisation
x_values = np.linspace(-5, 5, 200).reshape(-1, 1)
y_values = np.vectorize(target_function)(x_values)

# Plot the target function
plt.figure(figsize=(5, 4))
plt.plot(x_values, y_values, 'r-', alpha=0.5, label='Target Function')
plt.xlabel('Input')
plt.ylabel('Output')
plt.legend()
plt.show()

Let’s randomly sample the target function and fit a simple polynomial function to get a feeling for how the model works.

# Generate sample points from the target function
num_initial_points = 
initial_points = np.random.uniform(-5, 5, num_initial_points)
initial_values = np.vectorize(target_function)(initial_points)

# Plot the sample points
plt.figure(figsize=(5, 4))
plt.plot(x_values, y_values, 'r-', alpha=0.5, label='Target Function')
plt.scatter(initial_points, initial_values, color='blue', marker='o', label='Initial Samples')
plt.xlabel('Input')
plt.ylabel('Output')
plt.legend()
plt.show()

Code hint

Try `num_initial_points = 10`

# Perform a polynomial fit
degree =   # Adjust the degree of the polynomial fit
coefficients = np.polyfit(initial_points, initial_values, degree)
poly_fit = np.poly1d(coefficients)

# Calculate R^2
y_pred = poly_fit(initial_points)
r_squared = r2_score(initial_values, y_pred)

# Plot the sample points and polynomial fit
plt.figure(figsize=(5, 4))
plt.plot(x_values, y_values, 'r-', alpha=0.5, label='Target Function')
plt.scatter(initial_points, initial_values, color='blue', marker='o', label='Initial Samples')
plt.plot(x_values, poly_fit(x_values), 'g--', label=f'Polynomial Fit (degree {degree})\n$R^2 = {r_squared:.4f}$')
plt.xlabel('Input')
plt.ylabel('Output')
plt.legend()
plt.show()

🐢 Take a beat: Adjust the degree of the polynomial to see how good the fit is. Start with `degree = 2` and gradually increase it.

8.3.2. Step 3: Gaussian Process#

Now we can move to Bayesian Optimisation with a Gaussian Process model. The optimisation progress is visualised by plotting the target function, optimisation steps, and a colourbar indicating the step number.

⏱️ This may take a minute to run. Reverend Bayes makes computers work hard!

# Optimise the target function using Bayesian Optimisation
result = gp_minimize(target_function, [(-5.0, 5.0)], n_calls=50, random_state=42)

# Perform random sampling for comparison
random_result = dummy_minimize(target_function, [(-5.0, 5.0)], n_calls=50, random_state=42)

# Plot the Gaussian Process model after optimisation
x_gp = np.array(result.x_iters).reshape(-1, 1)
y_gp = result.func_vals

# Plot the target function
plt.figure(figsize=(5, 4))
plt.plot(x_values, y_values, 'r-', alpha=0.5, label='Target function')

# Plot the optimisation steps with a colormap
plt.scatter(x_gp, y_gp, c=range(len(x_gp)), cmap='viridis', marker='o', label='Step number')

# Add colorbar to indicate the progress
cbar = plt.colorbar()
cbar.set_label('Step number')

plt.title('BO: Gaussian Process Model')
plt.xlabel('Input')
plt.ylabel('Output')
plt.legend()
plt.show()

We can use plot_gaussian_process from scikit-optimize to visualise the confidence intervals. n_samples determines the number of samples to draw from the Gaussian Process for the estimation.

from skopt.plots import plot_gaussian_process as plot_gp

# Plot the Gaussian Process model with confidence intervals
plt.figure(figsize=(5, 4))
plot_gp(result)

# Add the target function for reference
plt.plot(x_values, y_values, 'r-', alpha=0.25, label='Target function')

plt.title('Confidence Intervals')
plt.xlabel('Input')
plt.ylabel('Output')
plt.legend()
plt.show()

The plot shows the GP mean (dashed green), confidence intervals (shaded), and sampled observations (red). The target function (light red) is also overlaid. The confidence region narrows where more observations exist and widens in unexplored areas, reflecting uncertainty in the GP model.

We should always have a benchmark to compare our model to. This block extracts the best results from BO and random sampling, then compares and visualises their performance over optimisation steps.

# Extract the cumulative minimum values
bo_min_values = np.minimum.accumulate(result.func_vals)
random_min_values = np.minimum.accumulate(random_result.func_vals)

# Plot the cumulative minimum values vs steps for both methods
plt.figure(figsize=(5, 4))
plt.plot(range(1, len(bo_min_values) + 1), bo_min_values, 'o-', label='Bayesian Optimisation')
plt.plot(range(1, len(random_min_values) + 1), random_min_values, 'x-', label='Random Sampling')

plt.title('Does BO Beat Random Sampling?')
plt.xlabel('Step')
plt.ylabel('Cumulative Minimum Value')
plt.legend()
plt.show()

BO (blue) converges faster to a lower minimum value. Random sampling (orange) fluctuates and struggles to improve beyond a certain point. This highlights BO’s advantage in structured search over purely random exploration.

8.4. 🚨 Exercise 8#

💡 Coding exercises: The exercises are designed to apply what you have learned with room for creativity. It is fine to discuss solutions with your classmates, but the actual code should not be directly copied.

8.4.1. Your details#

import numpy as np

# Insert your values
Name = "No Name" # Replace with your name
CID = 123446 # Replace with your College ID (as a numeric value with no leading 0s)

# Set a random seed using the CID value
CID = int(CID)
np.random.seed(CID)

# Print the message
print("This is the work of " + Name + " [CID: " + str(CID) + "]")

8.4.2. Problem#

The Department of Materials has purchased a new automated thin-film deposition system. The machine has two dials that provide a 2D parameter space (x, y) for materials processing. We can define a (hypothetical) target loss function for optimising the transition temperature of our candidate thin-film superconductors as:

# Target function for materials processing with x and y "dials"
def supermat(inputs):
    x, y = inputs
    a = 2, b = 5.1 / (2 * np.pi**2)
    c = 3 / np.pi
    r = 4, s = 10, t = 1 / (8 * np.pi)

    term1 = a * (y - b * x**2 + c * x - r)**2
    term2 = s * (1 - t) * np.cos(x)
    term3 = s

    return term1 + term2 + term3

# Example usage:
dials = [2.0, 3.0]
result = supermat(dials)
print(f"Experiment by setting dials to ({dials[0]}, {dials[1]}): {result}")

The tasks will be provided in class.

#Empty block for your answers

#Empty block for your answers

Task hint

Remember to first define the target function and then call it using gp_minimize()

📓 Submission: When your notebook is complete in Google Colab, go to File > Download and choose .ipynb. The completed file should be uploaded to Blackboard under assignments for MATE70026.

8.5. 🌊 Dive deeper#

Level 1: Visually explore Gaussian Processes
Level 2: Read a perspective on Bayesian optimisation for chemical problems by your teaching assistant Yifan, which includes links to tool and packages under development
Level 3: Interact with the self-driving laboratory demo by Sterling Baird