Bayesian Analysis: Tb2TiO7 (emcee), HEiDi¶
This tutorial demonstrates a practical two-stage workflow for single-crystal diffraction analysis with EasyDiffraction.
In the first stage, we run a fast local refinement to obtain a sensible point estimate and parameter uncertainties. In the second stage, we use these refined values to define fit bounds and then sample the posterior distribution with emcee.
The example uses constant-wavelength neutron single-crystal diffraction data for Tb2TiO7 measured on HEiDi at FRM II.
The goal is not only to obtain a good fit, but also to answer Bayesian questions such as:
- Which parameter values are most probable?
- How broad are the credible intervals?
- Which parameters are strongly correlated?
- How much uncertainty propagates into the calculated reflection intensities?
🛠️ Import Library¶
import easydiffraction as ed
📦 Define Project¶
The project object keeps structures, experiments, fit settings, and plotting utilities together in a single place. We will build the full workflow inside this object.
project = ed.Project(name='tbti_heidi_emcee')
project.save_as(dir_path='projects/ed_22_tbti_heidi_emcee')
Saving project 📦 'tbti_heidi_emcee' to '../../../projects/ed_22_tbti_heidi_emcee'
├── 📄 project.cif
├── 📁 structures/
├── 📁 experiments/
├── 📁 analysis/
│ └── 📄 analysis.cif
└── 📁 reports/
└── 📄 tbti_heidi_emcee.html
🧩 Define Structure¶
For this example we start from a CIF file describing the Tb2TiO7 pyrochlore structure. Loading the structure from CIF is convenient because it preserves a realistic starting model without rebuilding the full structure by hand.
structure_path = ed.download_data(id=20, destination='data')
Getting data...
Data #20: Tb2Ti2O7 (crystal structure)
✅ Data #20 already present at '../../../data/ed-20.cif'. Keeping existing.
project.structures.add_from_cif_path(structure_path)
structure = project.structures['tbti']
Render the structure to confirm the pyrochlore model loaded from CIF as expected before configuring the experiment.
project.display.structure(struct_name='tbti')
Structure 🧩 'tbti' (Atom view type: 'covalent')
wheel = zoom
right-drag = pan
🔬 Define Experiment¶
Next we download the measured reflection data, create a neutron single-crystal experiment, and configure the crystal link, wavelength, and extinction model.
data_path = ed.download_data(id=19, destination='data')
Getting data...
Data #19: Tb2Ti2O7, HEiDi (MLZ)
✅ Data #19 already present at '../../../data/ed-19.xye'. Keeping existing.
project.experiments.add_from_data_path(
name='heidi',
data_path=data_path,
sample_form='single crystal',
beam_mode='constant wavelength',
radiation_probe='neutron',
)
Data loaded successfully
Experiment 🔬 'heidi'. Number of data points: 220.
experiment = project.experiments['heidi']
Link the crystal structure to the experiment and set its scale factor.
experiment.linked_crystal.id = 'tbti'
experiment.linked_crystal.scale = 1.0
Set the instrument wavelength and starting extinction parameters. These values provide the initial experiment description for the local refinement.
experiment.instrument.setup_wavelength = 0.793
experiment.extinction.mosaicity = 35000
experiment.extinction.radius = 10
🚀 Initial Refinement¶
Before Bayesian sampling, it is useful to run a deterministic fit. This gives us:
- a good point estimate near the best-fit region,
- uncertainties from the local optimizer,
- a quick check that the model and experiment are configured sensibly.
In this tutorial we refine a small set of structural and extinction parameters while keeping occupancies fixed.
structure.atom_sites['O1'].fract_x.free = True
structure.atom_sites['Ti'].occupancy.free = False
structure.atom_sites['O1'].occupancy.free = False
structure.atom_sites['O2'].occupancy.free = False
structure.atom_sites['Tb'].adp_iso.free = True
structure.atom_sites['Ti'].adp_iso.free = True
structure.atom_sites['O1'].adp_iso.free = True
structure.atom_sites['O2'].adp_iso.free = True
experiment.linked_crystal.scale.free = True
experiment.extinction.radius.free = True
We keep using the default LMFIT Levenberg-Marquardt minimizer as a fast local optimizer. Its main purpose here is to provide a stable starting point and uncertainty estimates for the Bayesian run.
project.analysis.minimizer.show_supported()
Minimizer types
| Type | Description | ||
|---|---|---|---|
| 1 | bumps | BUMPS library using the default Levenberg-Marquardt method | |
| 2 | bumps (amoeba) | BUMPS library with Nelder-Mead simplex method | |
| 3 | bumps (de) | BUMPS library with differential evolution method | |
| 4 | bumps (dream) | BUMPS library with DREAM Bayesian sampling | |
| 5 | bumps (lm) | BUMPS library with Levenberg-Marquardt method | |
| 6 | dfols | DFO-LS library for derivative-free least-squares optimization | |
| 7 | emcee | emcee affine-invariant ensemble Bayesian sampling | |
| 8 | lmfit | LMFIT library using the default Levenberg-Marquardt method | |
| 9 | lmfit (least_squares) | LMFIT library with SciPy's trust region reflective algorithm | |
| 10 | * | lmfit (leastsq) | LMFIT library with Levenberg-Marquardt least squares method |
project.analysis.fit()
Standard fitting
📋 Using experiment 🔬 'heidi' for 'single' fitting
🚀 Starting fit process with 'lmfit (leastsq)'...
📈 Goodness-of-fit progress:
| iteration | time (s) | χ² | change / status | |
|---|---|---|---|---|
| 1 | 1 | 0.53 | 592.02 | |
| 2 | 11 | 1.32 | 191.19 | 67.7% ↓ |
| 3 | 19 | 1.95 | 36.84 | 80.7% ↓ |
| 4 | 29 | 2.63 | 18.99 | 48.4% ↓ |
| 5 | 37 | 3.26 | 12.74 | 32.9% ↓ |
| 6 | 62 | 5.13 | 12.71 |
🏆 Best goodness-of-fit (reduced χ²) is 12.71 at iteration 61
✅ Fitting complete.
Saving project 📦 'tbti_heidi_emcee' to '../../../projects/ed_22_tbti_heidi_emcee'
├── 📄 project.cif
├── 📁 structures/
│ └── 📄 tbti.cif
├── 📁 experiments/
│ └── 📄 heidi.cif
├── 📁 analysis/
│ └── 📄 analysis.cif
└── 📁 reports/
└── 📄 tbti_heidi_emcee.html
The fit-results display summarizes the locally refined values and their estimated uncertainties.
project.display.fit.results()
⚙️ Settings used:
| Name | Value | Description | |
|---|---|---|---|
| 1 | max_iterations | 1000 | Maximum solver iterations. |
📋 Least-squares fit results:
| Metric | Value | |
|---|---|---|
| 1 | 🧪 Minimizer | lmfit (leastsq) |
| 2 | ✅ Overall status | success |
| 3 | ⏱️ Fitting time (seconds) | 5.13 |
| 4 | 🔁 Iterations | 59 |
| 5 | 📏 Goodness-of-fit (reduced χ²) | 12.71 |
| 6 | 📏 R-factor (Rf, %) | 7.67 |
| 7 | 📏 R-factor squared (Rf², %) | 8.12 |
| 8 | 📏 Weighted R-factor (wR, %) | 8.52 |
📈 Refined parameters:
| datablock | category | entry | parameter | units | start | value | s.u. | change | |
|---|---|---|---|---|---|---|---|---|---|
| 1 | tbti | atom_site | Tb | adp_iso | Ų | 0.5300 | 0.5319 | 0.0190 | 0.36 % ↑ |
| 2 | tbti | atom_site | Ti | adp_iso | Ų | 0.4800 | 0.4776 | 0.0311 | 0.50 % ↓ |
| 3 | tbti | atom_site | O1 | fract_x | 0.3280 | 0.3280 | 0.0001 | 0.00 % ↓ | |
| 4 | tbti | atom_site | O1 | adp_iso | Ų | 0.4500 | 0.4504 | 0.0165 | 0.09 % ↑ |
| 5 | tbti | atom_site | O2 | adp_iso | Ų | 0.2300 | 0.2375 | 0.0259 | 3.25 % ↑ |
| 6 | heidi | extinction | radius | μm | 10.0000 | 26.4607 | 1.0003 | 164.61 % ↑ | |
| 7 | heidi | linked_crystal | scale | 1.0000 | 2.9236 | 0.0488 | 192.36 % ↑ |
• value = refined value from least-squares minimization
• s.u. = standard uncertainty (one sigma), from the covariance matrix
• change = relative change from start, in %; ↑ = increase, ↓ = decrease
The correlation plot shows how strongly the refined parameters move together in the local refinement. The measured-vs-calculated plot shows how well the refined crystal model reproduces the measured reflection intensities.
project.display.fit.correlations()
project.display.pattern(expt_name='heidi')
🎲 Prepare Sampling¶
Bayesian samplers require finite bounds for the free parameters. Instead of setting them manually, we derive them from the uncertainties estimated in the local refinement.
The helper method set_fit_bounds_from_uncertainty centers the bounds
on the current parameter value and expands them by a chosen multiple of
the reported uncertainty.
The default multiplier is 4. In this single-crystal tutorial we use
a tighter value of 1.5 to keep the sampling window closer to the
locally refined solution.
Show unset fit bounds before setting them from the local refinement uncertainties.
project.display.parameters.free()
Free parameters for both structures (🧩 data blocks) and experiments (🔬 data blocks)
| datablock | category | entry | parameter | value | uncertainty | min | max | units | |
|---|---|---|---|---|---|---|---|---|---|
| 1 | tbti | atom_site | Tb | adp_iso | 0.53188 | 0.01900 | -inf | inf | Ų |
| 2 | tbti | atom_site | Ti | adp_iso | 0.47759 | 0.03113 | -inf | inf | Ų |
| 3 | tbti | atom_site | O1 | fract_x | 0.32803 | 0.00009 | -inf | inf | |
| 4 | tbti | atom_site | O1 | adp_iso | 0.45043 | 0.01652 | -inf | inf | Ų |
| 5 | tbti | atom_site | O2 | adp_iso | 0.23747 | 0.02588 | -inf | inf | Ų |
| 6 | heidi | extinction | radius | 26.46071 | 1.00034 | -inf | inf | μm | |
| 7 | heidi | linked_crystal | scale | 2.92356 | 0.04884 | -inf | inf |
Set fit bounds for all free parameters using multiplier=1.5. In this
tutorial that means the posterior pair plot will later refer to a
±1.5 × uncertainty region in its title. To widen the sampling window,
increase the multiplier explicitly.
for param in project.free_parameters:
param.set_fit_bounds_from_uncertainty(multiplier=1.5)
Displaying the free parameters again is a convenient way to confirm that the fit bounds have been assigned as expected before launching the sampler.
project.display.parameters.free()
Free parameters for both structures (🧩 data blocks) and experiments (🔬 data blocks)
| datablock | category | entry | parameter | value | uncertainty | min | max | units | |
|---|---|---|---|---|---|---|---|---|---|
| 1 | tbti | atom_site | Tb | adp_iso | 0.53188 | 0.01900 | 0.50338 | 0.56038 | Ų |
| 2 | tbti | atom_site | Ti | adp_iso | 0.47759 | 0.03113 | 0.43090 | 0.52428 | Ų |
| 3 | tbti | atom_site | O1 | fract_x | 0.32803 | 0.00009 | 0.32789 | 0.32817 | |
| 4 | tbti | atom_site | O1 | adp_iso | 0.45043 | 0.01652 | 0.42565 | 0.47520 | Ų |
| 5 | tbti | atom_site | O2 | adp_iso | 0.23747 | 0.02588 | 0.19866 | 0.27629 | Ų |
| 6 | heidi | extinction | radius | 26.46071 | 1.00034 | 24.96020 | 27.96123 | μm | |
| 7 | heidi | linked_crystal | scale | 2.92356 | 0.04884 | 2.85030 | 2.99682 |
🎲 Run Sampling¶
We now switch from the local minimizer to the Bayesian emcee sampler.
The settings below are intentionally small so the tutorial runs quickly. For production analysis you would usually increase the number of steps and often the burn-in as well. emcee also lets you tune how walkers are initialized, how many walkers are used, and which proposal move drives the ensemble.
The burn setting is auto-resolved when left unset. Here we override
steps with a smaller value to keep the tutorial fast, and the
effective burn-in is recomputed automatically.
project.analysis.minimizer.show_supported()
Minimizer types
| Type | Description | ||
|---|---|---|---|
| 1 | bumps | BUMPS library using the default Levenberg-Marquardt method | |
| 2 | bumps (amoeba) | BUMPS library with Nelder-Mead simplex method | |
| 3 | bumps (de) | BUMPS library with differential evolution method | |
| 4 | bumps (dream) | BUMPS library with DREAM Bayesian sampling | |
| 5 | bumps (lm) | BUMPS library with Levenberg-Marquardt method | |
| 6 | dfols | DFO-LS library for derivative-free least-squares optimization | |
| 7 | emcee | emcee affine-invariant ensemble Bayesian sampling | |
| 8 | lmfit | LMFIT library using the default Levenberg-Marquardt method | |
| 9 | lmfit (least_squares) | LMFIT library with SciPy's trust region reflective algorithm | |
| 10 | * | lmfit (leastsq) | LMFIT library with Levenberg-Marquardt least squares method |
project.analysis.minimizer.type = 'emcee'
⚠️ Switching minimizer type removes these settings: • max_iterations
⚠️ Switching minimizer type adds these settings with defaults: • burn_in_steps=1000 • initialization_method='ball' • parallel_workers=0 • population_size=32 • proposal_moves='de' • random_seed=None • sampling_steps=5000 • thinning_interval=1
Current minimizer changed to
emcee
project.analysis.minimizer.sampling_steps = 500 # lower than the default 3000
project.analysis.minimizer.burn_in_steps = 100 # lower than the default 600
project.analysis.minimizer.population_size = 16 # lower than the default 32
project.analysis.minimizer.random_seed = 42 # fixed seed for reproducible output
project.analysis.fit()
Standard fitting
📋 Using experiment 🔬 'heidi' for 'single' fitting
🚀 Starting fit process with 'emcee'...
📈 Bayesian sampling progress:
| iteration | progress | time (s) | log posterior | phase | |
|---|---|---|---|---|---|
| 1 | 1/601 | 1.30 | -1354.04 | pre-processing | |
| 2 | 34/601 | 5.7% | 19.10 | -1354.16 | burn-in |
| 3 | 67/601 | 11.1% | 38.25 | -1354.73 | burn-in |
| 4 | 100/601 | 16.6% | 56.55 | -1354.54 | burn-in |
| 5 | 101/601 | 16.8% | 57.12 | -1354.54 | sampling |
| 6 | 126/601 | 21.0% | 70.58 | -1354.42 | sampling |
| 7 | 151/601 | 25.1% | 84.46 | -1354.69 | sampling |
| 8 | 176/601 | 29.3% | 96.33 | -1354.90 | sampling |
| 9 | 201/601 | 33.4% | 107.93 | -1354.61 | sampling |
| 10 | 226/601 | 37.6% | 119.71 | -1354.43 | sampling |
| 11 | 251/601 | 41.8% | 129.40 | -1355.11 | sampling |
| 12 | 276/601 | 45.9% | 138.42 | -1354.95 | sampling |
| 13 | 301/601 | 50.1% | 147.41 | -1354.54 | sampling |
| 14 | 326/601 | 54.2% | 156.41 | -1354.54 | sampling |
| 15 | 351/601 | 58.4% | 166.63 | -1355.46 | sampling |
| 16 | 376/601 | 62.6% | 178.10 | -1354.90 | sampling |
| 17 | 401/601 | 66.7% | 189.60 | -1354.90 | sampling |
| 18 | 426/601 | 70.9% | 201.12 | -1354.62 | sampling |
| 19 | 451/601 | 75.0% | 212.90 | -1354.82 | sampling |
| 20 | 476/601 | 79.2% | 224.47 | -1354.82 | sampling |
| 21 | 501/601 | 83.4% | 234.67 | -1354.77 | sampling |
| 22 | 526/601 | 87.5% | 243.77 | -1355.31 | sampling |
| 23 | 551/601 | 91.7% | 252.77 | -1354.51 | sampling |
| 24 | 576/601 | 95.8% | 261.77 | -1355.19 | sampling |
| 25 | 601/601 | 100.0% | 270.75 | -1355.37 | sampling |
| 26 | 306.56 | post-processing |
✅ Bayesian sampling complete.
⚠️ Convergence diagnostics indicate the posterior may be poorly mixed.
Saving project 📦 'tbti_heidi_emcee' to '../../../projects/ed_22_tbti_heidi_emcee'
├── 📄 project.cif
├── 📁 structures/
│ └── 📄 tbti.cif
├── 📁 experiments/
│ └── 📄 heidi.cif
├── 📁 analysis/
│ ├── 📄 analysis.cif
│ └── 📄 results.h5
└── 📁 reports/
└── 📄 tbti_heidi_emcee.html
📊 Inspect Results¶
The fit-results display now includes sampler settings, convergence diagnostics, committed parameter values, and posterior summary statistics.
project.display.fit.results()
⚙️ Settings used:
| Name | Value | Description | |
|---|---|---|---|
| 1 | sampling_steps | 500 | Total sampler iterations per chain. |
| 2 | burn_in_steps | 100 | Sampler iterations discarded as warm-up. |
| 3 | thinning_interval | 1 | Sampler thinning interval. |
| 4 | population_size | 16 | Number of chains or walkers. |
| 5 | parallel_workers | 0 | Worker count; 0 uses all available CPUs. |
| 6 | initialization_method | ball | emcee walker initialization method. |
| 7 | random_seed | 42 | Random seed; None uses a system-derived seed. |
| 8 | proposal_moves | de | Single emcee proposal move; move mixtures are not persisted in v1. |
📋 Bayesian fit results:
| Metric | Value | |
|---|---|---|
| 1 | 🧪 Sampler | emcee |
| 2 | ❌ Overall status | failed |
| 3 | 💬 Engine message | emcee sampling completed |
| 4 | ⏱️ Fitting time (seconds) | 306.56 |
| 5 | 📏 Goodness-of-fit (reduced χ²) | 12.72 |
| 6 | 📏 R-factor (Rf, %) | 7.66 |
| 7 | 📏 R-factor squared (Rf², %) | 8.12 |
| 8 | 📏 Weighted R-factor (wR, %) | 8.53 |
| 9 | 📉 Best log-posterior | -1354.19 |
| 10 | 📊 Convergence status | failed |
| 11 | 📊 Max r-hat | 1.071 |
| 12 | 📊 Min ess bulk | 365.730 |
| 13 | 📊 Draws per chain | 501 |
| 14 | 📊 Chains | 16 |
📈 Committed parameters:
| datablock | category | entry | parameter | units | start | value | s.u. | change | |
|---|---|---|---|---|---|---|---|---|---|
| 1 | tbti | atom_site | Tb | adp_iso | Ų | 0.5319 | 0.5323 | 0.0053 | 0.08 % ↑ |
| 2 | tbti | atom_site | Ti | adp_iso | Ų | 0.4776 | 0.4771 | 0.0083 | 0.10 % ↓ |
| 3 | tbti | atom_site | O1 | fract_x | 0.3280 | 0.3280 | 0.0000 | 0.00 % ↑ | |
| 4 | tbti | atom_site | O1 | adp_iso | Ų | 0.4504 | 0.4514 | 0.0046 | 0.21 % ↑ |
| 5 | tbti | atom_site | O2 | adp_iso | Ų | 0.2375 | 0.2383 | 0.0069 | 0.35 % ↑ |
| 6 | heidi | extinction | radius | μm | 26.4607 | 26.4197 | 0.2839 | 0.15 % ↓ | |
| 7 | heidi | linked_crystal | scale | 2.9236 | 2.9260 | 0.0141 | 0.08 % ↑ |
• value = estimate written back to the project (best posterior sample)
• s.u. = standard uncertainty (one sigma), posterior standard deviation
• change = relative change from start, in %; ↑ = increase, ↓ = decrease
📊 Posterior distribution:
| datablock | category | entry | parameter | units | median | 95% CI | r-hat | ess bulk | |
|---|---|---|---|---|---|---|---|---|---|
| 1 | tbti | atom_site | Tb | adp_iso | Ų | 0.5324 | [0.5219, 0.5430] | 1.033 | 366.9 |
| 2 | tbti | atom_site | Ti | adp_iso | Ų | 0.4767 | [0.4610, 0.4935] | 1.038 | 391.7 |
| 3 | tbti | atom_site | O1 | fract_x | 0.3280 | [0.3280, 0.3281] | 1.071 | 378.6 | |
| 4 | tbti | atom_site | O1 | adp_iso | Ų | 0.4504 | [0.4412, 0.4592] | 1.026 | 365.7 |
| 5 | tbti | atom_site | O2 | adp_iso | Ų | 0.2376 | [0.2248, 0.2512] | 1.044 | 467.8 |
| 6 | heidi | extinction | radius | μm | 26.4760 | [25.9298, 27.0930] | 1.040 | 412.5 | |
| 7 | heidi | linked_crystal | scale | 2.9241 | [2.8981, 2.9521] | 1.026 | 383.7 |
• 95% CI = 95% credible interval (2.5%-97.5%, asymmetric)
• r-hat = Gelman-Rubin diagnostic (good convergence: r-hat <= 1.01)
• ess bulk = bulk effective sample size (typically >= 400)
⚠️ ess bulk < 400: Consider longer sampling or reparameterization.
The correlation and posterior-pair plots are complementary:
plot_param_correlationssummarizes pairwise structure in a compact matrix.plot_posterior_pairsshows marginal densities on the diagonal and posterior contours off-diagonal. In this tutorial its title also reminds you that the display region follows the±1.5 × uncertaintybounds defined above, while numeric subplot ranges are omitted to keep the grid readable.
project.display.fit.correlations()
project.display.posterior.pairs()
The one-dimensional posterior distributions below make it easier to inspect individual parameters in isolation, including asymmetry or multimodality.
project.display.posterior.distribution()
Finally, the posterior predictive plot propagates the sampled parameter uncertainty into the calculated single-crystal reflection intensities.
project.display.posterior.predictive(expt_name='heidi')
❌ PlotlyPlotter._get_diagonal_shape() missing 2 required positional arguments: 'minimum' and 'maximum'