Bayesian Analysis: Tb2TiO7 (`emcee`), HEiDi¶

This tutorial demonstrates a practical two-stage workflow for single-crystal diffraction analysis with EasyDiffraction.

In the first stage, we run a fast local refinement to obtain a sensible point estimate and parameter uncertainties. In the second stage, we use these refined values to define fit bounds and then sample the posterior distribution with emcee.

The example uses constant-wavelength neutron single-crystal diffraction data for Tb2TiO7 measured on HEiDi at FRM II.

The goal is not only to obtain a good fit, but also to answer Bayesian questions such as:

Which parameter values are most probable?
How broad are the credible intervals?
Which parameters are strongly correlated?
How much uncertainty propagates into the calculated reflection intensities?

🛠️ Import Library¶

In [2]:

Copied!

import easydiffraction as ed
import easydiffraction as ed

📦 Define Project¶

The project object keeps structures, experiments, fit settings, and plotting utilities together in a single place. We will build the full workflow inside this object.

In [3]:

Copied!

project = ed.Project(name='tbti_heidi_emcee')
project = ed.Project(name='tbti_heidi_emcee')

In [4]:

Copied!

project.save_as(dir_path='projects/ed_22_tbti_heidi_emcee')
project.save_as(dir_path='projects/ed_22_tbti_heidi_emcee')

Saving project 📦 'tbti_heidi_emcee' to '../../../projects/ed_22_tbti_heidi_emcee'

├── 📄 project.cif

├── 📁 structures/

├── 📁 experiments/

├── 📁 analysis/

│   └── 📄 analysis.cif

└── 📁 reports/

    └── 📄 tbti_heidi_emcee.html

🧩 Define Structure¶

For this example we start from a CIF file describing the Tb2TiO7 pyrochlore structure. Loading the structure from CIF is convenient because it preserves a realistic starting model without rebuilding the full structure by hand.

In [5]:

Copied!

structure_path = ed.download_data(id=20, destination='data')
structure_path = ed.download_data(id=20, destination='data')

Getting data...

Data #20: Tb2Ti2O7 (crystal structure)

✅ Data #20 already present at '../../../data/ed-20.cif'. Keeping existing.

In [6]:

Copied!

project.structures.add_from_cif_path(structure_path)
project.structures.add_from_cif_path(structure_path)

In [7]:

Copied!

structure = project.structures['tbti']
structure = project.structures['tbti']

Render the structure to confirm the pyrochlore model loaded from CIF as expected before configuring the experiment.

In [8]:

Copied!

project.display.structure(struct_name='tbti')
project.display.structure(struct_name='tbti')

Structure 🧩 'tbti' (Atom view type: 'covalent')

Loading plot…

drag = rotate
wheel = zoom
right-drag = pan

🔬 Define Experiment¶

Next we download the measured reflection data, create a neutron single-crystal experiment, and configure the crystal link, wavelength, and extinction model.

In [9]:

Copied!

data_path = ed.download_data(id=19, destination='data')
data_path = ed.download_data(id=19, destination='data')

Getting data...

Data #19: Tb2Ti2O7, HEiDi (MLZ)

✅ Data #19 already present at '../../../data/ed-19.xye'. Keeping existing.

In [10]:

Copied!





project.experiments.add_from_data_path(
    name='heidi',
    data_path=data_path,
    sample_form='single crystal',
    beam_mode='constant wavelength',
    radiation_probe='neutron',
)
project.experiments.add_from_data_path(
    name='heidi',
    data_path=data_path,
    sample_form='single crystal',
    beam_mode='constant wavelength',
    radiation_probe='neutron',
)

Data loaded successfully

Experiment 🔬 'heidi'. Number of data points: 220.

In [11]:

Copied!

experiment = project.experiments['heidi']
experiment = project.experiments['heidi']

Link the crystal structure to the experiment and set its scale factor.

In [12]:

Copied!

experiment.linked_crystal.id = 'tbti'
experiment.linked_crystal.scale = 1.0
experiment.linked_crystal.id = 'tbti'
experiment.linked_crystal.scale = 1.0

Set the instrument wavelength and starting extinction parameters. These values provide the initial experiment description for the local refinement.

In [13]:

Copied!

experiment.instrument.setup_wavelength = 0.793
experiment.instrument.setup_wavelength = 0.793

In [14]:

Copied!

experiment.extinction.mosaicity = 35000
experiment.extinction.radius = 10
experiment.extinction.mosaicity = 35000
experiment.extinction.radius = 10

Before Bayesian sampling, it is useful to run a deterministic fit. This gives us:

a good point estimate near the best-fit region,
uncertainties from the local optimizer,
a quick check that the model and experiment are configured sensibly.

In this tutorial we refine a small set of structural and extinction parameters while keeping occupancies fixed.

In [15]:

Copied!





structure.atom_sites['O1'].fract_x.free = True

structure.atom_sites['Ti'].occupancy.free = False
structure.atom_sites['O1'].occupancy.free = False
structure.atom_sites['O2'].occupancy.free = False

structure.atom_sites['Tb'].adp_iso.free = True
structure.atom_sites['Ti'].adp_iso.free = True
structure.atom_sites['O1'].adp_iso.free = True
structure.atom_sites['O2'].adp_iso.free = True
structure.atom_sites['O1'].fract_x.free = True

structure.atom_sites['Ti'].occupancy.free = False
structure.atom_sites['O1'].occupancy.free = False
structure.atom_sites['O2'].occupancy.free = False

structure.atom_sites['Tb'].adp_iso.free = True
structure.atom_sites['Ti'].adp_iso.free = True
structure.atom_sites['O1'].adp_iso.free = True
structure.atom_sites['O2'].adp_iso.free = True

In [16]:

Copied!

experiment.linked_crystal.scale.free = True
experiment.extinction.radius.free = True
experiment.linked_crystal.scale.free = True
experiment.extinction.radius.free = True

We keep using the default LMFIT Levenberg-Marquardt minimizer as a fast local optimizer. Its main purpose here is to provide a stable starting point and uncertainty estimates for the Bayesian run.

In [17]:

Copied!

project.analysis.minimizer.show_supported()
project.analysis.minimizer.show_supported()

Minimizer types

		Type	Description
1		bumps	BUMPS library using the default Levenberg-Marquardt method
2		bumps (amoeba)	BUMPS library with Nelder-Mead simplex method
3		bumps (de)	BUMPS library with differential evolution method
4		bumps (dream)	BUMPS library with DREAM Bayesian sampling
5		bumps (lm)	BUMPS library with Levenberg-Marquardt method
6		dfols	DFO-LS library for derivative-free least-squares optimization
7		emcee	emcee affine-invariant ensemble Bayesian sampling
8		lmfit	LMFIT library using the default Levenberg-Marquardt method
9		lmfit (least_squares)	LMFIT library with SciPy's trust region reflective algorithm
10	*	lmfit (leastsq)	LMFIT library with Levenberg-Marquardt least squares method

In [18]:

Copied!

project.analysis.fit()
project.analysis.fit()

Standard fitting

📋 Using experiment 🔬 'heidi' for 'single' fitting

🚀 Starting fit process with 'lmfit (leastsq)'...

📈 Goodness-of-fit progress:

	iteration	time (s)	χ²	change / status
1	1	0.53	592.02
2	11	1.32	191.19	67.7% ↓
3	19	1.95	36.84	80.7% ↓
4	29	2.63	18.99	48.4% ↓
5	37	3.26	12.74	32.9% ↓
6	62	5.13	12.71

🏆 Best goodness-of-fit (reduced χ²) is 12.71 at iteration 61

✅ Fitting complete.

Saving project 📦 'tbti_heidi_emcee' to '../../../projects/ed_22_tbti_heidi_emcee'

├── 📄 project.cif

├── 📁 structures/

│   └── 📄 tbti.cif

├── 📁 experiments/

│   └── 📄 heidi.cif

├── 📁 analysis/

│   └── 📄 analysis.cif

└── 📁 reports/

    └── 📄 tbti_heidi_emcee.html

The fit-results display summarizes the locally refined values and their estimated uncertainties.

In [19]:

Copied!

project.display.fit.results()
project.display.fit.results()

⚙️ Settings used:

	Name	Value	Description
1	max_iterations	1000	Maximum solver iterations.

📋 Least-squares fit results:

	Metric	Value
1	🧪 Minimizer	lmfit (leastsq)
2	✅ Overall status	success
3	⏱️ Fitting time (seconds)	5.13
4	🔁 Iterations	59
5	📏 Goodness-of-fit (reduced χ²)	12.71
6	📏 R-factor (Rf, %)	7.67
7	📏 R-factor squared (Rf², %)	8.12
8	📏 Weighted R-factor (wR, %)	8.52

📈 Refined parameters:

	datablock	category	entry	parameter	units	start	value	s.u.	change
1	tbti	atom_site	Tb	adp_iso	Å²	0.5300	0.5319	0.0190	0.36 % ↑
2	tbti	atom_site	Ti	adp_iso	Å²	0.4800	0.4776	0.0311	0.50 % ↓
3	tbti	atom_site	O1	fract_x		0.3280	0.3280	0.0001	0.00 % ↓
4	tbti	atom_site	O1	adp_iso	Å²	0.4500	0.4504	0.0165	0.09 % ↑
5	tbti	atom_site	O2	adp_iso	Å²	0.2300	0.2375	0.0259	3.25 % ↑
6	heidi	extinction		radius	μm	10.0000	26.4607	1.0003	164.61 % ↑
7	heidi	linked_crystal		scale		1.0000	2.9236	0.0488	192.36 % ↑

  • start      = parameter value before refinement
  • value      = refined value from least-squares minimization
  • s.u.       = standard uncertainty (one sigma), from the covariance matrix
  • change     = relative change from start, in %; ↑ = increase, ↓ = decrease

The correlation plot shows how strongly the refined parameters move together in the local refinement. The measured-vs-calculated plot shows how well the refined crystal model reproduces the measured reflection intensities.

In [20]:

Copied!

project.display.fit.correlations()
project.display.fit.correlations()

Loading plot…

In [21]:

Copied!

project.display.pattern(expt_name='heidi')
project.display.pattern(expt_name='heidi')

Loading plot…

🎲 Prepare Sampling¶

Bayesian samplers require finite bounds for the free parameters. Instead of setting them manually, we derive them from the uncertainties estimated in the local refinement.

The helper method set_fit_bounds_from_uncertainty centers the bounds on the current parameter value and expands them by a chosen multiple of the reported uncertainty.

The default multiplier is 4. In this single-crystal tutorial we use a tighter value of 1.5 to keep the sampling window closer to the locally refined solution.

Show unset fit bounds before setting them from the local refinement uncertainties.

In [22]:

Copied!

project.display.parameters.free()
project.display.parameters.free()

Free parameters for both structures (🧩 data blocks) and experiments (🔬 data blocks)

	datablock	category	entry	parameter	value	uncertainty	min	max	units
1	tbti	atom_site	Tb	adp_iso	0.53188	0.01900	-inf	inf	Å²
2	tbti	atom_site	Ti	adp_iso	0.47759	0.03113	-inf	inf	Å²
3	tbti	atom_site	O1	fract_x	0.32803	0.00009	-inf	inf
4	tbti	atom_site	O1	adp_iso	0.45043	0.01652	-inf	inf	Å²
5	tbti	atom_site	O2	adp_iso	0.23747	0.02588	-inf	inf	Å²
6	heidi	extinction		radius	26.46071	1.00034	-inf	inf	μm
7	heidi	linked_crystal		scale	2.92356	0.04884	-inf	inf

Set fit bounds for all free parameters using multiplier=1.5. In this tutorial that means the posterior pair plot will later refer to a ±1.5 × uncertainty region in its title. To widen the sampling window, increase the multiplier explicitly.

In [23]:

Copied!

for param in project.free_parameters:
    param.set_fit_bounds_from_uncertainty(multiplier=1.5)
for param in project.free_parameters:
    param.set_fit_bounds_from_uncertainty(multiplier=1.5)

Displaying the free parameters again is a convenient way to confirm that the fit bounds have been assigned as expected before launching the sampler.

In [24]:

Copied!

project.display.parameters.free()
project.display.parameters.free()

Free parameters for both structures (🧩 data blocks) and experiments (🔬 data blocks)

	datablock	category	entry	parameter	value	uncertainty	min	max	units
1	tbti	atom_site	Tb	adp_iso	0.53188	0.01900	0.50338	0.56038	Å²
2	tbti	atom_site	Ti	adp_iso	0.47759	0.03113	0.43090	0.52428	Å²
3	tbti	atom_site	O1	fract_x	0.32803	0.00009	0.32789	0.32817
4	tbti	atom_site	O1	adp_iso	0.45043	0.01652	0.42565	0.47520	Å²
5	tbti	atom_site	O2	adp_iso	0.23747	0.02588	0.19866	0.27629	Å²
6	heidi	extinction		radius	26.46071	1.00034	24.96020	27.96123	μm
7	heidi	linked_crystal		scale	2.92356	0.04884	2.85030	2.99682

🎲 Run Sampling¶

We now switch from the local minimizer to the Bayesian emcee sampler.

The settings below are intentionally small so the tutorial runs quickly. For production analysis you would usually increase the number of steps and often the burn-in as well. emcee also lets you tune how walkers are initialized, how many walkers are used, and which proposal move drives the ensemble.

The burn setting is auto-resolved when left unset. Here we override steps with a smaller value to keep the tutorial fast, and the effective burn-in is recomputed automatically.

In [25]:

Copied!

project.analysis.minimizer.show_supported()
project.analysis.minimizer.show_supported()

Minimizer types

		Type	Description
1		bumps	BUMPS library using the default Levenberg-Marquardt method
2		bumps (amoeba)	BUMPS library with Nelder-Mead simplex method
3		bumps (de)	BUMPS library with differential evolution method
4		bumps (dream)	BUMPS library with DREAM Bayesian sampling
5		bumps (lm)	BUMPS library with Levenberg-Marquardt method
6		dfols	DFO-LS library for derivative-free least-squares optimization
7		emcee	emcee affine-invariant ensemble Bayesian sampling
8		lmfit	LMFIT library using the default Levenberg-Marquardt method
9		lmfit (least_squares)	LMFIT library with SciPy's trust region reflective algorithm
10	*	lmfit (leastsq)	LMFIT library with Levenberg-Marquardt least squares method

In [26]:

Copied!

project.analysis.minimizer.type = 'emcee'
project.analysis.minimizer.type = 'emcee'

⚠️ Switching minimizer type removes these settings:                                                                               
   • max_iterations

⚠️ Switching minimizer type adds these settings with defaults:                                                                    
   • burn_in_steps=1000                                                                                                           
   • initialization_method='ball'                                                                                                 
   • parallel_workers=0                                                                                                           
   • population_size=32                                                                                                           
   • proposal_moves='de'                                                                                                          
   • random_seed=None                                                                                                             
   • sampling_steps=5000                                                                                                          
   • thinning_interval=1

Current minimizer changed to

emcee

In [27]:

Copied!





project.analysis.minimizer.sampling_steps = 500  # lower than the default 3000
project.analysis.minimizer.burn_in_steps = 100  # lower than the default 600
project.analysis.minimizer.population_size = 16  # lower than the default 32
project.analysis.minimizer.random_seed = 42  # fixed seed for reproducible output
project.analysis.minimizer.sampling_steps = 500  # lower than the default 3000
project.analysis.minimizer.burn_in_steps = 100  # lower than the default 600
project.analysis.minimizer.population_size = 16  # lower than the default 32
project.analysis.minimizer.random_seed = 42  # fixed seed for reproducible output

In [28]:

Copied!

project.analysis.fit()
project.analysis.fit()

Standard fitting

📋 Using experiment 🔬 'heidi' for 'single' fitting

🚀 Starting fit process with 'emcee'...

📈 Bayesian sampling progress:

	iteration	progress	time (s)	log posterior	phase
1	1/601		1.30	-1354.04	pre-processing
2	34/601	5.7%	19.10	-1354.16	burn-in
3	67/601	11.1%	38.25	-1354.73	burn-in
4	100/601	16.6%	56.55	-1354.54	burn-in
5	101/601	16.8%	57.12	-1354.54	sampling
6	126/601	21.0%	70.58	-1354.42	sampling
7	151/601	25.1%	84.46	-1354.69	sampling
8	176/601	29.3%	96.33	-1354.90	sampling
9	201/601	33.4%	107.93	-1354.61	sampling
10	226/601	37.6%	119.71	-1354.43	sampling
11	251/601	41.8%	129.40	-1355.11	sampling
12	276/601	45.9%	138.42	-1354.95	sampling
13	301/601	50.1%	147.41	-1354.54	sampling
14	326/601	54.2%	156.41	-1354.54	sampling
15	351/601	58.4%	166.63	-1355.46	sampling
16	376/601	62.6%	178.10	-1354.90	sampling
17	401/601	66.7%	189.60	-1354.90	sampling
18	426/601	70.9%	201.12	-1354.62	sampling
19	451/601	75.0%	212.90	-1354.82	sampling
20	476/601	79.2%	224.47	-1354.82	sampling
21	501/601	83.4%	234.67	-1354.77	sampling
22	526/601	87.5%	243.77	-1355.31	sampling
23	551/601	91.7%	252.77	-1354.51	sampling
24	576/601	95.8%	261.77	-1355.19	sampling
25	601/601	100.0%	270.75	-1355.37	sampling
26			306.56		post-processing

✅ Bayesian sampling complete.

⚠️ Convergence diagnostics indicate the posterior may be poorly mixed.

Saving project 📦 'tbti_heidi_emcee' to '../../../projects/ed_22_tbti_heidi_emcee'

├── 📄 project.cif

├── 📁 structures/

│   └── 📄 tbti.cif

├── 📁 experiments/

│   └── 📄 heidi.cif

├── 📁 analysis/

│   ├── 📄 analysis.cif

│   └── 📄 results.h5

└── 📁 reports/

    └── 📄 tbti_heidi_emcee.html

📊 Inspect Results¶

The fit-results display now includes sampler settings, convergence diagnostics, committed parameter values, and posterior summary statistics.

In [29]:

Copied!

project.display.fit.results()
project.display.fit.results()

⚙️ Settings used:

	Name	Value	Description
1	sampling_steps	500	Total sampler iterations per chain.
2	burn_in_steps	100	Sampler iterations discarded as warm-up.
3	thinning_interval	1	Sampler thinning interval.
4	population_size	16	Number of chains or walkers.
5	parallel_workers	0	Worker count; 0 uses all available CPUs.
6	initialization_method	ball	emcee walker initialization method.
7	random_seed	42	Random seed; None uses a system-derived seed.
8	proposal_moves	de	Single emcee proposal move; move mixtures are not persisted in v1.

📋 Bayesian fit results:

	Metric	Value
1	🧪 Sampler	emcee
2	❌ Overall status	failed
3	💬 Engine message	emcee sampling completed
4	⏱️ Fitting time (seconds)	306.56
5	📏 Goodness-of-fit (reduced χ²)	12.72
6	📏 R-factor (Rf, %)	7.66
7	📏 R-factor squared (Rf², %)	8.12
8	📏 Weighted R-factor (wR, %)	8.53
9	📉 Best log-posterior	-1354.19
10	📊 Convergence status	failed
11	📊 Max r-hat	1.071
12	📊 Min ess bulk	365.730
13	📊 Draws per chain	501
14	📊 Chains	16

📈 Committed parameters:

	datablock	category	entry	parameter	units	start	value	s.u.	change
1	tbti	atom_site	Tb	adp_iso	Å²	0.5319	0.5323	0.0053	0.08 % ↑
2	tbti	atom_site	Ti	adp_iso	Å²	0.4776	0.4771	0.0083	0.10 % ↓
3	tbti	atom_site	O1	fract_x		0.3280	0.3280	0.0000	0.00 % ↑
4	tbti	atom_site	O1	adp_iso	Å²	0.4504	0.4514	0.0046	0.21 % ↑
5	tbti	atom_site	O2	adp_iso	Å²	0.2375	0.2383	0.0069	0.35 % ↑
6	heidi	extinction		radius	μm	26.4607	26.4197	0.2839	0.15 % ↓
7	heidi	linked_crystal		scale		2.9236	2.9260	0.0141	0.08 % ↑

  • start      = parameter value before sampling
  • value      = estimate written back to the project (best posterior sample)
  • s.u.       = standard uncertainty (one sigma), posterior standard deviation
  • change     = relative change from start, in %; ↑ = increase, ↓ = decrease

📊 Posterior distribution:

	datablock	category	entry	parameter	units	median	95% CI	r-hat	ess bulk
1	tbti	atom_site	Tb	adp_iso	Å²	0.5324	[0.5219, 0.5430]	1.033	366.9
2	tbti	atom_site	Ti	adp_iso	Å²	0.4767	[0.4610, 0.4935]	1.038	391.7
3	tbti	atom_site	O1	fract_x		0.3280	[0.3280, 0.3281]	1.071	378.6
4	tbti	atom_site	O1	adp_iso	Å²	0.4504	[0.4412, 0.4592]	1.026	365.7
5	tbti	atom_site	O2	adp_iso	Å²	0.2376	[0.2248, 0.2512]	1.044	467.8
6	heidi	extinction		radius	μm	26.4760	[25.9298, 27.0930]	1.040	412.5
7	heidi	linked_crystal		scale		2.9241	[2.8981, 2.9521]	1.026	383.7

  • median       = 50th percentile of the marginal posterior
  • 95% CI       = 95% credible interval (2.5%-97.5%, asymmetric)
  • r-hat        = Gelman-Rubin diagnostic (good convergence: r-hat <= 1.01)
  • ess bulk     = bulk effective sample size (typically >= 400)

⚠️ r-hat > 1.01: Consider longer sampling, better initialization, or reparameterization.
⚠️ ess bulk < 400: Consider longer sampling or reparameterization.

The correlation and posterior-pair plots are complementary:

plot_param_correlations summarizes pairwise structure in a compact matrix.
plot_posterior_pairs shows marginal densities on the diagonal and posterior contours off-diagonal. In this tutorial its title also reminds you that the display region follows the ±1.5 × uncertainty bounds defined above, while numeric subplot ranges are omitted to keep the grid readable.

In [30]:

Copied!

project.display.fit.correlations()
project.display.fit.correlations()

Loading plot…

In [31]:

Copied!

project.display.posterior.pairs()
project.display.posterior.pairs()

Loading plot…

The one-dimensional posterior distributions below make it easier to inspect individual parameters in isolation, including asymmetry or multimodality.

In [32]:

Copied!

project.display.posterior.distribution()
project.display.posterior.distribution()

Loading plot…

Finally, the posterior predictive plot propagates the sampled parameter uncertainty into the calculated single-crystal reflection intensities.

In [33]:

Copied!

project.display.posterior.predictive(expt_name='heidi')
project.display.posterior.predictive(expt_name='heidi')

❌ PlotlyPlotter._get_diagonal_shape() missing 2 required positional arguments: 'minimum' and 'maximum'

Bayesian Analysis: Tb2TiO7 (emcee), HEiDi¶

🛠️ Import Library¶

📦 Define Project¶

🧩 Define Structure¶

🔬 Define Experiment¶

🚀 Initial Refinement¶

🎲 Prepare Sampling¶

🎲 Run Sampling¶

📊 Inspect Results¶

Bayesian Analysis: Tb2TiO7 (`emcee`), HEiDi¶