.. default-role:: math Tutorial ======== Function Generation ------------------- To generate all functions at a given complexity (here complexity 5), one simply needs to run the following. .. code-block:: python import esr.generation.duplicate_checker runname = 'core_maths' comp = 5 esr.generation.duplicate_checker.main(runname, comp) In ``esr.generation.duplicate_checker`` we have predefined a few sets of functions which we believe would be useful. However, one simply needs to add another option to the start of that script to define a new run: .. code-block:: python if runname == 'keep_duplicates': basis_functions = [["x", "a"], # type0 ["square", "exp", "inv", "sqrt_abs", "log_abs"], # type1 ["+", "*", "-", "/", "pow"]] # type2 elif runname == 'core_maths': basis_functions = [["x", "a"], # type0 ["inv"], # type1 ["+", "*", "-", "/", "pow"]] # type2 elif runname == 'ext_maths': basis_functions = [["x", "a"], # type0 ["inv", "sqrt_abs", "square", "exp"], # type1 ["+", "*", "-", "/", "pow"]] # type2 elif runname == 'osc_maths': basis_functions = [["x", "a"], # type0 ["inv", "sin"], # type1 ["+", "*", "-", "/", "pow"]] # type2 where type 0, 1 and 2 functions are nullary, unary, and binary, respectively. Fitting to a dataset -------------------- Suppose we have already generated the equations required for the ``CCLikelihood`` class. In the following we show the steps that are required to fit the complexity 5 functions to these data. The various ``fitting'' functions run rely on the output of the previous script, so the order cannot change. .. code-block:: python import esr.fitting.test_all import esr.fitting.test_all_Fisher import esr.fitting.match import esr.fitting.combine_DL import esr.fitting.plot from esr.fitting.likelihood import CCLikelihood comp = 5 likelihood = CCLikelihood() esr.fitting.test_all.main(comp, likelihood) esr.fitting.test_all_Fisher.main(comp, likelihood) esr.fitting.match.main(comp, likelihood) esr.fitting.combine_DL.main(comp, likelihood) esr.fitting.plot.main(comp, likelihood) Once you have run this more many complexities, you can plot the pareto front and save it to file using the following function. .. code-block:: python import esr.plotting.plot esr.plotting.plot.pareto_plot(likelihood.out_dir, 'pareto.png') Fitting a single function ------------------------- One may wish to fit a single function to your data, instead of the full library produced during the function generation step. For example, suppose we wish to fit a `\Lambda` CDM expansion history to the cosmic chronometer data. This function is `y(x) = \theta_0 + \theta_1 x^3` which can be represented as the tree `[+, \theta_0, \times, \theta_1, {\rm pow}, x, 3]` although we will rewrite `\theta_0` as ``a0``, `\theta_1` as ``a1`` and `\times` as ``*``. The following script initially loads the cosmic chronometer data then fits the function to this data, returning the negative log-likelihood and the description length. .. code-block:: python from esr.fitting.fit_single import single_function from esr.fitting.likelihood import CCLikelihood cc_like = CCLikelihood() labels = ["+", "a0", "*", "a1", "pow", "x", "3"] basis_functions = [["x", "a"], # type0 ["inv"], # type1 ["+", "*", "-", "/", "pow"]] # type2 logl_lcdm_cc, dl_lcdm_cc = single_function(labels, basis_functions, cc_like, verbose=True) One can also fit the function directly from writing it as a string. This will convert the string to a list of labels, which are also returned. Note that this conversion if not guaranteed to produce the tree representation with the shortest description length, but does provide an upper limit on the DL of a function. .. code-block:: python from esr.fitting.fit_single import fit_from_string from esr.fitting.likelihood import CCLikelihood cc_like = CCLikelihood() basis_functions = [["x", "a"], # type0 ["inv"], # type1 ["+", "*", "-", "/", "pow"]] # type2 logl_lcdm_cc, dl_lcdm_cc, labels = fit_from_string("a0 + a1 * x ** 3", basis_functions, cc_like, verbose=True) Custom Likelihoods ------------------ To fit a function to your own data, one must create an alternative likelihood using the parent class ``esr.fitting.likelihood.Likelihood``. In the ``__init__()`` for this likelihood, you must define ``xvar``, ``yvar`` and ``yerr`` (the x, y and error on y variables) and a function ``negloglike(self, a, eq_numpy, **kwargs)`` which returns the negative log-likelihood. For example, a Gaussian likelihood can be defined as .. code-block:: python from esr.fitting.likelihood import Likelihood import numpy as np import os class GaussLikelihood(Likelihood): """Likelihood class used to fit a function directly using a Gaussian likelihood Args: :data_file (str): Name of the file containing the data to use :run_name (str): The name to be associated with this likelihood, e.g. 'my_esr_run' :data_dir (str, default=None): The path containing the data and cov files :fn_set (str, default='core_maths'): The name of the function set to use with the likelihood. Must match one of those defined in ``generation.duplicate_checker`` """ def __init__(self, data_file, run_name, data_dir=None, fn_set='core_maths'): super().__init__(data_file, data_file, run_name, data_dir=data_dir, fn_set=fn_set) self.ylabel = r'$y$' # for plotting self.xvar, self.yvar, self.yerr = np.loadtxt(self.data_file, unpack=True) def negloglike(self, a, eq_numpy, **kwargs): """Negative log-likelihood for a given function. Args: :a (list): parameters to subsitute into equation considered :eq_numpy (numpy function): function to use which gives y Returns: :nll (float): - log(likelihood) for this function and parameters """ ypred = self.get_pred(self.xvar, np.atleast_1d(a), eq_numpy) if not np.all(np.isreal(ypred)): return np.inf nll = np.sum(0.5 * (ypred - self.yvar) ** 2 / self.yerr ** 2 + 0.5 * np.log(2 * np.pi) + np.log(self.yerr)) if np.isnan(nll): return np.inf return nll although note that this is already included as ``esr.fitting.likelihood.GaussLikelihood``. If you want to use a different set of functions than ``core_maths`` then this can be passed using the ``fn_set`` argument. We can then combine the above code with the below to fit a mock dataset .. code-block:: python import esr.fitting.test_all import esr.fitting.test_all_Fisher import esr.fitting.match import esr.fitting.combine_DL import esr.fitting.plot np.random.seed(123) x = np.random.uniform(0.1, 5, 100) y = 0.5 * x ** 2 yerr = np.full(x.shape, 1.0) y = y + yerr * np.random.normal(size=len(x)) np.savetxt('data.txt', np.array([x, y, yerr]).T) likelihood = GaussLikelihood('data.txt', 'gauss_example', data_dir=os.getcwd()) comp = 5 esr.fitting.test_all.main(comp, likelihood) esr.fitting.test_all_Fisher.main(comp, likelihood) esr.fitting.match.main(comp, likelihood) esr.fitting.combine_DL.main(comp, likelihood) esr.fitting.plot.main(comp, likelihood) We also have a Poisson likelihood already implemented, which can be run as .. code-block:: python from esr.fitting.likelihood import PoissonLikelihood import numpy as np import os import esr.fitting.test_all import esr.fitting.test_all_Fisher import esr.fitting.match import esr.fitting.combine_DL import esr.fitting.plot np.random.seed(123) x = np.random.uniform(0.1, 5, 100) y = 0.5 * x ** 2 y = np.random.poisson(y) np.savetxt('data.txt', np.array([x, y]).T) likelihood = PoissonLikelihood('data.txt', 'poisson_example', data_dir=os.getcwd()) comp = 5 esr.fitting.test_all.main(comp, likelihood) esr.fitting.test_all_Fisher.main(comp, likelihood) esr.fitting.match.main(comp, likelihood) esr.fitting.combine_DL.main(comp, likelihood) esr.fitting.plot.main(comp, likelihood)