qharv.sieve package

Submodules

qharv.sieve.mean_df module

qharv.sieve.mean_df.categorize_columns(cols)[source]

Categorize the column names of a mean dataframe.

Parameters

cols (list) – a list of column names

Returns

(excol, mcol, ecol)

excol are columns of exact values with no errorbar (possibly labels) mcols are mean columns ecols are error columns

Return type

(list, list, list)

Examples

>>> rcol, mcol, ecol = categorize_columns(mdf.columns)
>>> xyye(df, 'Pressure', 'LocalEnergy', xerr=True)
qharv.sieve.mean_df.create(mydf)[source]
qharv.sieve.mean_df.dfme(df, cols, no_error=False, weight_name=None)[source]

Average scalar quantities over a set of calculations.

Parameters
  • df (pd.DataFrame) – a mean dataframe containing labels+col_mean+col_error

  • cols (list) – a list of column names, e.g. [‘E_tot’, ‘KE_tot’]

  • weight_name (str, optional) – name of weight column, default None, i.e. every entry has the same weight

Returns

averaged database

Return type

pd.DataFrame

qharv.sieve.mean_df.linex(df_in, vseries, dseries, names, labels=None, sorted_df=False, sort_col=None)[source]

Linearly extrapolate to 2*DMC-VMC hint: can also do time-step extrapolation if tau1 = 2*tau2

Parameters
  • mydf (pd.DataFrame) – database, must contain [‘series’] + name_mean, name_error for name in names

  • vseries (int) – VMC series index

  • dseries (int) – DMC series index

  • names (list) – a list of observable names to be extrpolated

  • labels (list, optinal) – a list of labels columns to keep along observables, default None

  • sorted_df (bool, optional) – VMC and DMC entries in input df are aligned

  • sort_col (str, optional) – column used to sort entries before subtraction

Returns

extrapolated entry

Return type

pd.DataFrame

qharv.sieve.mean_df.taw(ym, ye, weights)[source]

twist average with weights

qharv.sieve.mean_df.xyye(df, xname, yname, sel=None, xerr=False, yerr=True, sort=False)[source]

Get x vs. y data from a mean data frame.

Parameters
  • df (pd.DataFrame) – mean dataframe

  • xname (str) – name of x variable

  • yname (str) – name of y variable

  • sel (np.array, optional) – boolean selector for subset, default is all

  • xerr (bool, optional) – x variable has statistical error, default False

  • yerr (bool, optional) – y variable has statistical error, default True

  • sort (bool, optional) – sort x

Returns

(x, ym, ye) OR (xm, xe, ym, ye) if xerr=True

Examples

>>> xyye(df, 'rs', 'LocalEnergy')
>>> xyye(df, 'Pressure', 'LocalEnergy', xerr=True)

qharv.sieve.scalar_df module

qharv.sieve.scalar_df.mean_error_scalar_df(df, nequil=0)[source]

get mean and average from a dataframe of raw scalar data (per-block) take dataframe having columns [‘LocalEnergy’,’Variance’,…] to a dataframe having columns [‘LocalEnergy_mean’,’LocalEnergy_error’,…]

Parameters
  • df (pd.DataFrame) – raw scalar dataframe, presumable generated using qharv.scalar_dat.parse with extra labels columns added to identify the different runs.

  • nequil (int, optional) – number of equilibration blocks to throw out for each run, default 0 (keep all data).

Returns

mean_error dataframe

Return type

pd.DataFrame

qharv.sieve.scalar_df.merge_list(dfl, labels)[source]

Merge a list of DataFrames sharing common labels

Parameters
  • dfl (list) – a list of pd.DataFrame objects

  • labels (list) – a list of column labels

Returns

merged df

Return type

pd.DataFrame

qharv.sieve.scalar_df.mix_est_correction(mydf, vseries, dseries, namesm, series_name='series', group_name='group', kind='linear', drop_missing_twists=False)[source]

extrapolate dmc energy to zero time-step limit :param mydf: dataframe of VMC and DMC mixed estimators :type mydf: pd.DataFrame :param vseries: VMC series id :type vseries: int :param dseries: DMC series id :type dseries: int :param names: list of DMC mixed estimators names to extrapolate :type names: list :param series_name: column name identifying the series :type series_name: str,optional :param kind: extrapolation kind, must be either ‘linear’ or ‘log’ :type kind: str,optinoal

Returns

an entry copied from the smallest time-step DMC entry,

then edited with extrapolated pure estimators. !!!! Series index is not changed!

Return type

pd.Series

qharv.sieve.scalar_df.poly_extrap_to_x0(myx, myym, myye, order, return_fit=False)[source]

fit 1D data to 1D polynomial and extrpolate to x=0

The fit proceeds in two steps. The first polyfit does not take error into account. It estimates the extrapolated value, which is then used to setup a trust region (bounds). Using the trust region, curve_fit can robustly estimate the error of the extrapolation.

Parameters
  • myx (np.array) – x values

  • myym (np.array) – y values

  • myye (np.array) – y errors (1 sigma)

  • order (int) – order of 1D polynomial

  • return_fit (bool, optional) – if true, then return fit paramters

Returns

floats (y0m, y0e), y mean and error at x=0

Return type

2-tuple

qharv.sieve.scalar_df.reblock(trace, block_size, min_nblock=4, with_sigma=False)[source]

block scalar trace to remove autocorrelation; see usage example in reblock_scalar_df

Parameters
  • trace (np.array) – a trace of scalars, may have multiple columns !!!! assuming leading dimension is the number of current blocks.

  • block_size (int) – size of block in units of current block.

  • min_nblock (int,optional) – minimum number of blocks needed for meaningful statistics, default is 4.

Returns

re-blocked trace.

Return type

np.array

qharv.sieve.scalar_df.reblock_scalar_df(df, block_size, min_nblock=4)[source]

create a re-blocked scalar dataframe from a current scalar dataframe see reblock for details

qharv.sieve.scalar_df.ts_extrap(calc_df, issl, obsl, tname='timestep', series_name='series', **kwargs)[source]

extrapolate all dmc observables to zero time-step limit

Parameters
  • calc_df (pd.DataFrame) – must contain columns [tname, series_name]

  • issl (list) – list of DMC series index to use in fit

  • obsl (list) – a list of observable names to extrapolate

Returns

an entry copied from the smallest time-step DMC entry, then edited with extrapolated energy and corresponding info !!!! series number is unchanged

Return type

pd.Series

qharv.sieve.scalar_df.ts_extrap_obs(calc_df, sel, tname, obs, order=1)[source]

extrapolate a single dmc observable to zero time-step limit

Parameters
  • calc_df (pd.DataFrame) – must contain columns [tname, obs_mean, obs_error]

  • sel (np.array) – boolean selector array

  • tname (str) – timestep column name, e.g. ‘timestep’

  • obs (str) – observable column name, e.g. ‘LocalEnergy’

Returns

(myx, y0m, y0e) of type (list, float, float) containing (timesteps, t=0 value, t=0 error)

Return type

tuple

qharv.sieve.scalar_h5 module

qharv.sieve.scalar_h5.extract_twists(fh5, **suffix_kwargs)[source]

Extract an observable at all twists from an HDF5 archive

each twist should be a group in at root

Example

twist000

myr gr_mean gr_error

twist001

myr gr_mean gr_error

Parameters

fh5 (str) – h5 file location

Returns

(meta data, mean, error)

Return type

(dict, np.array, np.array)

qharv.sieve.scalar_h5.get_ymean_yerror(fp, twist0, msuffix='_mean', esuffix='_error')[source]
qharv.sieve.scalar_h5.twist_average_h5(fh5, weights=None, **suffix_kwargs)[source]

twist average data in an HDF5 archive

see extract_twists for h5 file format

Parameters

fh5 (str) – h5 file location

Returns

(meta data, mean, error)

Return type

(dict, np.array, np.array)

qharv.sieve.scalar_h5.twist_concat_h5(fh5, name, twists=None)[source]

Module contents