qharv.sieve package¶
Submodules¶
qharv.sieve.mean_df module¶
-
qharv.sieve.mean_df.
categorize_columns
(cols)[source]¶ Categorize the column names of a mean dataframe.
- Parameters
cols (list) – a list of column names
- Returns
- (excol, mcol, ecol)
excol are columns of exact values with no errorbar (possibly labels) mcols are mean columns ecols are error columns
- Return type
(list, list, list)
Examples
>>> rcol, mcol, ecol = categorize_columns(mdf.columns) >>> xyye(df, 'Pressure', 'LocalEnergy', xerr=True)
-
qharv.sieve.mean_df.
dfme
(df, cols, no_error=False, weight_name=None)[source]¶ Average scalar quantities over a set of calculations.
- Parameters
df (pd.DataFrame) – a mean dataframe containing labels+col_mean+col_error
cols (list) – a list of column names, e.g. [‘E_tot’, ‘KE_tot’]
weight_name (str, optional) – name of weight column, default None, i.e. every entry has the same weight
- Returns
averaged database
- Return type
pd.DataFrame
-
qharv.sieve.mean_df.
linex
(df_in, vseries, dseries, names, labels=None, sorted_df=False, sort_col=None)[source]¶ Linearly extrapolate to 2*DMC-VMC hint: can also do time-step extrapolation if tau1 = 2*tau2
- Parameters
mydf (pd.DataFrame) – database, must contain [‘series’] + name_mean, name_error for name in names
vseries (int) – VMC series index
dseries (int) – DMC series index
names (list) – a list of observable names to be extrpolated
labels (list, optinal) – a list of labels columns to keep along observables, default None
sorted_df (bool, optional) – VMC and DMC entries in input df are aligned
sort_col (str, optional) – column used to sort entries before subtraction
- Returns
extrapolated entry
- Return type
pd.DataFrame
-
qharv.sieve.mean_df.
xyye
(df, xname, yname, sel=None, xerr=False, yerr=True, sort=False)[source]¶ Get x vs. y data from a mean data frame.
- Parameters
df (pd.DataFrame) – mean dataframe
xname (str) – name of x variable
yname (str) – name of y variable
sel (np.array, optional) – boolean selector for subset, default is all
xerr (bool, optional) – x variable has statistical error, default False
yerr (bool, optional) – y variable has statistical error, default True
sort (bool, optional) – sort x
- Returns
(x, ym, ye) OR (xm, xe, ym, ye) if xerr=True
Examples
>>> xyye(df, 'rs', 'LocalEnergy') >>> xyye(df, 'Pressure', 'LocalEnergy', xerr=True)
qharv.sieve.scalar_df module¶
-
qharv.sieve.scalar_df.
mean_error_scalar_df
(df, nequil=0)[source]¶ get mean and average from a dataframe of raw scalar data (per-block) take dataframe having columns [‘LocalEnergy’,’Variance’,…] to a dataframe having columns [‘LocalEnergy_mean’,’LocalEnergy_error’,…]
- Parameters
df (pd.DataFrame) – raw scalar dataframe, presumable generated using qharv.scalar_dat.parse with extra labels columns added to identify the different runs.
nequil (int, optional) – number of equilibration blocks to throw out for each run, default 0 (keep all data).
- Returns
mean_error dataframe
- Return type
pd.DataFrame
-
qharv.sieve.scalar_df.
merge_list
(dfl, labels)[source]¶ Merge a list of DataFrames sharing common labels
- Parameters
dfl (list) – a list of pd.DataFrame objects
labels (list) – a list of column labels
- Returns
merged df
- Return type
pd.DataFrame
-
qharv.sieve.scalar_df.
mix_est_correction
(mydf, vseries, dseries, namesm, series_name='series', group_name='group', kind='linear', drop_missing_twists=False)[source]¶ extrapolate dmc energy to zero time-step limit :param mydf: dataframe of VMC and DMC mixed estimators :type mydf: pd.DataFrame :param vseries: VMC series id :type vseries: int :param dseries: DMC series id :type dseries: int :param names: list of DMC mixed estimators names to extrapolate :type names: list :param series_name: column name identifying the series :type series_name: str,optional :param kind: extrapolation kind, must be either ‘linear’ or ‘log’ :type kind: str,optinoal
- Returns
- an entry copied from the smallest time-step DMC entry,
then edited with extrapolated pure estimators. !!!! Series index is not changed!
- Return type
pd.Series
-
qharv.sieve.scalar_df.
poly_extrap_to_x0
(myx, myym, myye, order, return_fit=False)[source]¶ fit 1D data to 1D polynomial and extrpolate to x=0
The fit proceeds in two steps. The first polyfit does not take error into account. It estimates the extrapolated value, which is then used to setup a trust region (bounds). Using the trust region, curve_fit can robustly estimate the error of the extrapolation.
- Parameters
myx (np.array) – x values
myym (np.array) – y values
myye (np.array) – y errors (1 sigma)
order (int) – order of 1D polynomial
return_fit (bool, optional) – if true, then return fit paramters
- Returns
floats (y0m, y0e), y mean and error at x=0
- Return type
2-tuple
-
qharv.sieve.scalar_df.
reblock
(trace, block_size, min_nblock=4, with_sigma=False)[source]¶ block scalar trace to remove autocorrelation; see usage example in reblock_scalar_df
- Parameters
trace (np.array) – a trace of scalars, may have multiple columns !!!! assuming leading dimension is the number of current blocks.
block_size (int) – size of block in units of current block.
min_nblock (int,optional) – minimum number of blocks needed for meaningful statistics, default is 4.
- Returns
re-blocked trace.
- Return type
np.array
-
qharv.sieve.scalar_df.
reblock_scalar_df
(df, block_size, min_nblock=4)[source]¶ create a re-blocked scalar dataframe from a current scalar dataframe see reblock for details
-
qharv.sieve.scalar_df.
ts_extrap
(calc_df, issl, obsl, tname='timestep', series_name='series', **kwargs)[source]¶ extrapolate all dmc observables to zero time-step limit
- Parameters
calc_df (pd.DataFrame) – must contain columns [tname, series_name]
issl (list) – list of DMC series index to use in fit
obsl (list) – a list of observable names to extrapolate
- Returns
an entry copied from the smallest time-step DMC entry, then edited with extrapolated energy and corresponding info !!!! series number is unchanged
- Return type
pd.Series
-
qharv.sieve.scalar_df.
ts_extrap_obs
(calc_df, sel, tname, obs, order=1)[source]¶ extrapolate a single dmc observable to zero time-step limit
- Parameters
calc_df (pd.DataFrame) – must contain columns [tname, obs_mean, obs_error]
sel (np.array) – boolean selector array
tname (str) – timestep column name, e.g. ‘timestep’
obs (str) – observable column name, e.g. ‘LocalEnergy’
- Returns
(myx, y0m, y0e) of type (list, float, float) containing (timesteps, t=0 value, t=0 error)
- Return type
tuple
qharv.sieve.scalar_h5 module¶
-
qharv.sieve.scalar_h5.
extract_twists
(fh5, **suffix_kwargs)[source]¶ Extract an observable at all twists from an HDF5 archive
each twist should be a group in at root
Example
- twist000
myr gr_mean gr_error
- twist001
myr gr_mean gr_error
- Parameters
fh5 (str) – h5 file location
- Returns
(meta data, mean, error)
- Return type
(dict, np.array, np.array)