#!/usr/bin/env python # coding: utf-8 # # # # # BSSN Time-Evolution C Code Generation Library # # ## Author: Zach Etienne # # ## This module implements a number of helper functions for generating C-code kernels that solve Einstein's equations in the covariant BSSN formalism [described in this NRPy+ tutorial notebook](Tutorial-BSSN_formulation.ipynb). The kernel generation process includes the creation of the BSSN RHS expressions for the Method of Lines time integration, evaluation of BSSN constraints as a measure of numerical error, computation of the 3-Ricci tensor $\mathcal{R}{ij}$, and the enforcement of the conformal 3-metric constraint $det \bar{\gamma}{ij} = det \hat{\gamma}_{ij}$. Additional functionality includes the generation of Weyl scalar $\psi_4$ related to gravitational wave strain, and its tetrad. # # **Notebook Status:** Not yet validated # # **Validation Notes:** This module has NOT been validated to exhibit convergence to zero of the Hamiltonian constraint violation at the expected order to the exact solution *after a short numerical evolution of the initial data* (see [plots at bottom](#convergence)), and all quantities have been validated against the [original SENR code](https://bitbucket.org/zach_etienne/nrpy). # # ### NRPy+ modules that generate needed symbolic expressions: # * [BSSN/BSSN_constraints.py](../edit/BSSN/BSSN_constraints.py); [\[**tutorial**\]](Tutorial-BSSN_constraints.ipynb): Hamiltonian constraint in BSSN curvilinear basis/coordinates # * [BSSN/BSSN_RHSs.py](../edit/BSSN/BSSN_RHSs.py); [\[**tutorial**\]](Tutorial-BSSN_time_evolution-BSSN_RHSs.ipynb): Generates the right-hand sides for the BSSN evolution equations in singular, curvilinear coordinates # * [BSSN/BSSN_gauge_RHSs.py](../edit/BSSN/BSSN_gauge_RHSs.py); [\[**tutorial**\]](Tutorial-BSSN_time_evolution-BSSN_gauge_RHSs.ipynb): Generates the right-hand sides for the BSSN gauge evolution equations in singular, curvilinear coordinates # * [BSSN/Enforce_Detgammahat_Constraint.py](../edit/BSSN/Enforce_Detgammahat_Constraint.py); [**tutorial**](Tutorial-BSSN_enforcing_determinant_gammabar_equals_gammahat_constraint.ipynb): Generates symbolic expressions for enforcing the $\det{\bar{\gamma}}=\det{\hat{\gamma}}$ constraint # # ## Introduction: # Here we use NRPy+ to generate the C source code kernels necessary to generate C functions needed/useful for evolving forward in time the BSSN equations, including: # 1. the BSSN RHS expressions for [Method of Lines](https://reference.wolfram.com/language/tutorial/NDSolveMethodOfLines.html) time integration, with arbitrary gauge choice. # 1. the BSSN constraints as a check of numerical error # # # # Table of Contents # $$\label{toc}$$ # # This notebook is organized as follows # # 1. [Step 1](#importmodules): Import needed Python modules # 1. [Step 2](#helperfuncs): Helper Python functions for C code generation # 1. [Step 3.a](#bssnrhs): Generate symbolic BSSN RHS expressions # 1. [Step 3.b](#bssnrhs_c_code): `rhs_eval()`: Register C function for evaluating BSSN RHS expressions # 1. [Step 3.c](#ricci): Generate symbolic expressions for 3-Ricci tensor $\bar{R}_{ij}$ # 1. [Step 3.d](#ricci_c_code): `Ricci_eval()`: Register C function for evaluating 3-Ricci tensor $\bar{R}_{ij}$ # 1. [Step 4.a](#bssnconstraints): Generate symbolic expressions for BSSN Hamiltonian & momentum constraints # 1. [Step 4.b](#bssnconstraints_c_code): `BSSN_constraints()`: Register C function for evaluating BSSN Hamiltonian & momentum constraints # 1. [Step 5](#enforce3metric): `enforce_detgammahat_constraint()`: Register C function for enforcing the conformal 3-metric $\det{\bar{\gamma}_{ij}}=\det{\hat{\gamma}_{ij}}$ constraint # 1. [Step 6.a](#psi4): `psi4_part_{0,1,2}()`: Register C function for evaluating Weyl scalar $\psi_4$, in 3 parts (3 functions) # 1. [Step 6.b](#psi4_tetrad): `psi4_tetrad()`: Register C function for evaluating Weyl scalar $\psi_4$ tetrad # 1. [Step 6.c](#swm2): `SpinWeight_minus2_SphHarmonics()`: Register C function for evaluating spin-weight $s=-2$ spherical harmonics # 1. [Step 7](#validation): Confirm above functions are bytecode-identical to those in `BSSN/BSSN_Ccodegen_library.py` # 1. [Step 8](#latex_pdf_output): Output this notebook to $\LaTeX$-formatted PDF file # # # # Step 1: Import needed Python modules \[Back to [top](#toc)\] # $$\label{importmodules}$$ # In[1]: # RULES FOR ADDING FUNCTIONS TO THIS ROUTINE: # 1. The function must be runnable from a multiprocessing environment, # which means that the function # 1.a: cannot depend on previous function calls. # 1.b: cannot create directories (this is not multiproc friendly) # Step P1: Import needed NRPy+ core modules: from __future__ import print_function from outputC import lhrh, add_to_Cfunction_dict # NRPy+: Core C code output module import finite_difference as fin # NRPy+: Finite difference C code generation module import NRPy_param_funcs as par # NRPy+: Parameter interface import grid as gri # NRPy+: Functions having to do with numerical grids import indexedexp as ixp # NRPy+: Symbolic indexed expression (e.g., tensors, vectors, etc.) support import reference_metric as rfm # NRPy+: Reference metric support from pickling import pickle_NRPy_env # NRPy+: Pickle/unpickle NRPy+ environment, for parallel codegen import os, time # Standard Python modules for multiplatform OS-level functions, benchmarking import BSSN.BSSN_RHSs as rhs import BSSN.BSSN_gauge_RHSs as gaugerhs import loop as lp # # # # Step 2: Helper Python functions for C code generation \[Back to [top](#toc)\] # $$\label{helperfuncs}$$ # # * `print_msg_with_timing()` gives the user an idea of what's going on/taking so long. Also outputs timing info. # * `get_loopopts()` sets up options for NRPy+'s `loop` module # * `register_stress_energy_source_terms_return_T4UU()` registers gridfunctions for $T^{\mu\nu}$ if needed and not yet registered. # In[2]: ############################################### # Helper Python functions for C code generation # print_msg_with_timing() gives the user an idea of what's going on/taking so long. Also outputs timing info. def print_msg_with_timing(desc, msg="Symbolic", startstop="start", starttime=0.0): CoordSystem = par.parval_from_str("reference_metric::CoordSystem") elapsed = time.time()-starttime if msg == "Symbolic": if startstop == "start": print("Generating symbolic expressions for " + desc + " (" + CoordSystem + " coords)...") return time.time() # else: print("Finished generating symbolic expressions for "+desc+ " (" + CoordSystem + " coords) in "+str(round(elapsed, 1))+" seconds. Next up: C codegen...") elif msg == "Ccodegen": if startstop == "start": print("Generating C code for "+desc+" (" + CoordSystem + " coords)...") return time.time() # else: print("Finished generating C code for "+desc+" (" + CoordSystem + " coords) in "+str(round(elapsed, 1))+" seconds.") # get_loopopts() sets up options for NRPy+'s loop module def get_loopopts(points_to_update, enable_SIMD, enable_rfm_precompute, OMP_pragma_on, enable_xxs=True): loopopts = points_to_update + ",includebraces=False" if enable_SIMD: loopopts += ",enable_SIMD" if enable_rfm_precompute: loopopts += ",enable_rfm_precompute" elif not enable_xxs: pass else: loopopts += ",Read_xxs" if OMP_pragma_on != "i2": loopopts += ",pragma_on_"+OMP_pragma_on return loopopts # register_stress_energy_source_terms_return_T4UU() registers gridfunctions # for T4UU if needed and not yet registered. def register_stress_energy_source_terms_return_T4UU(enable_stress_energy_source_terms): if enable_stress_energy_source_terms: registered_already = False for i in range(len(gri.glb_gridfcs_list)): if gri.glb_gridfcs_list[i].name == "T4UU00": registered_already = True if not registered_already: return ixp.register_gridfunctions_for_single_rank2("AUXEVOL", "T4UU", "sym01", DIM=4) # else: return ixp.declarerank2("T4UU", "sym01", DIM=4) return None # # # # Step 3.a: Generate symbolic BSSN RHS expressions \[Back to [top](#toc)\] # $$\label{bssnrhs}$$ # # First, we generate the symbolic expressions. Be sure to call this function from within a `reference_metric::enable_rfm_precompute="True"` environment if reference metric precomputation is desired. # # # `BSSN_RHSs__generate_symbolic_expressions()` supports the following features # # * (`"OnePlusLog"` by default) Lapse gauge choice # * (`"GammaDriving2ndOrder_Covariant"` by default) Shift gauge choice # * (disabled by default) Kreiss-Oliger dissipation # * (disabled by default) Stress-energy ($T^{\mu\nu}$) source terms # * (enabled by default) "Leave Ricci symbolic": do not compute the 3-Ricci tensor $\bar{R}_{ij}$ within the BSSN RHSs, which only adds to the extreme complexity of the BSSN RHS expressions. Instead, leave computation of $\bar{R}_{ij}$=`RbarDD` to a separate function. Doing this generally increases C-code performance by about 10%. # # Two lists are returned by this function: # # 1. `betaU`: the un-rescaled shift vector $\beta^i$, which is used to perform upwinding. # 1. `BSSN_RHSs_SymbExpressions`: the BSSN RHS symbolic expressions, using the `lhrh` named-tuple to store a list of LHSs and RHSs, where each LHS and RHS is defined as follows # 1. LHS = BSSN gridfunction whose time derivative is being computed at grid point `i0,i1,i2`, and # 1. RHS = time derivative expression for a given variable at the given point. # In[3]: def BSSN_RHSs__generate_symbolic_expressions(LapseCondition="OnePlusLog", ShiftCondition="GammaDriving2ndOrder_Covariant", enable_KreissOliger_dissipation=True, enable_stress_energy_source_terms=False, leave_Ricci_symbolic=True): ###################################### # START: GENERATE SYMBOLIC EXPRESSIONS starttime = print_msg_with_timing("BSSN_RHSs", msg="Symbolic", startstop="start") # Returns None if enable_stress_energy_source_terms==False; otherwise returns symb expressions for T4UU T4UU = register_stress_energy_source_terms_return_T4UU(enable_stress_energy_source_terms) # Evaluate BSSN RHSs: import BSSN.BSSN_quantities as Bq par.set_parval_from_str("BSSN.BSSN_quantities::LeaveRicciSymbolic", str(leave_Ricci_symbolic)) rhs.BSSN_RHSs() if enable_stress_energy_source_terms: import BSSN.BSSN_stress_energy_source_terms as Bsest Bsest.BSSN_source_terms_for_BSSN_RHSs(T4UU) rhs.trK_rhs += Bsest.sourceterm_trK_rhs for i in range(3): # Needed for Gamma-driving shift RHSs: rhs.Lambdabar_rhsU[i] += Bsest.sourceterm_Lambdabar_rhsU[i] # Needed for BSSN RHSs: rhs.lambda_rhsU[i] += Bsest.sourceterm_lambda_rhsU[i] for j in range(3): rhs.a_rhsDD[i][j] += Bsest.sourceterm_a_rhsDD[i][j] par.set_parval_from_str("BSSN.BSSN_gauge_RHSs::LapseEvolutionOption", LapseCondition) par.set_parval_from_str("BSSN.BSSN_gauge_RHSs::ShiftEvolutionOption", ShiftCondition) gaugerhs.BSSN_gauge_RHSs() # Can depend on above RHSs # Restore BSSN.BSSN_quantities::LeaveRicciSymbolic to False par.set_parval_from_str("BSSN.BSSN_quantities::LeaveRicciSymbolic", "False") # Add Kreiss-Oliger dissipation to the BSSN RHSs: if enable_KreissOliger_dissipation: thismodule = "KO_Dissipation" diss_strength = par.Cparameters("REAL", thismodule, "diss_strength", 0.1) # *Bq.cf # *Bq.cf*Bq.cf*Bq.cf # cf**1 is found better than cf**4 over the long term. alpha_dKOD = ixp.declarerank1("alpha_dKOD") cf_dKOD = ixp.declarerank1("cf_dKOD") trK_dKOD = ixp.declarerank1("trK_dKOD") betU_dKOD = ixp.declarerank2("betU_dKOD", "nosym") vetU_dKOD = ixp.declarerank2("vetU_dKOD", "nosym") lambdaU_dKOD = ixp.declarerank2("lambdaU_dKOD", "nosym") aDD_dKOD = ixp.declarerank3("aDD_dKOD", "sym01") hDD_dKOD = ixp.declarerank3("hDD_dKOD", "sym01") for k in range(3): gaugerhs.alpha_rhs += diss_strength * alpha_dKOD[k] * rfm.ReU[k] # ReU[k] = 1/scalefactor_orthog_funcform[k] rhs.cf_rhs += diss_strength * cf_dKOD[k] * rfm.ReU[k] # ReU[k] = 1/scalefactor_orthog_funcform[k] rhs.trK_rhs += diss_strength * trK_dKOD[k] * rfm.ReU[k] # ReU[k] = 1/scalefactor_orthog_funcform[k] for i in range(3): if "2ndOrder" in ShiftCondition: gaugerhs.bet_rhsU[i] += diss_strength * betU_dKOD[i][k] * rfm.ReU[k] # ReU[k] = 1/scalefactor_orthog_funcform[k] gaugerhs.vet_rhsU[i] += diss_strength * vetU_dKOD[i][k] * rfm.ReU[k] # ReU[k] = 1/scalefactor_orthog_funcform[k] rhs.lambda_rhsU[i] += diss_strength * lambdaU_dKOD[i][k] * rfm.ReU[k] # ReU[k] = 1/scalefactor_orthog_funcform[k] for j in range(3): rhs.a_rhsDD[i][j] += diss_strength * aDD_dKOD[i][j][k] * rfm.ReU[k] # ReU[k] = 1/scalefactor_orthog_funcform[k] rhs.h_rhsDD[i][j] += diss_strength * hDD_dKOD[i][j][k] * rfm.ReU[k] # ReU[k] = 1/scalefactor_orthog_funcform[k] # We use betaU as our upwinding control vector: Bq.BSSN_basic_tensors() betaU = Bq.betaU # END: GENERATE SYMBOLIC EXPRESSIONS ###################################### lhs_names = ["alpha", "cf", "trK"] rhs_exprs = [gaugerhs.alpha_rhs, rhs.cf_rhs, rhs.trK_rhs] for i in range(3): lhs_names.append("betU" + str(i)) rhs_exprs.append(gaugerhs.bet_rhsU[i]) lhs_names.append("lambdaU" + str(i)) rhs_exprs.append(rhs.lambda_rhsU[i]) lhs_names.append("vetU" + str(i)) rhs_exprs.append(gaugerhs.vet_rhsU[i]) for j in range(i, 3): lhs_names.append("aDD" + str(i) + str(j)) rhs_exprs.append(rhs.a_rhsDD[i][j]) lhs_names.append("hDD" + str(i) + str(j)) rhs_exprs.append(rhs.h_rhsDD[i][j]) # Sort the lhss list alphabetically, and rhss to match. # This ensures the RHSs are evaluated in the same order # they're allocated in memory: lhs_names, rhs_exprs = [list(x) for x in zip(*sorted(zip(lhs_names, rhs_exprs), key=lambda pair: pair[0]))] # Declare the list of lhrh's BSSN_RHSs_SymbExpressions = [] for var in range(len(lhs_names)): BSSN_RHSs_SymbExpressions.append(lhrh(lhs=gri.gfaccess("rhs_gfs", lhs_names[var]), rhs=rhs_exprs[var])) print_msg_with_timing("BSSN_RHSs", msg="Symbolic", startstop="stop", starttime=starttime) return [betaU, BSSN_RHSs_SymbExpressions] # # # # Step 3.b: `rhs_eval()`: Register C code for BSSN RHS expressions \[Back to [top](#toc)\] # $$\label{bssnrhs_c_code}$$ # # `add_rhs_eval_to_Cfunction_dict()` supports the following features # # * (enabled by default) reference-metric precomputation # * (disabled by default) "golden kernels", which greatly increases the C-code generation time in an attempt to reduce computational cost. Most often this results in no speed-up. # * (enabled by default) SIMD output # * (disabled by default) splitting of RHSs into smaller pieces (multiple loops) to improve performance. Doesn't help much. # * (`"OnePlusLog"` by default) Lapse gauge choice # * (`"GammaDriving2ndOrder_Covariant"` by default) Shift gauge choice # * (disabled by default) enable Kreiss-Oliger dissipation # * (disabled by default) add stress-energy ($T^{\mu\nu}$) source terms # * (enabled by default) "Leave Ricci symbolic": do not compute the 3-Ricci tensor $\bar{R}_{ij}$ within the BSSN RHSs, which only adds to the extreme complexity of the BSSN RHS expressions. Instead, leave computation of $\bar{R}_{ij}$=`RbarDD` to a separate function. Doing this generally increases C-code performance by about 10%. # * (`"i2"` by default) OpenMP pragma acts on which loop (assumes `i2` is outermost and `i0` is innermost loop). For axisymmetric or near-axisymmetric calculations, `"i1"` may be *significantly* faster. # # Also to enable parallel C-code kernel generation, the NRPy+ environment is pickled and returned. # In[4]: def add_rhs_eval_to_Cfunction_dict(includes=None, rel_path_to_Cparams=os.path.join("."), enable_rfm_precompute=True, enable_golden_kernels=False, enable_SIMD=True, enable_split_for_optimizations_doesnt_help=False, LapseCondition="OnePlusLog", ShiftCondition="GammaDriving2ndOrder_Covariant", enable_KreissOliger_dissipation=False, enable_stress_energy_source_terms=False, leave_Ricci_symbolic=True, OMP_pragma_on="i2", func_name_suffix=""): if includes is None: includes = [] if enable_SIMD: includes += [os.path.join("SIMD", "SIMD_intrinsics.h")] enable_FD_functions = bool(par.parval_from_str("finite_difference::enable_FD_functions")) if enable_FD_functions: includes += ["finite_difference_functions.h"] # Set up the C function for the BSSN RHSs desc = "Evaluate the BSSN RHSs" func_name = "rhs_eval" + func_name_suffix params = "const paramstruct *restrict params, " if enable_rfm_precompute: params += "const rfm_struct *restrict rfmstruct, " else: params += "REAL *restrict xx[3], " params += """ const REAL *restrict auxevol_gfs,const REAL *restrict in_gfs,REAL *restrict rhs_gfs""" betaU, BSSN_RHSs_SymbExpressions = \ BSSN_RHSs__generate_symbolic_expressions(LapseCondition=LapseCondition, ShiftCondition=ShiftCondition, enable_KreissOliger_dissipation=enable_KreissOliger_dissipation, enable_stress_energy_source_terms=enable_stress_energy_source_terms, leave_Ricci_symbolic=leave_Ricci_symbolic) # Construct body: preloop="" enableCparameters=True # Set up preloop in case we're outputting code for the Einstein Toolkit (ETK) if par.parval_from_str("grid::GridFuncMemAccess") == "ETK": params, preloop = set_ETK_func_params_preloop(func_name) enableCparameters=False FD_outCparams = "outCverbose=False,enable_SIMD=" + str(enable_SIMD) FD_outCparams += ",GoldenKernelsEnable=" + str(enable_golden_kernels) loopopts = get_loopopts("InteriorPoints", enable_SIMD, enable_rfm_precompute, OMP_pragma_on) FDorder = par.parval_from_str("finite_difference::FD_CENTDERIVS_ORDER") starttime = print_msg_with_timing("BSSN_RHSs (FD order="+str(FDorder)+")", msg="Ccodegen", startstop="start") if enable_split_for_optimizations_doesnt_help and FDorder == 6: loopopts += ",DisableOpenMP" BSSN_RHSs_SymbExpressions_pt1 = [] BSSN_RHSs_SymbExpressions_pt2 = [] for lhsrhs in BSSN_RHSs_SymbExpressions: if "BETU" in lhsrhs.lhs or "LAMBDAU" in lhsrhs.lhs: BSSN_RHSs_SymbExpressions_pt1.append(lhrh(lhs=lhsrhs.lhs, rhs=lhsrhs.rhs)) else: BSSN_RHSs_SymbExpressions_pt2.append(lhrh(lhs=lhsrhs.lhs, rhs=lhsrhs.rhs)) preloop += """#pragma omp parallel { """ preloopbody = fin.FD_outputC("returnstring", BSSN_RHSs_SymbExpressions_pt1, params=FD_outCparams, upwindcontrolvec=betaU) preloop += "\n#pragma omp for\n" + lp.simple_loop(loopopts, preloopbody) preloop += "\n#pragma omp for\n" body = fin.FD_outputC("returnstring", BSSN_RHSs_SymbExpressions_pt2, params=FD_outCparams, upwindcontrolvec=betaU) postloop = "\n } // END #pragma omp parallel\n" else: preloop += "" body = fin.FD_outputC("returnstring", BSSN_RHSs_SymbExpressions, params=FD_outCparams, upwindcontrolvec=betaU) postloop = "" print_msg_with_timing("BSSN_RHSs (FD order="+str(FDorder)+")", msg="Ccodegen", startstop="stop", starttime=starttime) add_to_Cfunction_dict( includes=includes, desc=desc, name=func_name, params=params, preloop=preloop, body=body, loopopts=loopopts, postloop=postloop, rel_path_to_Cparams=rel_path_to_Cparams, enableCparameters=enableCparameters) return pickle_NRPy_env() # # # # Step 3.c: Generate symbolic expressions for 3-Ricci tensor $\bar{R}_{ij}$ \[Back to [top](#toc)\] # $$\label{ricci}$$ # # As described above, we find a roughly 10% speedup by computing the 3-Ricci tensor $\bar{R}_{ij}$ separately from the BSSN RHS equations and storing the 6 independent components in memory. Here we construct the symbolic expressions for all 6 independent components of $\bar{R}_{ij}$ (which is symmetric under interchange of indices). # # `Ricci__generate_symbolic_expressions()` does not support any input parameters. # # One list is returned by `Ricci__generate_symbolic_expressions()`: `Ricci_SymbExpressions`, which contains a list of expressions for the six independent components of $\bar{R}_{ij}$, using the `lhrh` named-tuple to store a list of LHSs and RHSs, where each LHS and RHS is defined as follows # # 1. LHS = gridfunction representation of the component of $\bar{R}_{ij}$, computed at grid point i0,i1,i2, and # 1. RHS = expression for given component of $\bar{R}_{ij}$. # In[5]: def Ricci__generate_symbolic_expressions(): ###################################### # START: GENERATE SYMBOLIC EXPRESSIONS starttime = print_msg_with_timing("3-Ricci tensor", msg="Symbolic", startstop="start") # Evaluate 3-Ricci tensor: import BSSN.BSSN_quantities as Bq par.set_parval_from_str("BSSN.BSSN_quantities::LeaveRicciSymbolic", "False") # Register all BSSN gridfunctions if not registered already Bq.BSSN_basic_tensors() # Next compute Ricci tensor Bq.RicciBar__gammabarDD_dHatD__DGammaUDD__DGammaU() # END: GENERATE SYMBOLIC EXPRESSIONS ###################################### # Must register RbarDD as gridfunctions, as we're outputting them to gridfunctions here: foundit = False for i in range(len(gri.glb_gridfcs_list)): if "RbarDD00" in gri.glb_gridfcs_list[i].name: foundit = True if not foundit: ixp.register_gridfunctions_for_single_rank2("AUXEVOL", "RbarDD", "sym01") Ricci_SymbExpressions = [lhrh(lhs=gri.gfaccess("auxevol_gfs", "RbarDD00"), rhs=Bq.RbarDD[0][0]), lhrh(lhs=gri.gfaccess("auxevol_gfs", "RbarDD01"), rhs=Bq.RbarDD[0][1]), lhrh(lhs=gri.gfaccess("auxevol_gfs", "RbarDD02"), rhs=Bq.RbarDD[0][2]), lhrh(lhs=gri.gfaccess("auxevol_gfs", "RbarDD11"), rhs=Bq.RbarDD[1][1]), lhrh(lhs=gri.gfaccess("auxevol_gfs", "RbarDD12"), rhs=Bq.RbarDD[1][2]), lhrh(lhs=gri.gfaccess("auxevol_gfs", "RbarDD22"), rhs=Bq.RbarDD[2][2])] print_msg_with_timing("3-Ricci tensor", msg="Symbolic", startstop="stop", starttime=starttime) return Ricci_SymbExpressions # # # # Step 3.d: `Ricci_eval()`: Register C function for evaluating 3-Ricci tensor $\bar{R}_{ij}$ \[Back to [top](#toc)\] # $$\label{ricci_c_code}$$ # # `add_Ricci_eval_to_Cfunction_dict()` supports the following features # # * (enabled by default) reference-metric precomputation # * (disabled by default) "golden kernels", which greatly increases the C-code generation time in an attempt to reduce computational cost. Most often this results in no speed-up. # * (enabled by default) SIMD output # * (disabled by default) splitting of RHSs into smaller pieces (multiple loops) to improve performance. Doesn't help much. # * (`"i2"` by default) OpenMP pragma acts on which loop (assumes `i2` is outermost and `i0` is innermost loop). For axisymmetric or near-axisymmetric calculations, `"i1"` may be *significantly* faster. # # Also to enable parallel C-code kernel generation, the NRPy+ environment is pickled and returned. # In[6]: def add_Ricci_eval_to_Cfunction_dict(includes=None, rel_path_to_Cparams=os.path.join("."), enable_rfm_precompute=True, enable_golden_kernels=False, enable_SIMD=True, enable_split_for_optimizations_doesnt_help=False, OMP_pragma_on="i2", func_name_suffix=""): if includes is None: includes = [] if enable_SIMD: includes += [os.path.join("SIMD", "SIMD_intrinsics.h")] enable_FD_functions = bool(par.parval_from_str("finite_difference::enable_FD_functions")) if enable_FD_functions: includes += ["finite_difference_functions.h"] # Set up the C function for the 3-Ricci tensor desc = "Evaluate the 3-Ricci tensor" func_name = "Ricci_eval" + func_name_suffix params = "const paramstruct *restrict params, " if enable_rfm_precompute: params += "const rfm_struct *restrict rfmstruct, " else: params += "REAL *restrict xx[3], " params += "const REAL *restrict in_gfs, REAL *restrict auxevol_gfs" # Construct body: Ricci_SymbExpressions = Ricci__generate_symbolic_expressions() FD_outCparams = "outCverbose=False,enable_SIMD=" + str(enable_SIMD) FD_outCparams += ",GoldenKernelsEnable=" + str(enable_golden_kernels) loopopts = get_loopopts("InteriorPoints", enable_SIMD, enable_rfm_precompute, OMP_pragma_on) FDorder = par.parval_from_str("finite_difference::FD_CENTDERIVS_ORDER") starttime = print_msg_with_timing("3-Ricci tensor (FD order="+str(FDorder)+")", msg="Ccodegen", startstop="start") # Construct body: preloop="" enableCparameters=True # Set up preloop in case we're outputting code for the Einstein Toolkit (ETK) if par.parval_from_str("grid::GridFuncMemAccess") == "ETK": params, preloop = set_ETK_func_params_preloop(func_name) enableCparameters=False if enable_split_for_optimizations_doesnt_help and FDorder >= 8: loopopts += ",DisableOpenMP" Ricci_SymbExpressions_pt1 = [] Ricci_SymbExpressions_pt2 = [] for lhsrhs in Ricci_SymbExpressions: if "RBARDD00" in lhsrhs.lhs or "RBARDD11" in lhsrhs.lhs or "RBARDD22" in lhsrhs.lhs: Ricci_SymbExpressions_pt1.append(lhrh(lhs=lhsrhs.lhs, rhs=lhsrhs.rhs)) else: Ricci_SymbExpressions_pt2.append(lhrh(lhs=lhsrhs.lhs, rhs=lhsrhs.rhs)) preloop = """#pragma omp parallel { #pragma omp for """ preloopbody = fin.FD_outputC("returnstring", Ricci_SymbExpressions_pt1, params=FD_outCparams) preloop += lp.simple_loop(loopopts, preloopbody) preloop += "#pragma omp for\n" body = fin.FD_outputC("returnstring", Ricci_SymbExpressions_pt2, params=FD_outCparams) postloop = "\n } // END #pragma omp parallel\n" else: body = fin.FD_outputC("returnstring", Ricci_SymbExpressions, params=FD_outCparams) postloop = "" print_msg_with_timing("3-Ricci tensor (FD order="+str(FDorder)+")", msg="Ccodegen", startstop="stop", starttime=starttime) add_to_Cfunction_dict( includes=includes, desc=desc, name=func_name, params=params, preloop=preloop, body=body, loopopts=loopopts, postloop=postloop, rel_path_to_Cparams=rel_path_to_Cparams, enableCparameters=enableCparameters) return pickle_NRPy_env() # # # # Step 4.a: Generate symbolic expressions for BSSN Hamiltonian & momentum constraints \[Back to [top](#toc)\] # $$\label{bssnconstraints}$$ # # Next output the C code for evaluating the BSSN Hamiltonian and momentum constraints [(**Tutorial**)](Tutorial-BSSN_constraints.ipynb). In the absence of numerical error, these constraints should evaluate to zero. However, it does not due to numerical (typically truncation) error. # # We will therefore measure the constraint violations to gauge the accuracy of our simulation, and, ultimately determine whether errors are dominated by numerical finite differencing (truncation) error as expected. # # `BSSN_constraints__generate_symbolic_expressions()` supports the following features: # # * (disabled by default) add stress-energy ($T^{\mu\nu}$) source terms # * (disabled by default) output Hamiltonian constraint only # # One list is returned by `BSSN_constraints__generate_symbolic_expressions()`: `BSSN_constraints_SymbExpressions`, which contains a list of expressions for the Hamiltonian and momentum constraints (4 elements total), using the `lhrh` named-tuple to store a list of LHSs and RHSs, where each LHS and RHS is defined as follows # # 1. LHS = gridfunction representation of the BSSN constraint, computed at grid point i0,i1,i2, and # 1. RHS = expression for given BSSN constraint # In[7]: def BSSN_constraints__generate_symbolic_expressions(enable_stress_energy_source_terms=False, leave_Ricci_symbolic=True, output_H_only=False): ###################################### # START: GENERATE SYMBOLIC EXPRESSIONS starttime = print_msg_with_timing("BSSN constraints", msg="Symbolic", startstop="start") # Define the Hamiltonian constraint and output the optimized C code. par.set_parval_from_str("BSSN.BSSN_quantities::LeaveRicciSymbolic", str(leave_Ricci_symbolic)) import BSSN.BSSN_constraints as bssncon # Returns None if enable_stress_energy_source_terms==False; otherwise returns symb expressions for T4UU T4UU = register_stress_energy_source_terms_return_T4UU(enable_stress_energy_source_terms) bssncon.BSSN_constraints(add_T4UUmunu_source_terms=False, output_H_only=output_H_only) # We'll add them below if desired. if enable_stress_energy_source_terms: import BSSN.BSSN_stress_energy_source_terms as Bsest Bsest.BSSN_source_terms_for_BSSN_constraints(T4UU) bssncon.H += Bsest.sourceterm_H for i in range(3): bssncon.MU[i] += Bsest.sourceterm_MU[i] BSSN_constraints_SymbExpressions = [lhrh(lhs=gri.gfaccess("aux_gfs", "H"), rhs=bssncon.H)] if not output_H_only: BSSN_constraints_SymbExpressions += [lhrh(lhs=gri.gfaccess("aux_gfs", "MU0"), rhs=bssncon.MU[0]), lhrh(lhs=gri.gfaccess("aux_gfs", "MU1"), rhs=bssncon.MU[1]), lhrh(lhs=gri.gfaccess("aux_gfs", "MU2"), rhs=bssncon.MU[2])] par.set_parval_from_str("BSSN.BSSN_quantities::LeaveRicciSymbolic", "False") print_msg_with_timing("BSSN constraints", msg="Symbolic", startstop="stop", starttime=starttime) # END: GENERATE SYMBOLIC EXPRESSIONS ###################################### return BSSN_constraints_SymbExpressions # # # # Step 4.b: `BSSN_constraints()`: Register C function for evaluating BSSN Hamiltonian & momentum constraints \[Back to [top](#toc)\] # $$\label{bssnconstraints_c_code}$$ # # `add_BSSN_constraints_to_Cfunction_dict()` supports the following features # # * (enabled by default) reference-metric precomputation # * (disabled by default) "golden kernels", which greatly increases the C-code generation time in an attempt to reduce computational cost. Most often this results in no speed-up. # * (enabled by default) SIMD output # * (disabled by default) splitting of RHSs into smaller pieces (multiple loops) to improve performance. Doesn't help much. # * (disabled by default) add stress-energy ($T^{\mu\nu}$) source terms # * (disabled by default) output Hamiltonian constraint only # * (`"i2"` by default) OpenMP pragma acts on which loop (assumes `i2` is outermost and `i0` is innermost loop). For axisymmetric or near-axisymmetric calculations, `"i1"` may be *significantly* faster. # # Also to enable parallel C-code kernel generation, the NRPy+ environment is pickled and returned. # In[8]: def add_BSSN_constraints_to_Cfunction_dict(includes=None, rel_path_to_Cparams=os.path.join("."), enable_rfm_precompute=True, enable_golden_kernels=False, enable_SIMD=True, enable_stress_energy_source_terms=False, leave_Ricci_symbolic=True, output_H_only=False, OMP_pragma_on="i2", func_name_suffix=""): if includes is None: includes = [] if enable_SIMD: includes += [os.path.join("SIMD", "SIMD_intrinsics.h")] enable_FD_functions = bool(par.parval_from_str("finite_difference::enable_FD_functions")) if enable_FD_functions: includes += ["finite_difference_functions.h"] # Set up the C function for the BSSN constraints desc = "Evaluate the BSSN constraints" func_name = "BSSN_constraints" + func_name_suffix params = "const paramstruct *restrict params, " if enable_rfm_precompute: params += "const rfm_struct *restrict rfmstruct, " else: params += "REAL *restrict xx[3], " params += """ const REAL *restrict in_gfs, const REAL *restrict auxevol_gfs, REAL *restrict aux_gfs""" # Construct body: BSSN_constraints_SymbExpressions = BSSN_constraints__generate_symbolic_expressions(enable_stress_energy_source_terms, leave_Ricci_symbolic=leave_Ricci_symbolic, output_H_only=output_H_only) preloop="" enableCparameters=True # Set up preloop in case we're outputting code for the Einstein Toolkit (ETK) if par.parval_from_str("grid::GridFuncMemAccess") == "ETK": params, preloop = set_ETK_func_params_preloop(func_name) enableCparameters=False FD_outCparams = "outCverbose=False,enable_SIMD=" + str(enable_SIMD) FD_outCparams += ",GoldenKernelsEnable=" + str(enable_golden_kernels) FDorder = par.parval_from_str("finite_difference::FD_CENTDERIVS_ORDER") starttime = print_msg_with_timing("BSSN constraints (FD order="+str(FDorder)+")", msg="Ccodegen", startstop="start") body = fin.FD_outputC("returnstring", BSSN_constraints_SymbExpressions, params=FD_outCparams) print_msg_with_timing("BSSN constraints (FD order="+str(FDorder)+")", msg="Ccodegen", startstop="stop", starttime=starttime) add_to_Cfunction_dict( includes=includes, desc=desc, name=func_name, params=params, preloop=preloop, body=body, loopopts=get_loopopts("InteriorPoints", enable_SIMD, enable_rfm_precompute, OMP_pragma_on), rel_path_to_Cparams=rel_path_to_Cparams, enableCparameters=enableCparameters) return pickle_NRPy_env() # # # # Step 5: `enforce_detgammahat_constraint()`: Register C function for enforcing the conformal 3-metric $\det{\bar{\gamma}_{ij}}=\det{\hat{\gamma}_{ij}}$ constraint \[Back to [top](#toc)\] # $$\label{enforce3metric}$$ # # To ensure stability when solving the BSSN equations, we must enforce the conformal 3-metric $\det{\bar{\gamma}_{ij}}=\det{\hat{\gamma}_{ij}}$ constraint (Eq. 53 of [Ruchlin, Etienne, and Baumgarte (2018)](https://arxiv.org/abs/1712.07658)), as [documented in the corresponding NRPy+ tutorial notebook](Tutorial-BSSN_enforcing_determinant_gammabar_equals_gammahat_constraint.ipynb). This function imposes the $\det{\bar{\gamma}_{ij}}=\det{\hat{\gamma}_{ij}}$ constraint. # # Applying curvilinear boundary conditions should affect the initial data at the outer boundary, and will in general cause the $\det{\bar{\gamma}_{ij}}=\det{\hat{\gamma}_{ij}}$ constraint to be violated there. Thus after we apply these boundary conditions, we must always call the routine for enforcing the $\det{\bar{\gamma}_{ij}}=\det{\hat{\gamma}_{ij}}$ constraint. # # `add_enforce_detgammahat_constraint_to_Cfunction_dict()` supports the following features # # * (enabled by default) reference-metric precomputation # * (disabled by default) "golden kernels", which greatly increases the C-code generation time in an attempt to reduce computational cost. Most often this results in no speed-up. # * (`"i2"` by default) OpenMP pragma acts on which loop (assumes `i2` is outermost and `i0` is innermost loop). For axisymmetric or near-axisymmetric calculations, `"i1"` may be *significantly* faster. # # Also to enable parallel C-code kernel generation, the NRPy+ environment is pickled and returned. # In[9]: def add_enforce_detgammahat_constraint_to_Cfunction_dict(includes=None, rel_path_to_Cparams=os.path.join("."), enable_rfm_precompute=True, enable_golden_kernels=False, OMP_pragma_on="i2", func_name_suffix=""): # This function disables SIMD, as it includes cbrt() and abs() functions. if includes is None: includes = [] # This function does not use finite differences! # enable_FD_functions = bool(par.parval_from_str("finite_difference::enable_FD_functions")) # if enable_FD_functions: # includes += ["finite_difference_functions.h"] # Set up the C function for enforcing the det(gammabar) = det(gammahat) BSSN algebraic constraint desc = "Enforce the det(gammabar) = det(gammahat) (algebraic) constraint" func_name = "enforce_detgammahat_constraint" + func_name_suffix params = "const paramstruct *restrict params, " if enable_rfm_precompute: params += "const rfm_struct *restrict rfmstruct, " else: params += "REAL *restrict xx[3], " params += "REAL *restrict in_gfs" # Construct body: enforce_detg_constraint_symb_expressions = EGC.Enforce_Detgammahat_Constraint_symb_expressions() preloop="" enableCparameters=True # Set up preloop in case we're outputting code for the Einstein Toolkit (ETK) if par.parval_from_str("grid::GridFuncMemAccess") == "ETK": params, preloop = set_ETK_func_params_preloop(func_name, enable_SIMD=False) enableCparameters=False FD_outCparams = "outCverbose=False,enable_SIMD=False" FD_outCparams += ",GoldenKernelsEnable=" + str(enable_golden_kernels) starttime = print_msg_with_timing("Enforcing det(gammabar)=det(gammahat) constraint", msg="Ccodegen", startstop="start") body = fin.FD_outputC("returnstring", enforce_detg_constraint_symb_expressions, params=FD_outCparams) print_msg_with_timing("Enforcing det(gammabar)=det(gammahat) constraint", msg="Ccodegen", startstop="stop", starttime=starttime) enable_SIMD = False add_to_Cfunction_dict( includes=includes, desc=desc, name=func_name, params=params, preloop=preloop, body=body, loopopts=get_loopopts("AllPoints", enable_SIMD, enable_rfm_precompute, OMP_pragma_on), rel_path_to_Cparams=rel_path_to_Cparams, enableCparameters=enableCparameters) return pickle_NRPy_env() # # # # Step 6.a: `psi4_part_{0,1,2}()`: Register C function for evaluating Weyl scalar $\psi_4$, in 3 parts (3 functions) \[Back to [top](#toc)\] # $$\label{psi4}$$ # # $\psi_4$ is a complex scalar related to the gravitational wave strain via # # $$ # \psi_4 = \ddot{h}_+ - i \ddot{h}_\times. # $$ # # We construct the symbolic expression for $\psi_4$ as described in the [corresponding NRPy+ Jupyter notebook](Tutorial-Psi4.ipynb), in three parts. The below `add_psi4_part_to_Cfunction_dict()` function will construct any of these three parts `0`, `1,` or `2`, and output the part to a function `psi4_part0()`, `psi4_part1()`, or `psi4_part2()`, respectively. # # `add_psi4_part_to_Cfunction_dict()` supports the following features # # * (`"0"` by default) which part? (`0`, `1,` or `2`), as described above # * (disabled by default) "setPsi4tozero", which effectively turns this into a dummy function -- for when $\psi_4$ is not needed, and it's easier to just set `psi_4=0` instead of calculating it. # * (`"i2"` by default) OpenMP pragma acts on which loop (assumes `i2` is outermost and `i0` is innermost loop). For axisymmetric or near-axisymmetric calculations, `"i1"` may be *significantly* faster. # # Also to enable parallel C-code kernel generation, the NRPy+ environment is pickled and returned. # In[10]: def add_psi4_part_to_Cfunction_dict(includes=None, rel_path_to_Cparams=os.path.join("."), whichpart=0, setPsi4tozero=False, OMP_pragma_on="i2"): starttime = print_msg_with_timing("psi4, part " + str(whichpart), msg="Ccodegen", startstop="start") # Set up the C function for psi4 if includes is None: includes = [] includes += ["NRPy_function_prototypes.h"] desc = "Compute psi4 at all interior gridpoints, part " + str(whichpart) name = "psi4_part" + str(whichpart) params = """const paramstruct *restrict params, const REAL *restrict in_gfs, REAL *restrict xx[3], REAL *restrict aux_gfs""" body = "" gri.register_gridfunctions("AUX", ["psi4_part" + str(whichpart) + "re", "psi4_part" + str(whichpart) + "im"]) FD_outCparams = "outCverbose=False,enable_SIMD=False,CSE_sorting=none" if not setPsi4tozero: # Set the body of the function # First compute the symbolic expressions psi4.Psi4(specify_tetrad=False) # We really don't want to store these "Cparameters" permanently; they'll be set via function call... # so we make a copy of the original par.glb_Cparams_list (sans tetrad vectors) and restore it below Cparams_list_orig = par.glb_Cparams_list.copy() par.Cparameters("REAL", __name__, ["mre4U0", "mre4U1", "mre4U2", "mre4U3"], [0, 0, 0, 0]) par.Cparameters("REAL", __name__, ["mim4U0", "mim4U1", "mim4U2", "mim4U3"], [0, 0, 0, 0]) par.Cparameters("REAL", __name__, ["n4U0", "n4U1", "n4U2", "n4U3"], [0, 0, 0, 0]) body += """ REAL mre4U0,mre4U1,mre4U2,mre4U3,mim4U0,mim4U1,mim4U2,mim4U3,n4U0,n4U1,n4U2,n4U3; psi4_tetrad(params, in_gfs[IDX4S(CFGF, i0,i1,i2)], in_gfs[IDX4S(HDD00GF, i0,i1,i2)], in_gfs[IDX4S(HDD01GF, i0,i1,i2)], in_gfs[IDX4S(HDD02GF, i0,i1,i2)], in_gfs[IDX4S(HDD11GF, i0,i1,i2)], in_gfs[IDX4S(HDD12GF, i0,i1,i2)], in_gfs[IDX4S(HDD22GF, i0,i1,i2)], &mre4U0,&mre4U1,&mre4U2,&mre4U3,&mim4U0,&mim4U1,&mim4U2,&mim4U3,&n4U0,&n4U1,&n4U2,&n4U3, xx, i0,i1,i2); """ body += "REAL xCart_rel_to_globalgrid_center[3];\n" body += "xx_to_Cart(params, xx, i0, i1, i2, xCart_rel_to_globalgrid_center);\n" body += "int ignore_Cart_to_i0i1i2[3]; REAL xx_rel_to_globalgridorigin[3];\n" body += "Cart_to_xx_and_nearest_i0i1i2_global_grid_center(params, xCart_rel_to_globalgrid_center,xx_rel_to_globalgridorigin,ignore_Cart_to_i0i1i2);\n" for i in range(3): body += "const REAL xx" + str(i) + "=xx_rel_to_globalgridorigin[" + str(i) + "];\n" body += fin.FD_outputC("returnstring", [lhrh(lhs=gri.gfaccess("in_gfs", "psi4_part" + str(whichpart) + "re"), rhs=psi4.psi4_re_pt[whichpart]), lhrh(lhs=gri.gfaccess("in_gfs", "psi4_part" + str(whichpart) + "im"), rhs=psi4.psi4_im_pt[whichpart])], params=FD_outCparams) par.glb_Cparams_list = Cparams_list_orig.copy() elif setPsi4tozero: body += fin.FD_outputC("returnstring", [lhrh(lhs=gri.gfaccess("in_gfs", "psi4_part" + str(whichpart) + "re"), rhs=sp.sympify(0)), lhrh(lhs=gri.gfaccess("in_gfs", "psi4_part" + str(whichpart) + "im"), rhs=sp.sympify(0))], params=FD_outCparams) enable_SIMD = False enable_rfm_precompute = False print_msg_with_timing("psi4, part " + str(whichpart), msg="Ccodegen", startstop="stop", starttime=starttime) add_to_Cfunction_dict( includes=includes, desc=desc, name=name, params=params, body=body, loopopts=get_loopopts("InteriorPoints", enable_SIMD, enable_rfm_precompute, OMP_pragma_on, enable_xxs=False), rel_path_to_Cparams=rel_path_to_Cparams) return pickle_NRPy_env() # # # # Step 6.b: `psi4_tetrad()`: Register C function for evaluating Weyl scalar $\psi_4$ tetrad \[Back to [top](#toc)\] # $$\label{psi4_tetrad}$$ # # Computing $\psi_4$ requires that an observer tetrad be specified. We adopt a "quasi-Kinnersley tetrad" as described in [the corresponding NRPy+ tutorial notebook](Tutorial-Psi4_tetrads.ipynb). # # `add_psi4_tetrad_to_Cfunction_dict()` supports the following features # # * (disabled by default) "setPsi4tozero", which effectively turns this into a dummy function -- for when $\psi_4$ is not needed, and it's easier to just set `psi_4=0` instead of calculating it. # # Also to enable parallel C-code kernel generation, the NRPy+ environment is pickled and returned. # In[11]: def add_psi4_tetrad_to_Cfunction_dict(includes=None, rel_path_to_Cparams=os.path.join("."), setPsi4tozero=False): starttime = print_msg_with_timing("psi4 tetrads", msg="Ccodegen", startstop="start") # Set up the C function for BSSN basis transformations desc = "Compute tetrad for psi4" name = "psi4_tetrad" # First set up the symbolic expressions (RHSs) and their names (LHSs) psi4tet.Psi4_tetrads() list_of_varnames = [] list_of_symbvars = [] for i in range(4): list_of_varnames.append("*mre4U" + str(i)) list_of_symbvars.append(psi4tet.mre4U[i]) for i in range(4): list_of_varnames.append("*mim4U" + str(i)) list_of_symbvars.append(psi4tet.mim4U[i]) for i in range(4): list_of_varnames.append("*n4U" + str(i)) list_of_symbvars.append(psi4tet.n4U[i]) paramsindent = " " params = """const paramstruct *restrict params,\n""" + paramsindent list_of_metricvarnames = ["cf"] for i in range(3): for j in range(i, 3): list_of_metricvarnames.append("hDD" + str(i) + str(j)) for var in list_of_metricvarnames: params += "const REAL " + var + "," params += "\n" + paramsindent for var in list_of_varnames: params += "REAL " + var + "," params += "\n" + paramsindent + "REAL *restrict xx[3], const int i0,const int i1,const int i2" # Set the body of the function body = "" outCparams = "includebraces=False,outCverbose=False,enable_SIMD=False,preindent=1" if not setPsi4tozero: for i in range(3): body += " const REAL xx" + str(i) + " = xx[" + str(i) + "][i" + str(i) + "];\n" body += " // Compute tetrads:\n" body += " {\n" # Sort the lhss list alphabetically, and rhss to match: lhss, rhss = [list(x) for x in zip(*sorted(zip(list_of_varnames, list_of_symbvars), key=lambda pair: pair[0]))] body += outputC(rhss, lhss, filename="returnstring", params=outCparams) body += " }\n" elif setPsi4tozero: body += "return;\n" loopopts = "" print_msg_with_timing("psi4 tetrads", msg="Ccodegen", startstop="stop", starttime=starttime) add_to_Cfunction_dict( includes=includes, desc=desc, name=name, params=params, body=body, loopopts=loopopts, rel_path_to_Cparams=rel_path_to_Cparams) return pickle_NRPy_env() # # # # Step 6.c: `SpinWeight_minus2_SphHarmonics()`: Register C function for evaluating spin-weight $s=-2$ spherical harmonics \[Back to [top](#toc)\] # $$\label{swm2}$$ # # After evaluating $\psi_4$ at all interior gridpoints on a numerical grid, we next decompose $\psi_4$ into spin-weight $s=-2$ spherical harmonics, which are documented in [this NRPy+ tutorial notebook](Tutorial-SpinWeighted_Spherical_Harmonics.ipynb). # # `SpinWeight_minus2_SphHarmonics()` supports the following features # # * (`"8"` by default) `maximum_l`, the maximum $\ell$ mode to output. Symbolic expressions $(\ell,m)$ modes up to and including `maximum_l` will be output. # # Also to enable parallel C-code kernel generation, the NRPy+ environment is pickled and returned. # In[12]: def add_SpinWeight_minus2_SphHarmonics_to_Cfunction_dict(includes=None, rel_path_to_Cparams=os.path.join("."), maximum_l=8): starttime = print_msg_with_timing("Spin-weight s=-2 Spherical Harmonics", msg="Ccodegen", startstop="start") # Set up the C function for computing the spin-weight -2 spherical harmonic at theta,phi: Y_{s=-2, l,m}(theta,phi) prefunc = r"""// Compute at a single point (th,ph) the spin-weight -2 spherical harmonic Y_{s=-2, l,m}(th,ph) // Manual "inline void" of this function results in compilation error with clang. void SpinWeight_minus2_SphHarmonics(const int l, const int m, const REAL th, const REAL ph, REAL *reYlmswm2_l_m, REAL *imYlmswm2_l_m) { """ # Construct prefunc: outCparams = "preindent=1,outCfileaccess=a,outCverbose=False,includebraces=False" prefunc += """ switch(l) { """ for l in range(maximum_l + 1): # Output values up to and including l=8. prefunc += " case " + str(l) + ":\n" prefunc += " switch(m) {\n" for m in range(-l, l + 1): prefunc += " case " + str(m) + ":\n" prefunc += " {\n" Y_m2_lm = SWm2SH.Y(-2, l, m, SWm2SH.th, SWm2SH.ph) prefunc += outputC([sp.re(Y_m2_lm), sp.im(Y_m2_lm)], ["*reYlmswm2_l_m", "*imYlmswm2_l_m"], "returnstring", outCparams) prefunc += " }\n" prefunc += " return;\n" prefunc += " } // END switch(m)\n" prefunc += " } // END switch(l)\n" prefunc += r""" fprintf(stderr, "ERROR: SpinWeight_minus2_SphHarmonics handles only l=[0,"""+str(maximum_l)+r"""] and only m=[-l,+l] is defined.\n"); fprintf(stderr, " You chose l=%d and m=%d, which is out of these bounds.\n",l,m); exit(1); } void lowlevel_decompose_psi4_into_swm2_modes(const int Nxx_plus_2NGHOSTS1,const int Nxx_plus_2NGHOSTS2, const REAL dxx1, const REAL dxx2, const REAL curr_time, const REAL R_ext, const REAL *restrict th_array, const REAL *restrict sinth_array, const REAL *restrict ph_array, const REAL *restrict psi4r_at_R_ext, const REAL *restrict psi4i_at_R_ext) { for(int l=2;l<="""+str(maximum_l)+r""";l++) { // The maximum l here is set in Python. for(int m=-l;m<=l;m++) { // Parallelize the integration loop: REAL psi4r_l_m = 0.0; REAL psi4i_l_m = 0.0; #pragma omp parallel for reduction(+:psi4r_l_m,psi4i_l_m) for(int i1=0;i1=0) sprintf(filename,"outpsi4_l%d_m+%d-r%.2f.txt",l,m, (double)R_ext); FILE *outpsi4_l_m; // 0 = n*dt when n=0 is exactly represented in double/long double precision, // so no worries about the result being ~1e-16 in double/ld precision if(curr_time==0) outpsi4_l_m = fopen(filename, "w"); else outpsi4_l_m = fopen(filename, "a"); fprintf(outpsi4_l_m,"%e %.15e %.15e\n", (double)(curr_time), (double)psi4r_l_m,(double)psi4i_l_m); fclose(outpsi4_l_m); } } } """ desc = "" name = "driver__spherlikegrids__psi4_spinweightm2_decomposition" params = r"""const paramstruct *restrict params, REAL *restrict diagnostic_output_gfs, const int *restrict list_of_R_ext_idxs, const int num_of_R_ext_idxs, REAL *restrict xx[3],void xx_to_Cart(const paramstruct *restrict params, REAL *restrict xx[3],const int i0,const int i1,const int i2, REAL xCart[3])""" body = r""" // Step 1: Allocate memory for 2D arrays used to store psi4, theta, sin(theta), and phi. const int sizeof_2Darray = sizeof(REAL)*(Nxx_plus_2NGHOSTS1-2*NGHOSTS)*(Nxx_plus_2NGHOSTS2-2*NGHOSTS); REAL *restrict psi4r_at_R_ext = (REAL *restrict)malloc(sizeof_2Darray); REAL *restrict psi4i_at_R_ext = (REAL *restrict)malloc(sizeof_2Darray); // ... also store theta, sin(theta), and phi to corresponding 1D arrays. REAL *restrict sinth_array = (REAL *restrict)malloc(sizeof(REAL)*(Nxx_plus_2NGHOSTS1-2*NGHOSTS)); REAL *restrict th_array = (REAL *restrict)malloc(sizeof(REAL)*(Nxx_plus_2NGHOSTS1-2*NGHOSTS)); REAL *restrict ph_array = (REAL *restrict)malloc(sizeof(REAL)*(Nxx_plus_2NGHOSTS2-2*NGHOSTS)); // Step 2: Loop over all extraction indices: for(int ii0=0;ii0 # # # Step 7: Confirm above functions are bytecode-identical to those in `BSSN/BSSN_Ccodegen_library.py` \[Back to [top](#toc)\] # $$\label{validation}$$ # In[13]: import BSSN.BSSN_Ccodegen_library as BCL import sys funclist = [("print_msg_with_timing", print_msg_with_timing, BCL.print_msg_with_timing), ("get_loopopts", get_loopopts, BCL.get_loopopts), ("register_stress_energy_source_terms_return_T4UU", register_stress_energy_source_terms_return_T4UU, BCL.register_stress_energy_source_terms_return_T4UU), ("BSSN_RHSs__generate_symbolic_expressions", BSSN_RHSs__generate_symbolic_expressions, BCL.BSSN_RHSs__generate_symbolic_expressions), ("add_rhs_eval_to_Cfunction_dict", add_rhs_eval_to_Cfunction_dict, BCL.add_rhs_eval_to_Cfunction_dict), ("Ricci__generate_symbolic_expressions", Ricci__generate_symbolic_expressions, BCL.Ricci__generate_symbolic_expressions), ("add_Ricci_eval_to_Cfunction_dict", add_Ricci_eval_to_Cfunction_dict, BCL.add_Ricci_eval_to_Cfunction_dict), ("BSSN_constraints__generate_symbolic_expressions", BSSN_constraints__generate_symbolic_expressions, BCL.BSSN_constraints__generate_symbolic_expressions), ("add_BSSN_constraints_to_Cfunction_dict", add_BSSN_constraints_to_Cfunction_dict, BCL.add_BSSN_constraints_to_Cfunction_dict), ("add_enforce_detgammahat_constraint_to_Cfunction_dict", add_enforce_detgammahat_constraint_to_Cfunction_dict, BCL.add_enforce_detgammahat_constraint_to_Cfunction_dict), ("add_psi4_part_to_Cfunction_dict", add_psi4_part_to_Cfunction_dict, BCL.add_psi4_part_to_Cfunction_dict), ("add_psi4_tetrad_to_Cfunction_dict", add_psi4_tetrad_to_Cfunction_dict, BCL.add_psi4_tetrad_to_Cfunction_dict), ("add_SpinWeight_minus2_SphHarmonics_to_Cfunction_dict", add_SpinWeight_minus2_SphHarmonics_to_Cfunction_dict, BCL.add_SpinWeight_minus2_SphHarmonics_to_Cfunction_dict) ] if sys.version_info.major >= 3: import inspect, re for func in funclist: # https://stackoverflow.com/questions/20059011/check-if-two-python-functions-are-equal # remove line continuations s1 = re.sub(r'\s*\\\n',' ',inspect.getsource(func[1])) s2 = re.sub(r'\s*\\\n',' ',inspect.getsource(func[2])) if s1 != s2: print("inspect.getsource(func[1]):") print(s1) with open('s1.txt','w') as fd: print(s1,file=fd) with open('s2.txt','w') as fd: print(s2,file=fd) print("inspect.getsource(func[2]):") print(s2) print("ERROR: function " + func[0] + " is not the same as the Ccodegen_library version!") sys.exit(1) print("PASS! ALL FUNCTIONS ARE IDENTICAL") else: print("SORRY CANNOT CHECK FUNCTION IDENTITY WITH PYTHON 2. PLEASE update your Python installation.") # # # # Step 8: Output this notebook to $\LaTeX$-formatted PDF file \[Back to [top](#toc)\] # $$\label{latex_pdf_output}$$ # # The following code cell converts this Jupyter notebook into a proper, clickable $\LaTeX$-formatted PDF file. After the cell is successfully run, the generated PDF may be found in the root NRPy+ tutorial directory, with filename # [Tutorial-BSSN_time_evolution-C_codegen_library.pdf](Tutorial-BSSN_time_evolution-C_codegen_library.pdf) (Note that clicking on this link may not work; you may need to open the PDF file through another means.) # In[14]: import cmdline_helper as cmd # NRPy+: Multi-platform Python command-line interface cmd.output_Jupyter_notebook_to_LaTeXed_PDF("Tutorial-BSSN_time_evolution-C_codegen_library")