Modified Oregonator: an Approach from the Complex Networks Theory Oregonador Modificado: un Enfoque desde la Teoría de Redes Complejas

Within the framework of Systems Biology, this paper proposes the complex network theory as a fundamental tool for determining the most critical dynamic variables in complex biochemical mechanisms. The Belousov-Zhabotinsky reaction is proposed as a study model and as a complex bipartite network. By determining the structural property authority, the most relevant dynamic variables are specified, and a mathematical model of the Belousov-Zhabotinsky reaction is obtained. The bidirectional coupling of the proposed model was made with other models associated with biological processes, finding synchronization phenomena when varying the coupling parameter. The time series obtained from the numerical solution of the coupled models were used to construct their images using the Gramian Angular Field technique. In the end, a supervised learning tool is proposed for the classification of the type of coupling by analyzing the images, obtaining score percentages above 94%. The hereby proposed methodology could be extended to the experimental field in order to determine anomalies in the coupling and synchronization of different physiological oscillators.


INTRODUCTION
It is possible to study any physiological process through an intricate network of biochemical reaction mechanisms from the systems biology paradigm [1] [2] [3] [4] , such mechanisms are responsible for regulating a wide variety of processes of utmost importance for life, through complex positive and negative feedback systems [5] [6] . A prominent example of negative feedback is the thyroid hormone regulatory mechanism [7] carried out by either thyroid cells or the leptin-insulin axis [8] , which is strongly related to metabolic processes. Discrepancies in these feedback mechanisms can lead to multiple pathologies at the systemic level [7] . With the development of experimental tools for studying such complex systems, the need to understand the underlying dynamics of these regulatory mechanisms also arose, so biochemists undertook the task of studying the chemistry of these mechanisms to determine the kinetic parameters of the reactions participating in these processes. Meanwhile, biophysicists had to translate these biochemical processes to mathematical models using tools based on Dynamical Systems Theory (DST) and the discoveries made by biochemists [9] [10] [11] .
B. Belousov was a Russian biophysicist who pioneered in the study of complex regulatory mechanisms present in biochemical processes at an experimental level. In 1950 he was given the task of proposing a reaction mechanism analogous to the Krebs cycle to study feedback processes [12] [13] [14] . Belousov conceived a chemical mixture made essentially of citric acid, bromate ions, and cerium ions in an acid medium under constant agitation [13] . During the reaction, Belousov observed that the mixture changed from being transparent to a yellow hue and vice versa. The phenomenon occurred time after time, being reminiscent of the Krebs cycle feedback processes.
Nevertheless, his work was never officially published because reviewers concluded that the process was caused by mixture impurities, and that the phenome-non violated the natural laws of thermodynamics [13] [14] . Years later, Prigogine defined the basis of thermodynamics of irreversible processes [12] [13] [14] .
Later on, another Russian biophysicist, A. Zhabotisky resumed Belousov's work. He replaced citric acid with malonic acid and cerium ions with iron ions to visualize chemical species concentration-oscillations obtaining a solution that shifted from red to blue and vice versa [12] . This chemical mechanism was thus named the Belousov-Zhabotinsky (BZ) reaction [12] [13] [14] . The work of Belousov and Zhabotinsky set the basis for the study of oscillatory biochemical processes [14] .
In 1972, Field, Köros and Noyes proposed the first mathematical model in differential equations that described the underlying dynamics of the BZ reaction (FKN Model) and laid the groundwork for mathematical modeling of chemical mechanisms with oscillatory behaviors [15] [16] . This opened a vast field of study for the development of mathematical models in the area of systems biology [17] . The FKN model is extremely robust; however, it considers many chemical species as dynamic variables of the system, which makes it difficult to handle from an analytical point of view [15] [ 16] . To solve this problem, in 1974, Field and Noyes proposed a much simpler mechanism to describe reaction dynamics based on the FKN model, also known as Oregonator because it was created at the University of Oregon. However, said mathematical model was constructed from a reaction mechanism that consists of only five irreversible chemical reactions and that describes the BZ reaction qualitatively [18] . Years later, in 1990, Györgyi, Turányi, and Field proposed a detailed mechanism of the BZ reaction, that consisted of 80 chemical reactions and 27 chemical species and which is currently the most accepted reaction mechanism [19] . The BZ mechanism has two subsets, the first consists of a set of inorganic chemical reactions while the second one consists of organic chemical reactions. Just one year later, in his seminal work A history of chemical oscillations and waves, Zhabotinsky presented a mathematical model in differential equations based on a subset of chemical reactions from the work of Györgyi et al., and mentioned that it is possible to explain BZ reaction dynamics with said subset [12] . At this point, the following questions arise: How reliable and valid is to reduce an extremely complex reaction mechanism to a subset of reactions?
What criteria should be used to make this reduction?
Could reducing the mechanism eliminate important system information? Is it possible to use mathematical tools to carry out this reduction properly without losing information? And if this is possible, can these mathematical techniques be used to study biochemical regulatory mechanisms to identify the most critical system variables?
To answer these questions, it is necessary to understand that DST is not the only mathematical tool used to study the complex mechanisms of biochemical regulation [20] . There are tools such as models of agents and cellular automata that have been used to explain at least qualitatively the phenomena that emerge from different physiological processes [20] . However, the tool that has attracted the most attention today is the Complex Networks Theory (CNT) based on graph theory [21] [22] . The CNT has been mainly developed by physicists, with the pioneering works of Barabasi et al., [23] . This theory has allowed us to understand the emergence of extremely complex behaviors in systems that involve a large number of variables because, through statistics, it enables us to study the structural properties of the networks to identify the variables or entities with greater relevance in the system of interest [24] . Multiple measures of centrality can be used to determine the level of importance of a variable in a complex network, such as degree, clustering, hub, authority, etc., [25] . In Biomedical research, CNT is widely used to study many phenomena including disease propagation, genetic regulation networks, protein-protein interaction networks, and identification of possible therapeutic targets in complex biochemical reaction mechanisms [25] . In their excellent work, Costa et al. describe the CNT as a vital tool for Systems Biology [26] .
It is well known that the emergence of chronic degenerative diseases (such as insulin resistance, diabetes mellitus II, cancer, cardiovascular diseases among others) result from the mismatch of a wide variety of physiological processes that are in turn regulated by an intricate network of biochemical reactions, which makes it difficult to study the interaction, coupling, and activation that may exist between different physiological processes [27] [28] [29] 30] . The CNT could facilitate the study of different biochemical processes related to each other taking these processes as complex networks where the participating chemical species can be considered nodes or vertices, and all possible physicochemical interactions between them as links or edges. It is possible to identify the chemical species with greater relevance by determining network centrality properties, and applying the standard chemical kinetics techniques (CK) [31] gives rise to mathematical models in differential equations that facilitate the study of coupling and synchronization phenomena between different physiological processes and their possible relationship with various pathologies (see Figure 1).
It is necessary to identify when a group of physiological processes is coupled or synchronized, which requires tools that permit identifying such phenomena. Thanks to the development of machine learning, it is possible to create models capable of learning to recognize or identify a series of patterns with high precision through the acquisition of experience (data) [32] . Machine learning can be divided in supervised and unsupervised learning and can be further classified into different combinations of these [32] [33] . In general, supervised learning consists of providing a series of input data with their respective label to a model so that the model is generalized and subsequently allows the label to be predicted knowing only the input data or characteristics [32] [33] . On the other hand, unsupervised learning consists of providing only the input data or characteristics to the model without its label, to identify specific patterns of information [32] [33] . Each supervised learning model must go through a training, validation, and testing process to ensure generalization [32] [33] . Supervised learning models can be divided into two large groups: classification models and regression models; in the former, the variable is to be predicted as a qualitative or categorical variable. In regression models, the variable to be predicted is a quantitative variable, either continuous or discrete [32] [ 33] . Hereby, the following question arises: Is it possible to use a supervised learning model to predict the coupling or synchronization of biochemical processes represented by systems of differential equations?
To answer this question, it is necessary to ask another one: How can data be provided to the supervised learning model to achieve the prediction of a coupling state? The answer is that by modeling biochemical processes as systems of differential equations and coupling them unidirectionally or bidirectionally, it is possible to numerically solve these models, from which the change in the concentration of chemical species with respect to time is obtained. Therefore, it is possible to determine if, for any value of the coupling parameter, these chemical species are synchronized. The degree of synchronization can be determined by evaluating some nonlinear metric, such as the fractal dimension of the time series, evaluation of Lyapunov exponents, entropy, the study of the synchronization variety in the phase space of the variables under study, etc. [35] [36] [37] [38] , then it is possible to use some of these metrics to determine, through a supervised learning model, whether these systems are coupled or not. It is also possible to obtain images from the numerical solution of the systems of differential equations; such is the case of recurrence diagrams [39]  However, they do not describe the actual biochemical process. For the construction of the network diagrams, the Gephi software was used (https://gephi.org/) [34] . [40] or Gramian angular field images (gaf ) [40] , which can be used with a supervised learning model to determine if a group of variables, in this case, chemical species related to important biochemical processes, are coupled or not, and thereafter, to associate a possible state of synchronization with various pathologies.
Currently, it is possible to monitor blood concentration of a variety of chemical species related to critical metabolic processes, which are associated with different pathologies [41] [44] . On the other hand, using mathematical models that represent the dynamics of these variables and using their numerical solution to obtain their respective gaf images to train, validate and test the supervised learning model, offers a solution to this problem. The ability of each mathematical model to represent a biochemical process depends on its degree of complexity, i.e., whether the model involves not only metabolic biochemical processes but also epigenetic processes [45] .
To analyze the possibility of using CNT in biochemical regulatory mechanisms for identifying the most critical variables of the system and building a system of differential equations that model it, the BZ reaction was used as a study model since it is currently the most complex oscillating reaction discovered so far and it has been used to study the biochemical feedback mechanisms present in the Krebs cycle [12] [19] . On the other hand, to emulate the effects of coupling and synchronization, the obtained model was bidirectionally coupled with other models associated with oscillatory biochemical processes, and its numerical solution was used to obtain the gaf images.
Finally, a supervised learning tool was used to identify the type of models coupled using the gaf images.

MATERIALS AND METHODS
For this study, the general mechanism of the BZ reaction proposed by Györgyi et al. [19] was used. A network was built considering two types of nodes or vertices, the "chemical species" type nodes, and the "chemical reaction" type nodes. The link or edge between species and chemical reactions is given by the reaction rate constant of each of the reactions, which gives place to a "bipartite" network [20] [25] . Once the network was built, the structural property authority was evaluated using the algorithm proposed by Kleinberg, which was initially proposed to determine the level of importance and information flow of websites, to reveal the sites with the highest traffic in a virtual hyperlink environment [46] . In this work, the structural property authority was used to determine the importance of each of the chemical species involved in the BZ reaction mechanism. To build a system of nonlinear differential equations capable of describing the BZ reaction mechanism, it was assumed that the most relevant variables have the most significant flow of "chemical" information. Once the most critical chemical species in the reaction mechanism were determined under the authority criteria, the mathematical model was constructed using standard techniques of CK [31] . When the model was obtained, the effects of synchronization that could emerge due to its coupling with different chemical oscillators were studied to emulate the synchronization processes present in different biochemical systems in mammals [47] . To this end, a bidirectional coupling was carried out with three different models [48] . The first one was an identical model, the second one was the model proposed by Levefer also called Brusselator [49] [50] , which shows interesting autocatalytic processes and, lastly, the model proposed by Selkov that describes the oscillatory behaviors present in glycolysis [51] [52] . For each case, the coupling parameter was varied and the numerical solution of the systems of coupled differential equations was obtained using the standard fourth-order Runge Kutta technique [53] . Once the time series of the numerical solu-tion of the differential equations were obtained, the image of the time series of the bidirectionally coupled variables was constructed using the gaf technique [54] , which was described extensively by Wang et al [55] . In where t i corresponds to the time interval between each value of the time series and N is a constant factor [55] . Lastly, the addition/subtraction between each point is determined to identify possible temporal correlations within different time intervals [55] . It is possible to obtain two types of gaf, the sum (gasf= cos (φ i +φ j )) and the subtraction (gadf= sin (φ i +φ j )). These values are used to construct the Grammar matrix that is used to obtain the image (see Figure 2) [55] . This technique in combination with supervised learning tools has been used to study time series of electroencephalograms [56] , electrocardiograms [57] , signals obtained from biosensors [58] , etc. After the gasf and gadf images were obtained, the supervised learning model was generated using the Orange Data Mining software (https://orange.biolab. si/) [58] [59] , which is a visual toolbox where it is possible to build workflows that allow the use of widgets for analysis and data processing, supervised and unsupervised learning tools, data visualization and model evaluation. It also has extensions for text mining, spectroscopy, complex network analysis, time series, bioinformatics, and image analysis [59] [60] [61] . The transfer learning technique, which is an artificial intelligence technique that consists of pre-training a model with an extensive database and the experience gained from said training to apply it to another problem that may be completely different, was used to process the gaf images [61] [62] [63] . This technique is used in image processing as follows: deep convolutional neural networks (CNN) are used, which are pre-trained with a large number of images of all kinds, later, activations of the penultimate layer of the model (CNN codes) are used to represent the images with vectors (embedded), i.e., CNN is used as a feature extractor or descriptor, allowing supervised learning models to be used and thus obtaining high precision values in image classification [61] [62] [63] . In Orange, it is possible to embed images using different CNNs, including Google's Inception V3 neural network that has been pre-trained with the ImageNet database consisting of 1.2 million images. The neural network has 2048 nodes in its penultimate layer, so each image represents it with a vector of dimension 2048 [33] [64] (CNN by default in Orange). On the other hand, embedding images can also be done with CNN SqueezeNet, which is a much simpler network than Inception V3; nevertheless, it achieves a precision close to that of CNN AlexNet on the ImageNet database, this CNN represents each image as a vector of dimension 1000 [65] . It is also possible to embed with CNN's VGG16 and VGG19 proposed by the Visual Geometry Laboratory of the University of Oxford [66] , in the same way, pretrained with the ImageNet database, the CNN DeepLoc pre-trained with 21882 images [67] and CNN Painters, a pre-trained network with 79433 images [59] [60] [61] . For embedding the images, Orange sends them to an external server, except for CNN SquezeeNet, which is done locally, making embedding faster. In this work, the gaf images were embedded using CNN Google Inception V3 and SquezeeNet. Subsequently, a super- After embedding the images with the CNN, the same six evaluation techniques were used to study the effect of reducing dimensions and to avoid overfitting [33] [68] [69] . For both procedures the classifier confusion matrix was obtained to determine the classification metrics: Classification Accuracy (CA), Precision (P), Recall (R), and F1-score (F1) [33] [70] . and one between 0.9 and 1 (0.93%). This last one is the most important node, considering the authority centrality criterion. In the histogram, it is easy to distinguish that there are few nodes with a high authority value, making it possible to identify the chemical species with greater importance, possibly due to powerlaw-like behavior [71] . Power laws are present in a wide variety of biological phenomena and are related to processes of a universal nature [72] .

RESULTS AND DISCUSSION
As expected, the chemical node or species with higher authority is the H + cation because the chemical reaction at the experimental level is carried out in the presence of sulfuric acid (H 2 SO 4 ), which is a source of H + and allows the formation of bromous acid (HBrO 2 ), an essential variable for the chemical feedback mechanism that enables the existence of periodic behaviors [12] [13] [14] . Furthermore, we can note that H 2 O is the second chemical species with higher authority since the precursor solutions for the chemical reaction use deionized water as a solvent [12] . The third most crucial chemical species is the carboxy radical (*COOH) inasmuch as the breakdown of malonic acid as a precursor agent results in short-lived chemical species [12] [19] . The presence of Br* radicals is mainly because during the progress of the reaction molecular bromine (Br 2 ) occurs due to the bromate anion precursor agent (BrO 3 ), said Br 2 results in the formation of bromide ions (Br -) and subsequently to Br* radicals, which are fundamental chemical species for the feedback mechanism [12] [19] . Because the chemical reaction includes organic components, specifically malonic acid and its derivatives, it has carboxyl or carbonic acid groups that can be decomposed into carbon dioxide (CO 2 ), which is seen in the form of gas bubbles during the reaction [12] [13] [14] .  [34] .
Lastly, we can see that the oxidized form of the Cerium metal (Ce 4+ ) is also among the ten nodes with the highest authority value, this results from said chemical species acting as a catalytic agent in the chemical reaction and that alongside its reduced form (Ce 3+ ), it gives rise to the typical color changes of the BZ reaction [12] [18] . With these results, it is possible to observe that applying CNT to the BZ reaction mechanism proposed by Györgyi et al. [19] permits identifying the chemical species with greater relevance under the criterion of authority. Such chemical species appear naturally in the mechanism described by Zhabotinsky, either in its ionic form or in the form of a precursor compound [12] , so it is possible to approximate the BZ reaction to said subset of reactions. Still, it is essential to mention that it is possible to use other centrality criteria to identify the nodes with greater relevance [25] . The mechanism proposed by Zhabotisnky uses Iron (Fe) as a catalyst instead of Cerium (Ce) [12] . Below is the chemical mechanism studied by Zhabotinsky replacing Fe with Ce. BZ Chemical model [12] 1. H + +HBrO 3  Where the variables X, Y, and Z are the variables that generally describe the chemical mechanism of the BZ reaction [12] . Using the law of mass action, the following system of nonlinear differential equations is obtained.
It models the change of X, Y, and Z with respect to time and as a function of the chemical concentrations of the precursors [12] :  Where k i are the reaction rate constants and F is a stoichiometric factor used as an adjustment parameter [12] [13] . It is possible to simplify the model by dividing the numerator and denominator of the term by k 9 , then doing a geometric series development and neglecting terms of greater order k 8 BZ ≈ k 8 BZ is obtained. Therefore, assuming that the concentration of H + remains constant and that F= ƒ, then the model proposed by Zhabotinsky can be reduced to [14] : The model described above is very similar to that proposed by Field and Noyes [13] [18] , except for the term k 12 B, which is related to the rate at which chemical species derived from malonic acid, specifically CHBr(COOH) 2 by its decomposition, it leads to the formation of new Brions, which cannot be neglected from the mathematical model because they have a strong implication in the feedback process of the BZ reaction [12] [19] since the excessive production of these can lead to complete inhibition of oscillations. Also, these ions actively participate in the production of HBrO 2 , which in turn facilitates the process of changing the oxidation state of the Ce catalyst.

Where
However, as in the model proposed by Field and Noyes, the parameter ε', is very small compared to ε [13] [18] ; therefore, it is possible to consider that the variable �, remains in a stationary state, so the system of Equations 4 can be reduced to: Amemiya et al. [73] , and Krug et al. [74] , associated the term to the sensitivity of the BZ reaction to the presence of oxygen and its photosensitivity. In contrast, we associate this term with the decomposition of the derivatives of the malonic acid that give rise to Brreduction. The chemical mechanism proposed by Györgyi et al. was used as a starting point [19] identifying the most relevant chemical species through the CNT under the authority criterion, which approximates the reaction mechanism BZ, to the subset of reactions proposed by Zhabotisky [12] .
To emulate the synchronization processes present in different physiological systems, the mathematical model obtained in this work was coupled with different oscillators, all of them dimensionless. The parameters of the models were selected according to the stability criteria to ensure the presence of oscillations [13] [18] . The coupling was linear bidirectional in all cases, and the coupling force is modulated by parame-

Equations 8
In Figure 4a, the numerical solution of the system of Equations 6 is shown. A full synchronization can be seen between variables u 1 and u 2 , i.e., the oscillation rhythms are completely coupled [48] [75] . This type of behavior is common in biochemical processes carried out by cells that share the same microenvironment and can be stimulated either by a chemical or physical mechanism [76] [77] . On the other hand, in Figure 4b the gasf image of the variable u 1 can be seen, while in Figure 4c, the gadf image of the variable u 2 is shown.
A characteristic pattern of bidirectional coupling between identical oscillators with periodic behavior can be seen in both figures.
In Figure 5a, the numerical solution for the system of Equations 7 is shown. Almost complete synchronization can be seen between the variables u 1 , w 1 and u 2 [48] [75] . Anyway, this system of coupled differential equations is very sensitive to the value of the coupling parameter. Also, Figure 5b shows the gasf image of the variable u 1 , while Figure 5c shows the gadf image of the variable u 2 . In both figures, a characteristic pattern for a quasi-periodic system can be seen [78] . In Figure 6a, the numerical solution of the system of Equations 8 is shown. The phase synchronization between the variables u 1 , u 2 and w 1 [48] [75] can be seen.
The NOM model oscillation rhythm causes variable u 2 to enter in-phase synchronization; however, the variable w 2 decays completely to zero. In the Selkov model, the variable u 2 is related to the concentration of ATP (adenosine triphosphate), while the variable w 2 is related to the concentration of ADP (adenosine diphosphate) [51 [52] , so when said system is coupled with the NOM model, the synchronization process that could exist between the biochemical mechanism of glycolysis and the Krebs cycle is emulated.   Mismatches in the oscillation rhythms of physiological processes can lead to a wide variety of metabolic problems [7] , so it is essential to know the type of coupling that can exist between oscillators (unidirectional, bidirectional, linear, nonlinear, etc., [48] .) and whether these are coupled or not at any given time.
To determine coupled oscillators type, the coupling parameter was varied between one and ten for the system of Equations 6, between one and ten for the system of Equations 7 and lastly, between 0.1 and 1 for the system of Equations 8 (because this system is highly sensitive to bidirectional coupling), to generate the images of the time series using the gaf technique.
Once obtained, the images were embedded to subsequently train the supervised learning model using the procedure described in the materials and methods section (see Figure 7). Figure 8 shows the general Orange workflow for this procedure. In the supplementary material (Tables TS-1 Tables ST-1). Likewise, it is possible to observe that there are no major differences between the results obtained by CNN's Inception V3 and SquezeeNet.
Godec et al., use transfer learning to embed biomedical images and mention that this technique allows them to obtain high precision values in classification models using small databases [61] . Godec et al., also found no major differences in the CA and F1 values of the logistic regression classifier when using either the CNN Inception V3 or the CNN SquezeeNet to embed the images [61] .
It is worth mentioning that sometimes when the database used for training supervised learning models has a higher number of characteristics or descriptors compared to the sample size (as is often the case in biomedical databases), or is unbalanced, i.e., there is a greater amount of data from one class than from the rest, it is possible to overfit the model, leading to erroneous results [ [83] . Therefore, we have also implemented a PCA after embedding the images with CNN's to study the effect of the reduction of dimensions in the classification of the type of coupled oscillators, using the same models of supervised learning and the same evaluation methods.
When performing the PCA implementation after embedding the images with CNN SquezeeNet using nine main components, which explain 95% of the total variance, the logistic regression model obtains the  one hyperparameter, which is used as a regularization or penalization [33] [67] [84] [85] . Therefore, logistic regression can be used as a classification model for the type of coupled oscillators. Table 2 shows the classification metrics for the logistic regression model for each of the evaluation methods, using descriptors obtained directly from CNN's as training data.  [61] .     embedding the images will depend on whether, as users, we want our images to be sent to an external server for embedding. For privacy and security reasons, we prefer them to be embedded locally [61] .

CONCLUSIONS
In the framework of Systems Biology, mathematical modeling of biochemical mechanisms involved in different physiological processes is of vital importance because it allows us to understand the non-linear dynamics that underlie these phenomena. This is why the use of mathematical tools and computational systems for the analysis of the complex feedback mechanisms present in living systems is necessary. The CNT is a mathematical tool that allows studying these mechanisms with a holistic approach and provides valuable information on each of the entities that make up the system [26] [87] .
When determining the authority structural property of the complex network obtained from the BZ reaction mechanism proposed by Györgyi et al., and using it as the centrality criterion, the variables with the highest relevance were identified, i.e., those chemical species that have the greatest flow of information and that could participate in the emergence of collective properties of the system. Identification of these variables led to the construction of a nonlinear system of differential equations similar to the reduction of the FKN model proposed by Field and Noyes (Oregonator) and which also explains the phenomenology of the BZ reaction.
Hence, this result answers the question of using mathematical tools to reduce complex reaction mechanisms without losing generality. Therefore, it is possible to use this methodology in the study of nonlinear dynamics present in biochemical and physiological processes.
On top of that, by applying this methodology to biological systems, it is possible to translate any biochemical or physiological process to a mathematical model and study the phenomena of synchronization between different regulatory mechanisms [88]