Optimal Prediction in Petroleum Geology by
1 Introduction In the recent years, the regression and classification methods have seen enormous success in some fields of business and sciences, but the application of these methods to petroleum geology (PG) is still in initial stage. This is because the PG is different from the other fields, with miscellaneous data types, huge quantity, different measuring precision, and lots of uncertainties to results. Up to now, the most popular methods that have been employed in nonlinear PG problems are the following three regression methods and three classification methods [1]. The three regression methods are the regression of support vector machine (R-SVM), the back-propagation neural network (BPNN), and the multiple regression analysis (MRA); and three classification methods are the classification of support vector machine (C-SVM), the na?ve Bayesian (NBAY), and the Bayesian successive discrimination (BAYSD). However, when these six methods are applied to solve a real-world problem, they often produce mixed results with varied success rates. This issue occurs due to the problem's nonlinearity degree and the different solution accuracies produced by different methods. The purpose of this paper, therefore, is how to select a proper method for real problems, i.e. a proposed method optimization. For the sake of intuitionistic views, eight case studies where the first two for the classification-regression problem and the other six for the classification problem, are demonstrated for this study. 2 Regression and Classification Methods The aforementioned regression and classification methods share the data of samples. The essential difference between the two types of methods is that the output of regression methods is real-type value and in general differs from the real number given in the corresponding learning sample, whereas the output of classification methods is integer-type value and must be one of the integers defined in the learning samples. In the view of dataology, the integer-type value is called as discrete attribute, while the real-type value is called as continuous attribute. The six methods (R-SVM, BPNN, MRA, C-SVM, NBAY, BAYSD) use the same known parameters, and also share the same unknown that is predicted. The only difference between them is the approach and calculation results. Assume that there are n learning samples, each associated with m+1 numbers (x1, x2, …, xm, y*) and a set of observed values (xi1, xi2, …, xim, ), with i=1, 2, …, n for these numbers. In principle, n>m, but in actual practice n>>m. The n samplesassociated with m+1 numbers are defined as n vectors: xi=(xi1, xi2, …, xim, ) (i=1, 2, …, n) (1) where n is the number of learning samples; m is the number of independent variables in samples; xi is the ith learning sample vector; xij is the value of the jth independent variable in the ith learning sample, j=1, 2, …, m; and is the observed value of the ith learning sample. Equation 1 is the expression of learning samples. Let x0 be the general form of a vector of (xi1, xi2, …, xim). The principles of BPNN, MRA, NBAY and BAYSD are the same, i.e. try to construct an expression, y=y(x0), such that Eq. 2 is minimized. Certainly, these four different methods use different approaches and obtain calculation results in differing accuracies. (2) where y=y(x0i) is the calculation result of the dependent variable in the ith learning sample; and the other symbols have been defined in Eq. 1. However, the principles of R-SVMand C-SVM methods are to try to construct an expression, y=y(x0), such that to maximize the margin based on support vector points so as to obtain the optimal separating line. This y=y(x0) is called the fitting formula obtained in the learning process. The fitting formulas of different methods are different. In this paper, y is defined as a single variable. The flowchart is as follows: the 1st step is the learning process, using n learning samples to obtain a fitting formula; the 2nd step is the learning validation, substituting n learning samples (xi1, xi2, …, xim) into the fitting formula to get prediction values (y1, y2, …, yn), respectively, so as to verify the fitness of a method; and the 3rd step is the prediction process, substituting k prediction samples expressed with Eq. 3 into the fitting formula to get prediction values (yn+1, yn+2, …, yn+k), respectively. xi=(xi1, xi2, …, xim) (i=n+1, n+2, …, n+k) (3) where k is the number of prediction samples; xi is the ith prediction sample vector; and the other symbols have been defined in Eq. 1. Equation 3 is the expression of prediction samples. In the six methods, only MRA is a linear method whereas the other five are nonlinear methods, this is due to the fact that MRA constructs a linear function whereas the other five construct nonlinear functions, respectively. To express the calculation accuracies of the prediction variable y for learning and prediction samples when the six methods are used, the following four types of residuals are defined. The absolute relative residual for each sample, R(%), is defined as (4) where yi is the calculation result of the dependent variable in the ith sample; and the other symbols have been defined in Eqs. 1 and 3. R(%) is the fitting residual to express the fitness for a sample in learning or prediction process. It is noted that zero must not be taken as a value of to avoid floating-point overflow. Therefore, for regression method, delete the sample if its =0; and for classification method, positive integer is taken as values of . The mean absolute relative residual for all learning samples, , is defined as (5) where all symbols have been defined in Eqs. 1 and 4. is the fitting residual to express the fitness of learning process. The mean absolute relative residual for all prediction samples, , is defined as (6) where all symbols have been defined in Eqs. 3 and 4. is the fitting residual to express the fitness of prediction process. The total mean absolute relative residual for all samples, , is defined as (7) where all symbols have been defined in Eqs. 1, 3 and 4. If there are no prediction samples, k=0, then =. is the fitting residual to express the fitness of learning and prediction processes, and is adopted to express the following two rules. 2.1 Rule 1: Nonlinearity Degree of a Studied Problem Since MRA is a linear method, its for a studied problem expresses the nonlinearity degree of y=y(x) to be solved, i.e. the nonlinearity degree of the studied problem. This nonlinearity degree can be divided into three classes: weak when ≤10, moderate when 10<≤30, and strong when >30. 2.2 Rule 2: Solution Accuracyof a Given Method Application Whether linear method (MRA) or nonlinear methods (R-SVM, BPNN, C-SVM, NBAY, BAYSD), their ofa studied problem expresses the accuracy of y=y(x) obtained by each method, i.e. solution accuracy of the studied problem solved by each method. This solution accuracy can be divided into three classes: high when ≤10, moderate when 10<≤30, and low when >30. 2.3 Methodology of the Six Methods in the Research Through the learning process, each method constructs its own function y=y(x). The methods of the six methods (R-SVM, BPNN, MRA, C-SVM, NBAY, BAYSD) are described below. It is noted that y=y(x)created by BPNN is an implicit expression, i.e. which cannot be expressed as a usual mathematical formula; whereas that of the other five methods are explicit expressions, i.e. which are expressed as a usual mathematical formula. The following will discuss the six methods. Because support vector machine (SVM) has both classification (C-SVM) and regression (R-SVM) methods, SVM is generally introduced ahead. Since the 1990's, SVM has been gradually applied in natural and social sciences, especially widely in this century. SVM is an approach utilizing machine-learning based on statistical learning theory. It is essentially performed by converting a real-world problem (the original space) into a new higher dimensional feature space using the kernel function, and then constructing a linear discriminate function in the new space to replace the nonlinear discriminate function. Theoretically, SVM can obtain the global optimal solution and avoid converging to a local optimal solution as can possibly occur in BPNN, though this problem in BPNN is rare if BPNN is properly coded [1]. The SVM procedure includes two principal methods: 1) C-SVM, such as the binary classification [1]–[6], and the n-binary classification [7]; and 2) R-SVM, such as the e-regression [1] [4] [6], and the n-regression [8]. In the case studies, the binary classification for C-SVM and the e-regression for R-SVM are employed. Moreover, it is better to take RBF (radial basis function) as a kernel function than to take the linear, polynomial and sigmoid functions under strong nonlinear conditions [1] [4], and thus the kernel function used in C-SVM and R-SVM is the RBF. 1) R-SVM A technique of R-SVM, e-regression [1] [4], has been employed. The formula created by this technique is an expression with respect to a vector x, which is so-calleda nonlinear function y=R-SVM(x1, x2, …, xm): (8) where α and α* are the vector of Lagrange multipliers, α=(α1, α2, …, αn) and α*=(, , …, ), 0≤αi≤C and0≤≤C where C is the penalty factor, and the constraint =0; is a RBF kernel function; γ is the regularization parameter, γ>0; and b is the offset of the separating hyperplane, which can be calculated using the free vectors xi. These free xi arethose vectors corresponding to αi>0 and >0, on which the final R-SVM model depends. αi, , C, and γ can be solved using the dual quadratic optimization: (9) where e (e>0)is determined by user. It is noted that in the case studies the formulas corresponding to Eq. 8 are not concretely written out due to their large size. 2) BPNN The BPNN procedure has been widely applied since the 1980's [1] [9] [10], and the application of BPNN is still predominant. The formula created by BPNN is an expression with respect to m parameters (x1, x2, …, xm) [1]: y=BPNN(x1, x2, …, xm) (10) where BPNN is a nonlinear function, which cannot be expressed as a usual mathematical formula and so is an implicit expression. BPNN consists of one input layer, one or more hidden layers, and one output layer. In the case studies, only one hidden layer is employed. There is no theory yet to determine how many hidden layers are needed for any given case, but in the case of output layer with only one node, it is enough to define one hidden layer. Moreover, it is also difficult to determine how many nodes a hidden layer should have. For solving local minima problem, it is suggested to use the large Nhidden=2(Ninput+Noutput)?1 estimate where Nhidden is the number of hidden nodes, Ninput is the number of input nodes and Noutput is the number of output nodes. The values of the network learning rate for the output layer and the hidden layer are within (0, 1), and in practice they can be the same. The term back-propagation refers to the way [1] [11], the error computed at the output side is propagated backward from the output layer, to the hidden layer, and finally to the input layer. Each iteration of BPNN constitutes two sweeps: forward to calculate a solution by using a sigmoid activation function, and backward to compute the error and thus to adjust the weights and thresholds for the next iteration. This iteration is performed repeatedly until the solution agrees with the desired value within a required tolerance. The error takes the root mean square error [1] [12] is RMSE(%)=×100 (11) where yi and are under the conditions of normalizations in the learning process. RMSE(%) is used in the conditions for terminating network learning. 3) MRA The MRA procedure has been widely applied since the 1970's [1] [13] [14], and the successive regression analysis, the most popular MRA technique, is still a very useful tool. The formula created by this technique is a linear combination with respect to m parameters (x1, x2, …, xm), plus a constant term, which is so-calleda linear function y=MRA(x1, x2, …, xm) [1]: y=b0+b1x1+b2x2+…+bmxm (12) where the constants b0, b1, b2, …, bm are deduced using regression criteria and calculated by the successive regression analysis of MRA. Eq. 12 is a so-called “regression equation”. In rare cases an introduced xk can be deleted in the regression equation, and in much rarer cases a deleted xk could be again introduced into the regression equation. Therefore, usually Eq. 12 is solved via m iterations. 4) C-SVM A technique of C-SVM, the binary classifier [1] [4], has been employed. The formula created by this technique is an expression with respect to a vector x, which is so-calleda nonlinear function y=C-SVM(x1, x2, …, xm): (13) where α is the vector of Lagrange multipliers, α=(α1, α2, …, αn), 0≤αi≤C where C is the penalty factor, and the constraint =0; is a RBF kernel function; γ is the regularization parameter, γ>0; and b is the offset of the separating hyperplane, which can be calculated using the free vectors xi. These free xi arethose vectors corresponding to αi>0, on which the final C-SVM model depends. αi, C, and γ can be solved using the dual quadratic optimization: (14) It is noted that in the case studies the formulas corresponding to Eq. 13 are not concretely written out due to their large size. 5) NBAY The NBAY procedure has been widely applied since the 1990's, and widely applied in this century[1] [15]. The following introduces a NBAY technique, i.e. the na?ve Bayesian. The formula created by this technique is a set of nonlinear products with respect to m parameters (x1, x2, …, xm) [1] [16] [17]: (l=1, 2, …, L) (15) where l is the class number, L is the number of classes, Nl(x) is the discrimination function of the lth class with respect to x, σjlis the mean square error of xj in Class l, μjlis the mean of xj in Class l. Eq. 15 is so-called a na?ve Bayesian discrimination function. Once Eq. 15 is created, any sample shown by Eq. 1 or Eq. 3 can be substituted in Eq. 15 to obtain L values: N1, N2, …, NL. If , then y=lb (16) for this sample. Eq. 16 is so-calleda nonlinear function y=NBAY (x1, x2, …, xm). 6) BAYSD The BAYSD procedure has been widely applied since the 1990's, and widely applied in this century[1] [18] [19]. The following introduces BAYSD technique. The formula created by this technique is a set of nonlinear combinations with respect to m parameters (x1, x2, …, xm), plus two constant terms [1]: (17) where l is the class number, L is the number of classes, Bl(x) is the discrimination function of the lth class with respect to x, cjl isthe coefficient of xj in the lth discrimination function, pland c0lare two constant terms in the lth discrimination function. The constants pl, c0l, c1l, c2l, …, cml are deduced using Bayesian theorem and calculated by the successive Bayesian discrimination. Eq. 17 is so-called a Bayesian discrimination function. In rare cases an introduced xk can be deleted in the Bayesian discrimination function, and in muchrarer casesa deleted xk could be again introduced into the Bayesian discrimination function. Therefore, usually Eq. 17 is solved via m iterations. Once Eq. 17 is created, any sample shown by Eq. 1 or Eq. 3 can be substituted in Eq. 17 to obtain L values: B1, B2, …, BL. If , then y=lb (18) for this sample. Eq. 18 is so-calleda nonlinear function y=BAYSD(x1, x2, …, xm). The aforementioned six methods applied to eight case studies below have the following common appoints. At the starting point of BPNN or SVM running, each feature in x and y of the learning samples and each feature in x of the prediction samples are normalized, i.e. features are normalized to [0, 1] for BPNN, while features are normalized to [?1, 1] for SVM (C-SVM and R-SVM). But there are no feature normalizations for samples in MRA, NBAY and BAYSD. In order to have the comparability of results among different applications, a) for BPNN, only one hidden layer is employed, the number of hidden nodes is defined by a fixed formula with respect to the number of input nodes and the number of output nodes as described above, and the values of the network learning rate for the output layer and the hidden layer are fixed to 0.6; and b) for SVM, the termination of calculation accuracy TCA is fixed to 0.001, the RBF is taken as a kernel function, and the insensitive functione in R-SVM is fixed to 0.1. 3 Two Case Studies of The Classification-Regression Problem 3.1 Case Study 1: The Permeability Prediction of an Oilfield The objective of this case study is to calculate rock permeability and to conduct rock permeability classification (RPC) for the oil/gas wells, which has practical value when experiment data is less limited. Using data of 12 samples from an oilfield in China, and each sample contains 3 independent variables (x1 = irreducible water saturation, x2 = surface area to volume ratio of rock, x3 = porosity) and an experiment data (y*= rock permeability), [20] adopted BPNN for the prediction of permeability. In this case study, among these 12 samples, 10 are taken as learning samples and one as prediction sample for the prediction of both permeability and RPC (Tables 1 and 2), in which for permeability by R-SVM, BPNN and MRA, and for RPC by C-SVM, NBAY and BAYSD. 1) Regression to Calculate Rock Permeability Using the 10 learning samples (Table 1) and by R-SVM, BPNN and MRA, the following three functions of rock permeability (y) with respect to 3 independent variables (x1, x2, x3)have been constructed. Using R-SVM, the result is an explicit nonlinear function corresponding to Eq. 8: y=R-SVM(x1, x2, x3) (19) with C=1, =0., and 10 free vectors xi. The BPNN used consists of 3 input layer nodes, 1 output layer node and 7 hidden layer nodes. The result is an implicit nonlinear function corresponding to Eq. 10: y=BPNN(x1, x2, x3) (20) with the optimal learning time count topt=, and RMSE(%)=0.5156×10?2. Table 1 Input data for permeability prediction of an parameters related to y ay* x1(%)x2(cm2·cm3)x3(%)Permeability b(10-3mm2)RPC c LearningSamplesLearningsamples... . ..02941 .19. (74)(2) a x1 = irreducible water saturation, x2 = surface area to volume ratio of rock, x3 = porosity. by* = rock permeability determined by the experimentation, number in parenthesis is not input data, but is used for calculating R(%). cy* = RPC = rock permeability classification (1–high permeability, 2–intermediate permeability, 3–low permeability) determined by Table 2, number in parenthesis is not input data, but is used for calculating R(%). Table 2 Rock permeability classification based on rock permeability quantityRock permeability classificationRock permeability(10-3mm2)RPC(rock permeability classification) High permeability>2001 Intermediate permeability[50, 200]2 Low permeability<503 Using MRA, the result is an explicit linear function corresponding to Eq. 12: y=653.63+2.5543x1?0.x2+35.859x3 (21) Equation 21 yields a residual variance of 0.5749 and a multiple correlation coefficient of 0.652. From the regression process, the rock permeability (y) is shown to depend on the 3 independent variables in decreasing order: x2, x3, and x1. Substituting the values of 3 independent variables (x1, x2, x3) given by the 10 learning samples and one prediction sample (Table 1) in Eqs. 19, 20 and 21, respectively, the rock permeability (y) of each sample is obtained (Table 3). From Table 4, it shown that a) the nonlinearity degree of this studied problem is strong due to of MRA is 269.7; and b) the solution accuracies of R-SVM, BPNN and MRA are low, moderate and low, respectively. Therefore, a) R-SVM and MRA are obviously unavailable; though the solution accuracy of BPNN is moderate, it is also inapplicable. 2) Classification to Calculate RPC Using the 10 learning samples (Table 1) and by C-SVM, NBAY and BAYSD, the following three functions of RPC (y) with respect to 3 independent variables (x1, x2, x3)have been constructed. Using C-SVM, the result is an explicit nonlinear function corresponding to Eq. 13: y=C-SVM(x1, x2, x3) (22) with C=128, =0.03125, 8 free vectors xi, and the cross validation accuracy CVA=72.7273%. Table 3 Prediction results of rock permeability of an permeability (10-3mm2) ExperimentationRegression method R-SVMBPNNMRA y*yR(%)yR(%)yR(%) .8.210.9 .8118.. .986...30.27 .894..556.2 ..0317.1 . . . ..0 .842. . 4 Comparison among the applications of regression methods (R-SVM, BPNN and MRA) to rock permeability prediction of an oilfieldMethodFittingformulaMean absolute relative residualDependence of the predictedvalue (y) on independentvariables (x1, x2, x3),in decreasing orderTimeconsumingon PC(Intel Core 2)Solutionaccuracy R-SVMNonlinear,/A3 sLow BPNNNonlinear,/A30 sModerate MRALinear,, x3, x1<1 sLow Using NBAY, the result is an explicit nonlinear discriminate function corresponding to Eq. 15: (l=1, 2, 3) (23) where for l=1, σj1=4.437, 371.561, 2.498, μj1=9.25, 665, 16.425; for l=2, σj2=6.549, 488.409, 4.507, μj2=17.333, 1635.667, 17.567; for l=3, σj3=6.128, 728.052, 4.082, μj3=19.667, 2543.333, 17. Using BAYSD, the result is an explicit nonlinear discriminate function corresponding to Eq. 17: (24) From the successive process, RPC (y) is shown to depend on the 3 independent variables in decreasing order: x2, x3, and x1. Though MRA is regression method but not classification method, MRA can provides the nonlinearity degree of the studied problem, and thus it is required to run MRA. Using MRA, the result is an explicit linear function corresponding to Eq. 12: y=2.4398++?0.x3 (25) Equation 25 yields a residual variance of 0. and a multiple correlation coefficient of 0.. From the regression process, RPC (y) is shown to depend on the 3 independent variables in decreasing order: x2, x3, and x1. Substituting the values of 3 independent variables (x1, x2, x3) given by the 10 learning samples and one prediction sample (Table 1) in Eqs. 22, 23 (and then use Eq. 16), 24 (and then use Eq. 18) and 25, respectively, the RPC (y) of each sample is obtained (Table 5). Table 5 Prediction results from rock permeability classification of an a y*Classification methodMRA C-SVMNBAYBAYSD yR(%)yR(%)yR(%)yR(%) .211.2 .731.0 .02.8 .96.4 .89.7 .711.7 .435.2 .39.6 1.332.6 a RPC= rock permeability classification (1–high permeability, 2–intermediate permeability, 3–low permeability) determined by Table 2. From Table 6, it shown that a) the nonlinearity degree of this studied problem is moderate due to of MRA is 15.6; and b) the solution accuracies of C-SVM, NBAY and BAYSD are high, moderate and high, respectively. Therefore, a) C-SVM and BAYSD are obviously available; though the solution accuracy of NBAY is moderate, it is inapplicable. Table 6 Comparison among the applications of classification methods (C-SVM, NBAY and BAYSD) to rock permeability classification of an oilfieldMethodFittingformulaMean absolute relative residualDependence of the predictedvalue (y) on independentvariables (x1, x2, x3),in decreasing orderTimeconsumingon PC(Intel Core 2)Solutionaccuracy C-SVMNonlinear,explicit000N/A5 sHigh NBAYNonlinear,/A<1 sModerate BAYSDNonlinear,explicit000x2, x3, x11 sHigh MRALinear,, x3, x1<1 sModeratenonlinearity 3.2 Case Study 2: The Oil Layer Classification of the Keshang Formation The objective of this case study is to calculate oil layer productivity index (OLPI) and to conduct oil layer classification (OLC) in glutenite for oil wells, which has practical value when oil test data is less limited. Using data of 14 learning samples and 4 prediction samples from the Keshang Formation in the District 8 of Kelamayi Oilfield in western China, and each sample contains 3 parameters (x1 = porosity, x2 = permeability, x3 = resistivity) and an oil test result (y*= OLPI and OLC), see Table 10.1 in [1], [1] adopted R-SVM, BPNN and MRA for the prediction of OLPI, and adopted C-SVM and BAYSD for the prediction of OLC. In this case study, NBAY is added for the prediction of OLC, and the results related to method comparison are listed in Tables 7 and 8. 4 Six Case Studies of The Classification Problem 4.1 Case Study 3: The Reservoir Classification of the Fuxin Uplift The objective of this case study is to conduct reservoir classification (RC), which has practical value when well test data is less limited. Using data of 25 learning samples from the fourth stage of Nanquan in the Fuxin Uplift of Jilin Oilfield in eastern China, and each sample contains 5 parameters (x1 = sandstone thickness, x2 = porosity, x3 = permeability, x4 = carbonate content, x5 = mud content) and a well test result (y*= reservoir type), [21] adopted Bayesian discrimination for the prediction of RC. Among these 25 samples, 20 were taken as learning samples and 3 as prediction samples for the prediction of RC by C-SVM and BAYSD [1]. In this case study, among these 25 samples, 20 are taken as learning samples and one as prediction sample (Table 9) by C-SVM, NBAY and BAYSD. Table 7 Comparison among the applications of regression methods (R-SVM, BPNN and MRA) to oil layer productivity index of the Keshang FormationMethodFittingformulaMean absolute relative residualDependence of the predictedvalue (y) on independentvariables (x1, x2, x3),in decreasing orderTimeconsumingon PC(Intel Core 2)Solutionaccuracy R-SVMNonlinear,explicit209...8N/A3 sLow BPNNNonlinear,/A30 sModerate MRALinear,explicit81..8155.5x2, x3, x1<1 sLowTable 8 Comparison among the applications of classification methods (C-SVM, NBAY and BAYSD) to oil layer classification of the Keshang FormationMethodFittingformulaMean absolute relative residualDependence of the predictedvalue (y) on independentvariables (x1, x2, x3),in decreasing orderTimeconsumingon PC(Intel Core 2)Solutionaccuracy C-SVMNonlinear,explicit000N/A5 sHigh NBAYNonlinear,explicit7..11N/A<1 sModerate BAYSDNonlinear,explicit000x2, x1, x31 sHigh MRALinear,, x1, x3<1 sWeaknonlinearityTable 9 Input data for reservoir classification of the Fuxin parameters related to y ay* b x1(m)x2(%)x3(10-3mm2)x4(%)x5(%) (1) ax1= sandstone thickness, x2 = porosity, x3 = permeability, x4 = carbonate content, x5 = mud content. b y*= RC = reservoir classification (1–excellent, 2–good, 3–average, 4–poor) determined by the well test, number in parenthesis is not input data, but is used for calculating R(%). Using the 20 learning samples (Table 9) and by C-SVM, NBAY and BAYSD, the following three functions of RC (y) with respect to 5 independent variables (x1, x2, x3, x4, x5)have been constructed. Using C-SVM, the result is an explicit nonlinear function corresponding to Eq. 13: y=C-SVM(x1, x2, x3, x4, x5) (26) with C=32, =0.5, 16 free vectors xi, and CVA=90%. Using NBAY, the result is an explicit nonlinear discriminate function corresponding to Eq. 15: (l=1, 2, 3, 4) (27) where for l=1, σj1=1.175, 1.287, 2.142, 1.111, 2.037, μj1=4.4, 15.052, 6.757, 2.493, 6.75; for l=2, σj2=0.728, 0.998, 0.389, 0.877, 1.683, μj2=4.1, 15.327, 1.145, 2.645, 8.618; l=3, σj3=1.682, 2.914, 1.205, 2.325, 3.536, μj3=0, 0, 0, 0, 0; l=4, σj4=3.1, 0.66, 0.627, 0.71, 0.71, μj4=7, 8.2, 0.11, 5.5, 29.01. Using BAYSD, the result is an explicit nonlinear discriminate function corresponding to Eq. 17: (28) From the successive process, RC (y) is shown to depend on the 5 independent variables in decreasing order: x5, x3, x1, x4, and x2. Though MRA is regression method but not classification method, MRA can provides the nonlinearity degree of the studied problem, and thus it is required to run MRA. Using MRA, the result is an explicit linear function corresponding to Eq. 12: y=1.895?0.0x1+?0.x3?0.0x4 +0.0x5 (29) Equation 29 yields a residual variance of 0. and a multiple correlation coefficient of 0.. From the regression process, RC (y) is shown to depend on the 5 independent variables in decreasing order: x5, x3, x1, x4, and x2. Substituting the values of 5 independent variables (x1, x2, x3, x4, x5) given by the 20 learning samples and one prediction sample (Table 9) in Eqs. 26, 27 (and then use Eq. 16), 28 (and then use Eq. 18) and 29, respectively, the RC (y) of each sample is obtained (Table 10). From Table 11, it shown that a) the nonlinearity degree of this studied problem is moderate due to of MRA is 13.41; and b) the solution accuracies of C-SVM, NBAY and BAYSD are all high. Therefore, C-SVM, NBAY and BAYSD are all available. 4.2 Case Study 4: The Reservoir Classification of the Baibao Oilfield The objective of this case study is to conduct reservoir classification (RC), which has practical value when well test data is less limited. Using data of 29 learning samples from the Chang 3 and Chang 4+5 of Triassic Formation of Baibao Oilfield in Ordos Basin in western China, and each sample contains 4 parameters (x1 = mud content, x2 = porosity, x3 = permeability, x4 = permeability variation coefficient) and a well test result (y*= reservoir type), [22] adopted Bayesian discrimination for the prediction of RC. Among these 29 samples, 25 were taken as learning samples and 3 as prediction samples for the prediction of RC by C-SVM and BAYSD [1]. In this case study, among these 29 samples, 24 are taken as learning samples and 3 as prediction samples for the prediction of RC (Table 12) by C-SVM, NBAY and BAYSD. Table 10 Prediction results from reservoir classification of the Fuxin a y*Classification methodMRA C-SVMNBAYBAYSD yR(%)yR(%)yR(%)yR(%) .6810.8 .3215.9 .885.79 .1211.8 .928.26 .7010.0 .729.36 .3115.4 .082.70 .927.65 1.4817.3 .758.50 .227.39 3.775.69 .7625.2 .9291.7 1.913.06 .903.45 .6013.4 a RC = reservoir classification (1–excellent, 2–good, 3–average, 4–poor) determined by the well test. Table 11 Comparison among the applications of classification methods (C-SVM, NBAY and BAYSD) to reservoir classification of the Fuxin UpliftMethodFittingformulaMean absolute relative residualDependence of the predictedvalue (y) on independentvariables (x1, x2, x3, x4, x5),in decreasing orderTimeconsumingon PC(Intel Core 2)Solutionaccuracy C-SVMNonlinear,explicit000N/A5 sHigh NBAYNonlinear,/A<1 sHigh BAYSDNonlinear,explicit000x5, x3, x1, x4, x21 sHigh MRALinear,, x3, x1, x4, x2<1 sModeratenonlinearity Using the 24 learning samples (Table 12) and by C-SVM, NBAY and BAYSD, the following three functions of RC (y) with respect to 4 independent variables (x1, x2, x3, x4)have been constructed. Using C-SVM, the result is an explicit nonlinear function corresponding to Eq. 13: y=C-SVM(x1, x2, x3, x4) (30) with C=2048, =0.03125, 8 free vectors xi, and CVA=95.8333%. Using NBAY, the result is an explicit nonlinear discriminate function corresponding to Eq. 15: (l=1, 2, 3, 4) (31) where for l=1, σj1=1.551, 2.709, 2.913, 0.046, μj1=10.87, 17.91, 33.252, 0.51; for l=2, σj2=1.617, 1.056, 4.27, 0.14, μj2=13.825, 12.783, 17.917, 1.043; l=3, σj3=5.562, 1.462, 0.546, 0.456, μj3=17.987, 8.011, 0.85, 1.792; l=4, σj4=3.1, 0.66, 0.627, 0.71, μj4=15.3, 0.84, 0.652, 2.11. Using BAYSD, the result is an explicit nonlinear discriminate function corresponding to Eq. 17: (32) From the successive process, RC (y) is shown to depend on the 4 independent variables in decreasing order: x3, x2, x4, and x1. Table 12 Input data for reservoir classification of the Baibao parameters related to y ay* b x1(%)x2(%)x3(10-3mm2)x4 3B102C4+ 6B108C4+ 7B110C4+ 8B112C4+ 11B202C4+ 14B205C4+ 15B205C4+ 17B206C4+ 18B206C4+ 20B210C4+ 22B215C4+ 24B217C4+ (2) 26H186C4+(3) 27W4C4+(4) ax1= mud content, x2 = porosity, x3 = permeability, x4 = permeability variation coefficient. b y*= RC = reservoir classification (1–excellent, 2–good, 3–average, 4–poor) determined by the well test, number in parenthesis is not input data, but is used for calculating R(%). Though MRA is regression method but not classification method, MRA can provides the nonlinearity degree of the studied problem, and thus it is required to run MRA. Using MRA, the result is an explicit linear function corresponding to Eq. 12: y=3.3863??0.0x2?0.0x3+0.2729x4 (33) Equation 33 yields a residual variance of 0.05788 and a multiple correlation coefficient of 0.. From the regression process, RC (y) is shown to depend on the 4 independent variables in decreasing order: x2, x3, x4, and x1. Substituting the values of 4 independent variables (x1, x2, x3, x4) given by the 24 learning samples and 3 prediction samples (Table 12) in Eqs. 30, 31 (and then use Eq. 16), 32 (and then use Eq. 18) and 33, respectively, the RC (y) of each sample is obtained (Table 13). From Table 14, it shown that a) the nonlinearity degree of this studied problem is weak due to of MRA is 8.77; and b) the solution accuracies of C-SVM, NBAY and BAYSD are all high. Therefore, C-SVM, NBAY and BAYSD are all available. Table 13 Prediction results from reservoir classification of the Baibao a y*Classification methodMRA C-SVMNBAYBAYSD yR(%)yR(%)yR(%)yR(%) .6337.2 .962.13 .061.62 .055.00 .031.36 .189.07 .961.26 .7612 .199.44 .5511.3 .134.48 .854.98 .199.42 1.3232.1 .134.31 . .854.93 .104.87 2.021.08 .6915.7 2.000.06 .189.04 .123.97 .256.24 a RC = reservoir classification (1–excellent, 2–good, 3–average, 4–poor) determined by the well test. Table 14 Comparison among the applications of classification methods (C-SVM, NBAY and BAYSD) to reservoir classification of the Baibao OilfieldMethodFittingformulaMean absolute relative residualDependence of the predictedvalue (y) on independentvariables (x1, x2, x3, x4),in decreasing orderTimeconsumingon PC(Intel Core 2)Solutionaccuracy C-SVMNonlinear,explicit000N/A5 sHigh NBAYNonlinear,explicit000N/A<1 sHigh BAYSDNonlinear,explicit000x3, x2, x4, x11 sHigh MRALinear,, x3, x4, x1<1 sWeak nonlinearity 4.3 Case Study 5: The Oil Layer Classification of the Lower H3 Formation The objective of this case study is to conduct oil layer classification in clayey sandstone, which has practical value when oil test data is less limited. Using data of 24 learning samples and 8 prediction samples from the lower H3 Formation in the Xia’ermen Oilfield of Henan oil province in central China, and each sample contains 8 parameters (x1 = SP, x2 = Rxo1, x3 = Rxo2, x4 = △t, x5 = Ra0.45, x6 = Ra4, x7 = Cond, x8 = Cal) and an oil test result (y*= oil layer type), see Table 10.7 in [1], [1] adopted C-SVM and BAYSD. In this case study, NBAY is added, and the results related to method comparison are listed in Table 15. Table 15 Comparison among the applications of classification methods (C-SVM, NBAY and BAYSD) to oil layer classification of the Lower H3 FormationMethodFittingformulaMean absolute relative residualDependence of the predictedvalue (y) on independentvariables xj (j=1, 2, …, 8),in decreasing orderTimeconsumingon PC(Intel Core 2)Solutionaccuracy C-SVMNonlinear,explicit000N/A5 sHigh NBAYNonlinear,explicit000N/A<1 sHigh BAYSDNonlinear,explicit000x5, x8 x6, x1, x3, x2, x4, x71 sHigh MRALinear,, x8 x6, x1, x4, x2, x3, x7<1 sWeaknonlinearity 4.4 Case Study 6: The Trap Evaluation of the Northern Kuqa Depression The objective of this case study is to conduct optimal selection of traps, which has practical value in the stage of rolling exploration. Using data of 27 learning samples and 2 prediction samples from the Northern Kuqa Depression of the Tarim Basin in western China, and each sample (trap) contains 14 parameters (x1 = unit structure, x2 = trap type, x3 = petroliferous formation, x4 = trap depth, x5 = trap relief, x6 = trap closed area, x7 = formation HC identifier, x8 = data reliability, x9 = trap coefficient, x10 = source rock coefficient, x11 = reservoir coefficient, x12 = preservation coefficient, x13 = configuration coefficient, x14 = resource quantity) and an integrated evaluation results by geologists (y*= trap quality), see Table 3.6 in [1], [1] adopted C-SVM and BAYSD. In this case study, NBAY is added, and the results related to method comparison are listed in Table 16. Table 16 Comparison among the applications of classification methods (C-SVM, NBAY and BAYSD) to trap classification of the Northern Kuqa DepressionMethodFittingformulaMean absolute relative residualDependence of the predictedvalue (y) on independentvariables xj (j=1, 2, …, 14),in decreasing orderTimeconsumingon PC(Intel Core 2)Solutionaccuracy C-SVMNonlinear,explicit000N/A5 sHigh NBAYNonlinear,explicit9..34N/A<1 sModerate BAYSDNonlinear,explicit000x 12, x10, x9, x4, x5, x11, x8, x6, x14, x1, x2, x7, x13, x31 sHigh MRALinear, 12, x10, x9, x4, x5, x11, x8, x6, x14, x1, x2, x7, x13, x3<1 sModeratenonlinearity 4.5 Case Study 7: The Fracture Prediction of Wells An1 and An2 The objective of this case study is to predict fractures using conventional well-logging data, which has practical values when the data of imaging log and core sample are limited. Using data of 29 learning samples and 4 prediction samples from Wells An1 and An2 in the Anpeng Oilfield at the southeast of the Biyang Sag in Nanxiang Basin in central China, and each sample contains 7 parameters (x1 = acoustictime, x2 = compensated neutron density, x3 = compensated neutron porosity, x4 = micro-spherically focused resistivity, x5 = deep laterolog resistivity, x6 = shallow laterolog resistivity, x7 = absolute difference of x5 and x6) and the imaging log (y*= fracture identification), see Table 3.10 in [1], [1] adopted C-SVM. In this case study, NBAY and BAYSD are added, and the results related to method comparison are listed in Table 17. Table 17 Comparison among the applications of classification methods (C-SVM, NBAY and BAYSD) to fracture prediction of Wells An1 and An2MethodFittingformulaMean absolute relative residualDependence of the predictedvalue (y) on independentvariables xj (j=1, 2, …, 7),in decreasing orderTimeconsumingon PC(Intel Core 2)Solutionaccuracy C-SVMNonlinear,explicit000N/A5 sHigh NBAYNonlinear,/A<1 sModerate BAYSDNonlinear,explicit000x5, x2, x4, x7, x3, x6, x11 sHigh MRALinear,, x2, x4, x7, x3, x6, x1<1 sModeratenonlinearity 4.6 Case Study 8: The Gas Layer Classification of the Tabamiao Area The objective of this case study is to conduct gas layer classification in tight sandstones, which has practical value when gas test data is less limited. Using data of 38 learning samples and 2 prediction samples from the Tabamiao area at the northeast of the Yishan Slope in Ordos Basin in western China, and each sample contains 3 parameters (x1 = porosity, x2 = permeability, x3 = gas saturation) and a gas test result (y*= gas layer type), see Table 4.1 in [1], [1] adopted C-SVM. In this case study, NBAY and BAYSD are added, and the results related to method comparison are listed in Table 18. Table 18 Comparison among the applications of classification methods (C-SVM, NBAY and BAYSD) to gas layer classification of the Tabamiao areaMethodFittingformulaMean absolute relative residualDependence of the predictedvalue (y) on independentvariables (x1, x2, x3),in decreasing orderTimeconsumingon PC(Intel Core 2)Solutionaccuracy C-SVMNonlinear,explicit000N/A5 sHigh NBAYNonlinear,explicit31..58N/A<1 sModerate BAYSDNonlinear,, x1, x21 sHigh MRALinear,, x1, x2<1 sModeratenonlinearity 5 Summary of Eight Case Studies From Tables 4, 6, 7, 8, 11, 14, 15, 16, 17 and 18, Tables 19 and 20 summarize the two rules of the proposed method optimization applied to the eight case studies. Table 19 is for the classification-regression problem, while Table 20 is for the classification problem. From Table 19, it is seen that if a regression problem has strong nonlinearity, it is required to convert the problem from regression to classification. From Tables 19 and 20, it is seen that if a classification problem is of weak or moderate nonlinearity, SVM and BAYSD are available, whereas NBAY is sometimes available. Among the aforementioned six methods, each of MRA and BAYSD can serve as a pioneering dimension-reduction tool, respectively, because the two methods all can give the dependence of the predicted value (y) on independent variables (x1, x2, …, xm), in decreasing order. However, because MRA belongs to data analysis in linear correlation whereas BAYSD is in nonlinear correlation, in applications the preferable tool is BAYSD, and the next is MRA. The called “pioneering tool” is whether it succeeds or not needs a high-class nonlinear tool (e.g. BPNN for regression problem, C-SVM for classification problem) for the validation, so as to determine how many independent variables can be reduced. For instance, [6] indicated that a 16-D problem (x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15, y) can be reduced to 8-D problem (x1, x4, x5, x8, x10, x11, x15, y). Comparing to C-SVM and BAYSD when the nonlinearity degree of a studied problem is weak or moderate, they are all available, but a) BAYSD runs much faster than C-SVM, b) it is easy to code the BAYSD program whereas very complicated to code the C-SVM program, and c) BAYSD can serve as a pioneering dimension-reduction tool, so BAYSD is conditionally better than C-SVM. Table 19 Summary of Case studies 1 and 2Case studyNonlinearitydefined byRule 1Regression methodClassification method R-SVMBPNNMRAC-SVMNBAYBAYSD Case study 1(regression)See Table 4StrongUnavailable(=126.0)Unavailable=(=269.7)N/AN/AN/A Case study 1(classification)See Table 6ModerateN/AN/A=(=0)Unavailable(=16.7)Available (=0) Case study 2(regression)See Table 7StrongUnavailable(=355.8)Unavailable=(=155.5)N/AN/AN/A Case study 2(classification)See Table 8WeakN/AN/A=7Available(=0)Unavailable(=11.11)Available (=0)Table 20 Summary of Case studies 3 to 10Case studyNonlinearitydefined byRule 1MRAClassification method C-SVMNBAYBAYSD Case study 3(classification)See Table 11Moderate=(=0)Available(=1.19)Available(=0) Case study 4(classification)See Table 14Weak=(=0)Available(=0)Available(=0) Case study 5(classification)See Table 15Weak=(=0)Available(=0)Available(=0) Case study 6(classification)See Table 16Moderate=(=0)Unavailable(=10.34)Available(=0) Case study 7(classification)See Table 17Moderate=(=0)Unavailable(=10.61)Available(=0) Case study 8(classification)See Table 18Moderate=(=0)Unavailable(=29.58)Available(=7.71) 6 Conclusions In this paper, a method optimization is proposed. Through the aforementioned eight case studies, five major conclusions can be drawn as follows: 1) the proposed two rules (nonlinearity degree of a studied problem, solution accuracy of a given method application) of the proposed method optimization are practical; 2) the total mean absolute relative residual of MRA can be used to measure the nonlinearity degreeof a studied problem, and thus MRA should be run at first; 3) any of R-SVM, BPNN and MRA cannot be applied to any regression problems with strong nonlinearity, but C-SVM, NBAY or BAYSD could be applied if the problems are converted from regression to classification; 4) if a classification problem has weak or moderate nonlinearity, SVM and BAYSD are available, whereas NBAY is sometimes available; 5) in general, the preferable method for regression problems is BPNN, and the next are R-SVM and MRA, while the preferable method for classification problems with weak or moderate nonlinearity is C-SVM or BAYSD, and the next is NBAY. Acknowledgment This work was supported by the Research Institute of Petroleum Exploration and Development (RIPED) and PetroChina. References [1] Shi G. “Data Mining and Knowledge Discovery for Geoscientists.” Elsevier Inc, USA, 2013 [2] Shi G. “The use of support vector machine for oil and gas identification in low-porosity and low-permeability reservoirs.” Int JMath Model Numer Optimisa 1(1/2): 75-87, 2009 [3] Shi G, Yang X. “Optimization and data mining for fracture prediction in geosciences.” Procedia Comput Sci 1(1): 1353-1360, 2010 [4] Chang C, Lin C. “LIBSVM: a library for support vector machines, Version 3.1.” Retrieved from /~cjlin/ libsvm, 2011 [5] Zhu Y, Shi G. “Identification of lithologic characteristics of volcanic rocks by support vector machine.” Acta Petrolei Sinica 34(2): 312-322, 2013 [6] Shi G, Zhu Y, Mi S, Ma J, Wan J. “A big data mining in petroleum exploration and development.” Adv Petrol Expl Devel 7(2): 1-8, 2014 [7] Chang C, Lin C. “Training n-support vector classifiers: Theory and algorithms.” Neural Comput 13(9): 2119-2147, 2001 [8] Chang C, Lin C. “Training n-support vector regression: Theory and algorithms.” Neural Comput 14(8): 1959-1977, 2002 [9] Darabi H, Kavousi A, Moraveji M, Masihi M. “3D fracture modeling in Parsi oil field using artificial intelligence tools.” J Petro Sci Eng 71(1-2): 67-76, 2010 [10] Labani MM, Kadkhodaie-Ilkhchi A, Salahshoor K. “Estimation of NMR log parameters from conventional well log data using a committee machine with intelligent systems: A case study from the Iranian part of the South Pars gas field.” Persian Gulf Basin J Petro Sci Eng 72(1-2): 175-185, 2010 [11] Güler, übeyli ED. “Detection of ophthalmic artery stenosis by least-mean squares backpropagation neural network.” Comput Biol Med 33(4): 333-343, 2003 [12] Hush D R, Horne B G. “Progress in supervised neural networks.” IEEE Sig Proc Mag 10(1): 8-39, 1993 [13] Sharma MSR, O'Regan M, Baxter CDP, Moran K, Vaziri H, Narayanasamy R. “Empirical relationship between strength and geophysical properties for weakly cemented formations.” J Petro Sci Eng 72(1-2): 134-142, 2010 [14] Singh J, Shaik B, Singh S, Agrawal VK, Khadikar PV, Deeb O, Supuran CT. “Comparative QSAR study on para-substituted aromatic sulphonamides as CAII inhibitors: information versus topological (distance-based and connectivity) indices.” Chem Biol Drug Design 71, 244-259, 2008 [15] Ramoni M, Sebastiani P. “Robust Bayes classifiers.” Artificial Intelligence 125(1-2): 207-224, 2001 [16] Tan P, Steinbach M, Kumar V. “Introduction to Data Mining.” Pearson Education, Boston, MA, USA, 2005 [17] Han J, Kamber M. “Data Mining: Concepts and Techniques, 2nd ed.” Morgan Kaufmann, San Francisco, CA, USA, 2006 [18] Denison DGT, Holmes CC, Mallick BK, Smith AFM. “Bayesian Methods for Nonlinear Classification and Regression.” John Wiley & Sons Inc, Chichester, England, UK, 2002 [19] Shi G. “Four classifiers used in data mining and knowledge discovery for petroleum exploration and development.” Adv Petrol Expl Devel 2(2): 12-23, 2011 [20] Yang J, Yang C, Zhang Y Cui L, Wang L. “Permeability prediction method based on improved neural network.” Lithologic Reservoirs 23(1): 98-102, 2011 [21] Fu D, Xu J, Wang G. “Reservoir classification and evaluation based on Q cluster analysis combined with Bayesian discrimination algorithm.” Sci Tech Review, 29(3): 29-33, 2011 [22] Zhang W, Li X, Jia G. “Quantitative classification and evaluation of low permeability reservoir, with Chang 3 and Chang 4+5 of Triassic Formation in Baibao Oil Field as examples.” Sci Tech Review, 26(21): 61-65, 2008 Author Guangren Shi was born in Shanghai, China in February, 1940, Professor with qualification of directing Ph. D. students. He was graduated from Xi’an Jiaotong University, China in 1963, majoring in applied mathematics (1958–1963). Since 1963, he has been engaged in computer applications for petroleum exploration and development. In the recent 30 years, his research only contains two fields of basin modeling (petroleum system) and data mining for geosciences. In the recent years, however, he has focused on the latter more than the former. He has more than 50 years of professional career, working for the computer center, Daqing Oilfield, petroleum ministry, China as Associate Engineer and Head of software group (1963–1967), the computer center, Shengli Oilfield, petroleum ministry, China as Engineer and Director (1967–1978), the computer center, petroleum ministry, China as Engineer and Head of software group (1978–1985), the Aldridge laboratory of applied geophysics, Columbia university at New York City, U.S.A as Visiting Scholar (1985–1987), the computer application technology research department, Research Institute of Petroleum Exploration and Development (RIPED), China National Petroleum Corporation (CNPC), China as Professor, Director (1987–1997), the RIPED, PetroChina Company Limited (PetroChina), China as Professor with qualification of directing Ph. D. students, Deputy Chief Engineer (1997–2001), and the department of specialists in RIPED of PetroChina, China as Professor with qualification of directing Ph. D. students (2001–Present). He published eight books, in which there are three in English, i.e. 1) Shi G. R. 2013. Data Mining and Knowledge Discoveryfor Geoscientists. Elsevier Inc, USA. 367pp; 2) Shi G. R. 2005. Numerical Methods of Petroliferous Basin Modeling, 3rd edition. Petroleum Industry Press, Beijing, China. 338pp, which was book-reviewed by Mathematical Geosciences in 2009; and 3) Shi G. R. 2000. Numerical Methods of Petroliferous Basin Modeling, 2nd edition. Petroleum Industry Press, Beijing, China. 233pp, which was book-reviewed by Mathematical Geology in 2006. He also published 74 articles, in which there are 16 in English, e.g. three articles indexed by SCI: 1) Shi G. R. 2009. A simplified dissolution-precipitation model of the smectite to illite transformation and its application. Journal of Geophysical Research-Solid Earth, 114, B, doi:10.1029/2009JB006406; 2) Shi G. R. 2008. Basin modeling in the Kuqa Depression of the Tarim Basin (Western China): A fully temperature-dependent model of overpressure history. Mathematical Geosciences, 40 (1): 47–62; and 3) Shi G. R., Zhou X. X., Zhang G. Y., Shi X. F., Li H. H. 2004. The use of artificial neural network analysis and multiple regression for trap quality evaluation: a case study of the Northern Kuqa Depression of Tarim Basin in western China. Marine and Petroleum Geology, 21 (3): 411–420. Prof. Shi is Member of Society of Petroleum Engineers (International), Member of Chinese Association of Science and Technology, Member of Petroleum Society of China. And he also is Regional Editor (Asia), International Journal of Mathematical Modelling and Numerical Optimisation, and Member of Editorial Board, Journal of Petroleum Science Research. He received three honors: 1) A Person Studying Overseas and Return with Excellent Contribution, appointed by the Ministry of Education of China (1991); 2) Special Government Allowance, awarded by the State Council of China (1994); and 3) Grand Award of Sun Yueqi Energy, awarded by the Ministry of Science-Technology of China (1997). And he also obtained four awards of Science-Technology Progress, in which one is China National Award, and three are from CNPC and PetroChina.
文章来源:《新疆石油地质》 网址: http://www.xjsydzzz.cn/qikandaodu/2020/1016/376.html