掲載を希望されないスライド著者の方は、こちらよりご連絡下さい。

byijscai

3年以上前 (2013/05/14)にアップロードinテクノロジー

A new smoothing method for solving ε -support vector regression (ε-SVR), tolerating a small error...

A new smoothing method for solving ε -support vector regression (ε-SVR), tolerating a small error in

fitting a given data sets nonlinearly is proposed in this study. Which is a smooth unconstrained

optimization reformulation of the traditional linear programming associated with a ε-insensitive support

vector regression. We term this redeveloped problem as ε-smooth support vector regression (ε-SSVR).

The performance and predictive ability of ε-SSVR are investigated and compared with other methods

such as LIBSVM (ε-SVR) and P-SVM methods. In the present study, two Oxazolines and Oxazoles

molecular descriptor data sets were evaluated. We demonstrate the merits of our algorithm in a series of

experiments. Primary experimental results illustrate that our proposed approach improves the

regression performance and the learning efficiency. In both studied cases, the predictive ability of the ε-

SSVR model is comparable or superior to those obtained by LIBSVM and P-SVM. The results indicate

that ε-SSVR can be used as an alternative powerful modeling method for regression studies. The

experimental results show that the presented algorithm ε-SSVR, , plays better precisely and effectively

than LIBSVMand P-SVM in predicting antitubercular activity

- International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.2, No.2, April 2013

STUDY OF Ε-SMOOTH SUPPORT VECTOR

REGRESSION AND COMPARISON WITH Ε- SUPPORT

VECTOR REGRESSION AND POTENTIAL SUPPORT

VECTOR MACHINES FOR PREDICTION FOR THE

ANTITUBERCULAR ACTIVITY OF OXAZOLINES AND

OXAZOLES DERIVATIVES

Doreswamy1 and Chanabasayya .M. Vastrad2

1Department of Computer Science, MangaloreUniversity, Mangalagangotri-574 199,

Karnataka,INDIA

Doreswamyh@yahoo.com

2Department of Computer Science, MangaloreUniversity, Mangalagangotri-574 199,

Karnataka, INDIA

channu.vastrad@gmail.com

ABSTRACT

A new smoothing method for solving ε -support vector regression (ε-SVR), tolerating a small error in

fitting a given data sets nonlinearly is proposed in this study. Which is a smooth unconstrained

optimization reformulation of the traditional linear programming associated with a ε-insensitive support

vector regression. We term this redeveloped problem as ε-smooth support vector regression (ε-SSVR).

The performance and predictive ability of ε-SSVR are investigated and compared with other methods

such as LIBSVM (ε-SVR) and P-SVM methods. In the present study, two Oxazolines and Oxazoles

molecular descriptor data sets were evaluated. We demonstrate the merits of our algorithm in a series of

experiments. Primary experimental results illustrate that our proposed approach improves the

regression performance and the learning efficiency. In both studied cases, the predictive ability of the ε-

SSVR model is comparable or superior to those obtained by LIBSVM and P-SVM. The results indicate

that ε-SSVR can be used as an alternative powerful modeling method for regression studies. The

experimental results show that the presented algorithm ε-SSVR, , plays better precisely and effectively

than LIBSVMand P-SVM in predicting antitubercular activity.

KEYWORDS

ε-SSVR , Newton-Armijo, LIBSVM, P-SVM

1.INTRODUCTION

The aim of this paper is supervised learning of real-valued functions. We study a sequence

S = ሼሺxଵ,yଵሻ, . . . , ሺx୫, y୫ሻሽof descriptor-target pairs, where the descriptors are vectors in ℝ୬

and the targets are real-valued scalars, yi ∈ ℝ.Our aim is to learn a function f: ℝ୬ → ℝ which

serves a good closeness of the target values from their corresponding descriptor vectors. Such a

function is usually mentioned to as a regression function or a regressor for short.The main aimof

DOI : 10.5121/ijscai.2013.2204 49 - regression problems is to find a function fሺxሻthat can rightly predict the target values,y of new

input descriptor data points, x, by learning from the given training data set, S.Here, learning

from a given training dataset means finding a linear surface that accepts a small error in fitting

this training data set. Ignoring thevery smal errors that fal within some acceptance, say εthat

maylead to a improvedgeneralization ability is performed bymake use of an ε -insensitive loss

function. As well as applying purpose of support vector machines (SVMs) [1-4], the function

fሺxሻis made as flat as achievable, in fitting the training data. This issue is called ε -support vector

regression (ε-SVR) and a descriptor data pointx୧ ∈ R୬is called a support vector ifหf൫x୧൯ − y୧ห ≥

ε.Generally, ε-SVR is developed as a constrained minimization problem [5-6], especially, a

convex quadratic programming problem or a linear programming problem[7-9].Suchcreations

presents 2m more nonnegative variablesand 2m inequality constraints that increase the problem

sizeand could increase computational complexity for solvingthe problem. In our way, we change

the model marginally and apply the smooth methods that have been widely used for solving

important mathematical programming problems[10-14] and the support vector machine for

classification[15]to deal with the problem as an unconstrained minimizationproblemstraightly.

We name this reformulated problem as ε – smooth support vector regression(ε-SSVR). Because

ofthe limit less arrangement of distinguishability of the objectivefunction of our unconstrained

minimization problem, weuse a fast Newton-Armijo technique to deal with this reformulation.

It has been shown that the sequence achieved by the Newton-Armijo technique combines to the

unique solutionglobally and quadratically[15]. Taking benefit of ε-SSVR generation, we only

need to solve a system of linear equations iteratively instead of solving a convex quadratic

program or a linear program, as is the case with a conventionalε-SVR. Thus, we do not need to

use anysophisticated optimization package tosolve ε-SSVR. In order to deal with the case of

linear regression with aOxazolines and Oxazoles molecular descriptor dataset.

The proposed ε-SSVR model has strong mathematical properties, such as strong convexity and

infinitely often differentiability. To demonstrate the proposed ε-SSVR’s capability in solving

regression problems, we employ ε-SSVR to predict ant tuberculosis activity for Oxazolines and

Oxazoles agents. We also compared our ε-SSVR model with P-SVM[16-17] and LIBSVM [18]

in the aspect of prediction accuracies. The proposed ε-SSVR algorithm is implemented in

MATLAB.

A word about our representation and background material is given below. Entire vectors will be

column vectors by way of this paper.For a vector xin the n-dimensional real descriptor space

R୬ , the plus functionxା is denoted as ሺxାሻ୧ = maxሼ0,x୧ሽ ,i = 1, … . . , n. The scalar(inner)

product of two vectors x and y in the n-dimensional real descriptor space R୬ will be reprsented

by x y and the p-norm of x will be represnted by ‖x‖୮. For a matrix A ∈ R୫⨯୬ , A୧ is the iTh

row of A which is a row vector inR୬? A column vector of ones of arbitrary dimension will be

reprsented by 1. For A ∈ R୫⨯୬ and B ∈ R୬⨯୪ , the kernel KሺA, Bሻ maps R୫⨯୬ ⨯

R୬⨯୪ intoR୫⨯୪ . In exact, if x andy are column vectors in R୬ , then Kሺx , yሻ is a real number ,

KሺA, xሻ = K൫x , A ൯ is a column vector in R୫. andKሺA, A ሻ is an m ⨯ m matrix . If f is a

real valued function interpreted on the n-dimensional real descriptor spaceR୬ , the gradient of

f at x is represented by ∇fሺxሻ which is a row vector in R୬ and n ⨯ n Hessian matrixof second

partial derivatives of f at x is represented by∇ଶfሺxሻ . The base of the natural logarithm will be

represented bye.

50 - 2. MATERIALS AND ALGORITHAMS

2.1 The Data Set

The molecular descriptors of 100Oxazolines and Oxazoles derivatives [19-20] based H37Rv

inhibitors analyzed. These molecular descriptors are generated using Padel-Descriptor tool [21].

The dataset covers a diverse set of molecular descriptors with a wide range of inhibitory

activities against H37Rv. The pIC50 (observed biological activity) values range from -1 to 3.

The dataset can be arranged in data matrix. This data matrix x contains m samples (molecule

structures) in rows and n descriptors in columns. Vector y with order m × 1 denotes the

measured activity of interest i.e. pIC50. Before modeling, the dataset is scaled.

2.2 The Smooth ε –support vector regression(ε-SSVR)

We allow a given dataset Swhich consists of m points in n-dimensional real descriptor space R୬

denoted by the matrix A ∈ R୫⨯୬ and m observations of real value associated with each

descriptor. That is,S = ሼሺA୧, y୧ሻ| A୧ ∈ R୬, y୧ ∈ R, for i = 1, … … , m ሽwe would like to search a

nonlinear regression function,fሺxሻ, accepting a small error in fitting this given data set. This

can be performed by make use of the ε- insensitive loss function that sets ε- insensitive “tube”

around the data, within which errors are rejected. Also, put into using the idea of support

vector machines (SVMs) [1-4],thefunction fሺxሻ is made as ϐlat as possible in fitting thetraining

data set. We start with the regression function f(x) and it is expressed as f(x) = x w + b. This

problem can be formulated as an unconstrained minimization problem given as follows:

1

min

(୵,ୠ)∈ୖ శభ 2 w w + C1 |ξ|க (1)

Where |ξ| ∈ R୫ , (|ξ|க)୧ = max{0, |A୧w + b + y୧| − ε } that denotes the fitting errors and

positive control parameter C here weights the agreement between the fitting errors and the

flatnessof the regression functionf(x). To handle ε-insensitive loss function in the objective

function of the above minimization problem,traditionallyit is reformulated as a constrained

minimization problem expressed as follows:

1

min

(୵,ୠ,ஞ,ஞ∗) 2 w w + C1 (ξ + ξ∗)Aw − 1b − y ≤ 1ε + ξ − Aw − 1b + y ≤ 1ε + ξ∗ ξ, ξ∗

≥ 0. (2)

This equation (2), which is equivalent to the equation (1), is a convex quadratic minimization

problem with n + 1 free variables, 2m nonnegative variables, and 2m imparity

constraints.However, presenting morevariables (and constraints in the formulation increases

theproblem size and could increase computational complexityfor dealing with the regression

problem.

In our smooth way, we change the model marginally and solve it as an unconstrained

minimization problem preciselyapart from adding any new variable and constraint.

51 - Figure1. (a) |x|ଶ

ଶ

கand (b) pக(x, α) with α = 5,ε=1.

That is, the squares of 2-norm ε- insensitive loss, ‖|Aw − 1b + y| ଶ

க‖ଶ is minimized with weight

େ

ଵ bଶ

ଶ in place of the 1-norm of ε- insensitive loss as in Eqn (1). Additional, we add the term ଶ

in the objective function to induce strong convexity and to certainty that the problem has a only

global optimal solution. These produce the following unconstrained minimization problem:

1

C ୫

min

ଶ (3)

(୵,ୠ)∈ୖ శభ 2(w w + bଶ) + 2 |A୧w + b − y୧|க

ଵୀଵ

This formulation has been projected in active set support vector regression [22] and determined

in its dual form. Motivated by smooth support vector machine for classification (SSVM) [15]

the squares of ε- insensitive loss function in the above formulation can be correctly

approximated by a smooth function which is extremely differentiable and described below.

Thus, we are admitted to use a fast Newton-Armijo algorithm to determine the approximation

problem. Before we make out the smooth approximation function, we exhibit some interesting

observations:

|x|க = max {0,|x| − ε }

= max{0, x − ε } + max{0, −x − ε } (4)

= (x − ε)ା + (−x − ε)ା.

In addition, (x − ε)ା . (−x − ε)ା = 0 for all x ∈ R and ε > 0 . Thus, we have

|x|ଶ

ଶ

ଶ

க = (x − ε)ା + (−x − ε)ା. (5)

In SSVM [15], the plus function xା approximated by a smooth p-function,p(x, α) = x +

ଵ log(1 + eି ୶),α > 0.

ଶ

It is straightforward to put in place of |x|க by a very correct smooth

approximation is given by:

pଶக(x, α) = ൫p(x − ε, α)൯ଶ + ൫p(−x − ε, α)൯ଶ. (6)

52 - Figure1.exemplifies the square of ε- insensitive loss function and its smooth approximation in

the case of α = 5 and ε = 1. We call this approximation pଶக -function with smoothing

parameterα. This pଶக-function is used hereto put in place of the squares of ε- insensitive loss

function of Eqn. (3) to get our smooth support vector regression(ε-SSVR):

min Φ

(୵,ୠ)∈ୖ శభ க, (w, b)

1

∶= min

(୵,ୠ)∈ୖ శభ 2 ൫w w + bଶ൯

C ୫

+

ଶ

2 pக (A୧w + b − y୧, α) (7)

୧ୀଵ

1

= min

(୵,ୠ)∈ୖ శభ 2 ൫w w + bଶ൯

C

+

ଶ

2 1 pக(A୧w + b − y, α) ,

Where pଶ

ଶ

ଶ

க(A୧w + b − y, α) ∈ R୫ is expressed by pக(A୧w + b − y, α)୧ = pக(A୧w + b − y୧, α).

This problem is a powerfully convex minimization problem without any restriction. It is not

difficult to show that it has a one and only solution. Additionally, the objective function in Eqn.

(7)is extremelydifferentiable, thus we can use a fast Newton-Armijo technique to deal with the

problem.

Before we deal with the problem in Eqn. (7) we have to show that the result of the equation (3)

can be got by analyzing Eqn. (7) with α nearing infinity.

We begin with a simple heading thatlimits the difference betweenthe squares of ε- insensitive

loss function,|x|ଶ

ଶ

க and its smooth approximation pக(x, α).

Heading 2.2.1.For x ∈ Rand |x| < ߪ + ߝ:

log 2 ଶ 2σ

pଶ

ଶ

க(x, α) − |x|க ≤ 2 ൬ α ൰ + α log2, ሺ8ሻ

where pଶகሺx, αሻis expressed in Eqn. (6).

Proof. We allow for three cases. For −ε ≤ x ≤ ε, |x|க = 0 and p(x, α)ଶ are a continuity

increasing function, so we have

pଶ

ଶ

க(x, α) – |x|க = p(x − ε, α)ଶ + p(−x − ε, α)ଶ log2 ଶ

≤ 2p(0, α)ଶ = 2 ൬ α ൰ ,

sincex − ε ≤ 0 and – x − ε ≤ 0.

ଶ

For ε < ݔ < ߝ + ߪ , using the result in SSVM[15] that pሺx, αሻଶ − ሺxାሻଶ ≤ ቀ୪୭ ଶቁ + ଶ log 2

for |x| < ߪ , we have

pଶக(x, α) – (|x|க)ଶ

= (p(x − ε, α))ଶ + (p(−x − ε, α))ଶ − (x − ε)ଶା

53 - ≤ ൫p(x − ε, α)൯ଶ − (x − ε)ଶା + (p(0,α))ଶ

log 2 ଶ 2σ

≤ 2 ൬ α ൰ + α log2.

Likewise, for the case of – ε − σ < ݔ < – ߝ , we have

log 2 ଶ 2σ

pଶகሺx, αሻ – ሺ|x|க)ଶ ≤ 2 ൬ α ൰ + α log2.

ଶ

Hence, pଶ

ଶ

கሺx, αሻ – |x|க ≤ 2 ቀ୪୭ ଶቁ + ଶ log 2.

By Heading 2.1, we have that as the smoothing parameter α reaches infinity, the one and only

solution of Equation (7) reaches, the one and only solution of Equation (3). We shall do this for

a function fக(x) given in Eqn. (9) below that includes the objective function of Eqn. (3) and for

a function gக(x, α) given in Eqn. (10) below which includes the SSVR function of Eqn. (7).

Axiom 2.2.2. Let A ∈ R୫⨯୬ andb ∈ R୫⨯ଵ . Explain the real valued functions fக(x) and

gக(x, α) in the n-dimensional real molecular descriptor spaceR୬:

1 ୫

1

f

ଶ

ଶ

க(x) = 2 หA୨x − bห +

(9)

க

2 ‖x‖ଶ

୧ୀଵ

And

g

୫

ଶ

ଶ

க(x, α) = ଵ ∑୧ୀଵ p ( A

‖x‖ ,

ଶ

க

୨x − b, α) + ଵଶ

ଶ

(10)

Withε,α > 0.

1. There exists a one and only solution x of min୶∈ୖ fக(x) and one and only solution x of

min୶∈ୖ gக(x,α).

2. For all α > 0 , we have the following inequality:

log 2 ଶ

log 2

‖x

ଶ

− x‖ଶ ≤ m ቆ൬ α ൰ + ξ α ቇ , (11)

Whereξ is expressed as follows:

ξ = max |(Ax − b)

ଵஸ୧ஸ୫

୧|. (12)

Thus x gathers to xas α goes to endlessness with an upper limit given by Eqn. (11).

The proof can be adapted from the results in SSVM [15] and, thus, excluded here. We now

express a Newton-Armijo algorithm for solving the smooth equation (7).

54 - 2.2.1A NEWTON-ARMIJO ALGORITHM FOR ઽ-SSVR

By utilizing the results of the preceding section and taking benefitof the twice differentiability

of the objectivefunction in Eqn. (7), we determine a globally and quadratically convergent

Newton-Armijo algorithm for solving Eqn. (7).

Algorithm 2.3.1 Newton-ArmijoAlgorithm For -SSVR

Start with any choice of initial point (w , b ) ∈ R୬ାଵ. Having (w୧, b୧), terminate if the gradient

of the objective function of Eqn. (7) is zero, that is, ∇Φக, (w୧, b୧ )=0. Else calculate

(w୧ାଵ,b୧ାଵ) as follows:

1. Newton Direction:Decide the directiond୧ ∈ R୬ାଵ by allocatingequal to zero the

Linearization of∇Φக, (w, b) all over(w୧, b୧), which results inn + 1

Linear equations with n + 1 variables:

∇ଶΦக, (w୧,b୧)d୧ = −∇Φக, (w୧,b୧) . (13)

2. Armijo Step size [1]: Choose a stepsize λ୧ ∈ R such that:

൫w୧ାଵ, b୧ାଵ൯ = (w୧, b୧)+λ୧d୧ , (14)

ଵ

whereλ୧ = max{1, ଵ , … … }

ଶ,ସ

such that:

Φக, ൫w୧,b୧൯ − Φக, ((w୧,b୧) +λ୧d୧ ≥ −δλ୧Φக, (w୧,b୧)d୧,

(15)

where δ ∈ ቀ0, ଵቁ

ଶ .

Note that animportant difference between our smoothingapproach and that of the traditional

SVR [7-9] is that we are solving a linear system of equations (13) here, rather solving a

quadratic program, as is the case with the conventional SVR.

2.3LIBSVM

LIBSVM [18] is a library for support vector machines. LIBSVM is currentlyone of the most

widely used SVM software. This software contains C-support vector classification (C-SVC), v-

support vector classification (v-SVC), ε-support vector regression (ε-SVR), v-support vector

regression (v-SVR). All SVM formulations supported in LIBSVM are quadratic minimization

problems

2.4Potential-Support Vector Machines(P-SVM)

P-SVM [16-17] is a supervised learning method used for classification and regression. As well

as standard Support Vector Machines, it is based on kernels. Kernel Methods approach the

problem by mapping the data into a high dimensional feature space, where each coordinate

corresponds to one feature of the data items, transforming the data into a set of points in a

Euclidean space. In that space, a variety of methods can be used to find relations between the

data.

55 - 2.5Experimental Evaluation

In order to evaluate how well each method generalizedto unseen data, we split the entire data

set into two parts,the training set and testing set. The training data was usedto generate the

regression function that is learning fromtraining data; the testing set, which is not involved in

thetraining procedure, was used to evaluate the predictionability of the resulting regression

function.We also used a tabular structure scheme in splitting the entire data set to keep the

“similarity” between training and testing data sets [23]. That is, we tried to make the training

set and the testing set have the similar observation distributions. A smaller testing error

indicates better prediction ability. We performed tenfold cross-validation on each data set [24]

and reported the average testing error in our numerical results. Table 1 gives features of two

descriptor datasets.

Table 1: Features of two descriptor datasets

Data set(Molecular

Train

Test Size

Descriptors of

Size

Attributes

Oxazolines and

Oxazoles Derivatives)

Full

75 X 254

25 X 254

254

Reduced

75 X 71

25 X 71

71

In all experiments, 2-norm relative error was chosen to evaluate the tolerance between the

predicted values and the observations. For an observation vector y and the predicted vector yො ,

the 2-norm relative error (SRE) of two vectors y and yො was defined as follows.

‖y − yො‖

SRE =

ଶ

‖y‖ (16)

ଶ

In statistics, the mean absolute error is a quantity used to measure how close predictions are to

the eventual outcomes. The mean absolute error (MAE) is given by

1 ୬

1 ୬

MAE = n |yො୧ − y୧| = n |e୧| (17)

୧ୀଵ

୧ୀଵ

As the name suggests, the mean absolute error is an average of the absolute errors e୧=yො୧ − y୧,

where yො୧ is the prediction and y୧ the observed value.

In statistics, the coefficient of determination, denoted Rଶ and pronounced R squared, is used in

the context of statistical models whose main purpose is the prediction of future outcomes on the

basis of other related information. Rଶis most often seen as a number between 0 and 1, used to

describe how well a regression line fits a set of data. A Rଶ near 1 indicates that a regression

line fits the data well, while a Rଶ close to 0 indicates a regression line does not fit the data very

well. It is the proportion of variability in a data set that is accounted for by the statistical

model. It provides a measure of how well future outcomes are likely to be predicted by the

model.

56 - ∑୬୧ୀ (

ଵ y

Rଶ = 1 −

୧ − yො୧)ଶ

∑୬

(18)

୧ୀ (

ଵ y୧ − y)ଶ

The predictive power of the models developed on the calculated statistical parameters standard

error of prediction (SEP) and relative error of prediction (REP)% as follows:

∑୬

.ହ

୧ୀ (

ଵ yො

SEP = ቈ

୧ − y୧)ଶ

n

(19)

.ହ

100 1 ୬

REP(%) = y n (yො୧ −y୧)ଶ൩ (20)

୧ୀଵ

The performancesof models were evaluated in terms of root mean square error (RMSE), which

was defined as below:

ܴܯܵܧ = ඨ∑ ൫ݕ − ݕෝ ൯ଶ

ୀଵ ݊

(21)

Whereyො୧ ,y୧and y are the predicted, observed and mean activity property, respectively.

3.RESULTS AND DISCUSSION

In this section, we demonstrate the effectiveness of our proposed approachε-SSVR by

comparing it to LIBSVM (ε-SVR) and P-SVM. In the following experiments, training is done

with Gaussian kernel function k(x1, x2) = exp ቀ−ϒฮx୧ − x୨ฮଶቁ , where ϒis the is the width

of the Gaussian kernel, i, j = 1, … . . , l. We perform tenfold cross-validation on each dataset and

record the average testing error in our numerical results. The performances of ε-SSVR for

regression depend on the combination of several parameters They are capacity parameter ܥ, ε

of ε- insensitive loss function and ϒparameter. ܥ is a regularization parameter that controls the

tradeoff between maximizing the margin and minimizing the training error. In practice the

parameter ܥ is varied through a wide range of values and the optimal performance assessed

using a separate test set. Regularization parameter ܥ, whose effect on the RMSE is shown in

Figure 1a for full descriptor datasetandFigure 1b for reduced descriptor dataset.

57 - Figure 1a.The selection of the optimal capacity factorܥ (8350) for ε-SSVR(ε=0.1, ϒ=0.0217)

For the Full descriptordataset, The RMSE valuefor ε-SSVRmodel 0.3563 is small for selected

optimal parameter C, compared to RMSE values for other two models i.e. LIBSVM (ε-SVR)

and P-SVM are 0.3665 and 0.5237.Similarly,for the reduced descriptor dataset,The RMSE

value for ε-SSVR model 0.3339 is small for selected optimal parameter C, compared to RMSE

values for other two models i.e. LIBSVM (ε-SVR) and P-SVM are 0.3791 and 0.5237. The

optimal value for ε depends on the type of noise present in the data, which is usually unknown.

Even if enough knowledge of the noise is available to select an optimal value for ε, there is the

practical consideration of the number of resulting support vectors. Ε insensitivity prevents the

entire training set meeting boundary conditions and so allows for the possibility of sparsely in

the dual formulations solution.

58 - Figure 1b. The selection of the optimal capacity factorܥ(1000000) for ε-SSVR(ε=0.1, ϒ=0.02)

So, choosing the appropriate value of ε is critical from theory. To find an optimal ε, the root

mean squares error (RMSE) on LOO cross-validation on different ε was calculated. The curves

of RMSE versus the epsilon (ε) is shown in Figure 2a and Figure 2b.

Figure 2a. The selection of the optimal epsilon (0.1) for ε-SSVR(ܥ = 1000, ϒ=0.02)

For the Full descriptor dataset , The RMSE value for ε-SSVR model 0.3605 is small for

selected optimal epsilon(ε), compared to RMSE value for LIBSVM(ε-SVR) model is closer i.e.

0.3665 but comparable to the proposed model and bigRMSE value for P-SVM model is

0.5237.

59 - Figure 2b.The selection of the optimal epsilon (0.1) for ε-SSVR(ܥ= 10000000, ϒ=0.01)

Similarly , for the Reduced descriptor dataset , The RMSE value for ε-SSVR model 0.3216 is

small for selected optimal epsilon(ε) , compared to RMSE values for other two models i.e.

LIBSVM(ε-SVR) and P-SVM are 0.3386 and 0.4579.

Figure 3a. The selection of the optimal ϒ(0.02) for ε-SSVR(C =1000, ε=0.1)

Parameter tuning was conducted in ε-SSVR, where the ϒparameter in the Gaussian kernel

function was varied from 0.01 to 0.09 in steps 0.01 to select optimal parameter. The value of ϒ

is updated based on the minimization LOO tuning error rather than directly minimizing the

training error. The curves of RMSE versus the gamma(ϒ) is shown in Figure 3a and Figure 3b.

60 - Figure 3b. The selection of the optimal ϒ(0.01) for ε-SSVR(C =1000000, ε=0.1)

For the Full descriptordataset , The RMSE value for ε-SSVR model 0.3607 is small for

selected optimal parameter ϒ , compared to RMSE values for other two models i.e.

LIBSVM(ε-SVR) and P-SVM are 0.3675 and 0.5224. Similarly , for the Reduced descriptor

dataset , The RMSE value for ε-SSVR model 0.3161 is small for selected optimal parameterϒ,

compared to RMSE values for other two models i.e.LIBSVM(ε-SVR) and P-SVM are 0.3386

and 0.4579.

The statistical parameters calculated for the ε-SSVR, LIBSVM(ε-SVR) and P-SVM models are

represented in Table 2 and Table 3.

Table 2. Performance Comparison between ε-SSVR,ε-SVR and P-SVM for Full descriptor

dataset

Algorithm (ε, C,ϒ)

Train

Test

MAE

SRE

SEP

REP(%)

Error(܀ )

Error(܀ )

ε-SSVR

0.9790

0.8183

0.0994

0.1071

0.3679

53.7758

ε-SVR

(0.1,1000,0.0217)

0.9825

0.8122

0.0918

0.0979

0.3741

54.6693

P-SVM

0.8248

0.6166

0.2510

0.3093

0.5345

78.1207

ε-SSVR

0.9839

0.8226

0.0900

0.0939

0.3636

53.1465

ε-SVR

(0.1,8350,0.0217)

0.9825

0.8122

0.0918

0.0979

0.3741

54.6693

P-SVM

0.8248

0.6166

0.2510

0.3093

0.5345

78.1207

ε-SSVR

0.9778

0.8181

0.1019

0.1100

0.3681

53.8052

ε-SVR

(0.1,1000,0.02)

0.9823

0.8113

0.0922

0.0984

0.3750

54.8121

P-SVM

0.8248

0.6186

0.2506

0.3093

0.5332

77.9205

61 - In these tables, statistical parameters R-square (Rଶ) ,Mean absolute error (MAE),2-N

Normalization(SRE), standard error of prediction (SEP) and relative error of prediction

(REP%) obtained by applying the ε-SSVR, ε-SVR and P-SVM methods to the test set indicate

a good external predictability of the models.

Table 3. Performance Comparison between ε-SSVR,ε-SVR and P-SVM for Reduced descriptor

dataset

Algorithm ( ε, C,ϒ)

Train

Test

MAE

SRE

SEP

REP(%)

Error(܀ )

Error(܀ )

ε-SSVR

0.9841

0.8441

0.0881

0.0931

0.3408

49.8084

ε-SVR

(0.1,1000000,0.02)

0.9847

0.7991

0.0827

0.0914

0.3870

56.5533

P-SVM

0.8001

0.7053

0.2612

0.3304

0.4687

68.4937

ε-SSVR

0.9849

0.8555

0.0851

0.0908

0.3282

47.9642

ε-SVR

(0.1,10000000,0.01)

0.9829

0.8397

0.0892

0.0967

0.3456

50.5103

P-SVM

0.8002

0.7069

0.2611

0.3303

0.4673

68.3036

ε-SSVR

0.9796

0.8603

0.0964

0.1056

0.3226

47.1515

ε-SVR

(0.1,1000000,0.01)

0.9829

0.8397

0.0892

0.0967 0.3456

50.5103

P-SVM

0.8002

0.7069

0.2611

0.3303

0.4673

68.3036

An experimental results show that experiments carried out from reduced descriptor datasets

shows good results rather than full descriptor dataset. As from can be seen from table 4 , the

results of ε-SSVR models are better than those obtainedby ε-SVR and P-SVM models for

Reduced descriptor data set.

Figure 4. Correlation between observed and predicted values for training set and test set

generated by ε-SSVR

Figure4, 5and 6 are the scatter plot of the three models, which shows a correlation between

observed value and ant tuberculosisactivity prediction in the training and test set.

62 - Figure 5. Correlation between observed and predicted values for training set and test set

generated by ε-SVR

Figure 6. Correlation between observed and predicted values for training set and test set

generated by P-SVM algorithm

Our numerical results have demonstrated that ε-SSVR is a powerful tool for solving

regressionProblems handle the massive data sets without scarifying any prediction accuracy. In

the tuning process of these experiments, we found out that LIBSVM and P-SVM become very

slow when the control parameter ܥ becomes bigger, while ε-SSVR is quite robust to the control

parameter ܥ. Although we solved the ε-insensitive regression problem is an unconstrained

minimization problem.

63 - 4CONCLUSION

In the present work, ε-SSVR, which is a smooth unconstrained optimization reformulation of

the traditional quadratic program associated with a ε-insensitive support vector regression.We

have compared the performance of, ε-SSVR, LIBSVM and P-SVM models with two datasets.

The obtained results show that ε-SSVR can be used to derive statistical model with better

qualities and better generalization capabilities than linear regression methods. Ε-

SSVRalgorithm exhibits the better overall performance and a better predictive ability than the

LIBSVM and P-SVM models. The experimental results indicate ε-SSVR has high precision and

good generalization ability.

ACKNOLDGEMENTS

We gratefully thank to the Department of Computer Science Mangalore University, Mangalore

India for technical support of this research.

REFERENCES

[1] Jan Luts,,Fabian Ojeda, Raf Van de Plasa, Bart De Moor, Sabine Van Huffel, Johan A.K. Suykens

,“A tutorialon support vector machine-based methods for classification problems in chemometrics”,

AnalyticaChimicaActa665 (2010) 129–145

[2] HongdongLi ,Yizeng Liang, QingsongXu ,”Support vector machines and its applications in

chemistry”,Chemometrics and Intelligent Laboratory Systems 95 (2009) 188–198

[3] Jason Weston,”Support Vector Machines and Stasitical Learning Theory”, NEC Labs America 4

IndependenceWay,

Princeton,

USA.http://www.cs.columbia.edu/~kathy/cs4701/documents/jason_svm_tutorial.pdf

[4] AlyFaragandRefaat M Mohamed ,“Regression Using Support Vector Machines: Basic

Foundations”

,http://www.cvip.uofl.edu/wwwcvip/research/publications/TechReport/SVMRegressionTR.pdf

[5] Chih-Jen Lin ,“Optimization, Support Vector Machines, and Machine Learning” ,

http://www.csie.ntu.edu.tw/~cjlin/talks/rome.pdf

[6] Max

Welling

,“Support

Vector

Regression”

,

http://www.ics.uci.edu/~welling/teaching/KernelsICS273B/SVregression.pdf

[7] ALEX J. SMOLA and BERNHARD SCHO¨ LKOPF,, “Tutorial on support vector regression”

,Statistics and Computing 14: 199–222, 2004

[8] Qiang Wu Ding-Xuan Zhou,” SVM Soft Margin Classifiers: Linear Programming versus Quadratic

Programming” ,www6.cityu.edu.hk/ma/doc/people/zhoudx/LPSVMfinal.pdf

[9] Laurent

El

Ghaoui,”

Convex

Optimization

in

Classifcation

Problems”

,

www.stanford.edu/class/ee392o/mit022702.pdf

[10] DONGHUI LI MASAO FUKUSHIMA, “Smoothing Newton and Quasi-Newton methods for

mixed Complementarity problems” ,Computational Optimization and Applications ,17,203-

230,2000

[11] C. Chen and O.L. Mangasarian ,“Smoothing Methods for Convex Inequalities and Linear

Complementarity problems”, Math. Programming, vol. 71, no. 1, pp. 51-69, 1995

64 - [12] X Chen,L. Qi and D. Sun,“Globalandsuperlinear convergence of the Smoothing Newton

methodapplication to general box constrained variational inequalities“, Mathematics of

Computation Volume 67,Number 222, April 1998, Pages 519-540

[13] X. Chen and Y. Ye, SIAM J. , “ On Homotopy-Smoothing Methods For Variational Inequalities”

,Control andOptimization, vol. 37, pp. 589-616, 1999.

[14] Peter W. ,“A semi-smooth Newton method for elasto-plastic contact problems”, Christensen

International Journal of Solids and Structures 39 (2002) 2323–2341

[15] Y. J. Lee and O. L. Mangasarian ,” SSVM: Asmooth support vector machine forclassification”,

Computational Optimization and Applications, Vol. 22, No. 1, 2001, pp. 5-21.

[16] SeppHochreiter and Klaus Obermanyer ,”Support Vector Machines for Dyadic Data” ,Neural

Computation,18, 1472-1510, 2006. http://ni.cs.tu-berlin.de/software/psvm/index.html

[17] Ismael F. Aymerich, JaumePiera and AureliSoria-Frisch ,“Potential Suport Vector Machines and

Self-Organizing Maps for phytoplankton discrimination”, In proceeding of: International Joint

Conference onNeural Networks, IJCNN 2010, Barcelona, Spain, 18-23 July, 2010

[18] C.-C. Chang and C.-J. Lin, 2010 , “LIBSVM: A Library for Support Vector Machines”

,http://www.csie.ntu.edu.tw/~cjlin/libsvm

[19] Andrew J. Phillips, Yoshikazu Uto, Peter Wipf, Michael J. Reno, and David R. Williams,

“Synthesis of Functionalized Oxazolines and Oxazoles with DAST and Deoxo-Fluor” Organic

Letters 2000 Vol 2 ,No.81165-1168

[20] Moraski GC, Chang M, Villegas-Estrada A, Franzblau SG, Möllmann U, Miller MJ.,”Structure-

activityrelationship of new anti-tuberculosis agents derived from oxazoline and oxazole benzyl

esters” ,Eur J Med Chem. 2010 May;45(5):1703-16. doi: 10.1016/j.ejmech.2009.12.074. Epub

2010 Jan 14.

[21] “Padel-Descriptor” http://padel.nus.edu.sg/software/padeldescriptor/

[22] David R. Musicant and Alexander Feinberg ,“Active Set Support Vector Regression” , IEEE

TRANSACTIONS ON NEURAL NETWORKS, VOL. 15, NO.2, MARCH 2004

[23] Ian H.Witten&Eibe Frank, “Data mining: Practical Machine Learning Tools andtechinques”

,SecondEditionElseveir.

[24] PayamRefaeilzadeh,

Lei

Tang,

Huanliu

,“Cross-Validation”

,

http://www.cse.iitb.ac.in/~tarung/smt/papers_ppt/ency-cross-validation.pdf

65 - Authors

Doreswamyreceived B.Sc degree in Computer Science andM.Sc Degree

inComputer Science from University of Mysore in 1993 and 1995 respectively.

Ph.Ddegree in Computer Science from Mangalore University in the year 2007.

Aftercompletion of his Post-Graduation Degree, he subsequently joined and served

asLecturer in Computer Science at St.Joseph’s College, Bangalore from 1996-

1999.Then he has elevated to the position Reader in Computer Science at

Mangalore Universityin year 2003. He was the Chairman of the Department of

Post-Graduate Studies and researchin computer science from 2003-2005 and from 2009-2008 and served

at varies capacitiesin Mangalore University at present he is the Chairman of Board of Studies and

AssociateProfessor in Computer Science of Mangalore University. His areas of Researchinterestsinclude

Data Mining and Knowledge Discovery,ArtificialIntelligence and Expert Systems, Bioinformatics

,Molecular modelling and simulation ,Computational Intelligence ,Nanotechnology, ImageProcessing

and Pattern recognition. He has been granted a Major Research project entitled “Scientific Knowledge

DiscoverySystems(SKDS) for Advanced Engineering Materials Design Applications” fromthe funding

agency University Grant Commission, New Delhi, India. Hehas been published about 30 contributedpeer

reviewed Papers at national/International Journal and Conferences.Hereceived SHIKSHA RATTAN

PURASKAR for his outstanding achievementsin the year 2009 and RASTRIYA VIDYA

SARASWATHI AWARD for outstanding achievement in chosenfield of

activityin the year 2010.

ChanabasayyaM. Vastradreceived B.E. degree and M.Tech.degree in the

year2001 and 2006 respectively. Currently working towards his Ph.D Degree in

Computer Science andTechnology under the guidance of Dr. Doreswamyin the

Department of Post-Graduate Studies and Research in Computer Science,

Mangalore University.

66