By Hongmin Cai (auth.), Petra Perner (eds.)

This ebook constitutes the refereed complaints of the eleventh commercial convention on information Mining, ICDM 2011, held in ny, united states in September 2011.

The 22 revised complete papers awarded have been conscientiously reviewed and chosen from a hundred submissions. The papers are equipped in topical sections on info mining in drugs and agriculture, info mining in advertising and marketing, information mining for business techniques and in telecommunication, Multimedia info Mining, theoretical points of knowledge mining, information Warehousing, WebMining and data Mining.

While in the past mostly black box methods such as neural nets and support vector machines have been heavily used in technical domains, methods that have explanation capability are preferred in medical domains. Nowadays, data mining methods with explanation capability are also used for technical domains after more work on advantages and disadvantages of the methods has been done. 5 is the most preferred method since it works well on average regardless of the data set being used. This method can easily learn a decision tree without heavy user interaction while in neural nets a lot of time is spent on training the net.

It splits the whole data set into blocks of n and runs cross-validation based theorem . The output of cross-validation is mean accuracy. As you might know from statistics it is much better to predict a measure based on single measures obtained from a data set split into blocks of data and to average over the measure than predict the measure based on a single shot on the whole data set. Moreover the variance of the accuracy gives you another hint in regard to how good the measure is: If the variance is high, there is much noise in the data; if the variance is low, the result is much more stable.

N}) φi(v[n+2]) = |ei[n+2]Tv[n+2]| = |wi| (41) where ei[n+2] are the unit vectors and v[n+2] = [w[n]T, θ, β]T. The modified CPL criterion function ΨH(v[n+1]) has the below form [9]: ΨH(v[n+2]) = ΦH(v[n+2]) + λ Σ γi φi (v[n+2]) i ∈{1,…,n} (42) where λ (λ ≥ 0) is the cost level and the feature costs γi are typically equal to one. The criterion function ΨH(v[n+2) (42) similarly to the function ΦH(v[n+2]) (32) is convex and piecewise-linear (CPL). , wλn*]T (22). (43) Prognostic Models Based on Linear Separability 21 The optimal vector vλ*[n+2] (43) allows to define both the interval regression model (4) as well as the below decision rule of the linear classifier which operates on elements of the sets H+ or H- (25) (z[n+2] = zj+[n+2] or z[n+2] = zj-[n+2] (21)).

