Given discretized functional data f
defined over a grid, find the knots that are quasi-optimal for
the piecewise constant basis. It is 'quasi-optimal' since the optimality is locally verified at each
step when a knot is added to already existing ones. There is no guarantee that the final set of knots
is optimal globally over all possible knots of this size.
Utilizing 'add_split' function for evaluation of L new knots only in the intervals that contain
more than the specified (by M
) number of knots. If it is not possible for the function to return the specified by L
number of additional knots a warning message is reported informing what is the maximal number of new knots
that can be added. The knots are added in such a way that it achieves the largest drop in the AMSE (which the sum
of mean squared errors accross all functional data.)
add_knots(f, f_v = NULL, knots, L, M = 5)
f | functional data, i.e. matrix of the values row-wise evaluated over equidistant arguments (not given) |
---|---|
f_v | functional data i.e. matrix of the values row-wise evaluated over equidistant arguments as the validation data for which the validate the
decreas in the amse over the selected knots outputed from the functional data |
knots | input knots, a sequence of ordered K integers in the range 0:nx, where nx=dim(f)[2] in between knots intervals are defined open ended on the left hand side (knots[i]) and closed on the right hand side (knots[i+1]): knots[i]+1,...,knots[i+1] |
L | number of additional knots |
M | the minimal number of points per intervals between knots for further split of the optimization to be performed. The default is 5. It means that if there is less than M points per interval the further split of the interval marked by knots is not performed. The program will stop if there is too few points to find the requested number of knots with the given restriction for M. |
A list of the following values
sequence of K+L (old and new) knots
the corresponding sequence of K+L-1 of the within knots average mean square errors
the decreasing sequence of the averaged squared L2 norms: ||f1 - hat f1l||_2^2+...+||fn - hat fnl||_2^2, l=0,...,L, where hat f_il are piecewise constant approximation of fi's with l knots added to the input knots.
Nassar, H., Podgórski, K. Empirically driven orthonormal bases for functional data analysis. Proceedings of European Numerical Mathematics and Advanced Applications Conference 2019. Cognitive Systems, Department of Applied Mathematics and Computer Science, Technical University of Denmark, Denmark
Basna, R. Nassar, H., Podgórski, K. Machine Learning Assisted Orthonormal Bases Selection for Functional Data Analysis. (preprint)
split
for constructing split at a given knot; opt_split
for finding the optimal split within one interval; add_splitw
for selecting the optimal split from a set of potential splits.
n=10 #number of samples #f=rbetafda(n) #generating data #f=rbetafda(n,ta=3,tb=3) #generating data nx=1000 f=rbetafda(n,nx,ta=3,tb=3) #generating data nx=dim(f)[2] #size of the equidistant one dimensional grid hf=1/(nx+1) #increment s i z e grid=matrix( seq (hf , 1-hf , by=hf) , nrow=1) #grid xx=vector() for( i in 1:(nx-1)){ Q=split(f,i) xx[i]=Q[1]*i/nx+Q[2]*(nx-i)/nx } plot(xx)AMSE=c(mean((nx-1)/nx*apply(f,1,var))) knots=c(0,nx) #We take zero as the location of the first knot since, we want intervals pointed # by 'knots' to be open-close, i.e. the k-th interval is 'knots[k]+1, knots[k+1]' K=length(knots) KS=add_knots(f,knots=knots,L=10, M = 5)#> proposed splits is 645 #> proposed splits is 292 865 #> [1] "printing the new knot" #> [1] 645 #> proposed splits is 133 458 865 #> [1] "printing the new knot" #> [1] 292 #> proposed splits is 133 458 761 967 #> [1] "printing the new knot" #> [1] 865 #> proposed splits is 59 211 458 761 967 #> [1] "printing the new knot" #> [1] 133 #> proposed splits is 59 211 373 555 761 967 #> [1] "printing the new knot" #> [1] 458 #> proposed splits is 59 211 373 555 761 925 994 #> [1] "printing the new knot" #> [1] 967 #> proposed splits is 59 211 373 555 706 814 925 994 #> [1] "printing the new knot" #> [1] 761 #> proposed splits is 59 211 373 555 706 814 925 984 NA #> [1] "printing the new knot" #> [1] 994 #> proposed splits is 59 211 373 555 706 814 897 948 984 NA #> [1] "printing the new knot" #> [1] 925 #> proposed splits is 59 211 373 506 602 706 814 897 948 984 NA #> [1] "printing the new knot" #> [1] 555KS#> $Fknots #> [1] 0 133 292 458 555 645 761 865 925 967 994 1000 #> #> $FAMSE #> [1] 0.017752663 0.013880798 0.012814407 0.003403344 0.003627512 0.011056448 #> [7] 0.012445392 0.006045039 0.008473167 0.023907391 0.104279808 #> #> $APPRERR #> [1] 0.27794695 0.16970367 0.09762801 0.05974294 0.04799061 0.03666111 #> [7] 0.02599530 0.01828658 0.01567433 0.01369670 0.01191857 #>