High dimensional chaos (HDS).

Welcome to the Web Site dedicated to the algorithm

for estimation of the correlation dimension d of High Dimensional Signals.

The new version has been uploaded on 29 Dec 2018. Some new details have been added and the algorithm has been accelerated.

What has been modified in the version-2 of the HDS toolkit?   New algorithm details are marked with (new)  ....

The modifications marked with (new) were developed after publishing the papers below and are not described in the cited papers.

 

1.      K. Michalak, (2011) Modifications of the Takens-Ellner algorithm for medium- and high-dimensional signals, Physical Review E, Vol. 83(2):026206,

2.     K.P. Michalak. How to estimate the correlation dimension of high-dimensional signals?  AIP Chaos, 24(2):033118, 2014.

3.  K.P. Michalak. Estimating correlation dimension of high-dimensional signals - quick algorithm. AIP Advances 8, 105201 (2018); doi: 10.1063/1.5013255

The notation being used in the functions dhds.m and dhdsall.m  corresponds to the notation being used in the above mentioned articles.

 

DOWNLOAD  HDS-TOOLKIT-v2   (50MB) (updated 29.12.2018)

This is the beta version of the toolkit.

If something does not go, please write me the email with the description of your problem. I try help.

Download the former version 1 if necessary...

 

 There are 2 functions dedicated for estimating correlation dimension d

dhds.m - the short version of algorithm is implemented

and

dhdsall.m - the full version of algorithm is implemented

 

The general call of dhds.m function is:


[ dcopt, Wopt, dcW, dcM, dcMc, WV, Pmax, other ] =

dhds (series, quick precise, MaxIterationTime, Da,  DeflErr,  norm,  NW Wmax WV,  Pmin,  dcW,  fitting,  figures)

 

However, you can use the simplest call:   d = dhds ( series )

 

Use empty string:  ''    if you want use the default values of subsequent input parameters.


  

Let us analyze the input parameters:
 

(1)  series - the signal being analyzed,

     use empty string or nothing for looking the exemplary calculation of the 3xLorenz signal having d = about 6 (new).

 

(2) quick precise_surrog: this parameter joins 2 parapmeters: speed and surrogate number .(new)

    In general, please use the number in the format ssp :    ss - surrogates number, p = 1, 2 or 3
      Joining 2 variables in one is connected with the necessity to keep the compatibility with the version described in citation [3]
      The variables will be separated in future versions
   p - makes it possible to set just one of 3 options of speed/precision:

  Use 1 for preliminary estimation and 3 for final estimation, if necessary. Change other parameters if the results are not satisfactory.

  - 1 or 'quick' - quick, low precision, no surrogates
     -2 or 'medium' - medium (default), , no surrogates
    - 3  or 'precise' - slow, high precision, , no surrogates.
   ss - the number of surrogates calculated to compare result for the original signal with surrogates. (new)

- e.g 11 - quick, 1 surrogate
   - e.g. 72 - medium, 7 surrogates
   - e.g. 503 - precise, 50 surrogates
      The algorithm uses the same parameters for surrogates for subsequent W-iterations: B ED W L as that used for the original signal

 


(3) MaxIterationTime: this parameter makes it possible to schedule the calculations with respect to its duration. It denotes the maximal duration (in seconds) of every W-iteration. There is more di cult to nd distances that are shorter than assumed max when the estimated complexity for the given W increases and the calculations become signi cantly longer. The solution is to increase the accepted de ection error e which determines higher max. The estimation of d becomes possible, however, it is performed with a bit lower precision due to higher e. After MaxIterationTime is reached without nding Da distances, the algorithm takes decision: (a) if more than 50% of Da is reached until now, the calculation follows up to reaching Da, however, e is increased for next iteration; if less than 50% distances is reached, W-iteration is repeated with higher e. e increases by 1% by default and cannot excess 30%. The default MaxIterationTime is 15 s, 40 s or 120 s (new, due to accelaration of the algorithm) depending on quick_precise_surrog value.

It is possible, also, to change the HEADER's variable 'UserAction'. If 'UserAction'=0 than the calculations are repeated using increased Deflection Error e. If 'UserAction'=1, the algorithm stops and asks for the user's decision: 'Follow' using current e'  or  'Break and repeat using increased e'.
 

(4) Da: assumed nbr of distances being shorter than the estimated distance dmax. dmax is estimated based on assumed deflection error e and used embedding dimension m. When omitted, default value of Da depends on the used quick_precise_surrog parameter: Da = 1000 or 3000 or 10000.


(5) DeflErr(1:2): the vector with 2 elements. The 1-st element describes the assumed deflection error e for the first few W-iterations to estimate current
dmax. The default value is 4%. It is increased automatically when the complexity d for the given W increases and it is difficult to find assumed number of distances Da in the assumed MaxIterationTime. Increasing e reduces slighlty the precision of calculations and stronlgy reduces calculation time.

It must be pointed, however, that to small e may shift the results into `noise` region of d=fn(Pmax) relation.

You can use higher initial values e.g. 5-10%

- if the signal is significantly charged with noise,

- if the expected complexity d is very high

- if the signal is short

- if you want obtain robust results in very quick time.

You may use lower values if the complexity of the signal is rather low, the signal is noise-free (generated by some differential equations) and you would like reduce the error connected with e-correction. High values accelerate strongly the calculations for small Ws.

The calculation time for large Ws (large d estimated) should be rather regulated by using MaxIterationTime

 

The 2nd element of this input parameter DeflErr(2) = ecut represents the deflection error for which the dcut is estimated. It must be equal or greater than DeflErr(1)=e. dcut determines the cut-off distance and, thus, the length of the actual dmax vector in the memory space and the duration of its sorting, after Da is reached. The distances shorter than dcut and longer than dmax are the reserve if increase in e is necessary for the given W-iteration. The iteration has not to be repeated. It influences also the Figures being printed as a output of the function. If you use higher value, the relation d = fn (Pmax) is broader. The acceptable values lies between 1 and 30. The unit is [%]. By default, it is 5%.

Both these parameters are automatically increased  by 1 if the assumed number of distances Da being shorter than dmax is not reached in the assumed MaxIterationTime.
The calculation for the given W-iteration is repeated using DeflErr increased by 1 making the estimation of dw possible in the assumed time, however, with a bit lower precision.


(6) normalization : preliminary normalisation of the signal. Acceptable values are 0 and 1. If 1 (default) - the series is converted to possess a normal distribution (0,1). Normalisation can reduce the calculation time, because the higher ratio of distances is shorter than assumed
dmax and Pmax is higher for the assumed deflection error e. Be careful  if the histogram of the signal is strongly non-gaussian, and especially, if the histogram possesses the maximum/a on its borders. Normalisation makes it also possible to use preliminary estimated histogram slopes (stored in attached deltamaxM_PmaxM_normal.mat file). They have not to be estimated individually for the given signal.
 

(7) NW_Wmax_WV: Here, you can set the vector WV with your Window Width values used for calculations. You can just set the vector WV with the assumed W values. E.g. you want to use the same W values for different similar signals.

If the number of elements is less than 4 the algorithm finds default values depending on the number of elements:
- 1-element number : it is interpreted as NW
- 2-element vector it is interpreted as: [NW, Wmax]
- 3-element vector: it is interpreted as: [NW, Wmax, Wmaxdensity]
        NW is the number of elements in WV vector (default is 40),  

  Wmax - maximal value of the vector WV, by default 10t (autocorelation time of the signal).
    Wmaxdensity denotes the value of W for which W values are chosen most dense.
    If you expect/know the range of plateau in dw=fn(W) relation then set Wmaxdensity to be in the center of this plateau in order to increase a bit the precision of polynomial fitting (default is 2.5 ).
 

- if the parameter is 2-dimensional matrix (2 x NW) - the second row is treated as default embedding dimension m for W's defined in the first row (new)

- if the parameter is 3-dimensional matrix (3 x NW)- the second row - as above - embedding dimension, third row is treated as J for W's defined in the first row  and m defined in the second row. (new)

 

- if [0,Wmax] - dw's are calculated for increasing W's calculated using a rule:
      Wnext = Wprevious * Wincr... up to Wmax value starting from W=5samples.
      Use this option as the most universal and independent on . Wincr is equal to 1.1-1.3 depending on quick precise value.


- if 'fntau' or 'tau' or " (empty string): (DEFAULT)

      sets 40 values W from 0.1 to 10 due to estimated autocorrelation time of the signal, unequally distributed with the maximum density for about 2.5 tau .
The distribution of default W-values can be modified in this method by changing the default parameters in the HEADER of the function: NW=40,  Wmax_over_tau=10 and defaultWdense_over_tau=2.5.


(8) Pmin: This parameter determines the method of the choice of Pmin parameter. This parameter does not influence strongly the result of final d estimation if Pmax is high enough. As a rule, it is necessary to exclude the initial part of the vector with the very short distances which do not create a regular plateau. In the case of noise contaminated signal, Pmin represents the range of the strong influence of noise.

The interpretation of this parameter depends on its value:

- If 0 < Pmin < 0.1 - it is treated as Pmin. You can set this value if you have observed the begin of the influence of noise on the graph d=fn(Pmax,W) in the preliminary run of the algorithm. 
  - If Pmin > 0.1 - it denotes the deflection error e (in %) for which Pmin should be estimated.
  - If Pmin = 0  - than no Pmin-cut-off is used and d is estimated using ALL the distances shorter than
dmax (Imin=1). Use this value if you want see on the graph all the relation d=fn(Pmax,W) with the noise contamination region.
  - If Pmin = -1 - The default value. It determines the automatic approximate determination of Pmin, separately for every W-iteration. Pmin is chosen to be the smallest Pmax for which the estimated d lies in the range 66-150% of the preliminary/roughly estimated d.

Look for the variable PminCut=1.5 in the HEADER of the function to modify this range.

E.g. PminCut=1.2 denotes the range dinit/1.2  < dinit < dinit* 1.2


(9) dcWmethod : This parameter determines the method of the estimation dw for the current W-iteration from the relation d=fn(Pmax).
     The question arises, which fragment of this relation should be the base for the estimation of dw?

  It is observed that the relation d=fn(Pmax) possesses lacunarities in the plateau region. Thus, in order to increase the precision, a number of points from this relation should be taken to estimate dw as a mean over these points. The parameter dcWmethod determines the the choice of the range of indices  Imaxdown  - Imaxup  which are used to estimate this mean.
 

First:  Imaxopt is estimated based on dinit, e and m.   Iminopt is estimated based on Pmin.

choose:
1 - range is maximally broad: Imaxdown = Iminopt + 200,      Imaxup = Icut,
2 - (default) range is broad:   Imaxdown = 0.05*Imaxopt ,     Imaxup = 1.5*Imaxopt
3 - range is medium broad:    Imaxdown = 0.12*Imaxopt ,     Imaxup = 1.2*Imaxopt
4- range is narrow:                  Imaxdown = 0.3*Imaxopt ,       Imaxup =    1 *Imaxopt
5- use 1 point only:    Imax = Imaxopt

(additional small modifications may be added to these rules for small Imaxopt if necessary)
 

(10) fitting_method: This parameter determines the method of polynomial fitting applied to the relation d=fn(W).
- if 0:   looks for optimal fitting over different polynomials and self-defined functions in the function 'optimal_fitting.m`.

Here, you can define other functions than polynomials to find optimal fitting. The procedure works on symbolic expressions and solving differential equations, thus be careful with defining variables to be optimized (especially, do not use the variables in the denominanitve - symbolic solver fails in this case).
- if [4,5,6] (or higher): define the degrees of the polynomials to be fitted.
- if [4]: there is only 1 possible minimum in the 3-degree 1st derivative. You get unambiguous result: no plateau  or 1 plateau and dopt estimated.
- if [5] or [6]: there are 0, 1 or 2 possible minima in the 4th- and 5th-degree 1st derivatives.
- if [7] or [8]: there are 0, 1, 2 or 3 possible minima in the 6th- and 7th-degree 1st derivatives.
- if [4,5] [4 6] [4 5 9] [4 6 8 10]: you can use every combination of the polynomial degrees to be used. The graphs and results will be printed for every polynomial used. Increase NW when using higher degree polynomials to increase fitting precision.
DEFAULT is [4 6 8].
 

(10) figures :
0 - turns off plotting figures
1 - figures are plotted once after the last W-iteration.
2 - (default) turns on plotting intermediate figures after every W-iteration.
 

 

(12) signal_number : important only for figure names to to create unique figure names when the function is repeated in the loop. This number is added to figure names to distinguish figure names while calling to function in the loop for subsequent signals. Other case - figure are over-saved for subsequent calls of the function.

 

(13) fighandleset: vector with fighandles to use while printing figures.

when ommitted or 0 or -1 or NaN - new figures are created for every function loop. However, when using signal_number  - they are saved with unique names.

(1:3) handles for 3 main figures printed and refreshed after every W-iteration

(4)  handle for figure showing the difference between d for original and shuffled signals , only if the number of surrogates defined in parameter(2): quick precise_surrog  is higher than 0

(5-10)  - dedicated for future use

(11-..) handles for figures with polynomial fittig , odd number - without e-correction, even number - with e-correction

(11-12) first defined in (10)fitting_method polynomial degree  or   for the optimal_fitting procedure fitting_method =0

(13-14) second defined in (10)fitting_method polynomial degree

(15-16) third defined in (10)fitting_method polynomial degree

...

 

(14)  -EachToEach (new) - defines the algorithm for the choice of pairs of points in m-dimensional space to calculate distances...

- if 1 - the approach `Each-To-Each` from the A matrix is used to calculate distances. However, the algorithm stops after reaching Da distances shorter than dmax. In order to avoid biases, the order of basal points for comparing with other ones are chosen randomly.

Based on the former W-iteration, the automatically estimated parameter Jreduction lowers the preliminary estimated J if A matrix is expected to be too small to get sufficient number of distances Da being shorter than dmax.

(Lower J  ->  larger A matrix  ->  more distances to calculate using EachToEach approach)

- if 0 - (default) repeating the  Random Joining Procedure is used, as described in [1,2,3]. This approach may lead to repeating the pairs in subsequent RJP repetitions. However, if the number of RJP repetitions becomes too large, the algorithm jumps automatically to EachToEach approach in next W-iterations. If EachToEach becomes insuficient - J is lowered to get bigger A matrix with m-dimensional points.

 

 

 

OUTPUT PARAMETERS:

(new): Due to the repeatition of the algorithm for subsequent surrogates, the necessity occured to store the results for all surrogate repetitions. Thus, the index s is added in the output paramters.

s = 1                       - the reuslts for original signal,  

s = 2 .. NSurr+1   -  the results for  1..NSurr surrogates,   NSurr - the number of surrogates defined in quick_precise_surrog variable.

 

 


I. dopt {i,j,s}(k) : final estimators of dimensional complexity
i=1 - results without e-correction, i=2 - results with e-correction
j - the number of the fitted polynomial, index into ` fitting` vector
k - the number of plateaus for the given polynomial fitting. In case of higher degree polynomials, the number of the plateaus may be higher than 1.
II. Wopt {i,j,s}(k) : W's corresponding to dopt.   i, j, k as above.
III. dcW(1, : , s) - vector with dw's estimated for subsequent W's without e-correction.
      dcW(2, : , s) - with e-correction.
IV. dcM - matrix (sizePmaxV, sizeWV, NSurr+1) with d estimated without e-correction for the individual pairs Pmax, W.

Used for printing d=fn(Pmax,W) relation.
V. dcM_c - the same but with e-correction
VI. WV - vector with W's for which dw were estimated to build the relation dw = fn(W)
VII. PmaxV - defined in the HEADER of the function as: PmaxV_default = 10(8:0:01:0:01). Necessary for printing graphs.
VIII. other - structure keeping many other intermediate parameters. For details - see file dhds.m, -  the end of the file.

 

Exemplary use:

Analyzed signal - is the sum of 3 Lorenz signals possessing similar time scales.


 

The short relation d=fn(Pmax,W) after 40 W-iterations: (Figure 2 in the alorithm)


 

The exemplary relation d=fn(Pmax) for the last W-iteration  (Figure 1 in the algorithm)

The upper relation (with e-correction) reflects the lower one (without e-correction) being increased by about 10% due to the used deflection error e=10% corresponding to such very small Pmax's.

 


 

Exemplary 6-degree polynomial fitting applied to dw = fn(W) relation  after finishing the calculations for the original signal

 


 

Comparison with the surrogates:

The difference between <dsurr> and dorig  . The increase in dsurr after shuffling the phases in fft (surrogate signals) points to the chaotic structure which is visible especially in the range of W = 400-800 px.


 

 

If you would like to be informed about the new events on this web site, please write the email to the author :  kmichalak@amu.edu.pl