Life is about optimizing paths (own's and possibly other's) in an environment divided by tradeoffs (if you think `homo oeconomicus' then you may stop reading as well). I here review the well-known Bias-Variance Dilemma as well as the less-well-known Forecast AT-Dilemma and Signal-Extraction ATS-Trilemma. Whereas the first mentioned is a generic (unspecialized) statistical tradeoff, the former two address specifically forecasting and real-time signalextraction (though their essence is generic, too). A strongly counter-intuitive example is provided in order to flash-light into the darkness of the ATS-trilemma.
May 2 2013: The R-Code is available (see below) and I added hyperlinks (in the pdf): a very convenient navigation-facility, indeed. I added further references of recent applications, research reports and papers on DFA as well as on MDFA.
May 3: I added colors to figs 67-68 (beautiful) and I propose a new figure 69 which supports my claims better (conceptually). I added also a bad `smart' idea at the end of the book (an answer has been provided in 1).
May 4: I added references to the `bad smart idea' at the end of the book. Also, some references and links were added in section 8 (replication and customization of model-based approach). I added some comments in the R-code too.
May 6: after user-feedback (thank's David) I corrected the code in exercise 4, p.11 and exercise 4 p.36. The output in fig.4 is now smoother and the periodogram in fig.15 is `as should be': both periodograms (blue and red) are virtually indistinguishable in the passband but the red spectrum is nearly vanishing in the stopband (as it should be, once more).My R-code (posted below) is corrected too.
May 9: A discussion about trilemma and dilemmata is proposed in 1.
I'm happy to announce the nearby release of my new on-line DFA-Book (manuscript for teaching and more). In summary I collected within 200 pages my latest work on DFA, as spread in a loose form over many tutorials, in a unifying manuscript. I also included a whole bunch of new results, including comprehensive finite sample distributions of filter coefficients, of amplitude functions, of time-shift functions (I never saw the latter two distributions in any published work) as well as in- and out-of sample finite-sample distributions of Curvature and Peak-Correlation measures (am I crazy, seriously?). I included:
A comprehensive atheoretical introduction to the frequency-domain with many many many examples
Revisions, vintages, tentacle plot
Replication and customization of classic filters (model-based MSE)
Comprehensive in- and out-of-sample distributions (see above)
ATS-error components plus four new measures: Curvature and Peak-Correlation as well as Selectivity and Mean-Shift. I show that suitably customized filters outperform best theoretical MSE-designs (assuming knowledge of the DGP) out-of-sample in all dimensions simultaneously (at costs of Accuracy, of course...).
A lot more (`older' but invariably up-to-date material)...
All formulas, results, plots, tables are obtained by the R-code as published in the book. No series to download: everything is based on R-data. I use the Sweave-environment in order to generate the Latex file and therefore my manuscript is perfectly reproducible. This is a long unifying comprehensive, detailed and well documented tutorial to grasp the mysteries behind DFA. You may ask: what about MDFA? Absolutely nothing! This is DFA, only DFA but everything on DFA: nothing more could be added ever on the topic;
It is completed!
Release-time of version 0.0.0: tomorrow, 12-noon, Swiss-timing. First here, first served.
I'm working on a series of three scripts intended for my lectures: #1 SARIMA-models, #2 MDFA/GARCH and #3 state space (#1SARIMA and #3state space scripts are to be found in the category `teaching', see 0 and 00). My approach is heavily empirical and I provide comprehensive R-code to illustrate the relevant topics (practically relevant from my point of view). I'm doing steady progress on the `outstanding' (in the sense of : not yet available...) MDFA script and I feel that this could serve as a good introduction/tutorial to the more sophisticated forecasting problems handled by MDFA (customized real-time signal extraction). In fact I realized since long that my tutorials (category tutorials) might be `too specialized' and insufficiently `fleshed out' for a broader audience. In this perspective, the new script will provide kind of a unifying, self-contained and comprehensive empirical support for those users which are not primarily or not uniquely interested in theory as summarized in my elements vade-mecum 1. So I really see added-value beyond an exclusive teaching-facility.
PS: the mark of 100 pages was hit today; still ~100 to come. Soon to be seen here!
In 1 and 2 I released my latest teaching material on SARIMA and state-space models. Yet another document/hand-out/script/book on time series. Why?
I refute any deterministic approach (linear or polynomial fitted to data): this is non-sense!
I'm not interested in nice looking smoothers: real-time performance is all we need (i.e. I emphasize filters).
My experience with multivariate approaches (VAR(MA), state-space, DFM,...) is: good in-sample fit, poor/disappointing out-of-sample performances!
Mainly due to overfitting noise by the classical short-term one-step ahead estimation paradigm.
Therefore I use MDFA which emphasizes mid-/long-term dynamics and which allows for `tailored' regularization and customization (material of the middle script, in preparation)
I want `effective' techniques which perform well in the context of economic data: SARIMA-family, GARCH-family, Filters (DFA/MDFA), adaptive models (state space).
But that's quite a lot!
I want to emphasize a practical perspective: when working on SARIMA-models students must be able to identify processes and perform forecasts after three weeks (12 hours). After 8-10 weeks (32-40 hours of course) they must be able to work on real-world data.
I typically rely on data from forecasting competitions where students can benchmark their performance against established competitors.
After 8-10 weeks (32-40 hours of course) performances are pretty good: in the mean typical classes rank invariably in the top ten.
I don't want to depend on the arbitrariness of extraneous publisher politics (publication stop, price, ...).
I want to be able to up-date documents anywhen, if necessary.
Emphasizing new aspects (depending on input from running research projects) or
removing out-dated/obsolete material.
I release and discuss some `secret sauce' which is not be found in other books/papers/scripts/hand-outs.
The above requirements afford a large amount of flexibility which would conflict with rigid/uncontrollable publication strategies: there are interesting books available but I feel much better with my own stuff!
Below you'll find the script on state-space models. As for the SARIMA-script I emphasize a practical perspective:
I don't derive the Kalman-filter (hints are given and the tougher parts are treated in the appendix but this is not to be considered as `state-of-the-art').
I'm not interested in the Kalman-smoother because users/practitioners emphasize a real-time perspective: smoothers are good for historians.
I provide an extensively worked-out real-world example in section 9 where I'm proposing modifications to the original filter equations which are likely to improve performances in our typically Non-Gaussian non-iid world (at least they did it for me in whole bunch of applications, including competitions)
I'm looking at existing R-packages in section 10 and I compare results with the outcome of my `tweaked' code (section 9).
As for the time series script: I use the Sweave package and therefore any single number, any table or figure in my script is reproducible. Instead of providing the Sweave file I here poste the pdf and the R-code classified and numbered after the sections. You'll need one single data-file too, for the real-world example in section 9.
I'm testing this material the first time this Spring and next Fall. If something doesn't work or seems suspect or is plain fault please tell me: I'll revise and reposte. Orthograph is unchecked!
May 6: A SEFBlog reader - David Friskin - provided a better numerical solution to the general state-space-model in section 9. I included his solution p.64 with some comments. David used heavy numerical computations (simulated annealing and genetic algorithms) which I explicitly avoid in my courses. Be warned: state-space models require number-crunching: proponents of the approach do not emphasize this point sufficiently clearly. It is a practically relevant problem!!!