DV 34 gets down to the practical issues of collecting data and using trade simulation software in Part 3 of his series on testing trading strategies

 Back testing – Part 3 The Testing Process 

In this article we look at the actual testing (or data collection) process, as an example I will use my own recent back test results from a simple strategy on 12 years of data.

B)                                                          THE TESTING PROCESS 

i)                    Define a Clear Rule-Based Strategy to Test (Mechanical or Discretionary)  

In this example the strategy being tested is a simple discretionary swing trading (days to weeks) harmonic pattern system, using only candlesticks and Fibonacci retracements/ expansion clusters and one ema moving average which was used very rarely.

The entries and exits varied slightly due to price action clues usually in an attempt to reduce risk or take profits before key support resistance levels - but generally were quite close to the PRZ (potential reversal zones) 

The rules were very simple:

a)        Test only one timeframe at a time

b)        Test at least 30x trades (minimum!) more = better

c)        Identify pattern

d)       1x entry, 1x stop loss – 2x targets (scale out/ 50% each)

e)        Min. overall  R:R per trade of 1.3x or better (ideally much larger i.e. >3:1)

f)         No trailing stop loss until 1st target is hit, then trail stop to just above breakeven and behind lower highs/ higher lows on final position till stopped out or T2 hit

g)        There were only 3x possible outcomes for trades: (no partial loss or breakeven trades)

                                                                                           i.     A Loss (-1R) – unless stop was gapped over, which is uncommon in FX

                                                                                         ii.     Small Profit (0  to ~1.5R) - T1 hit, T2 stopped out slightly above entry

                                                                                       iii.     Large Profit (1.5R or better) - T2 hit/ trailed stop hit on 2nd position

ii)                  Decide on a Testing Process

(i.e. back testing (proof of concept), forward/ walk-forward testing (verification)) 

In this case I will be back testing for proof of concept, live forward demo testing to follow if it has a big enough edge and shows promise 

iii)                The Types Of Testing

  • Back testing (Proof of concept)

  1. Scrolling back over charts
  2. Trading Simulator with unseen data
  3. Automated Strategy Development (outside of the scope of this article)

  • Forward Testing (The acid test... verification on unseen data)

  1. Walk Forward Analysis - Slicing off a part of unseen data to verify the initial test
  2. Real Time Demo Testing

 For this article we are doing very simple MANUAL back testing only, therefore there are two primary ways of doing this – Using a trading simulator or scrolling back over charts 

Trading Simulator

For my back tests I use a trading simulator, the main reasons for this are:

a)      You can upload your own data to test

b)      Allows you to test the data in real time (even using multiple timeframes) and make trading decisions at the hard right edge of the chart on whatever timeframe you choose.

c)      You can add popular built in indicators or get some indicators programmed if the platform does not have them already.

d)     Can scroll at variable speed (bar by bar or at an increased speed of your choice)

e)      You can place orders on the chart, and include a fixed or variable spread (depending on data)

f)       It records all of your entries and exits trades for you, downside is it does not export comments or show commission/ trading costs

g)      Can export to a simple spreadsheet software package such as Microsoft Excel/ Openoffice.org or Apple Numbers/ Neooffice for further analysis later which speeds up the testing process. 

Two examples of trading simulators I am aware of include:

i)     Forex Tester 2, this costs around $150-$200 one off cost/ and has optional data subscriptions + free delayed monthly 1min data updates

ii)   Trade Interceptor, cost = free (apparently), another trader I respect noted this program - although I personally have no experience with it at all 

Scrolling back over charts:

The problem I personally found with simply scrolling back over charts is that you can easily see what happened and have a very clear hindsight bias. 

The other issue is that if you have a strategy that relies on multiple timeframes then you will have difficulty matching the timeframes together so that any decision is made with data at the same decision point. 

iv)                Collate and set up data for testing

  • Check data quality, Absolutely Vital...!

  1. Are there any anomalies? large spikes, large gaps/ missing data etc
  2. Check server time of data - when do candles "close?" this can change the appearance of data/ candles/ bars dramatically
  3. Include a realistic spread in your tests

v)                  vi)  Scroll through and execute orders on historical data - (Preferably unseen) using your rules/ strategy and record raw results  

Try to trade the "hard right edge" during testing as much as possible 

Below is a weekly chart showing all of the trades taken on the 1hr chart, zooming out you can easily see what market conditions suit the strategy and where it will be less effective.

Click to Enlarge Click to Enlarge

This strategy is a correction trading strategy - by using Fibonacci ratios in trading you are effectively looking for possible turning points.

Therefore looking at the results above very strong trends tend to render the system ineffective on that particular time frame. 

It is worth remembering that there are always other instruments and timeframes that offer trading opportunities during these periods. 

This strategy does trade relatively infrequently (~18 trades/ year) although you have to remember that this is only one pair/ one timeframe… and there are plenty of others out there! 

vi)        Develop a record keeping method  

In this case I am using Forex Tester 2 to record my trades although you could just as easily manually enter the details into a spreadsheet yourself.

To analyze the raw data I am using Microsoft excel to work out the key statistics which we will cover in the next article 

A screenshot of the raw results exported from Forex Tester 2 is shown below - although there are actually 445 trades shown on the test screen above, this was reduced down to 215 trades due to scaled positions on all trades (2x trades/ setup) and eliminating trades with less than 10pips risk - which was my minimum 

The risk on these trades varied from 10pips – 123pips per trade during very volatile periods (namely the GFC 2008-2009) so ignore the profit column as the position sizing is not correct and not required to determine expectancy

Click to Enlarge Click to Enlarge

There is some cleanup work to do to this raw data but we will cover that in the next article 

The purpose of collating key trade information/ record keeping is to allow you to analyze the results and pull out key statistics that will tell you if you have a possible mathematical edge 

There are a couple of notes on the data above, namely:

i)          A fixed spread is already included into the order entry system of the simulator of 3 pips which is reasonable for this pair… and needs to be in line with your brokers average spread or even slightly conservative (wider)

ii)        Each setup has two orders entered as I am scaling out at 2x targets. I will collate these together in the next article into a single trade (as it is effectively a single trade setup).

iii)      Also note that the position size is the same for all trades at 0.1x lots, this is deliberate - I want to find the reward to risk ratios, expectancy and other statistics first which are independent of how much you risk per trade.

iv)      As a caveat, position sizing IS crucial, and important for preventing large draw-downs and or overleveraging your account and having large losses, but not so important when evaluating expectancy and statistics - we will cover this later in the series using the results of this test as an example. 

In the next article we will look at cleaning up the raw data, summarising the results into key statistics and what might be useful 

Hope this helps,