Forecast Production with Variable Shifts, Seasonal Cycles and Growth
Sometimes
forecasting production seems like it is just a step ahead of
using crystal balls, Ouija boards and reading goat entrails.
Complicate the problem with variable shift schedules, seasonal
fluctuations and underlying growth, and it seems almost insolvable.
But recognizing these facts and compensating for them make it
puts the problem back into the realm of solvability.
The
accompanying spreadsheet shows an analysis of such a situation.
The raw data consists of dates, and the number of units produced
on the dates. The shift structure is two 10-hour shifts (20
hours) on Monday, Tuesday, Wednesday and Thursday, and a 12-hour
shift on Friday, Saturday and Sunday.
It’s
About Time
Forecasting
is about looking forward in time and making a projection based
on looking back in time. Time can be expressed as Excel expresses
it namely, the number of days since January 1, 1900. Using this
scale would produce extreme coefficients for the function that
describes the events. So a scale starting with August 15, 2007
(the first date for which data is available) is used. This value
is referred to as the Elapsed or E-Date.
The
formula in cell C2 duplicated to the end of data is:
=A2-$A$2+1
Converting
Apples to Oranges
Production
output varies greatly from day to day primarily because of the
difference in the number of hours worked in each day (20 or
12).
The
first adjustment to be made is to convert the daily production
values into a common scale. The logical thing to do is to convert
the number of units produced per shift to units per hour.
A
couple of intermediate steps have to happen first:
·
Determine the day of the week
·
Determine the shift
·
Determine the number of hours on the shift.
To
determine the day of the week, use the weekday formula in cell
D2:
=WEEKDAY(A2,2)
The
second argument, 2, means that Monday is the first day of the
week (1) and Sunday is the seventh day of the week (7). Using
this option makes the following if statement that determines
the number of hours worked during the day easier to formulate.
The
number of hours that the shifts on duty work during a day depends
on the day of the week. This information is displayed in cell
E2
=IF(D2<5,20,12)
Finally,
the number of units per hour can be calculated in F2
=B2/E2
The
Charts
The
function is in the eye of the beholder. Visual inspection will
reveal what components are in it.
First
look is at the raw data.

Looking
at this data, it appears that the data shifts up as time goes
on. There is a growth trend. To see it better, add a trend line
to the graph to see if the data “wraps around” it.

This
exercise confirms that part of the formula is mx + b. Where
m is the growth rate and b is a steady-state baseline. Sometimes
growth is not linear. It may be exponential or part of a larger
cycle. Both of these can be approximated by a second degree
or higher order equation. In this particular case, the linear
fit looks good.
It
is also apparent that there is a cyclical component to the function.
This can be better seen if the trend line (growth + baseline)
is subtracted from the raw data.

One
of the obvious things about this chart is that the signal has
a cyclical component. It is interesting to note, that although
lines usually obscure patterns in scattered data, in this case
keeping them in makes seeing the cyclical component easier.
This
chart reveals another factor, namely that the amplitude of the
cycle is also increasing with respect to time. So in addition
to the growth and steady-state component, the function can be
described with: A * sin(Bx +C).
So
the entire function looks like mx + b + A * Sin(Bx +C)
Solver
Once
the model for the function exists, the parameters can be computed
using solver. The solver parameters are set up in cells N1:S4

The
initial parameter settings for the Solver solution are set up
in cells O2:S2 and in cell O4.
These
Parameters (A, B, C, m and b) need to be applied to E-Date:
Cell G2 contains =$R$2*C2+$S$2= mx + b
Cell H2 contains =$O$2 *SIN(2*PI()*C2/$P$2+$Q$2) = A * Sin (Bx
+ C)
Cell I2 contains =G2+H2 = mx + b + A * Sin(Bx +C)
Column
I contains the computed values according to the formula.
Column
J contains the formula =(F2-I2)^2 which is the square of the
difference between the computed value and the actual value.
Which
brings us back to cell N4. Since the objective is to minimize
the sum of the squares of the difference to get the least squares
fit, cell O4 contains the formula =SUM(J2:J577)
It
is now time to engage solver. Note that Solver may have to be
added using Tools ®
Addins.
Select
Tools -> Solver

In
the Set Target Cell Block, select (or type in) $O$4 –
this is the cell to minimize. Assure that the Min radio button
is selected.
With
the cursor in the By Changing Cells block, select cells
O2:S2 – these are the parameters to vary to fit the function
to the data.
Click
on solve.

Keep
the solver solution and click OK.
Re-engage
Solver several times until the parameters are no longer changing.
Usually this takes only two or three iterations.
Solver
should present a display like the following:

Which
means the function is 2.03 x + 1950 -51.87 * Sin (2* PI() *
x / -19.26 + 16.39) give or take a couple of decimal points.
Where x is E-Date.
A
plot of raw data, trend line and seasonality all on one graph
looks like:

Note
that the sine wave does a pretty good job cutting through the
center of mass for the raw data.
Back
Testing
One
way to test the accuracy of a model is to see how well it predicts
“history.” The Back Test tab of the spreadsheet does exactly
this. To get the data needed, the units per hour is re-converted
into production units per day. In other words, the model (which
shows data in units per hour) is multiplied by 12 or 20 depending
on the shifts on duty for the day.
It
is obvious from looking at the graph that there is a significant
amount of noise in this signal so that the ability to predict
any specific day’s production volume is an exercise in frustration.
However, according to the least squares fit methodology, the
combined magnitude of the noise on either side of the function
is the same: the area under both curves should be equal. In
other words, the longer the period of measurement, the more
likely it is that the model will fit reality – the noise will
cancel itself out.
A look at
the actual and predicted values for monthly periods shows that
the model is fairly accurate.

There
are some anomalous peaks in the variance around year end 2007
and year end 2008. Beyond the plain numbers, there is a knowledge
that there is a drop in production around the holiday season.
An adjustment to the model could be made to incorporate this
knowledge, or in a more practical sense, a manual adjustment
can be made.
Download
in PDF Format