I recently posted a method for demand planning based on joint probability distribution functions derived using averages and variances. While robust for items with higher sales frequency, my experience is that the method isn't very stable on low frequency items due to churn and susceptibility to an extreme observations (not to mention the normal distribution may not be appropriate). In this post I'll highlight a distribution-free method based on Monte Carlo methods.
Monte Carlo methods are well documented
elsewhere , so I won't go into much detail. Suffice it to say that by repeatedly sampling from the receipt and sales distributions one can estimate the underlying joint lead-time demand distribution. The algorithm is pretty simple:
- For each item
- Randomly pick a lead-time in days using the receipts for that item
- For each day in #2 pick a day from the sampling period and add the number of orders for that day (often zero) to a running total
- For each order in #3 randomly pick an order from the order file and add the quantity to a running total
- Repeat steps 2-4 several hundred times saving the quantity total for each iteration
- Sort the iterations in ascending order and set the order point equal to the iteration that corresponds to the target service level
In terms of the distributions to pick from there are a number of approaches. The first is to use the raw data by taking the top record from a randomly sorted table, a method that can become time consuming with lots of SKUs. A second approach is to summarize the raw transactions into a cumulative distribution table and add a cumulative percent of total column. the random value is choosen by selecting the record where the cumulative % of total corresponds to a random number between 0 and 1. The third approach is to assume underlying distributions (i.e. orders per day is Poisson, order quantity is log-normal etc) and choose a value form that distribution using a random number. This post will use combination of the first two methodologies.
The data is the same from the prior post (no cost data will be used, but I've left it in for consistency).
create table #item (item varchar(3), daysinstock int, servicetarget float,stdcost float, carrycost float, ordercost float)
insert into #item select 'abc', 120, .95,1.5, .12, 15
create table #receipts (purchaseorder int, item varchar(3), orderdate int, receiptdate int)
insert into #receipts select 456,'abc', 20130109, 20130226
insert into #receipts select 789,'abc', 20130325, 20130413
insert into #receipts select 567,'abc', 20131007, 20131130
create table #salesorders (salesorder int, salesdate int, item varchar(3), quantity int)
insert into #salesorders select 123, 20130208, 'abc', 1
insert into #salesorders select 234, 20131014, 'abc', 10
insert into #salesorders select 345, 20130413, 'abc', 35
insert into #salesorders select 456, 20130409, 'abc', 10
insert into #salesorders select 323, 20131105, 'abc', 5
insert into #salesorders select 432, 20130515, 'abc', 5
Step #2 of the algorithm picks a day at random which implies calendar that has lots of days with no sales. Rather than build a calendar tailored to each item and select a day at random as in the first approach, construct a summary distribution table.
create table #calendar(item varchar(3), ordercount int, daycount int, distribution float)
The following inserts the number of days for each order count by item. The sub-query (sq) counts the number of orders by day which is then summarized as number of order days for each order count.
insert into #calendar (item, ordercount, daycount)
select item, ordercount, count(1) from
(select i.item, so.salesdate, count(1) ordercount from #salesorders so
join #item i on i.item = so.item group by i.item, so.salesdate) sq
group by ordercount, item
The day with no orders is calculated as the days in stock from the #item table less a sub-query totaling the number of days with orders from #calendar.
insert into #calendar (item, ordercount, daycount) select item, 0, daysinstock - (select sum(daycount) from #calendar c where c.item = i.item) from #item i
The last calendar step is to update the cumulative distribution which uses two sub-queries. The first totals the day count buy order count which is then standardized by dividing by the total number of days in the second sub-query. To save time, the second query could be replaced by incorporating the days in stock value into the #calendar file.
update c set distribution = (select cast(sum(daycount) as float) from #calendar c1 where c1.item = c.item and c1.daycount >= c.daycount ) /(select cast(sum(daycount) as float) from #calendar c2 where c2.item = c.item) from #calendar c
The next step is the looping program; the final results are stored in a table called #orderpoint and consist of the item, lead-time demand quantity and order point quantity. A scratchpad table called #quantity is also required to hold the output for each iteration.
create table #orderpoint(item varchar(3), leadtimequantity int, orderpoint int)
create table #quantity( quantity int)
A number of variables are also required for tracking; the first four are self explanatory, @it1 tracks the overall iterations (set at 100 here) and @it2 is a general purpose counter.
declare @item varchar(3),@servicetarget int, @leadtime int, @ordercount int, @quantity int, @it1 int,@it2 int
The program runs until every item in the #item table is also in the #orderpoint table. I've highlighted the looping statements in yellow and annotated their function. There are two functions used for the random selection. The first is newid() which is build into SQL server and is really a pseudo random sequence of characters. Using first check sum and then the right 3 divided by 999 than can be converted into a random number between 0 and 1
while 1=1 --loop for each item
begin
set @item = (select top 1 item from #item where item not in (select item from #orderpoint))
If @item is null break
set @servicetarget = (select cast(servicetarget*100 as int) from #item where item = @item)
set @it1 = 1000
truncate table #quantity
while @it1>0 --loop for each iteration picking a lead time
begin
set @leadtime = (select top 1 datediff(d, convert(char, orderdate,112), convert(char, receiptdate,112)) from #receipts where item = @item order by newid())
set @it2 = 1
set @ordercount = 0
while @it2 < = @leadtime --loop for each day in lead time picking an order count
begin
set @ordercount=@ordercount+ (select min(ordercount) from #calendar where item = @item and distribution >=dbo.fx_rand())
set @it2 = @it2+1
end
set @it2 = 1
set @quantity = 0
while @it2 < = @ordercount --loop for each order in order count picking a quantity
begin
set @quantity=@quantity+(select top 1 quantity from #salesorders where item = @item order by newid())
set @it2 = @it2+1
end
insert into #quantity select @quantity
set @it1 = @it1-1
end
insert into #orderpoint select @item,leadtimequantity, max(sq2.quantity) from
(select avg(quantity) leadtimequantity from #quantity) sq1,
(select top (@servicetarget) percent quantity from #quantity order by quantity) sq2
group by leadtimequantity
end
The orange highlighted section loads the lead-time demand and the order point to the working file that would then be loaded to an MRP system.
An interesting property of this approach is that it can be used graphically illustrates the demand profile of an item. Each iteration has an equal chance of occurring, a graph of the #quantity table for a given item shows all of the equally likely outcomes. Consider the graphs below. The first shows the values for 1000 iterations (took about 2 seconds on my Gateway in SQL server express) as calculated with an average of about 22 units. The second is sorted based on quantity with the black arrows indicating the median (15 units) and 95th percentile (71 units). The red arrow to the left is the number of iterations with no volume. Shows the lead-time volume by frequency: the red arrow is nearly 20% of the distribution: There is nearly a one in five chance of selling nothing after placing an order. The joys of intermittent demand.
Appendix: Dynamic Random Numbers
The rand() command in TSQL is not dynamic; using it in a query will return the same value for each record returned. I know of two work-arounds of which I'm not sure which is the more random.
First, newid() is a built-in SQLServer function that returns a pseudo-random sequence of characters that can be used for sorting as is:
select * from #table order by newid()
It can also be used to generate a random number between 0 and 1 :
select cast(right(checksum(newid()),3) as float)/999
The second alternative is more involved but is built on the rand() function. First create a view that references the rand() and then a function that return the results of the view:
create view vw_rand
as
select rand() random
create function dbo.fx_rand ()
returns float
as
begin
return (select random from vw_rand)
end