82) Deriving Distributed Recursion

Events


MUG 2025
September 28 - October 3, 2025

News

I have previously described how distributed recursion solves the pooling problem. That note explained it as an issue of quality balancing, but the method does have some mathematical underpinnings which can provide further insight into how it works, particularly if you are the sort of person who likes to see the equations .

The pooling problem arises because of the need to set up blending equations in the LP model where the qualities of the components are not known in advance because they are themselves blends . Why is this a problem? Because you can’t make a linear equation. Lets write it out and see.

Consider a maximum specification on quality “Q” for a blend, “c”:

[1] Qc <= MAX where "c" is a mixture of "a" and "b".

The average Quality of "c" can be calculated from the Volumes of component blended, given their quality contribution.

[2] Q_c = (Q_a * V_ac + Q_b * V_bc)/(V_ac + V_bc)

I’ve used V for Volume, by the way, as an arbitrary choice over W for Weight to deal with the problem of not being able to use Q again for Quantity without causing major confusion. The quantities will be volumes or tons depending on the basis of the model material balances – and the blend basis for the quality could be weight or volume, assuming that the co-efficients for the component contributions have been suitably scaled.

Written that way, however, it can’t be solved as part of an LP model, because the “V”s are variables so the division isn’t allowed.

However, this can be re-arranged to be a linear function:

[3] Q_a x V_ac + Q_b x V_bc – MAX x (V_ac + V_bc) ≤0

where black items are constants, and blue items are variables. Another row is used to drive a single variable to be equal to the total quantity, V_c.

[4] Q_a * V_ac + Q_b * V_bc – MAX * (V_c) ≤0

No pooling problem yet. But if a (or b) is itself a blend, its quality co-efficient should be a variable not a constant. The specification constraint on Q_c becomes

[5] Q_a * V_ac + Q_b * V_bc – MAX * (V_c) ≤0

This is back to not being suitable for LP, as two variables are multiplied together for the contribution from component “a”. Its not that Q_ais difficult to calculate. [ Q_a = ∑(Q_x * V_xa)/ ∑(V_xa) where “x” indicates one or more components], but even if you substitute that into the specification equation there is no re-arrangement that will convert it to an acceptable linear equation. Now we have a pooling problem.

It's not uncommon in applied mathematics to come up against equations that are in the "wrong" form, or simply very difficult to solve, so of course techniques have been developed for circumventing such roadblocks. A Taylor Expansion is a means of predicting the value of a function for some set of variables by adding-up the derivatives of the terms multiplied by the changes in the inputs. The concept is very simple. If you have a function…N= f(X,Y,Z)… that you have solved for {X₀,Y₀,Z₀} the value for {X_n,Y_n,Z_n}

[6]	≈ f(X₀,Y₀,Z₀)	+ (X_n-X₀) x	dN	+ (Y_n-Y₀) x	dN	+ (Z_n-Z₀) x	dN
			dX		dY		dZ

If the original function has higher order terms the derivatives can themselves be quite complex equations– but as long as each is easier to solve than the original, the expansion is a useful technique. (If the derivative term is still hard to solve, you can repeat the process recursively, doing an expansion to approximate its behaviour until you have something you can calculate Although it would be trendier to apply machine learning, a neural net or some other AI technique, rather than actually working through the layers of maths.) To be useful for us in the context of SLP, we need to end up with a linear equation. Fortunately, the expanded specification equation is one.

A product of two variables, such as the Quality x Volume term for component "a" expands with one term for each variable to :

[7]	Qa₀ x Vac₀+	dQV	x (Vac₁-Vac₀) +	dQV	x (Qa₁-Qa₀)
		dV		dQ

The derivatives are simple: dQV/dV = Q and dQV/dQ = V. So

[8] Q_aV_ac ≈ Qa₀Vac₀ + Qa₀ x ΔV_ac + V_ac0 x ΔQ_a

where variables are blue and the bold constants are recursed values that are changed between iterates, typically to match the actual value from the solution of the previous approximation. So we have

Assumed Quality x Assumed Volume + Assumed Quality x delta-Volume + Assumed Volume x delta-Quality.

Note that this function is very sensitive to changes in the inputs. The impact of simultaneous changes in both quality and volume is deeply curved and the approximation also loses accuracy to an accelerating degree the further away from the base point. Adding a ΔQ_a ^xΔV_caterm would bring the quality prediction back to the actual value - but that's not linear, so we can't. This linear equation can be used because we can create other equations to force the Delta variables to take on the correct values.

Delta volume, ΔV_ac , is driven by a material balance row to be equal to the difference between the assumed and actual quantity of component “a” to product “c”.

[9] Vac₁- Vac₀ + ΔV_ac= 0 with a fixed bound on Vac₀ to force it equal to the assumed amount.

The variable ΔQ_a can be driven by a quality blance row:

[10] Qa₀ x Va₀ + Va₀ x ΔQ_a + Qa₀ x ΔV_a - ∑(Q_x * V_xa) = 0

where Va₀ is an estimated total volume for A (not just the part that blends to "c"), and ΔV_a is the difference between the assumed and actual volume of "a", driven by the same type of material balance row used for getting a value for the change in the volume of "a" blended to "c". Now we have a set of linear equations that could be used in the model to approximate the non-linear element of the blending.

We could stop here and use this design in an SLP model, using estimated qualities and quantities for all the blends to create our linear approximations - but the equation can be manipulated further to make something that will recurse better - with fewer assumed values and better scaling.

First by recognizing that Assumed Volume V₀ plus delta-Volume ΔV is the actual volume made, V₁,we can simplify the the approximation of the Q_aV_acterm in the product specification and the component's quality balance:

[11] Q_aV_ac ≈ Qa₀Vac₁ + V_ac0 x ΔQ_a from [8]

[12] Qa₀ x Va₁+ Va₀ x ΔQ_a - ∑(Q_x * V_xa) = 0 from [10]

To finish removing assumed Volume numbers, divide the ΔQ_a terms by Va₀. The component quality balance equation becomes

[13] Qa₀ x Va₁+ ΔQ_a - ∑(Q_x * V_xa) = 0 from [12]

which drives ΔQ_a = V1a (Q1a-Q0a) , making it the volume x change in quality variable- familiar as the error vector in Distributed Recursion.

In the Q_aV_acterm approximation that quality error is now multiplied by an assumed co-efficient calculated as the Volume of "a" used in "c" divided by the total Volume of "a".

[14] Q_aV_ac ≈ Qa₀Vac₁ + V_ac0 /Va₀ x ΔQ_a from [11]

In other words, Estimated Quality * Actual Quantity + Fraction A to C * QualityQuantity Error a.k.a. Distributed Recursion. (Q.E.D.)

This is a better equation for SLP for several reasons:
- Quantities can make for large co-efficients, and so result in poorly scaled matrices

- Distributions are often more stable in the solution than amounts. There are more things that will affect the amounts that are made (such as the duration of the planning period) than the disposition of the streams. Consider the extreme example of a pool that has just one active use. The distribution factor is always 1, no matter how much is made so the quality contribution is correct and converged from the beginning.

- Initial estimates for distribution factors are easy to come up with. They are not sensitive to the amounts to be made and so the same values are reasonable for a range of cases. An equal fraction to all places may not be the most likely outcome, but it is at least a plausible one, easy to work out and connects information about changes in quality to all the uses.

So, while SLP with Distributed Recursion, like most other non-linear optimization techniques relies on heuristics to move from one approximation to another in search of a solution to the full problem, its not all rules of thumb - there are some good mathematical underpinnings inside the box too.

From Kathy's Desk 3rd February 2022.

Comments and suggestions gratefully received via the usual e-mail addresses or here.
You may also use this form to ask to be added to the distribution list to be notified via e-mail when new articles are posted.