Notes from Kathy's Desk

Richard Tan, my long term colleague at Haverly, passed away unexpectedly at the end of April. Over his career, he was responsible for conceiving of and implementing some of the best features in our optimization products. I would like to share with you some of my memories and thoughts about the 25 years that we worked together.

One of my readers wrote to me to ask why, since we are solving non-linear problems, we use SLP instead of something like a Generalized Gradient method – as you can find in the Excel Solver. There are lots of non-linear optimization methods – and the general characteristic of them is that they make an approximation to the problem, solve it, inspect the solution and try again – just like SLP. So, the question then becomes, if you had access to another non-linear optimization technique, what would you gain by switching?

The difference between refinery planning and scheduling may be obvious if you have worked in the commercial department, but from the outside the distinction is less clear. I usually explain that planning is about what you want to do while scheduling is deciding when you want to do it.

Although LP in the refinery business usually means a production planning model, it is also possible to apply optimization methods to product distribution problems - deciding which depot should supply each customer.

A type-1 Special Ordered Set (SOS) is a convenient way of constraining a model to use only one of a group of options in the solution. One application is piece-wise linear representations of processing units – where more than one mode or base-delta base representation has been made to improve accuracy over a wider range of inputs – instead of trying to represent it all with one line.

Emissions! How low can you go? Use an alternative objective function in your LP model to minimize emissions instead of costs.

Does setting a minimum objective function value in an SLP model solve local optima problems?

You've been offered a new computer! What should you ask for?

We are often asked for computer hardware recommendations to reduce the time required to solve GRTMPS optimization problems. Let's answer the hardware question and throw in some additional suggestions that can reduce run times.

Batch Limits: How do you tell a model that you can buy/sell 20,000 or 40,000 but not 31,415 - and why you might need a multi-period model with inventory if you do.

At our most recent user conference I was struck by how different the content felt to previous meetings. The agenda as always included material on the latest Haverly product developments, and best practices in the boiling-oil business, but that's definitely not all we are talking about these days...

A client recently asked how to re-optimize the final matrix from an SLP model outside of GRTMPS. It's not hard if you know which files to save and remember a few DOS commands. Here are some instructions, in case anyone else is interested in going in through the back door and playing around inside the black box.

Refinery planning models are usually written with one underlying material balance basis – weight or volume – with one standard unit of measurement. Volume models, particularly in the US, are done in barrels, but it is common to find that the Hydrogen, and other gases, have been balanced in MSCF instead. Solids like Coke and Sulphur might be counted in tons. Why? Primarily, because that is how these materials are normally measured, so the values are more familiar and easier to interpret.

Team	Played	Wins	Draws	Losses	Goals For	Goals Against	Points
A	3	2	1	0	3	1	7
B	3	1	2	0	2	1	5
C	3	1	0	2	2	2	3
D	3	0	1	2	2	5	1

Group Results tables from the group stages of football tournaments make nifty logic puzzles. Knowing the Wins, Draws, Losses, Goals For and Goals Against, can you work out the scores of the individual games?

It can be very useful to give each process unit a special operation to represent on stream days to control availability and capacity, particularly when setting up cases with shutdowns.

Here is a trivia quiz for GRTMPS users and other LP and/or Excel aficionados. I’ve tried for a mix of obvious and obscure. Here is a chance to prove how clued up and erudite you are - or maybe to pick up some useful tidbits.

While the reports of individual optimizations can show us much of interest, most planning questions are answered by comparing the results of cases. With the Report Generator tool you can not only see numbers from multiple runs side by side, you can also create virtual cases which show the deltas between values.

If..then logic can be used in planning models to represent situations where an option can only be, or must be, active when a condition is met.

Tis the season....at least in Northern latitudes - for shorts, fans, lunch in the garden and summer vacations ... and to blend less butane into gasoline because the permitted maximum vapour pressure specification is lower when the weather is expected to be warmer. At least you probably won't have to worry as much about meeting cloud and pour on the diesel blending. How to handle these seasonal changes in specifications in your planning models?

I have previously described how distributed recursion solves the pooling problem (Desk Note #4) – focussing on it as an issue of quality balancing. I mentioned that this method does also have some mathematical underpinnings which I would write about later – and after being reminded by a reader --- here is an explanation of how distributed recursion can be derived from a Taylor expansion – and why the extra transformation that gives us distribution factors is useful for stable optimization.

Distribution planning models usually include targets for product quantities to be delivered reflecting commitments that have already been made. Customer N is due to receive X m3 of gasoline. Ideally, your refineries will be able to supply product to fulfill all the demands, but plans don’t always work out and you might find yourself having to manage a shortfall. Contracts sometimes include penalty clauses requiring compensation to be paid if all the contracted product is not delivered. These need to be represented in the model so it can give guidance in how to allocate the product that is available.

If you have multiple distillation towers at your refinery, you may find that the model allocates different crude slates to each. If operational constraints mean that crudes will be mixed in advance and run as a single feed, then the model is over-optimizing by directing them to the towers in different proportions. This is exactly the issue that is solved by pooling.

If you have multiple towers at your refinery, you may find that the model allocates different crude slates to each. If you have the flexibility to achieve that, then this is exactly the kind of solution you want. But if your operating options are more restricted and such plans cannot be implemented, the model is over-optimizing and you need to add some constraints on the blends to guide it to more realisitic plans.

The SSI system that is used to exchange data between model databases and Excel workbooks is useful for many tasks, and even more so when the additional capabilities of the Multi-SSI panel and the Workflow Tool for handling multiple imports are taken into account.

Process unit representations in planning models need to include capacity controls so that the optimization takes into account limits on how much material can be processed. The obvious and easy one to set up is a count of the feed.

Process vectors are usually written per unit of feed, so if you put a 1 as a loading factor on the capacity control and give the maximum, the unit has a size. Historically weight models were set up with weight-based limits because that was easy and linear but as it almost certain that physical constraints on the actual plant are volumetric flows, volume capacity controls would be more accurate.

Does your model have a lot of old clutter in it? Cleaning up your model can make it easier to understand, reduce its run time and improve stability. Unused pool qualities are a good de-cluttering target.

In GRTMPS you can define “generic” operations on distillation units that automatically expand for whichever set of crudes is available in the case. This is more convenient than having to write out operating modes for every crude. But if you have multiple towers or modes and some crudes that cannot be processed in all of them, you need to exclude those feeds from the expansion.

Planning models are usually run so that the optimization maximizes profit margin with no limits set on what can be spent to achieve it. But what if credit is tight? The global economy and political landscape have had quite a few upsets so far this century. For some companies that has certainly meant operating with limited access to finance. Feedstock evaluation changes from being an assessment of how valuable each option is to a question of what is the best combination that can be obtained for the money available.

Crude oil evaluation is one of the most common optimization applications in the oil industry. The predicted profit margin/bbl for each grade relative to the others gives a “pecking order” of preferred feedstocks for a refinery. Even a difference of a few cents per barrel translates to a large sum of money, so it is very worrying when a crude that looked good drops from the top to the bottom of the list from one assessment to the next. Does it mean that the original evaluation was wrong?

Help! My data is in rows, but I need columns. How can I get from:

72 SmallRows to 72 SmallColumns ?

Where in the reports you can see the margins on all the process units?

Blending Scaling? Choose 1.

How does a planning model decide how much process unit capacity to use?

Have you ever right-clicked on a scroll bar?

Case generator is a flexible tool for setting up loads of cases to go at the press of a button – and a 181 file of initial estimates can be very helpful in improving recursion behaviour. So how, asks a client, can I use one with the other?

Spreadsheet or Database? Let’s be honest about our preferences. Given a pile of data to work with your first click will be the one that opens Excel. Databases are great for validation but they are lousy at maths so we in the GRTMPS team try to offer you the best of both – so you can use the input database to protect yourself from time wasting over undefined codes, but use a spreadsheet to manipulate data. The tool for passing data in and out of the model database via Excel is called SSI – for SpreadSheet Import.

What do cleaning a fish, climbing a mountain peak, and LP models have in common?

There should be mass balance across a refinery – every molecule of feed stock that comes in comes back out again (eventually) – and we hope, mostly as useful products. But the reality is that when inputs and outputs are compared there will be some discrepancy. How should this be handled in our planning models?

If you have ever sat and stared at a screen absolutely puzzled - by the behaviour of your code, model, simulator etc. you will appreciate the aptness of Dawkin's observation that...

*W* Data for recursed qualities missing for components of pool S1# at RR. Values assumed ZERO.

Systematically varying an operating parameter such as severity is a fairly common task for the user of a refinery planning model. Such exercises help us to confirm that a model is appropriately responsive to changes in conditions, allowing you to gain some understanding of the economic impacts as well as to see how the overall system adapts.

It is very useful for the refinery industry that many of the stream qualities that we need to predict blend linearly. But many don't. Vapour pressure, octane, cloud point and viscosity are examples of properties where a simple proportional average of the measured values of the components will not tell you what value the mixture will have. Many such properties can be handled linearly after all, however, if a blending index is applied.

Clearing out a pile of old magazines while tidying up my desk, I came across an old review of "Unknown Quantity: A Real and Imagined History of Algebra" by John Derbyshire (2006). Since I basically earn my living messing about with simultaneous equations and I like a clever title, I thought I would get hold of a copy and have a look.

“I have a multi-refinery model and it reports the crude marginal values at each site. Where can I find the over-all marginal value for a crude?”

When writing a non-linear process unit representation most of the effort goes into building the Process Simulator Interface (PSI) spreadsheet itself. However, once that has been completed, it is necessary to link each calculated output PUP with the variables that it depends on in the Adherent Recursion panel to connect it to the LP model. Here is how to to save some time using the Spreadsheet Import (SSI) and PSI Analyzer tools.

If you have the multi-core extension to GRTMPS so you can run multiple optimizations simultaneously, the Queue Manager will default Max Jobs to the number of cores. On my i7 laptops that would be 8, but I normally set it to...

If the price of a crude is less than the marginal value, why doesn’t it buy more?

When there is a Process Simulator Interface (PSI) workbook connected to your GRTMPS model the solution values that are used as inputs are passed into the calculations on each recursion pass. These aren’t left behind in the InputValue section of the workbook after the run though, so if you want to set up the simulator to match a particular LP solution, you need to put them in yourself. The Haverly Excel Add-in includes a tool for this that you will find very useful for analysing and debugging.

Do most use a single “transfer price” for gas and diesel and other products or do they use a series of pricing levels…..?

The gap between not doing something at all and its maximum level often includes a range where the quantity is too small to be practical. A minimum would force the option to be active, while it might be more profitable to leave it at zero. You can incorporate this kind of condition into an LP model using semi-continuous bounds

GRTMPS offers several ways to temporarily deactivate data records that are not currently needed. But sometimes it just more useful, for example when combining independent models to make a larger multi-site one, to be able to obliterate a whole set of information. This can easily be done with an OMNI:DELETE TABLE command.

Where does negative sulphur come from?

Swing cuts are a well-established method for representing flexibility in cut-points on refinery distillation towers. One of the first tech support questions I had this year was about reporting and controling the cut point temperatures on a unit using that method, given that they are determined by how much of the swing has blended up or down into the adjacent core cuts.

Do you think "Dilbert" is a documentary? To mark the end of another busy year and give you something to play with over the holiday season, I have been inspired by modern digital encryption to create a new coding system, “semi-primary”, to challenge your cracking skills.

Did you know that very small numbers are bad for the stability of linear optimization? If you see a zero value for some yield or property in a solution, you will usually be right in assuming that it is actually zero, or so small that it has rounded to it in the report. However, just occasionally, sometimes, if you calculate out the value using your input data and the solution activities, you will find that something should be there. It might be very small, but it should not actually be zero. Here are some suggestions for improving the scaling of process unit representations.

Oil refinery and other process industry optimization problems are largely covered by Linear Programming models. Most variables represent continuous quantities, such as the amount of a component to mix into a blend, that are allowed to take on any real number value between a minimum and maximum. However, there are some constraints that are best handled with discrete variables. A model that contains both linear and discrete variables is an example of Mixed Integer Programming (MIP) and is traditionally solved using a Branch and Bound algorithm.

In GRTMPS, non-linear equations can be connected to the model directly using Adherent Recursion. Below we’ll present a simple way to fit process unit data into a polynomial function and then use that in a model to drive the linear approximations that are needed for each optimization pass

A critical requirement for anyone making blended products like gasoline or diesel is that the properties – such as density, sulphur, octane, cloud point - of the final mixtures are within the legally required specifications. Refinery optimization models obviously need to have equations that represent these constraints.

I travel a lot. My colleagues in Haverly travel a lot. Amongst us we have probably experienced every possible reason for delay that you can imagine. Some distilled wisdom is offered here from our collective experience of air travel and business trips, particularly those dreaded long-haul flights that land you in a different time zone.

Why is the marginal value on the total blend different from the marginal value of the component?

Are you paranoid enough about backing up your work? How many hours would you lose if your computer just wouldn’t boot or if that core document, spreadsheet or database you have just spent a week on was corrupted?

Haverly’s Matrix Analyzer is a very useful tool for combining the matrix structure with the solution values to see how the equations that make up your model are working - all in Excel so you can tinker around with it.

As an example of how it can be used, this note takes a look at how block operation affects process unit limits.

If you constrained an operating parameter to take a specific value, forcing it away from the optimal value, you would expect to see an incentive on the limit. But what if it always came out blank, no matter what value you fixed it too? What could be going on?

Installing GRTMPS on another computer? You can copy across all your g5 preferences and run history as long as you are installing the same version.

Marginal values – the additional profit to be made if a constraint is relaxed – are one of the benefits of optimizing planning problems with Linear Programming as they can help us understand the economic drivers of the solution. Refinery planning models are normally written with a balance row for each hydrocarbon material being tracked. As equality rows they are always constraining and so we can see a marginal value for each stream – but what exactly do they mean?

What makes this case different from the base? It worked yesterday, what’s changed? Why don’t we get the same answer? I have previously written about how the GRTMPS compare tool can be used to identify differences between spreadsheets that contain OMNI format input tables. Here's a tool included in MS Office 2016 that can identify the differences betweeen any pair of Excel workbooks, to help you when you have spreadsheets containing other data formats.

This is a linear programming problem written in MPS format. Can you make sense of it?

What makes this case different from the base? It worked yesterday, what’s changed? Why don’t we get the same answer?

When did you last take a good look at your initial pool property estimates? A good first approximation should put your optimization on a good path and save you some recursion passes. Recursion Monitor is a useful tool for checking out your starting qualities; you can compare first to last pass values and check them for internal consistency.

Would you expect the objective function of an integrated refinery model (two or more sites) to normally be lower or higher than the sum of the value of the individual models?

How did I get here? The recursion monitor is a useful tool as it provides an easy way to see what is going on with the recursed parts of your model across each recursion pass. Looking at the final reports only ever shows you where you arrived, not how you got there. Taking a look at what is going on during earlier passes will give you insights into a model’s solution path and help you resolve problems with instability and infeasibility.

Fancy an infinite amount of money? UNBD as a solution status is offering you just that – but unbounded solutions are not very likely to be prove true out in the real world so they are no better a basis for a plan than an infeasibility

The Matrix is the question and the Solution Print is the answer.

Have you ever tried to track a stream with multiple uses through a solution?

Fancy yourself as a cryptographer? Here's a little code cracking challenge to keep the brain active over the year end holidays.

Its a bit of a waste to make the solution MDB on every run, if you aren't necessarily going to use it, but even more of a bother if you want it and don't have it. Well you can just…..

Many regulatory regimes include product specifications that cover not just individual batches of a particular grade, but also the overall average which is exported from the refinery over multiple grades. These regulations sometimes include incentives for doing even better than the legal requirement – effectively paying you to blend with giveaway. Including such an incentive in your LP model, is easier than you might think.

Does it make any difference if you use a FIX limit or equal MIN and MAX constraints?

Do you have the Haverly Add-in activated? Originally for indexing GRTMPS input data spreadsheets, it now has some useful functions for working with your PSI workbooks and for analysing solutions.

When a model doesn’t converge, one possible cause is that some of the adherent recursion PUPs don’t converge. To address this problem, one can use the adherent recursion slope damping to reduce the variations of PUPs between recursions passes to help the model to converge.

Maths boring? Never! Here are some books and movies that illustrate the dramatic (and even comic) potential of a story that revolves around mathematics and computing

This is a guide to the hierarchy of data types in the GRTMPS input database: Model, Case, and Base/Alternate.

A butterfly flaps its wings in the Andes and ….

Do you ever find yourself grumbling because you got the scaling wrong on a set of numbers, or reversed your positives and negatives? The Multiply and Divide options in Paste Special allow you to sort the problem out with just a few clicks.

If you are working with GRTMPS database models and entering your data via the interface panels the older GRTMPS data table system may be something of a mystery to you. However since the database information is exported into these tables before being processed it can be very helpful to be able to recognize the connections between panels and tables. Debugging tools such as data check, run time messages and file compare all refer to data in the internal tables, so knowing how the names work will make you more efficient. Here is a brief guide.

Every variable and equation in an LP matrix generated via GRTMPS has a unique name built from user assigned codes and internal elements. Understanding them will help in debugging model problems, such as infeasibilities.

NO…. but, YES.

If you’ve ever wasted a few hours debugging a broken Excel workbook, then you will be as excited as I was when I learned about Go To Special last year. This does a lot of really interesting things, including finding all the cells in a worksheet with errors in them.

If you have a model with multiple locations and / or periods, you may want to create limits that control sub-totals across all or some of these places and times. If you can buy a stream at 3 locations in 3 periods, how many constraints would you need to cover every possible sub-set? How do you put them in the model?

Are you using the “spooling” option when you submit runs? It might save you some run time.

Have you ever entered a formula into Excel only to have it treat it as text and just sit there displaying what you typed without resolving it?

When you are setting up the case data for your monthly plan, what price should you use for the crude or other materials that you have already bought? Quite a few people I have asked thought the answer was obvious, but they did not all come up with the same answer.

Distributed Recursion SLP* models require initial estimates for the pool properties that are being optimized and the pool’s distribution factors. By default the pool property values are taken from the blending data and the error distribution is an even division over all the ways the pool can be used. However, you can override these numbers by using a “181 file” as an additional input to the model. Sometimes this helps the optimization converge sooner and may find you a better value.

Sorting your stream list can help you manage your data and make your reports easier to read. In most full database models, this is quite a long list and it can be challenging to use and maintain as it is unlikely to fit on one screen. (Even if you use a table-based model, you probably have a list of crude streams here – so read on.)

The trend towards larger and larger models works against our desire for fast run times. If you are adding many crudes, periods and / or locations, adjusting the OMNI settings for your GRTMPS model might help speed things up again. It might even be essential to keep it running.

This is an introduction to the fundamental issue that brought recursion into refinery planning models and how this approximation allows us to optimize the qualities of products where some of the components are themselves mixtures of varying qualities.

HOW DO YOU SEARCH FOR "*", "?" and "~" ?

If you are working with GRTMPS data in a spreadsheet - data tables, SSIs, etc. - you probably have some cells that contain asterisks(*) and question marks (?) since GRTMPS uses these as wild cards, replacing them with specific period, location and crude codes when the data is processed. The challenge on doing a Find for a specific entry with one of these characters is that Excel uses them as wild cards in search terms, “?” for any single character and “*” for any group of characters (as does the find in Windows Explorer and many editors).

WHAT DO YOU DO WHEN H/XPRESS FREEZES OVER?

Have you had a run that appears to freeze in the optimize step? It might well have already done some recursion passes, but now it’s just sitting there in the Queue Manager like it is never going to finish.

Have you ever wondered why pool qualities sometimes have marginal values, even when there is no specification?

PoolMarginalValueOneLine

Notes from Kathy's Desk

Guidance, Ruminations, Thoughts, Miscellaneous Points and Suggestions

Welcome!

- Kathy