Workflow Soluiton

 View Only
Expand all | Collapse all

LogicBase.ServerExtensions.exe Process Using Tons of Memory

  • 1.  LogicBase.ServerExtensions.exe Process Using Tons of Memory

    Posted Jan 26, 2012 02:58 PM

    We're having an issue on our production server where all memory (6GB) is being used.  When we look in task manager LogicBase.ServerExtensions.exe is taking up the majority of the resources (~1.5-2 GB).  We're currently cycling the LogicBase Extension via task tray manager which sets this process down to a more reasonable level, but it begins building and minutes later the system is starving for resources.

    What is this process used for?  How can we troubleshoot the issue?  What is the impact to running workflows when cycling the logicbase extensions?  We've tried looking through logs, but nothing obvious jumps out.

    Thanks,

    Brandon



  • 2.  RE: LogicBase.ServerExtensions.exe Process Using Tons of Memory

    Posted Jan 26, 2012 04:22 PM

    Server Extensions is required for the workflow server to run (and thus host your flows).

    Check the folders beneath <install>/Data... these are the locations of file exchanges. If you have lots (1000s) of files in these directories, I'd check your flows and verify that you don't have some structure where you're looping and introducing new messages to the exchanges, too fast for SE to process.

    What version of Workflow are you running?



  • 3.  RE: LogicBase.ServerExtensions.exe Process Using Tons of Memory

    Posted Jan 26, 2012 06:23 PM

    We just got off a call w/ support earlier and we determined that the localfilestorage-lbme.reportingqueue directory under <install>\Data had about 850,000 files in it which accumulated over the course of 6 days.  We're going to write a Workflow to monitor the number of files in this specific directory.  Would you suggest adding additional monitoring for other directories in the Data directory?  If so, which ones and how many files should trigger an alert?  The reportingqueue is used for writing process manager report data into ensemble, but what are the other directories under Data used for?

    We're currently running 7.0 SP3, but are working on an 7.1 upgrade POC.



  • 4.  RE: LogicBase.ServerExtensions.exe Process Using Tons of Memory

    Posted Jan 26, 2012 11:08 PM

    We had this issue last year as well and it turned out that some of our workflows were causing the issue and had to manually remove them from the Messages table. Our Ensemble database is huge (over 70GB) at the moment and I have an open ticket with Symantec to see if they can pinpoint the exact workflow(s) is causing it.



  • 5.  RE: LogicBase.ServerExtensions.exe Process Using Tons of Memory

    Posted Jan 27, 2012 07:03 PM

    Aryanos,

    That's interesting what you wrote. We had a bit similar problem on one of our production environment. Our exchange databases didn't grow up so extremely like yours but partially it's an effect of our security assumption. As standard we split exchange databases. We prepare separate databases for one complex and heavy loaded process or for several smaller and lighter processes each. We had to do this because when we used one database for about 80 different web apps, to many deadlocks had happened.

    The most curious was that some exchange databases used by quite simple processes were extremelly huge e.g. 2-3 GB. I started analysis of this issue and I've made some observations which I can share.

    I've made some statistics on problematic databases and it's occurred that sizes of some rows in the Messages table reached 30 MB! Naturally almost whole of that size is a content of the Message column which contains the payload of the exchange message. In one of such databases we had more than 100 messages larger than 5 MB. After further analysis it's turned out that largest messages usually belongs to workflow tasks (QueuName ends with phrase ".tasks") what is not so surprising. I'd started searching deeper (debugging, libraries reflecting) and I've discovered the main reason.

    The issue happens in processes implementing massive parallelism even if they are small and simple. The effect is caused more by count then by complexity. Typical example is Dialog Workflow component run inside ForEach loop. Every next task created in such loop has larger exchange message stored in database the previous one! When loop is repeated more than 100 times the size of task message can become a really big.

    The reason is a specific implementation of the execution engine of the Workflow. Each time execution engine run into component splitting the flow (making parallel flows) it makes a copy of whole collection of the process data and it stores this copy inside execution context. The execution context is one of field of the task message structure stored in exchange database. So this way each next task has inside its execution context all copies of whole data collection made in previous iterations. I've got the final prove when I deserialised task messages read directly from exchange database (manually in VS).

    Is anything from what I wrote similar to your issue or your problems are quite different?

    What are your observations (experience)?



  • 6.  RE: LogicBase.ServerExtensions.exe Process Using Tons of Memory

    Posted Jan 27, 2012 07:17 PM

    The common mistake causing a process endles loop is unsecured exception handling.

    The components placed on path after the Exception Trigger component should always be secured by additional Exception Trigger From Specified Components component set to handle exceptions from them to avoid looping in case the excaption has happened during exception handling.

    I'm not sure but logging level of the process accidentally set to DEBUG can also have a serious impact on workflow server load.



  • 7.  RE: LogicBase.ServerExtensions.exe Process Using Tons of Memory

    Broadcom Employee
    Posted Jan 28, 2012 10:21 AM

    Somewhat simplified way of looking at it is that message contains the process data for a task. If you split the process into multiple tasks, each of them will have it's own set of process data in there. Practically, these messages should be gone when the process is finished (all tasks closed).

    This is not a detailed description of how it works but this is the idea.



  • 8.  RE: LogicBase.ServerExtensions.exe Process Using Tons of Memory

    Posted Jan 28, 2012 11:48 AM

    Yes this what I've observed as well. Last year we introduced parallel workflows for our HR department and it nearly crashed our server. There were 5 workflows each with one parallel split in them and somehow one was doing something that was causing it.

    One thing I'm not sure how Workflow works is memory clean up. For example say you have a simple approval workflow where someone submits a request and it goes to an approver. All the variables that are stored in that workflow is in the Message and MessageProperties tabes. After the workflow ends does all that information get flushed out and cleaned up or does it write the it somewhere else for historical references?

    I think some of the data cleanup isn't working in my environment because 70GB of data used by the workflows is extremel\y high and there are over 22,000 + entries in the Message table alone.



  • 9.  RE: LogicBase.ServerExtensions.exe Process Using Tons of Memory

    Posted Jan 30, 2012 07:46 AM

    You described it simply but this doesn't show the true implicit effects having great impact on platform reliability and performance.

    Each message contains the whole execution context with all data collections created till this moment by the instance of the process. For each parallel task the seperate data collection is generated but this collection is stored inside the same execution context. So each next parallel task carries the burden of data collections from all previous created tasks which makes arithmetic progression of the exchange message size.
    I've done one more simple test using model like shown below.
    Initial Task passes to the loop, which makes 30 Parallel Tasks, a data structure about 200kB size. Every next task created in this parallel loop has its message size about 200k larger then previously generated task. The size of the last Parallel Task is over 7 MB! The most interesting is that after finishing all parallel tasks (merge completion) the Final Task is generated and its size is also larger than 7 MB!

    I've started to analyse why it works this way and it seems that data collections made for parallel tasks are not removed from execution context. The only place where data collections are deleted is ClearCallbackInfo(...) method and the method is called during parallel task merging but only for current context which is not copied to processed item and not stored in exchange. But exactly this item is used to resume workflow after parallel merging completion. So each new task still suffers because of the burden from all undeleted data collections previously stored in context.



  • 10.  RE: LogicBase.ServerExtensions.exe Process Using Tons of Memory

    Posted Jan 30, 2012 10:22 AM

    Almost as you said - all the variables are stored as an element of the binary serialized object of the class WorkQueueItem in field Message in Messages table.

    Cleaning up the exchange storage after the workflow instance completion is another story.
    Each workflow process internally uses several different queues for message storage:
     - "*.Tasks" for storing task data,
     - "*.Assignments" for storing info about assignments of the corresponding task,
     - "*.Processes" for storing basic info about instance of a process.
    These three can be configured by setting values in properties of the poject. It can be easily distinguished by the ending of a string stored in column "QueueName" of Messages table.
    There is also a special queue for workflow task data caching but it is by default pointing to "local.workflow.task.cache" exchange config using InMemory storage.
    There is another one queue "*.Archive" which can also be stored in a database and configured by properties settings but it is disabled by default. It is used for storing "emergency" messages: Checkpoints, Exceptions and Archives (like you said: for historical references).
    It does not end. There is also one extra special queue also configured in properties and also its messages can be stored in database. This is an "*.externaldata" queue. It is used for storing process variables externally - not directly in internal data structures (the tab Storage Preferences in project settings). By dafault FileDataType objects are stored in this way.

    Excuse me this "lecture" if you know all this things.

    The point is IWorkQueue interface for classes providing exchange services for workflow processes has a special method named ClearWorkflow(). Used by default, the LogicBaseExchangeWorkQueue exchange service class (In Workflow 6.x was available selection of exchange service class in project settings on tab Deployment. In current versions its name is still written in DeploymentInfo.config file.)
    Exactly this method is responsible for cleaning up the exchange queues after workflow ends. In versions 6.x of Workflow this method was broken. It didn't clean up the tasks queues, only the processes queue. So the effect was surprising - tasks without instnaces. Even if one of parallel flows reached up the end of process component the remaining tasks haven't been deleted. Better - it was quite easy to finish such orfan tasks and resurrect the zombie-instance. We've spent days on figuring out what happens. Finally we wrote own class with corrected method ClearWorkflow.

    I don't know exactly when but in one of versions 7.x this was repaired. So now any time any flow in process reaches up an end component the whole instance and all of their tasks are cleaned up. The only thing that can still remain in queues after instance of process ends up is message in "*.Archive" queue but it depends on properties settings. It is likely that also an "*.externaldata" could cause a problem with cleaning up because this type of messages is services seperately from tasks and instances. I've noticed on our production environments many "*.externaldata" messages left in queues long after their instance had finished. (And this is a real problem because it's almost impossible to identify from what instances this messages were written. They haven't descriptor with TrackingID in MessageProperties table.)

    My suggestion is: do a statistics (aggregation) by QueueName column on your Messages table in two ways: by full name and only by ending phrase starting after dot (.Tasks, .Processes etc). It will allow to estimate what type of queues are used by your processes and what are the proportions between different message types. In queues with the same ServiceID assignments should always be the same like Tasks, and Tasks should always have a corresponding Processes.



  • 11.  RE: LogicBase.ServerExtensions.exe Process Using Tons of Memory

    Posted Jan 30, 2012 10:25 AM

    Thanks for researching this AnaMan. With my workflows I try to remove as much data from the workflow by using the Definitions To Remove on the End components so that only the variables I needed are in the workflow. If what you say is true about it the data being not being  removed after the tasks are completed then that is a big issue.

    As I mentioned we have 5 workflows that only do one parallel split in each one and for some reason this increased the size of our datatbase massively. However we do have a lot of instances of these workflows running so that may explain it but what about those workflows that parallel split into hundreds? 



  • 12.  RE: LogicBase.ServerExtensions.exe Process Using Tons of Memory

    Posted Jan 30, 2012 10:33 AM

    This is good advice, athough this was not our particular problem.  Our issue was created by sheer volume.  Symantec also informed us that you have more control over these queues and how their messages are processed in 7.1.  For example, you have the ability to set how often you check the queue for messages and it is configurable at the millisecond level.



  • 13.  RE: LogicBase.ServerExtensions.exe Process Using Tons of Memory

    Posted Jan 30, 2012 10:41 AM

    On a sidenote here, we've found using embedded models instead of embedded rule models to be more effective for a few reasons.  As you mentioned, you're using the Definitions To Remove on the End components which lead me to believe you're using the embedded rule model.  If you use the embedded model instead you only pass out the information needed outside the model to reduce the risk of passing out unneeded data.  Additionally, we have seen instances where data that is created inside an embedded rule model is lost when leaving the embedded rule model.  This only occurs in rare situations (maybe 1/1000), but we have captured it with our logging.  Just thought I'd pass this along!



  • 14.  RE: LogicBase.ServerExtensions.exe Process Using Tons of Memory

    Posted Jan 30, 2012 11:15 AM

    While I do use some embedded models the Definitions to Remove are on the End components when I exit out the Workflow Dialog is where I strip the data.



  • 15.  RE: LogicBase.ServerExtensions.exe Process Using Tons of Memory

    Posted Jan 30, 2012 11:22 AM

    Thanks for the explanation AnaMan. In our workflow I do allow users to upload attachments into the workflows and often wondered where these attachments go after the workflow is completed as I always assumed it would just get flushed out when the workflow ended. So are you saying we should remove the FileDataType variables before the workflow ends so that it doesn't store it in .externaldata in the database? 



  • 16.  RE: LogicBase.ServerExtensions.exe Process Using Tons of Memory

    Posted Jan 31, 2012 04:33 AM

    Yes, we also had an issue with to often checking timeouts and escalations. As a solution we applied individual time slots for each IIS application setting different times for "minEscalationTime" parameter.

    But in our opinion the implementation of timeouts checking has one weakness - the "minEscalationTime" is read and set only when the first timeouts processing accomplishes successfully. Till then checking is done every 1 minute. After restarting the server with about 80 applications with hundreds of tasks to process it causes a terrible mess. All appliactions tries to process its own tasks at same time.

    But I've never encountered in Workflow configs any parameter allowing to control time at millisecond level.



  • 17.  RE: LogicBase.ServerExtensions.exe Process Using Tons of Memory

    Posted Jan 31, 2012 05:09 AM

    Removing the unnecessary data is good tactics and it works but it doesn't resolve the problem. What about the GlobalData and Properties collections - both are stored in execution context. What about necessery data?

    Every split in process makes a new copy of data collection (containing all still existing in the flow variables)and passes it to new flow but the the old collections are still stored in execution context making some kind of overhead. Any time the flow splits it increases this overhead. It's inevitable.

    I thougth that maybe splitting loops will reduce this overhead (e.g. insted of one loop creating 100 tasks - five parallel loops creating 20 tasks) but the results I've got are ambiguous.

    I would say that massive parallelism could be a threat for reliability and performance of the platform.

    I bet that that tasks from your process generated in hundred iterations of a loop easily reach up tens of MB. This messages must be processed and cached in server memory, serialized and deserialized - it MUST be resource consuming.



  • 18.  RE: LogicBase.ServerExtensions.exe Process Using Tons of Memory

    Posted Jan 31, 2012 05:25 AM

    We have also noticed the disappearance od data but not only leaving en embedded models but also a Dialog Workflow. Usually reasons were two: operation on data structures without taking into account passing by reference and broken threads (execution context is stored as ThreadStatic data).



  • 19.  RE: LogicBase.ServerExtensions.exe Process Using Tons of Memory

    Posted Jan 31, 2012 06:22 AM

    When you are removing the FileDataType variables which are ExternalReference types by default the special CleanUp routine is called removing the messages from queue.

    But this routine is also called when the process is ended - regardless successfully or by exception.

    Even that I observe from time to time orphaned messages in "*.externaldata" queues. I include them to orphaned when theirs MessagePostData is much older then any other message in others queues with the same ServiceID in QueueName.



  • 20.  RE: LogicBase.ServerExtensions.exe Process Using Tons of Memory

    Posted Jan 31, 2012 04:16 PM

    So just so that I understand, should the records in the Messages table be cleaned up after a workflow has finished or will it just leave the records in there? It seems that it is just leaving the records in there and the CleanUp routine is not removing these records which means that whatever memory that was used in the process is still in the table and accumulating.



  • 21.  RE: LogicBase.ServerExtensions.exe Process Using Tons of Memory

    Posted Feb 01, 2012 04:12 AM

    In Workflow 7.x with corrected ClearWorkflow() method all records from Messages table put in the queues: "*.tasks", "*.assignments", "*.processes" and "*.externaldata" should be cleaned up automaticaly at the end of the process. This should happen in both cases: the successful finish (including cought and handled exception) and unhandled exception but cought by execution engine. But still it is possible that will happen an exception impossible to cought by the engine, for instance "Thread abort exception". If such broken process cannot be continued it may leave in queues some messages. So we form time to time check the queues in search of orphaned messages.

    If the process has set to 'true' the property "LBExchangeArchiveFlag" it will always leave in queue "*.archive" single message which type depends on successful or exceptional end of the process.
    If these messages are not used for reporting or any historical reference they can be later manually deleted.

    Regardless of those cases above it is possible that your database is quite free from remaining unnecessary records and mainly suffers from valid but extremaly large messages from processes with extensive parallelism. It is particulary inconvenient when processes have long term to finish e.g. many tasks wait for users service etc. In such situation possibly you can do nothing to reduce the size of exchange database. Increasing the size of execution context is done during a runtime - any remaining in queues messages have no influence on this issue - just only overhead (size of all present variables that are copied to each new split) and a number of the flow splits. The growth of the each next exchange message is just a side effect of increasing runtime execution context.
    There are possible some optimizations like removing all unnecessary variables before parallel tasks or splitting parallel loops into several independent merging branches but this is more an insignificant work around then a real solution.
     

    In your place first I would make a queue statistics to get to know what types of messages are stored in the exchange DB. Is it contain an "*.archive" messages and what is the total size? Next I would check the correctnes of all messages in exchange queues. If all tasks has its corresponding assignment message and if all belongs to an existing process message etc. And then I would make an statistics of the sizes of messages and I would analyse its correaltion to belonging to particular processes (applications) and particular instances of a process and total count of tasks in such instance and date of storing them to exchange. The statistics should give an answer if the true reason is a parallelism in processes and how long the large messages have been stored.



  • 22.  RE: LogicBase.ServerExtensions.exe Process Using Tons of Memory

    Posted Feb 02, 2012 09:51 AM

    Don't happen to have your cleanup/search SQL script(s) handy do you AnaMan? One of the things I'm afraid of doing while cleaning this up is accidentally removing any legit records in the tables. Sorry beamer7296 for hijacking this thread.



  • 23.  RE: LogicBase.ServerExtensions.exe Process Using Tons of Memory

    Posted Feb 03, 2012 09:48 AM

    Yes, indeed, it is a risky operation, but... I've a long experience in message analysis.
    Of course, I've accidentally cleaned up correct records several times :-)
    We've a bit more comfortable situation because as s default we assign different databases to a groups of processes and even to a single but complicated process. This way it is much easier to backup an exchange DB in case if something goes wrong.

    We don't use ProcessManager, we've our own portal for task management and we treat it as a kind of reference system. Usually I check the correctness of records in both systems to be sure not removing any valid records. I very often prepare report with list of tasks to check their validity by users to whom have been attached to. If task cannot be executed I estimate if this a logic bug and it can be corrected by modifications in a model or it is a data error (for instance wrong value or even a non-existence of required variable) and it can't be corrected. Even in case of data errors, sometimes, if it is a really important task, I try to repair it by reading it from queue, deserializing in Visual Studio, correcting the broken data, serializing and storing it again in queue. This is absolutely a handicraft but it works :-)

    The processes with broken tasks can be deleted (aborted) or just enough is to wait for their timeout. But sometimes waiting for timeout is not a good idea - for instance: designer doesn't handled it in proper way or task was designed to autoregenerate in case of timeout - then instance must be deleted or will be active forever. Except the space in database it is also a an additional load to background processing - such tasks  must also have timeouts processed. During process designing we put an accent on forcing its tasks to end in a reasonable time to avoid long lasting processing.

    I'm curious - how old are the oldest messages from different queue types in your exchange database? It's not specially secret information.

     select mstp.QueueType, COUNT(*), MIN(mstp.MessagePostedDate)
    from
    (
     select ms.*, RIGHT(ms.QueuePhrase, LEN(ms.QueuePhrase) - CHARINDEX('.', ms.QueuePhrase)+1) as QueueType
     from (select MessageId, QueueName, MessagePostedDate, RIGHT(QueueName, 20) as "QueuePhrase"
           from [Messages])as ms
    ) as mstp
    group by mstp.QueueType
    order by 2 desc; 


  • 24.  RE: LogicBase.ServerExtensions.exe Process Using Tons of Memory

    Posted Feb 08, 2012 09:36 AM

    Thanks I'll take a look at this when I get a chance and hopefully be able to sort it out.