StatefulKnowledgeSession and multi-threaded processing

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

StatefulKnowledgeSession and multi-threaded processing

Skiddlebop
Greetings all!

My apologies in advance for dropping this problem warhead...

I have a business requirement that dictates the use of a single long-running StatefulKnowledgeSession for partial inference among separate requests. This system must scale to handle a maximum (peak load) of 22 events per second... Normal/Average load will be 1 event per second. Currently, there is no plan to distribute this processing; it will all be handled by a single instance. There will be at maximum 3GB of facts in the rete-network at any given time. Are there any statistics on Drools performance that could inform us on its feasibility to meet such load against such strict requirements?  Additionally, is there any way to parallelize processing with a single StatefulKnowlegeSession? I'd sincerely appreciate any help in this regard. Please, let me know if more information is needed.

With Humble Gratitude and Humility,
Lucas
Reply | Threaded
Open this post in threaded view
|

Re: StatefulKnowledgeSession and multi-threaded processing

lhorton
You might like to view this video, posted by Mauricio (Salaboy) this year, from their Drools workshop in Argentina.  It's about a real (production) implementation of a very high volume, high performance Drools-based system:  

http://vimeo.com/27209589
Reply | Threaded
Open this post in threaded view
|

Re: [rules-users] StatefulKnowledgeSession and multi-threaded processing

salaboy
That's not me.. his name is Alexandre Porcelli.. he is also a
community member.. really good presentation

On Tue, Dec 20, 2011 at 2:46 PM, lhorton <[hidden email]> wrote:

> You might like to view this video, posted by Mauricio (Salaboy) this year,
> from their Drools workshop in Argentina.  It's about a real (production)
> implementation of a very high volume, high performance Drools-based system:
>
> http://vimeo.com/27209589 http://vimeo.com/27209589
>
> --
> View this message in context: http://drools.46999.n3.nabble.com/StatefulKnowledgeSession-and-multi-threaded-processing-tp3599689p3601845.html
> Sent from the Drools: User forum mailing list archive at Nabble.com.
> _______________________________________________
> rules-users mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/rules-users



--
 - CTO @ http://www.plugtree.com
 - MyJourney @ http://salaboy.wordpress.com
 - Co-Founder @ http://www.jugargentina.org
 - Co-Founder @ http://www.jbug.com.ar

 - Salatino "Salaboy" Mauricio -

_______________________________________________
rules-users mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/rules-users
Reply | Threaded
Open this post in threaded view
|

Re: [rules-users] StatefulKnowledgeSession and multi-threaded processing

Greg Barton
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: [rules-users] StatefulKnowledgeSession and multi-threaded processing

salaboy
Sure, but its common sense and find the right tool for each particular problem. In my perspective he did that perfectly.

- CTO @ http://www.plugtree.com
- MyJourney @ http://salaboy.wordpress.com
- Co-Founder @ http://www.jbug.com.ar
- Mauricio "Salaboy" Salatino -

On 20/12/2011, at 22:52, Greg Barton <[hidden email]> wrote:

> Absolutely.  Anyone who wants to build a high performance rules system should watch it.
>
> --- On Tue, 12/20/11, Mauricio Salatino <[hidden email]> wrote:
>
>> From: Mauricio Salatino <[hidden email]>
>> Subject: Re: [rules-users] StatefulKnowledgeSession and multi-threaded processing
>> To: "Rules Users List" <[hidden email]>
>> Date: Tuesday, December 20, 2011, 11:50 AM
>> That's not me.. his name is Alexandre
>> Porcelli.. he is also a
>> community member.. really good presentation
>>
>> On Tue, Dec 20, 2011 at 2:46 PM, lhorton <[hidden email]>
>> wrote:
>>> You might like to view this video, posted by Mauricio
>> (Salaboy) this year,
>>> from their Drools workshop in Argentina.  It's about
>> a real (production)
>>> implementation of a very high volume, high performance
>> Drools-based system:
>>>
>>> http://vimeo.com/27209589 http://vimeo.com/27209589
>>>
>>> --
>>> View this message in context: http://drools.46999.n3.nabble.com/StatefulKnowledgeSession-and-multi-threaded-processing-tp3599689p3601845.html
>>> Sent from the Drools: User forum mailing list archive
>> at Nabble.com.
>>> _______________________________________________
>>> rules-users mailing list
>>> [hidden email]
>>> https://lists.jboss.org/mailman/listinfo/rules-users
>>
>>
>>
>> --
>>  - CTO @ http://www.plugtree.com
>>  - MyJourney @ http://salaboy.wordpress.com
>>  - Co-Founder @ http://www.jugargentina.org
>>  - Co-Founder @ http://www.jbug.com.ar
>>
>>  - Salatino "Salaboy" Mauricio -
>>
>> _______________________________________________
>> rules-users mailing list
>> [hidden email]
>> https://lists.jboss.org/mailman/listinfo/rules-users
>>
>
> _______________________________________________
> rules-users mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/rules-users

_______________________________________________
rules-users mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/rules-users
Reply | Threaded
Open this post in threaded view
|

Re: [rules-users] StatefulKnowledgeSession and multi-threaded processing

Greg Barton
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: [rules-users] StatefulKnowledgeSession and multi-threaded processing

salaboy
You are completely right! and he is also good communicating
complicated topics :)

On Tue, Dec 20, 2011 at 11:19 PM, Greg Barton <[hidden email]> wrote:

> Common sense doesn't become common until it's communicated. :)
>
> --- On Tue, 12/20/11, Salaboy <[hidden email]> wrote:
>
>> From: Salaboy <[hidden email]>
>> Subject: Re: [rules-users] StatefulKnowledgeSession and multi-threaded processing
>> To: "Rules Users List" <[hidden email]>
>> Cc: "Rules Users List" <[hidden email]>
>> Date: Tuesday, December 20, 2011, 7:56 PM
>> Sure, but its common sense and find
>> the right tool for each particular problem. In my
>> perspective he did that perfectly.
>>
>> - CTO @ http://www.plugtree.com
>> - MyJourney @ http://salaboy.wordpress.com
>> - Co-Founder @ http://www.jbug.com.ar
>> - Mauricio "Salaboy" Salatino -
>>
>> On 20/12/2011, at 22:52, Greg Barton <[hidden email]>
>> wrote:
>>
>> > Absolutely.  Anyone who wants to build a high
>> performance rules system should watch it.
>> >
>> > --- On Tue, 12/20/11, Mauricio Salatino <[hidden email]>
>> wrote:
>> >
>> >> From: Mauricio Salatino <[hidden email]>
>> >> Subject: Re: [rules-users]
>> StatefulKnowledgeSession and multi-threaded processing
>> >> To: "Rules Users List" <[hidden email]>
>> >> Date: Tuesday, December 20, 2011, 11:50 AM
>> >> That's not me.. his name is Alexandre
>> >> Porcelli.. he is also a
>> >> community member.. really good presentation
>> >>
>> >> On Tue, Dec 20, 2011 at 2:46 PM, lhorton <[hidden email]>
>> >> wrote:
>> >>> You might like to view this video, posted by
>> Mauricio
>> >> (Salaboy) this year,
>> >>> from their Drools workshop in Argentina.
>> It's about
>> >> a real (production)
>> >>> implementation of a very high volume, high
>> performance
>> >> Drools-based system:
>> >>>
>> >>> http://vimeo.com/27209589 http://vimeo.com/27209589
>> >>>
>> >>> --
>> >>> View this message in context: http://drools.46999.n3.nabble.com/StatefulKnowledgeSession-and-multi-threaded-processing-tp3599689p3601845.html
>> >>> Sent from the Drools: User forum mailing list
>> archive
>> >> at Nabble.com.
>> >>>
>> _______________________________________________
>> >>> rules-users mailing list
>> >>> [hidden email]
>> >>> https://lists.jboss.org/mailman/listinfo/rules-users
>> >>
>> >>
>> >>
>> >> --
>> >>  - CTO @ http://www.plugtree.com
>> >>  - MyJourney @ http://salaboy.wordpress.com
>> >>  - Co-Founder @ http://www.jugargentina.org
>> >>  - Co-Founder @ http://www.jbug.com.ar
>> >>
>> >>  - Salatino "Salaboy" Mauricio -
>> >>
>> >> _______________________________________________
>> >> rules-users mailing list
>> >> [hidden email]
>> >> https://lists.jboss.org/mailman/listinfo/rules-users
>> >>
>> >
>> > _______________________________________________
>> > rules-users mailing list
>> > [hidden email]
>> > https://lists.jboss.org/mailman/listinfo/rules-users
>>
>> _______________________________________________
>> rules-users mailing list
>> [hidden email]
>> https://lists.jboss.org/mailman/listinfo/rules-users
>>
>
> _______________________________________________
> rules-users mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/rules-users



--
 - CTO @ http://www.plugtree.com
 - MyJourney @ http://salaboy.wordpress.com
 - Co-Founder @ http://www.jugargentina.org
 - Co-Founder @ http://www.jbug.com.ar

 - Salatino "Salaboy" Mauricio -

_______________________________________________
rules-users mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/rules-users
Reply | Threaded
Open this post in threaded view
|

Re: [rules-users] StatefulKnowledgeSession and multi-threaded processing

Skiddlebop
Thanks for the link :) 'Twas an interesting view! A few complexities still remain. I may be ill-informed about this matter so please correct me if I'm totally off-base in any of these statements.

Alexandre states that the KnowledgeSession is thread safe, which is contrary to the documentation (if he is is referring to a StatefulKnowledgeSession). However, it seems like he is actually using a StatelessKnowledgeSession pulling his facts from Redis. Our requirements are different, we are dealing with real-time data, not historical data, and this real-time data will all be stored in a long-running StatefulKnowledgeSession. So every fact in that StatefulKnowledgeSession will be an active real-time representation of an object's state. However, we do share a similar approach in that we are also "going async" using queues. The reasoning for choosing the queue for us is based on a variety of considerations, the most relevant for this conversation is the fact that, based on the un-thread-safe nature of the StatefulKnowledgeSession, we anticipate processing one record at a time (one call to fireAllRules() per event). Does this seem like a reasonable approach?

I have considered inserting multiple events (as facts) per call to FireAllRules(), but am uncertain of how the conflict resolution should work if say for instance two of those facts correspond to the same field of the same object. Here's a pseudo-code example outlining my concern:

Key assumption: It is NOT possible to concurrently access a StatefulKnowledgeSession from multiple threads or separate parallel REST service calls.
 
//Assume KnowledgeBase has already been built and sks is our long-running global singleton //StatefulKnowledgeSession which has continuity between individual REST web-service sessions.

//Begin Pseudo-Code Snippet.

//objectA with Data Member "Integer A" Value 1
sks.insert(objectA);
sks.insert(objectB);
sks.insert(objectC);
//objectA with Data Member "Integer A" Value 2 (this is an update to the same object which has a higher //index in the chronologically sorted queue).
sks.insert(objectA);
sks.fireAllRules();

//End Pseudo-Code Snippet.

As you can see, we enter multiple facts into the session for a single call to FireAllRules(), and objectA contains a conflict because there are two separate events relating to an update to the same object (objectA) and the same field (integerA).  

Does this seem like a reasonable or feasible approach? What would happen in this scenario? Should we just try to work around it by pre-determining such conflicts?

Which of the two is advised? Single Event per SKS fireAllRules() call or Multiple Events per fireAllRules() call? Again, performance is important, but we must temper such desires with our considerations of data-integrity :)

I REALLY REALLY Appreciate the help you've provided thus far.

Thanks for all the shoes,
Lucas (Skiddlebop)
Reply | Threaded
Open this post in threaded view
|

Re: [rules-users] StatefulKnowledgeSession and multi-threaded processing

Richard Calmbach
When you are dealing with events, the processing works a bit differently. First, we have to clarify: Are you inserting events or plain facts? If it has a timestamp, it has to be handled as an event. The distinction matters because in an event-driven scenario, only events would be inserted "from the outside" (i.e., from outside the session), and then plain facts (representing the system state) get inserted/updated/deleted in response to these incoming events (via rule consequences). The only time you *might* insert plain facts "from the outside" is at the beginning, before the start of event processing, to define an initial state of the system. (If anyone can think of a good reason to insert plain facts *during* event processing other than in response to events, please let us know - always willing to learn something new :-).)

In your example, rather than inserting an object (that represents system state) again with modified state, you would insert a "StateChangeEvent" into the session, and a rule would update the object (fact) in response to this event - that's what is meant by event-driven processing or event-driven architecture. Think of your system state as getting updated in response to external events, and the rule engine is in charge of performing the updates.

So, if your objects (object[ABC]) are events, then the situation of inserting the same event (as in reference identity: ref1 == ref2) twice doesn't make sense. Honestly, I don't know what the rule engine would do in this case, but I'd say you have to consider the behavior undefined. Each state change should be represented by a separate, new event.

I strongly recommend calling fireAllRules() after every single event insertion (event != fact, event is a special kind of fact). It may be possible to do batched inserts of events, but the overhead of calling fireAllRules() after every single event insertion should be minimal. Keep in mind, either way the rule engine has to process each event in turn in order to maintain the event-driven semantics. In an event-driven setting, you would not find yourself doing batch inserts like this, anyway. Events are processed as they arrive. If you have multiple event streams and inevitable delays due to messaging, then it is on you as the rule author to ensure that your rules don't break due to unpredictable event ordering. I also recommend synchronizing on the session in case you are inserting events from multiple threads (I have empirical evidence that the StatefulKnowledgeSession is *not* thread-safe). In a single-event-stream scenario, where events arrive in chronological order (and even if not), you want to insert and fire immediately upon event arrival (and, yes, events could come from an asynchronous messaging queue).

At least this approach has worked well for me. Happy to hear feedback from others.

-Richard

On Thu, Dec 22, 2011 at 10:45 AM, Skiddlebop <[hidden email]> wrote:
Thanks for the link :) 'Twas an interesting view! A few complexities still
remain. I may be ill-informed about this matter so please correct me if I'm
totally off-base in any of these statements.

Alexandre states that the KnowledgeSession is thread safe, which is contrary
to the documentation (if he is is referring to a StatefulKnowledgeSession).
However, it seems like he is actually using a StatelessKnowledgeSession
pulling his facts from Redis. Our requirements are different, we are dealing
with real-time data, not historical data, and this real-time data will all
be stored in a long-running StatefulKnowledgeSession. So every fact in that
StatefulKnowledgeSession will be an active real-time representation of an
object's state. However, we do share a similar approach in that we are also
"going async" using queues. The reasoning for choosing the queue for us is
based on a variety of considerations, the most relevant for this
conversation is the fact that, based on the un-thread-safe nature of the
StatefulKnowledgeSession, we anticipate processing one record at a time (one
call to fireAllRules() per event). Does this seem like a reasonable
approach?

I have considered inserting multiple events (as facts) per call to
FireAllRules(), but am uncertain of how the conflict resolution should work
if say for instance two of those facts correspond to the same field of the
same object. Here's a pseudo-code example outlining my concern:

Key assumption: It is NOT possible to concurrently access a
StatefulKnowledgeSession from multiple threads or separate parallel REST
service calls.

//Assume KnowledgeBase has already been built and sks is our long-running
global singleton //StatefulKnowledgeSession which has continuity between
individual REST web-service sessions.

//Begin Pseudo-Code Snippet.

//objectA with Data Member "Integer A" Value 1
sks.insert(objectA);
sks.insert(objectB);
sks.insert(objectC);
//objectA with Data Member "Integer A" Value 2 (this is an update to the
same object which has a higher //index in the chronologically sorted queue).
sks.insert(objectA);
sks.fireAllRules();

//End Pseudo-Code Snippet.

As you can see, we enter multiple facts into the session for a single call
to FireAllRules(), and objectA contains a conflict because there are two
separate events relating to an update to the same object (objectA) and the
same field (integerA).

Does this seem like a reasonable or feasible approach? What would happen in
this scenario? Should we just try to work around it by pre-determining such
conflicts?

Which of the two is advised? Single Event per SKS fireAllRules() call or
Multiple Events per fireAllRules() call? Again, performance is important,
but we must temper such desires with our considerations of data-integrity :)

I REALLY REALLY Appreciate the help you've provided thus far.

Thanks for all the shoes,
Lucas (Skiddlebop)

--
View this message in context: http://drools.46999.n3.nabble.com/StatefulKnowledgeSession-and-multi-threaded-processing-tp3599689p3607250.html
Sent from the Drools: User forum mailing list archive at Nabble.com.
_______________________________________________
rules-users mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/rules-users


_______________________________________________
rules-users mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/rules-users
Reply | Threaded
Open this post in threaded view
|

Re: [rules-users] StatefulKnowledgeSession and multi-threaded processing

laune
Where is it documented that a StatefulKnowledgeSession is not thread-safe? (It is sad that the string "thread" doesn't occur more than once in the "Expert" manual, disregarding case.)

Batch insertion of facts followed by a single fireAllRules() that contain data that is semantically ordered in any way is not advisable. Conflict resultion may actually reverse processing order as compared to insert order. (It's defined, but I wouldn't rely on this.)

It is not true that the Engine's behaviour w.r.t. insertion of "identequal" objects is undefined. Search the "Expert" manual for "assertion mode".

There are excellent reasons for inserting facts in addition to events. One Application Design Pattern I present in my lecture is based on what I call the "Factory Floor" model. Imagine several gadgets of different kind being in your factory, and you process events, inquire about state and update the configuration. With the session running in fireUntilHalt(), you not only insert events happening with the gadgets, but also facts such as ReportRequest or Gadget, which aren't events. (Of course, about anything can be made into an event, if you really want to.)

Running fireUntilHalt() after each insertion of events arriving over time might also one way to go, but notice that the engine will not react to timeouts while it is idling.

-W





_______________________________________________
rules-users mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/rules-users
Reply | Threaded
Open this post in threaded view
|

Re: [rules-users] StatefulKnowledgeSession and multi-threaded processing

kkamalkumar
This is really an interesting thing to discuss. We tried working on the similar problems but in a different way. We need to understand stateful and stateless properly. Ideally after we got drools,  things became simple and we could solve most of the problems using stateless model. I believe the system should be lightweight to scale .I recommend not to dump all the data into the run-time memory ( Statefulsession here).  Stateful is to maintain a state per session for user interaction. Instead of sending all the data to and fro from the UI to rules engine, we can store state in the session.  When we are dealing with larger data sets, if we try to dump all the data into the memory, and  If the data grows bigger then system will run into performance issues. We followed similar techniques that were mentioned in the video. We have indexed all the data and kept in cache/lucene(in some cases, ideally for faster search). Rules were divided into two types. 1. Data lookup rules and 2. Logical rules. Data lookup rules will give criteria to provide get required data to the rules to process. Logical rules will provide the result criteria.  

The way I understand is any rule engine is meant to do two things. 1. Rule execution( gives the result for given inputs of the rule), 2. Provide the result data (constraining or data filtering) . When we deal with larger data sets, constraining becomes very time consuming, so I have taken out constraining  from the rule engine and used the third party tool for search based on the rule execution criteria.  the advantages with this approach is 1. You do not need to move data from data source to the runtime memory(change of format). 2. Rule execution will be simple and fast (Excellent performance) 3. Scalability (System need not struggle with huge memory data filtering).

Reply | Threaded
Open this post in threaded view
|

Re: [rules-users] StatefulKnowledgeSession and multi-threaded processing

pprasad
In reply to this post by salaboy
Hi,

Is it possible to insert redis hashmap as it is as fact into rule engine. I am using Jedis from java to access jedis and drool rule engine.
Also, Is there a way to access redis hashmap from drl file.

Thanks,
Pradeep
Reply | Threaded
Open this post in threaded view
|

Re: [rules-users] StatefulKnowledgeSession and multi-threaded processing

kkamalkumar
You can do it as per my knowledge. Only thing is you will access in drl from java code. I dont think it will be an issue. but what the reason for sending hashmap as fact, You should have a different way of sending facts. As per my observation no rule needs so much of data to make a decision. It should have name value parameters to make the decision. 

Hope this helps.

thanks
kamal.


On Sun, Jun 2, 2013 at 9:59 PM, pprasad [via Drools] <[hidden email]> wrote:
Hi,

Is it possible to insert redis hashmap as it is as fact into rule engine. I am using Jedis from java to access jedis and drool rule engine.
Also, Is there a way to access redis hashmap from drl file.

Thanks,
Pradeep


If you reply to this email, your message will be added to the discussion below:
http://drools.46999.n3.nabble.com/StatefulKnowledgeSession-and-multi-threaded-processing-tp3599689p4024075.html
To unsubscribe from StatefulKnowledgeSession and multi-threaded processing, click here.
NAML

Reply | Threaded
Open this post in threaded view
|

RE: [rules-users] StatefulKnowledgeSession and multi-threaded processing

pprasad
Thanks for your prompt reply. I am working on a fraud system for a bank and they have around 3 millions records.

All my data is loaded to redis and I can access the key using jedis. How do I insert redis hashmap to fact in one go. I do not want to iterate for every key and insert fact individually.

Is there a way to get the instance of redis hashmap in java and insert that hashmap as a fact.

 
Thanks and Regards
Pradeep Prasad



Date: Sun, 2 Jun 2013 22:08:32 -0700
From: [hidden email]
To: [hidden email]
Subject: Re: [rules-users] StatefulKnowledgeSession and multi-threaded processing

You can do it as per my knowledge. Only thing is you will access in drl from java code. I dont think it will be an issue. but what the reason for sending hashmap as fact, You should have a different way of sending facts. As per my observation no rule needs so much of data to make a decision. It should have name value parameters to make the decision. 

Hope this helps.

thanks
kamal.


On Sun, Jun 2, 2013 at 9:59 PM, pprasad [via Drools] <[hidden email]> wrote:
Hi,

Is it possible to insert redis hashmap as it is as fact into rule engine. I am using Jedis from java to access jedis and drool rule engine.
Also, Is there a way to access redis hashmap from drl file.

Thanks,
Pradeep


If you reply to this email, your message will be added to the discussion below:
http://drools.46999.n3.nabble.com/StatefulKnowledgeSession-and-multi-threaded-processing-tp3599689p4024075.html
To unsubscribe from StatefulKnowledgeSession and multi-threaded processing, <a href="http://" target="_blank" rel="nofollow">click here.
NAML




If you reply to this email, your message will be added to the discussion below:
http://drools.46999.n3.nabble.com/StatefulKnowledgeSession-and-multi-threaded-processing-tp3599689p4024076.html
To unsubscribe from StatefulKnowledgeSession and multi-threaded processing, click here.
NAML
Reply | Threaded
Open this post in threaded view
|

Re: [rules-users] StatefulKnowledgeSession and multi-threaded processing

Stephen Masters
CONTENTS DELETED
The author has deleted this message.