Snapshot of drools engine's state

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Snapshot of drools engine's state

IK81
Hi,

I am looking for a solution to make the engine's state persistent in regular intervals. I do not want to persist the engine's state at every event insertion for performance reasons. Instead I am thinking of making a snapshot of the engine's state let's say every X seconds. The events I have are always stored to a database. In case of a crash or reboot I'd like to recover the engine's state from the snapshot + reinserting the events that happened after the timestamp of my snapshot.

Are there any hints or caveats regarding this approach?

Best regards,
Ingo
Reply | Threaded
Open this post in threaded view
|

Re: [rules-users] Snapshot of drools engine's state

salaboy
IMO that's a very difficult problem to solve.. because if you want to do it every X seconds you can be caught in the middle of an execution. If you delay the snapshot creation for when the execution ends, you will probably caught in the next snapshot creation. 
Depending on what are you doing within your rules, you can always recreate the session from the scratch. You can also use rules to summarize state and persist that summaries in your external database, that can save you some time.

HTH


On Mon, Aug 12, 2013 at 8:52 AM, IK81 <[hidden email]> wrote:
Hi,

I am looking for a solution to make the engine's state persistent in regular
intervals. I do not want to persist the engine's state at every event
insertion for performance reasons. Instead I am thinking of making a
snapshot of the engine's state let's say every X seconds. The events I have
are always stored to a database. In case of a crash or reboot I'd like to
recover the engine's state from the snapshot + reinserting the events that
happened after the timestamp of my snapshot.

Are there any hints or caveats regarding this approach?

Best regards,
Ingo




--
View this message in context: http://drools.46999.n3.nabble.com/Snapshot-of-drools-engine-s-state-tp4025457.html
Sent from the Drools: User forum mailing list archive at Nabble.com.
_______________________________________________
rules-users mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/rules-users



--
 - MyJourney @ http://salaboy.com
 - Co-Founder @ http://www.jugargentina.org
 - Co-Founder @ http://www.jbug.com.ar
 
 - Salatino "Salaboy" Mauricio -

_______________________________________________
rules-users mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/rules-users
Reply | Threaded
Open this post in threaded view
|

Re: [rules-users] Snapshot of drools engine's state

laune
Saving every so many, and synced with, insertions might be an
alternative, perhaps with an upper limit on the frequency. This limits
recovery time as well as time spent on creating the snapshots.

-W

On 12/08/2013, Mauricio Salatino <[hidden email]> wrote:

> IMO that's a very difficult problem to solve.. because if you want to do it
> every X seconds you can be caught in the middle of an execution. If you
> delay the snapshot creation for when the execution ends, you will probably
> caught in the next snapshot creation.
> Depending on what are you doing within your rules, you can always recreate
> the session from the scratch. You can also use rules to summarize state and
> persist that summaries in your external database, that can save you some
> time.
>
> HTH
>
>
> On Mon, Aug 12, 2013 at 8:52 AM, IK81 <[hidden email]> wrote:
>
>> Hi,
>>
>> I am looking for a solution to make the engine's state persistent in
>> regular
>> intervals. I do not want to persist the engine's state at every event
>> insertion for performance reasons. Instead I am thinking of making a
>> snapshot of the engine's state let's say every X seconds. The events I
>> have
>> are always stored to a database. In case of a crash or reboot I'd like to
>> recover the engine's state from the snapshot + reinserting the events
>> that
>> happened after the timestamp of my snapshot.
>>
>> Are there any hints or caveats regarding this approach?
>>
>> Best regards,
>> Ingo
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://drools.46999.n3.nabble.com/Snapshot-of-drools-engine-s-state-tp4025457.html
>> Sent from the Drools: User forum mailing list archive at Nabble.com.
>> _______________________________________________
>> rules-users mailing list
>> [hidden email]
>> https://lists.jboss.org/mailman/listinfo/rules-users
>>
>
>
>
> --
>  - MyJourney @ http://salaboy.com <http://salaboy.wordpress.com>
>  - Co-Founder @ http://www.jugargentina.org
>  - Co-Founder @ http://www.jbug.com.ar
>
>  - Salatino "Salaboy" Mauricio -
>
_______________________________________________
rules-users mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/rules-users
Reply | Threaded
Open this post in threaded view
|

Re: [rules-users] Snapshot of drools engine's state

IK81
I don't think that the synchronization is a problem. I don't want to use a dedicated thread that runs the fireUntilHalt method but rather trigger the engine externally. So I can decide if it's time to make a new snapshot or not. So I can avoid to find the engine in an inconsistent state. I also plan to provide the clock from an external source instead of relying on the system time to achieve a deterministic behavior when replaying the events.

What do you exactly mean with "Saving every so many, and synced with, insertions might be an
alternative,"?

Ingo
Reply | Threaded
Open this post in threaded view
|

Re: [rules-users] Snapshot of drools engine's state

laune
On 12/08/2013, IK81 <[hidden email]> wrote:

> I don't think that the synchronization is a problem. I don't want to use a
> dedicated thread that runs the fireUntilHalt method but rather trigger the
> engine externally. So I can decide if it's time to make a new snapshot or
> not. So I can avoid to find the engine in an inconsistent state. I also
> plan
> to provide the clock from an external source instead of relying on the
> system time to achieve a deterministic behavior when replaying the events.
>
> What do you exactly mean with "Saving every so many, and synced with,
> insertions might be an
> alternative,"?

Save every N insertions. They may not be spaced evenly over time, so
saving more than once in a lull is useless.

-W

>
> Ingo
>
>
>
> --
> View this message in context:
> http://drools.46999.n3.nabble.com/Snapshot-of-drools-engine-s-state-tp4025457p4025464.html
> Sent from the Drools: User forum mailing list archive at Nabble.com.
> _______________________________________________
> rules-users mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/rules-users
>
_______________________________________________
rules-users mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/rules-users
Reply | Threaded
Open this post in threaded view
|

Re: [rules-users] Snapshot of drools engine's state

IK81
I understand. But in my case the state might change even if i do not
insert new events because I have rules
that fire if I have event A but no event B within X seconds after A.
For that purpose I'll use a timer
that advances the clock and calls the fireAllRules without any
insertion.

Is my assumption right that the engine will behave deterministic when I
replay the events on my
snapshot using an external (pseudo?) clock. I've tested it and from my
observations it seemed to be
deterministic - but I haven't looked at the engine's internals so far.
I plan to persist my events with the timestamps of their occurrence and
the timestamp when I insert them into the engine.
So will I always end up with the same propagation numbers etc.?

Ingo
Reply | Threaded
Open this post in threaded view
|

Re: [rules-users] Snapshot of drools engine's state

laune
On 12/08/2013, IK81 <[hidden email]> wrote:
> I understand. But in my case the state might change even if i do not
> insert new events because I have rules
> that fire if I have event A but no event B within X seconds after A.
> For that purpose I'll use a timer
> that advances the clock and calls the fireAllRules without any
> insertion.

This does not preclude saving based on event insertion count as you
can monitor inserts by listening.

But this timer introduces real time. How do you propose to replay this
when you restart your session using saved data?

>
> Is my assumption right that the engine will behave deterministic when I
> replay the events on my
> snapshot using an external (pseudo?) clock. I've tested it and from my
> observations it seemed to be
> deterministic - but I haven't looked at the engine's internals so far.

I think so. But then your engine will continue with a pseudo-clock.

> I plan to persist my events with the timestamps of their occurrence and
> the timestamp when I insert them into the engine.
> So will I always end up with the same propagation numbers etc.?

Only one timestamp is used as the one on which rule evaluations are based.

I may  have missed something, but I don't think you can build a system
that provides continuous real-time operation for event processing by
using backup and recovery.

-W

>
> Ingo
>
>
>
> --
> View this message in context:
> http://drools.46999.n3.nabble.com/Snapshot-of-drools-engine-s-state-tp4025457p4025467.html
> Sent from the Drools: User forum mailing list archive at Nabble.com.
> _______________________________________________
> rules-users mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/rules-users
>
_______________________________________________
rules-users mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/rules-users
Reply | Threaded
Open this post in threaded view
|

Re: [rules-users] Snapshot of drools engine's state

IK81
I know that this timer inserts real-time. I plan to persists also the timestamps when I call these external timer-based fireAllRules invocations. The problem is that I have to support rules that reason over the non-occurence of events and not every time I insert a new event. I don't know how I could realize this otherwise. I'd love to have a kind of callback where the engine tells me - wake me up in X seconds so that I can re-evaluate my rules because then my not-after-N-timeunits rules can fire.

The reason for this approach is that I'll have to ensure that I process every event but be tolerant against re-starts of my application server. If I have a rule - let's say fire if I have event A but no event B after 3 minute and I insert event A. Let's assume that the application server is rebooted in the meantime and I start with a clean session, then this rule would never fire. Unfortunately, I cannot afford to persist the complete state after every rule insertion due to performance reasons. However, the events have to be persisted anyway. Therefore, I came to this proposal to persist the state only after N seconds or insertions as a kind of snapshot. In case of recovery after a reboot (or transfer of the engine to another cluster node) I use the snapshot, reinsert the events to get to the same state. However, this was just my idea how to solve this problem. If you could provide me a pointer to a better solution I'd also be very thankful.

Thanks for the hint about the timestamp. But isn't the clock of the session also relevant when inserting new events (which have their own timestamp provided in the event's POJO?). Therefore, I though that it is necessary to store both the event's timestamp (which is assigned by an external event source) and the timestamp when I put it into the engine.

Ingo
Reply | Threaded
Open this post in threaded view
|

Re: [rules-users] Snapshot of drools engine's state

laune
The timestamp that is the basis for all calculations done by the engine can be
- a regular field of the class, named in a @timestamp(fieldname) annotation
- a handle field maintained by the engine, set to the session clock
time during insertion (and not readily available to the app)

If you need to keep the one and only timestamp you'll have to use a
regular field and set it to whatever. If you need to "replay" a part
of an aborted session, you'll need to have a pseudo-clock. During the
initial replay phase, the pseudo-clock is driven by the timestamps in
the saved events. Then, you may have to insert some delayed events
that accumulate during the time between a crash and the end of the
replay phase, with the clock still driven by the timestamps in the
events. (Note that this implies that the timestamps ought to reflect
the time of the event's arrival in the system.)

I have found that sliding windows are a bit tricky when the session is
run with a pseudo-clock. Time is frozen between bumps of the clock and
so a window ends when the engine is called again after advancing the
clock. This means that a rule based on a window will not fire at the
"true" end of the window but at the time to which the clock was
advanced. This will still be correct from a logical point of view but
the consequence will execute with a delay. Therefore it may not be
possible to use windows but you should be able to work around that.

-W

On 12/08/2013, IK81 <[hidden email]> wrote:

> I know that this timer inserts real-time. I plan to persists also the
> timestamps when I call these external timer-based fireAllRules invocations.
> The problem is that I have to support rules that reason over the
> non-occurence of events and not every time I insert a new event. I don't
> know how I could realize this otherwise. I'd love to have a kind of
> callback
> where the engine tells me - wake me up in X seconds so that I can
> re-evaluate my rules because then my not-after-N-timeunits rules can fire.
>
> The reason for this approach is that I'll have to ensure that I process
> every event but be tolerant against re-starts of my application server. If
> I
> have a rule - let's say fire if I have event A but no event B after 3
> minute
> and I insert event A. Let's assume that the application server is rebooted
> in the meantime and I start with a clean session, then this rule would
> never
> fire. Unfortunately, I cannot afford to persist the complete state after
> every rule insertion due to performance reasons. However, the events have
> to
> be persisted anyway. Therefore, I came to this proposal to persist the
> state
> only after N seconds or insertions as a kind of snapshot. In case of
> recovery after a reboot (or transfer of the engine to another cluster node)
> I use the snapshot, reinsert the events to get to the same state. However,
> this was just my idea how to solve this problem. If you could provide me a
> pointer to a better solution I'd also be very thankful.
>
> Thanks for the hint about the timestamp. But isn't the clock of the session
> also relevant when inserting new events (which have their own timestamp
> provided in the event's POJO?). Therefore, I though that it is necessary to
> store both the event's timestamp (which is assigned by an external event
> source) and the timestamp when I put it into the engine.
>
> Ingo
>
>
>
> --
> View this message in context:
> http://drools.46999.n3.nabble.com/Snapshot-of-drools-engine-s-state-tp4025457p4025471.html
> Sent from the Drools: User forum mailing list archive at Nabble.com.
> _______________________________________________
> rules-users mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/rules-users
>
_______________________________________________
rules-users mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/rules-users
Reply | Threaded
Open this post in threaded view
|

Re: [rules-users] Snapshot of drools engine's state

IK81
Thanks for your reply. The way you described this is actually how my proof-of-concept works. However, I implemented it using two timestamps. First I use the @timestamp annotation for the property of the event. This property is set by the event source. Second, I persist a second timestamp that reflects the timestamp of the session's pseudo clock when I inserted the event. From my naive feeling I need both for a reproducible replay starting with the last snapshot of the session.

Thanks for the hint concerning the sliding window. This is the drawback of the approach of externally controlling the clock - it limits the time granularity of these not-event-B-after-A rules. But in my case I can live with this restriction that the rule might fire some seconds later. I wished there would exist a possibility to register for a kind of callback when to externally trigger the event engine again - however, I haven't found anything related to that.

Thanks & best regards,
Ingo
Reply | Threaded
Open this post in threaded view
|

Re: [rules-users] Snapshot of drools engine's state

laune
On 12/08/2013, IK81 <[hidden email]> wrote:
> Thanks for your reply. The way you described this is actually how my
> proof-of-concept works. However, I implemented it using two timestamps.
> First I use the @timestamp annotation for the property of the event. This
> property is set by the event source. Second, I persist a second timestamp
> that reflects the timestamp of the session's pseudo clock when I inserted
> the event. From my naive feeling I need both for a reproducible replay
> starting with the last snapshot of the session.

The (original) @timestamp property is the one used by the Engine,
right? Then the real or pseudo time of insertion is irrelevant
(except, perhaps, for logging, but then you don't need to save it in
the event).

Another issue. What is the recovery meant to cure? Application and/or
Drools bugs? In this case: what if the faithful recovery runs into the
same bug? If it is a HW glitch, redundancy might be the way to go. For
powerfail, there's UPS...

If not losing a single beat is essential you are most likely not on
the right track with a system that can't claim to be free of errors,
for any "community release". (Support only solves problems only after
they become manifest.) OTOH, if you can, it might be better to think
about a way to start again with a minimum of disturbance. It depends
on the app domain, of course...

-W

>
> Thanks for the hint concerning the sliding window. This is the drawback of
> the approach of externally controlling the clock - it limits the time
> granularity of these not-event-B-after-A rules. But in my case I can live
> with this restriction that the rule might fire some seconds later. I wished
> there would exist a possibility to register for a kind of callback when to
> externally trigger the event engine again - however, I haven't found
> anything related to that.
>
> Thanks & best regards,
> Ingo
>
>
>
> --
> View this message in context:
> http://drools.46999.n3.nabble.com/Snapshot-of-drools-engine-s-state-tp4025457p4025474.html
> Sent from the Drools: User forum mailing list archive at Nabble.com.
> _______________________________________________
> rules-users mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/rules-users
>
_______________________________________________
rules-users mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/rules-users
Reply | Threaded
Open this post in threaded view
|

Re: [rules-users] Snapshot of drools engine's state

IK81
Ok, if this is the case then I am fine. I thought the engine might be interested not on the event's timestamp but also when the events are inserted into the system as well. But as already mentioned, this was just my naive impression - I haven't digged into the source code that deep now.

The recovery is in a first step to keep the state during restarts of the application server. Next step is to allow a migration of the session between different nodes on the AS cluster. If a node fails than another node can use the last persisted state, replay the events and keep on going with the processing. AFAIK there's no special support for Drools to run in a distributed fashion in a cluster. The requirements are not that hard that mission a single beat is essential, but I'll try to avoid any loss of state in my design.

Ingo