sync: Send a hash of all database rules at beginning and end of sync #1409

russellhancox · 2024-08-01T16:36:34Z

No description provided.

pmarkowsky · 2024-08-01T17:11:04Z

Source/santad/DataLayer/SNTRuleTable.mm

+  CC_SHA256_Init(&sha);
+
+  [self inDatabase:^(FMDatabase *db) {
+    FMResultSet *rs = [db executeQuery:@"SELECT * FROM rules"];


You might need to exclude transitive rules here as it'll change your hash of hashes and the server will have no idea that these exist.

Also any rules added manually while a sync server didn't exist (though I don't think we track such thing today). However this is likely niche...

Another thought - there's been requests and talks in the past about per-user rules. It would be good to have a bit of an idea about how something like those might be represented here if they're ever implemented.

@pmarkowsky Good call, I've excluded transitive rules

@mlw If a sync server is added the first sync afterward is a clean sync so the hashes shouldn't matter. For per-user rules those would be implemented on the server side anyway, so that shouldn't matter here. The purpose of this is not necessarily to have the server calculate the hash and compare them but to store the hash sent in a postflight and then validate that the one sent in a preflight matches the stored value.

@mlw I'm assuming you're talking about the rumblings in the past that folks would like to block execution by the user e.g. root can run X but not user peterm.

That'd have to be implemented in the client but is an extension to rules similar to #1200.

Re: Hash of Hashes. Do you need to sort by identifier and policy to create a stable hash? Like if I get a rule out of order for some reason i'll end up with a different hash even though the set of policies is the same.

Do you need to sort by identifier and policy to create a stable hash? Like if I get a rule out of order for some reason i'll end up with a different hash even though the set of policies is the same.

Right now this hashes based on insertion order, which is stable on the client at least. The server could ensure this is stable by always ordering by creation timestamp but I'm not really expecting servers to calculate this in the same way as the client but instead just check the hash doesn't change during preflight.

Right now this hashes based on insertion order, which is stable on the client at least. The server could ensure this is stable by always ordering by creation timestamp but I'm not really expecting servers to calculate this in the same way as the client but instead just check the hash doesn't change during preflight.

That's good to know.

Not saying you should do this as the hash is probably the right tradeoff for now, but previously, I looked into using an xorfilter/bloomfilter rather than a hash to avoid worrying about ordering e.g. this single header one https://github.com/FastFilter/xor_singleheader.

The advantage to this is that you're creating a set membership data structure so order becomes irrelevant.

The process would just be:

Hash rules (policy, state, identifier) to a 64-bit integer (e.g. use murmur3)

Insert into the xorfilter

Serialize and ship the xorfilter to the server in postflight

Rehydrate the xorfilter on the server and then for all rules per user check if they're in the supplied xorfilter

Alternatively you could also receive a xorfilter from the server of all the expected rules and as you're scanning the Santa rule db check if those rules are supposed to be in the client DB. This would let you know exactly which rules were spurious.

I see 3 problems with that approach

Serializing and de-serializing the filter may require reimplementing the serialization - the Go implementation in particular doesn't support this

This sounds far more expensive on the server-side; what I've proposed requires saving a single extra field in postflight and doing a simple 64-byte string comparison during preflight. The fuse filter method would require looping over all rules in postflight to check that they're in the filter.

This will not detect extra rules that have been added. Reversing the process would find extra rules (which is better, imo) but then not detect missing rules.

Late reply, but I was referring to some unimplemented per-user rule feature. (I think this has been requested multiple times, but the only reference I found quickly was: #1280 )

This may not matter much, and we should certainly not overfit to some unimplemented feature. I could see the hash-of-hashes being user-agnostic and that would likely be just fine. I was just hoping to consider it early in case there were any obvious/easy tweaks to make to this draft that would better adapt to some new, more complex rule set (per user, CEL, etc.).

@mlw Gotcha. I don't think that should be an issue, whether we implement that as an extra field in a rule or via CEL, we would add the new field(s) to those included in each rules' hash.

Source/common/SNTFileInfo.mm

Source/common/SNTRule.mm

Source/santad/DataLayer/SNTRuleTable.mm

mlw · 2024-08-13T21:22:09Z

Source/santad/DataLayer/SNTRuleTable.mm

+    while ([rs next]) {
+      char digest[128];
+      uint32_t len = snprintf(
+        digest, 128, "%s:%d:%d:%u", [rs stringForColumn:@"identifier"].UTF8String,


nit: s/128/sizeof(digest)/

mlw · 2024-08-14T00:50:16Z

What do you think about adding the hash of hashes value to santactl status output? feels like it might be useful for debugging / sanity check purposes.

russellhancox added the sync service Issues related to the sync service / protocol label Aug 1, 2024

pmarkowsky reviewed Aug 1, 2024

View reviewed changes

tburgin reviewed Aug 9, 2024

View reviewed changes

Source/common/SNTFileInfo.mm Outdated Show resolved Hide resolved

Source/common/SNTRule.mm Outdated Show resolved Hide resolved

Source/santad/DataLayer/SNTRuleTable.mm Outdated Show resolved Hide resolved

russellhancox added 2 commits August 13, 2024 10:35

sync: Send a hash of all database rules at beginning and end of sync

a64952b

Feedback

9499599

russellhancox marked this pull request as ready for review August 13, 2024 14:38

russellhancox requested a review from a team as a code owner August 13, 2024 14:38

russellhancox added 5 commits August 13, 2024 11:59

Fix module compilation

c855f2a

Switch to crc32

12d3e19

Revert unrelated changes, fix test

b705907

Remove unused imports, fix lint

536de49

Switch to sprintf for digest

58bcc1e

mlw reviewed Aug 13, 2024

View reviewed changes

russellhancox marked this pull request as draft August 15, 2024 14:21

russellhancox closed this Sep 24, 2024

russellhancox deleted the rule-database-hash branch September 24, 2024 15:55

russellhancox restored the rule-database-hash branch September 24, 2024 15:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sync: Send a hash of all database rules at beginning and end of sync #1409

sync: Send a hash of all database rules at beginning and end of sync #1409

russellhancox commented Aug 1, 2024

pmarkowsky Aug 1, 2024

mlw Aug 1, 2024

mlw Aug 1, 2024

russellhancox Aug 1, 2024

pmarkowsky Aug 1, 2024

russellhancox Aug 1, 2024

pmarkowsky Aug 1, 2024

russellhancox Aug 1, 2024 •

edited

Loading

mlw Aug 2, 2024

russellhancox Aug 5, 2024

mlw Aug 13, 2024

mlw commented Aug 14, 2024

sync: Send a hash of all database rules at beginning and end of sync #1409

sync: Send a hash of all database rules at beginning and end of sync #1409

Conversation

russellhancox commented Aug 1, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

russellhancox Aug 1, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mlw commented Aug 14, 2024

russellhancox Aug 1, 2024 •

edited

Loading